Nijotech DBMS
Nijotech DBMS
net/publication/354204210
CITATIONS READS
10 1,241
4 authors:
All content following this page was uploaded by Oluwafemi A. Sarumi on 05 September 2021.
Abstract
The evaluation of mobile crowdsourcing activities and reports require a viable and large volume of data. These data are gathered
in real-time and from a large number of paid or unpaid volunteers over a period. A high volume of quality data from smartphones
or mobile devices is pivotal to the accuracy and validity of the results. Therefore, there is a need for a robust and scalable database
structure that can effectively manage and store the large volumes of data collected from various volunteers without compromising
the integrity of the data. An in-depth review of various database designs to select the most suitable that will meet the needs of a
real-time, robust and large volunteer data handling system is presented. A non-relational database was proposed for the mobile-
end database: Google Cloud Firestore specifically due to its support for mobile client implementation, this choice also makes the
integration of data from the mobile end-users to the cloud-hosted database relatively easier with all proposed services being part
of the Google Cloud Platform; although it is not as popular as some other database services. Separate comparative reviews of the
Database Management System (DBMS) performance demonstrated that MongoDB (a non-relational database) performed better
when reading large datasets and performing full-text queries, while MySQL (relational) and Cassandra (non-relational) performed
much better for data insertion. Google BigQuery was proposed as an appropriate data warehouse solution. It will provide continuity
and direct integration with Cloud Firestore and its Application Programming Interface (API) for data migration from Cloud
Firestore to BigQuery, and the local server. Also Google BigQuery provides machine learning support for data analytics.
.
Keywords: Database management system, mobile applications, crowdsourcing, big data
1.0 INTRODUCTION different places and are usually made worse by data
A database is a collection of data stored redundancy which occurs when there are multiple entries
electronically such that information relevant to an of the same data in different places [3]. Modern DBMSs
enterprise, organization or entity can be retrieved sufficiently handle data consistency, redundancy issues,
conveniently and efficiently [1]. Although the terms and several others by ensuring that every transaction has
“database” and “database management system (DBMS)” the following properties [2], [4]:
are used interchangeably the former correctly refers to the
collection of data which comprises all data significant to 1. Atomicity: This requires that all concurrent
an organization while the latter refers to the entire system operations or co-dependent operations must be
which comprises of the collection of stored data successfully executed, in the case where an error or
(database) and a set of programs that handle operations failure occurs during the execution of any of these
through transactions on the stored data [2]. Appropriate operations the transaction should be terminated and
database architecture is critical in ensuring reliable data, all changes reversed.
minimizing data duplication, effective query execution,
high-performance usage and integration. The database 2. Consistency: This ensures that changes made through
architecture becomes more relevant at the point of transactions occur only in predefined ways by always
analysis for the extraction of meaningful information. An following certain rules and constraints. An example
important issue in the management of large amounts of of a constraint is the data type for a column in a
data handled by a DBMS is data inconsistency. This relational database which dictates the type of values
occurs when different versions of the same data exist at that can be stored in that specific column any attempt
to store data of the wrong type results in a failed
*Corresponding author (Tel: +234 (0) 8101564160)
transaction.
Email addresses: [email protected] (F. M.
Dahunsi), [email protected] (A. J. Joseph),
[email protected] (O.A. Sarumi), [email protected] 3. Isolation: This defines the rule for cascading
(O.O. Obe) transactions and it restricts the use of data to1 only
one transaction at a time and that data can only be
used by another once the current transaction is done. million users in Nigeria [8] and interact with specific user
This property determines how changes made by one demographics. Also to gather data that reflects actual user
or more users are committed in the database and experience and opinions provided that the crowdsourcing
become visible to other users, it usually involves a project is properly planned and challenges such as crowd
trade-off between perfectly isolated transactions and engagement with the applications, data and user privacy,
concurrent transactions through the use of strategies data verification and crowdsourcing architecture which
such as serializability which executes concurrent are highlighted in [9]–[12] are considered. Some
transactions serially depending on the business logic examples of crowdsourcing projects which integrated
[5]. mobile applications are Waze [13], Open street map [14],
OpenSignal [15], Disaster management, and emergency
4. Durability: This enforces the changes made by a routing with google maps [16].
successful transaction and ensures that those changes This research focuses on the comparative analysis
cannot be reversed or lost except by the execution of of various database management systems (DBMS) and
another transaction. selecting an appropriate one to be used for a mobile
crowdsourcing application. The particular mobile
5. Serializability: This property ensures that concurrent crowdsourcing application system considered is one
transactions are properly scheduled serially to ensure developed to evaluate the performance of mobile
consistency in results. communication services through the measurement of
certain voice and broadband key performance indicators
DBMSs have become more important to the (KPIs) on volunteer’s mobile devices. The KPIs were
efficiency of several tasks ranging from day-to-day selected based on requirements from the Nigerian
activities of large and small organizations to research in Communication Commission (NCC) for quality-of-
various fields. It is especially relevant in the past few service evaluations. For database design, a brief and
decades where the amount of data collected and stored has expected data type for each KPI is presented in this
risen in exponential magnitudes according to several section, though the data type can be changed if a different
indicators [6]. This high growth rate of created data has approach is taken for the computation of the metrics [17].
highlighted how data can be underutilized due to poor
management and the limitations of the present 2.1 Voice Service KPIs
technology. Some of the features considered at various The Voice Service KPIs considered are the Call
levels in ensuring that a suitable database was selected are Setup Success Rate (CSSR), Radio Signal Quality and
normalization, data structure, and prioritized operations. Strength, Handover Success Rate (HSR), Bit Error Rate
Database normalization involves the structuring (BER), and Traffic Channel Congestion (TCH-CONG).
of data into a specified normal form to reduce avoidable • Call Setup Success Rate (CSSR): The call setup
duplication of data and improve the integrity of database success rate is the percentage of successfully linked
operations. This gives an advantage of efficiency for calls. The result from a single attempted call is stored
relational databases which hold structured data but this is as a Boolean value to indicate success or failure.
not the case for non-relational databases. • Radio Signal Quality and Strength: This metric is a
In addition, determining if the data will be in a measure of the strength and the quality of signal
structured, semi-structured, or unstructured format received by a mobile device antenna. It is computed
depends mainly on if the DBMS is relational or non- as integer values measured in dBm.
relational and the intended application of the data. In • Handover Success Rate (HSR): This is a percentage
addition to database normalization and data structure, of successful switches between cell towers which is
prioritized operations were examined to determine which usually attempted when a mobile device moves to the
operations have precedence. For instance, at the point of distance where the signal quality it is currently
data collection, data write operations have priority while receiving is weak. The result from a handover attempt
at the most crucial stage (retrieving data for analysis) data is stored as a Boolean value to indicate success or
read and update operations take priority. Hence priority failure.
varies, though retrieving data is of greater importance in • Bit Error Rate (BER): This is a measurement of the
this application because the accuracy of results is hinged end-to-end bit error that occurs in the data transmitted
on the retrieved data. during a voice call. The result from a single voice call
is stored as a floating-point value.
2.0 OVERVIEW OF EXISTING APPROACH • Traffic Channel Congestion (TCH-CONG): This
Mobile crowdsourcing applications are software indicates the level of unavailability of resources
applications developed to leverage the widespread use of needed for voice services which results in blocked
mobile devices especially smartphones which have an calls. A single test gives a Boolean value result that
estimated 3.6 billion users worldwide [7] and 25-30 indicates whether or not there is congestion.
2.2 Broadband Service KPIs carried out here was to first explain the difference in how
The broadband Service KPIs considered are data was organized with the former implementing a
Download Speed, Upload Speed, Domain Name Service graphical table format while the latter uses a document-
(DNS) Lookup, Network availability, and Video collection based structure, this review then moved on to
streaming experience. benchmarking four major operations; insert, select, update
• Download Speed: This is a measure in megabits per and delete for large amounts of data and results showed
second of the amount of data a user receives from a that MongoDB provided better execution times for all four
server. The measured values are stored as floating- operations and it concludes that although a non-relational
point values. database performed better for large amount of data the
• Upload Speed: This is a measure in megabits per choice ultimately depends on the particular application
second of the amount of data a user sends to a server. that is to be integrated with the database. A more
The measured values are stored as floating-point particular comparative study of document-based DBMSs
values. was carried out in [21], this focused on the certain features
• Domain Name Service (DNS) Lookup: This metric of the compared DBMS with an eventual conclusion that
is the time taken in milliseconds to successfully send each DBMS classification addresses specific
a query for the internet protocol (IP) address of a requirements that suit different application
domain name and get a response. Results are stored as implementation.
integer values. Section three of this study explains the types of
• Network availability: The percentage measure of databases based on usage and application. Section four
how often broadband service can be accessed in a presents the evaluation of DBMS using four distinct
given period is the network availability. The parameters of the system: mobile application, cloud
measured values are stored as floating-point values. application, web application, and the local server. Section
• Video streaming experience: This represents the five discusses the results and presents the comparative
perceived experience a user gets when using a video analysis. Section six presents the critical analysis gleaned
streaming service. The measured values are stored as from the review and proposed research areas that could be
floating-point values. investigated in subsequent works.
This study consulted official documentation of
standard DBMS; their features, specifications, and 3.0 AN OVERVIEW OF DATABASE SYSTEMS
previous works where similar systems were implemented. It is important to note that the term “database”
Some important requirements of a mobile crowdsourcing covers more than just data but is used to generally refer to
application database are: the data, database management system (DBMS), and all
associated applications [22]. An operational database is
a. a flexible schema design that makes it much easier to used to manage and store data in real-time. It is the source
update the database to handle changing application of information for the data warehouse and it is set up to
requirements. work efficiently with a high volume of transactional
b. an ability to seamlessly and effectively scale the processing. It deals mainly with operational information
database and local server as data grows. which is the type of information required for day-to-day
c. To be readily available and easy to use. routine activities [23].
The quality of analysis gleaned from gathered A data warehouse system is used for services that
data is highly dependent on the database implementation. involve data analysis and decision-making [23]. Data
Several researchers have used different database warehouses deal with strategic information which must
architectures in the implementation of mobile have a uniform and consistent view, conveniently
crowdsourcing applications. In [18], mobile broadband available, accessible, and correct. These systems are
performance measurement was carried out using MySQL optimized mainly for read operations; medium access
database design for about one hundred Mobile Network frequency larger amount of accessed data, and much more
Operators (MNOs) subscribers in two cities in Nigeria. complex queries compared to operational databases.
This coverage is less than the projected number of users Although, it may frequently interact with the operational
and geographical coverage for this current research. database for data. In a basic sense, the create, read, update
SQLite database technology was used in [19] is and delete operations are the main activities of an
also SQL-based and implements a structured database, but operational database and it includes both relational and
it still has constraints similar to [18], which are the limited non-relational databases.
amount number of users and geographical coverage.
Aside from these research papers mentioning the DBMS 3.1 Comparative Analysis of Database Types
used, no further insight was offered as to why these A relational database uses a structure that helps
specific systems were used. A comparison of a relational the user to define and access data in the database
(MySQL) and a non-relational (MongoDB) database was concerning some other piece of data. It is a collection of
done in [20], the approach of the comparative study tables representing both data and data relationships [22].
The physical arrangement of each table (known as Object-relational mapping (ORM) libraries which are
relations) is similar to that of a spreadsheet, having available in various programming languages but do not
multiple columns which are labelled with unique names easily scale up when data grows quickly [26]. Although a
and records (rows) of various types. An instance of a relational database is better suited for quantitative data
record specifies a set number of fields or attributes and which favours statistical analysis compared to a non-
table columns refer to the attributes of the record type, relational database.
constraints, and data types. The outcome of a relational A non-relational database permits a structure that
database is structured data that fits perfectly into defined does not necessarily have a rigid schema but can easily
fields and columns and it is managed using a Structured accommodate changes to how data is organized or how
Query Language (SQL) which has several flavours like relationships exist. It is also readily available with
SQLite, MySQL. variations i.e. MongoDB. It can be integrated using well-
A non-relational database is a database that does versed Object Relational Mapping (ORM) libraries in
not follow the rows and column tabular structure used in various programming languages. Non-relational database
most conventional database systems. Alternatively, non- scales better with data that grows quickly and can hold
relational databases utilize a retrieval model that is both unstructured and semi-structured data. The former
designed to satisfy the particular needs of the data format holds mainly qualitative data (useful for categorization)
being processed [24]. The outcome of this is unstructured and the latter includes both qualitative and quantitative
and semi-structured data. Unstructured data has loose data this property makes it suitable for complex data
formatting, with limited structure, it consists mainly of collection and analysis.
qualitative data (text, audio, video, satellite imagery, etc.) It is important to note that gathering and storing
[1]. Semi-structured data is data comprising semantic tags data for analysis goes beyond an eventual purpose of
and elements (known as metadata) but not perfectly displaying data in charts but more importantly extracting
compatible with the framework associated with traditional useful information from the gathered data [29], the
relational databases [1]. This data model has some form structure of the data inevitably affects this. The
of structure because similar entities are grouped and measurement of parameters for mobile communication
organized in a hierarchical format and can consist of both quality of service can be implemented using several
quantitative and qualitative data. Both unstructured and algorithms, this implies that during the development of the
semi-structured data are managed using NoSQL mobile application different algorithms may be
technologies like MongoDB, and Firebase. implemented with a consequence of modifying the
A distributed database is a collection of different database schema to accommodate the new attributes of the
databases, but at different geographical locations (sites) data like the data type or association constraints [29].
connected over a network and is managed as a single Hence there’s a need for a system that can accommodate
database [1]. It provides an advantage of improved such changes. Even though a relational database is best
performances and system reliability. A distributed suited for quantitative data and a non-relational database
database can be implemented by data fragmentation, for qualitative data both database types can still handle
allocation, and replication [1], [25]. Fragmentation either form of data well. An example is the PostgreSQL
involves breaking up data into chunks and storing them in relational database having a JSON datatype which can
the constituent systems of the distributed database, hold qualitative [30]. As mobile telecommunications
allocation is the operation by which fragments are placed technology evolves with the introduction of newer
in the available distributed infrastructures, and replication services such as 5G and Voice over LTE (VoLTE)
involves the duplication of data in the various available recently, the infrastructure for this research must suitably
systems that make up the distributed database. This accommodate changes and accommodate continued
duplication serves as backups and fallbacks in the event research. Hence, a mutable database structure will be of
of the failure of any constituent system but comes at a an advantage here.
shortfall of cost and data redundancy. Most cloud Additional, comparative insights and rankings
database services use this model to ensure that data is were retrieved from DB-Engines ranking which calculates
secure and always available to users quickly and ranking scores based on the number of mentions of the
consistently. system online, frequency of technical discussions, general
A comparison between relational databases interests, etc [31] and not the actual performance of the
(structured data) and non-relational databases (semi- DBMS. Figure 1 shows the ranking scores of some select
structured and unstructured data) is presented in Table 1 database systems. It shows that relational databases like
from the standpoint of data structure which is the Oracle and MySQL are much more popular than their
fundamentals of all databases. The relational database non-relational counterparts like MongoDB, this can be
options for data with a predefined schema that is not easily attributed to the fact that they have been around longer
changed are available across various platforms i.e. than non-relational DBMSs and hence have been and are
MySQL. These are quite straightforward to use alongside still used more frequently.
Table 1: Comparison between Relational Databases (structured data) and Non-relational Databases (semi-structured
and unstructured data).
Characteristics Relational (Structured) Non-relational (Semi-structured and unstructured)
Type of data Predefined and rigid schema Dynamic schema configuration
Size of data Scaling is vertical and expensive Scaling is horizontal and cheaper since non-relational
because when relational databases databases can be scaled by distributing a single database
become large they have to be scaled over multiple servers that are not required to be more
to a more powerful server which powerful than the ones already in use [27]. An example
comes at a higher cost [26]. can be seen from the pricing of digital ocean droplets,
horizontally scaling two basic droplets cost $10/month
each and will provide the same specifications as a single
$20/month basic droplet [28] but with an extra 20GB of
SSD disk storage.
Applicability Well suited for quantitative data Well suited for qualitative data
Availability Readily available across various Readily available across various platforms
platforms (MySQL, SQL Server, (MongoDB, Firebase, etc.)
etc.)
Ease of use Modifiable, supported, friendly user Modifiable and supported friendly user interfaces
interface (MySQL Workbench) and (MongoDB Atlas, Firebase Console), and availability of
availability of object-relational object-relational mapping (ORM) libraries
mapping (ORM) libraries
Figure 2: System architecture for the crowdsourced mobile communication quality of service analysis system
4.1 Mobile Application DBMS read/write, or misappropriation of user data and even
Databases are essential to most smartphone the mobile client application.
applications and selecting the correct one is key to the viii. Secure data at rest and in motion: Data at the stages
performance of the application. User details and where it is stored both locally or on the cloud and
interactions at specific points during the use of the when it is being moved between the client and the
application are required to be stored for various uses such cloud database should be secured from external and
as displaying the same information at a later time or using unauthorized access and also should not be lost.
the information to ensure consistent and correct updates ix. Good access to data: The database should be readily
and read on a more diverse database. Features of databases accessible at any instant when there is a request by the
for mobile applications are: mobile client.
efficiently handled by the database host service 5.0 SELECTED DATABASE FOR MOBILE
provider. CROWDSOURCING APPLICATION
2. Speed: This is a foreground feature that defines
operational capability. Delays attributable to 5.1 Database Models (Services) for Cloud DBMS
requests in the database would be significant Database Models (Services) considered for the
because this is tangible and impacts how the client Cloud DBMS are Firebase Real-Time Database, Cloud
perceives the application’s performance [1]. Firestore, MongoDB, PostgreSQL, Redis, Neo4j, and
Improvements can be made through strategies such Cassandra. The non-relational databases were chosen in a
as distributed database and browser features like way that the four major categories (Document stores,
caching can be used to improve speed performance Column stores, Key-value stores, and Graph stores) are
for data retrieval on web applications. represented, except for PostgreSQL which is a relational
3. Structure: This depends mainly on the business database.
solutions for which the application will be used and a. Firebase Real-Time Database
subsequently the type of data that will be generated It is a NoSQL cloud-hosted database service offered
and stored, this data may be structured, semi- by Firebase Incorporation. Data is stored as a JSON tree
structured, or unstructured. and synchronized in real-time to every connected client
4. Availability: Data is always available to connected and remains available when the application goes offline.
clients with multiple views for specific users [34]. As mentioned earlier, data is stored as one simple large
5. Data consistency and security: Security is essential JSON tree and this generally gives efficiency and
in preventing data breaches, loss, and availability guarantees optimized performance in the execution of
[35]. Also, data consistency must be guaranteed and high-volume queries without delay or loss (low-latency
modifications must be constrained to set down rules advantages) [39].
to ensure that any document or data field can only b. Cloud Firestore
be changed in a specific way [36]. It is a NoSQL, document-based cloud-host database
service also offered by Firebase Inc. Each document is
grouped into collections may further point to other sub-
4.4 The Local Server DBMS collections. It features queries that are much faster and
The database becomes increasingly important efficient than Firebase Real-Time Database and it also has
down the workflow. Data on the cloud database will be better scalability [40], the speed advantage it has over
shipped to the local server, for further analysis and Firebase Real-Time Database is attributed to all queries
storage. Exporting data from the cloud to the local server being indexed by default, this ensures that the query
can be handled by a scheduled job that performs performance is proportional to the size of the result set
automated shipping of data from Cloud Firestore to the data unlike Firebase Real-Time Database whose querying
local server and it is best implemented using Firestore performance degrades as data grows [41].
server SDKs [37]. The exported data retains its document- c. MongoDB
based structure which does not pose many difficulties for It is similar to Cloud Firestore in that each document
analysis and as regards the management of data on the is grouped into collections and it is a NoSQL, document
local server is can be done using the Firestore server SDKs database. It is an open-source DBMS service offered by
or cloud functions [38]. MongoDB Inc. and it is the most popular NoSQL database
Data archived in the local server database is according to DB-Engines rankings [31], [42].
stored in formats that are suited for how they will be used. d. PostgreSQL
This research on mobile communication quality of This is an open-source relational database where data
service, for instance, requires this data to be available is organized in tables, columns, and rows. It can be
locally for analysis, therefore data that is pulled from the deployed on a self-managed cloud server or a fully
cloud database will be stored in a format (possibly .csv) managed cloud service.
that will serve the purpose of data analysis perfectly, focus e. Redis
on the review here will be on the server application and This is a key-value store type non-relational database
not particularly on data. Some features of databases for that natively provides fast response time hence its
Local Server DBMS. common use as a caching database and for applications
that carry out heavy computation on query results that is
to be sent to a client (mobile or web) since it
1. High storage capacity.
significantly reduces query time.
2. Cost-effective
f. Neo4j
3. Low downtimes
This is a graph store type non-relational database
4. High-end computers to ensure minimum
mostly used for systems that are heavily reliant on
downtime.
relationships that exist between data and uses its graph not hosted using MongoDB Atlas which is a more
architecture to optimize complex queries. expensive alternative to hosting on the personal cloud
g. Cassandra server also management tools are provided as part of the
This is a column store type non-relational database cloud database service for Firebase Real-Time Database
particularly built for storing a large amount of data and Cloud Firestore.
quickly and easily scales when data becomes large. Limits to data storage size are based solely on
subscription plans in the cases of Cloud Firestore and
Each of the compared database services provides Firebase Real-Time Database [48] while in the case of
various data types that cover whatever will be required, MongoDB it is based on the configuration setup on
easy data migration features using REST APIs or MongoDB Atlas [49] Firebase Real-Time Database has
Command Line Tools [43], [44], and can be easily the best presence support for native mobile applications
integrated with mobile clients using reliable SDKs. [41] amongst all three although Cloud Firestore can
Although MongoDB’s solution does not provide as wide leverage on Firebase Real-Time Database for this
a range of cloud database management tools compared to functionality [41] and pricing is generally cost-effective
the other [45]–[47]. Most especially when the database is on pay-as-you-go plans [42], [48], [50].
Table 2: Comparison between Selected Database Models (Services) for the Cloud DBMS.
Characteristics Firebase Cloud MongoDB
PostgreSQL Redis Neo4j Cassandra
Real-Time Firestore
Database
Type of data Supports most Supports most Data structures
Supports a Supports a Basic data Supports a
data types and data types, are represented
wide range of wide range of types are wide range of
generally Cloud Firestore in JSON,
data types for data types with supported data types with
structured in a references and internal data is
table columns the basic being [53]. the
JSON tree. generally stored as
which String which aforementioned
structured in a BSON.
constraints the covers various basic data types
JSON tree.
values to be data types [52] covered by its
stored [51]. native data type
The comparison in Table 2 shows that all discussed database provide by default, unlike others that
databases have features that cover each requirement require an additional API layer to connect the
except some specific cases like: database to the mobile application.
1. PostgreSQL does not easily scale when data The proposed procedure for gathering data in this research
grows large. is not labour-intensive (such as a physical survey) making
2. The requirement for a mobile client integration the exact amount of data to be gathered unpredictable, this
which Cloud Firestore and Firebase Realtime window of uncertainty at this point will only reflect in the
pricing incurred for the cloud storage provider and data distribution during scale-up operations (which is
physical storage capacity and limits of local server for handled by the cloud host) and availability of data [41].
analysis. Cloud Firestore for instance is a Backend-as-a-Service
A cloud-hosted database removes the burden of (BaaS) deployment option under the larger Google Cloud
setting up synchronization configuration procedures with Platform, providing server-side services for mobile and
the mobile client because they have an easy-to-configure web clients.
cloud service. Also, cloud-hosted databases solve issues of
The database proposed for a cloud DBMS is a non- [40]. PostgreSQL doesn’t scale very well as it is a
relational database. For the cloud storage, the proposed is relational database that has to scale vertically and
the Cloud Firestore, the main reasons for choosing Cloud distributed clusters are not as easy to manage
Firestore are as follows: compared to other compared non-relational
databases. MongoDB, Redis, Neo4j, and Cassandra
1. Cloud Firestore does not have native presence
all scale easily.
support for instant synchronization of data but it can
3. MongoDB has been filtered out due to not having
leverage on Firebase Realtime Database's support
active support for mobile clients, compared with
by syncing Cloud Firestore and Realtime Database
both Firebase Real-time Database and Cloud
using Cloud Functions, the other databases
Firestore [45]–[47] which are more commonly used
compared do not have this feature hence it would
for mobile applications and are more preferred by
require the software developers to create an
mobile application developers.
additional API service to connect the mobile
4. The cost of hosting a Cloud Firestore is much
application to the cloud database.
cheaper compared to Firebase Realtime Database
2. The scalability of Cloud Firestore goes further
according to the official pricing lists which prices
compared to Firebase Real-time Database, with
Cloud Firestore storage at $0.18/GiB and Firebase
Cloud Firestore scaling up to 1 million concurrent
Real-time database storage at $5/GB [42], [48],
connections while Firebase Real-time Database
[50]. The cost of hosting a cloud instance of the
scales to about 200,000 concurrent connections
other compared databases depends on whether it is with the database, execute the required database
hosted on a self-managed server or a fully-managed operations needed by the web application for operational
database service of which is the former is usually data. Therefore, the review already carried out for Cloud
charged for just the infrastructure while the latter is Application Database covers the database service for the
charged based on storage and query usage. web application. The decision of utilizing an API layer
Firebase Real-Time Databases are limited to zonal instead of a separate database for the web application
availability in a single region and Cloud Firestore ensures ensures that a centralized cloud database service is used
data is shared across multiple data centres at once which for the entire system which subsequently saves the cost of
will provide strong consistency of data at any instant [41]. running multiple cloud database services.
5.2 Database Models (Services) for the Web 5.3 Database Models (Services) for the Local Server
Application DBMS DBMS
The requirements for a Web Application DBMS The requirements for the local database server are
for operational database operations such as the that the server performs an intermittent extraction and
management, authentication, and authorization of users storage of data from the cloud database to local storage
(stakeholders) as seen in Figure 2 and also evaluation of and also grants data access to other computers that are
aggregated results on the web application can be securely connected locally for analysis. Therefore,
efficiently handled by the existing cloud database which is decisions to be made concerning the server depends on
already managing metrics measurement data. This can be optimization plans, developers’ preference, and existing
achieved using an API layer software that will interface cloud database service.
a. Optimization: The local server could be run on a data migration from Cloud Firestore to BigQuery, and the
regular desktop operating system environment like local server. Google BigQuery provides machine learning
Windows, but this will only compete with the primary support for data analytics. Furthermore, from this study,
assignments of the server mentioned above for the the need for a separate database for the web application
computing power of the computer [61]. For an was eliminated, considering that the web application
optimized server, the computer should run a dedicated mainly serves the purpose of visualization of results
server operating system like Ubuntu Server. gathered from the analysis of collected data from the
b. Developers’ Preference: This particularly determines mobile clients, the required functional data can be
the server’s operating system; the choice of the provided through the API layer. This serves as an
operating system depends on which one the developer intermediary with specific endpoints and provides the web
who will set up the system can efficiently work with. application with the required data. This also serves the
c. Existing Cloud Database Service. purpose of separation of concerns as the web application
The already deployed cloud database service from only deals with just requesting data via the API endpoints
which the local server will determine what SDK or and visualization while the API handles all queries on the
software to be used to extract data from it. For database. It also optimizes the already existing database
instance, Firebase has cloud functions that can be for the mobile client which the API will query for
implemented with custom scripts that will run on the functional data required by the web application.
server to execute data exports [38].
7.0 CONCLUSION
6.0 DISCUSSION In this review, a comparison of DBMSs was
A robust mobile crowdsourcing application data carried out for a mobile crowdsourcing application and
management system is heavily reliant on data for its analysis, a broad overview of database types was given,
functionality, the word robust here signifies a system that through the description of their features, suitable areas of
covers most of the requirements for a DBMS that were application and previous implementation of mobile
identified in this study. In this survey, different features of applications along with database performance
DBMSs were compared along with the previous benchmarking reports helped to narrow down from the
implementation of crowdsourcing mobile quality of vast DBMS options. A mobile crowdsourcing application
service measurement systems and benchmark tests and should be flexible considering the continuous evolution of
reports carried out by [62], [63]. Although database the technology of mobile communication, this, therefore,
normalization gives a huge advantage of reducing data makes it necessary that new systems and existing ones
redundancy it is most suited for structured data which should be capable of accommodating changes instead of
requires a predefined and rigid schema. The need for entirely new systems being developed due to innovations,
normalization was not so desirable as it was pointed out in to this effect the choice of a non-relational database in this
this review and a flexible data schema will be more review was considered. Also, when choosing a DBMS, it
suitable. Database query optimization by the DBMS is important to consider how efficient such a system will
service providers in combination with standard object- be at the point of extracting data for analysis and
relational mapping libraries ensure that the absence comprehensive result, it will not do much good if data that
efficiency of normalization of structured data is catered for is gathered cannot serve this end purpose well. Data
by the aforementioned optimization features. Therefore, a should exist in a format or structure that will ensure that
non-relational database was proposed for the mobile-end analysis and its results can be conveyed empirically. The
database with Google Cloud Firestore proposed proposed focus for subsequent works is that proper
specifically due to its support for mobile client documentation of the process of choosing a DBMS and
implementation. This choice also makes the integration of review of the basis for their preference should be discussed
data from the mobile end-users to the cloud-hosted in upcoming works of literature on mobile applications
database relatively easier with all proposed services being because of its importance in the overall system accuracy
part of the Google Cloud Platform. Although it is not as and sustainability.
popular as some other database services as seen in Figures
1 and 3, separate comparative reviews of the DBMS ACKNOWLEDGEMENT
performance by [62], [63] demonstrated that MongoDB This research was funded by the Nigerian
(a non-relational database) performed better when reading Communication Commission Research Fund Grant 2020
large datasets and performing full-text queries, while
MySQL (relational) and Cassandra (non-relational) REFERENCES
performed much better for data insertion. [1] Silberschatz, A., Korth, H. F. and Sudarshan, S.
Google BigQuery was proposed as an appropriate Database System Concepts (7th. edition), (2019).
data warehouse solution since it will provide continuity of [2] Carlos, C., Steven, M. and Rob, P. "Database Systems:
direct integration with Cloud Firestore and its APIs for Design, Implementation and Management", 13th ed.
CENGAGE, (2010).