CHAPTER+1+Big+Data+Analytics+a+Hands on+Approach
CHAPTER+1+Big+Data+Analytics+a+Hands on+Approach
Fundamentals of
Big Data and
Techniques
Module 1
Fundamentals of Big Data and Techniques
Objectives:
After completing this chapter, students should be able to
1. Define the term Analytics.
2. Describe the characteristics of Big Data.
3. Explain each domain of Big Data.
4. Use data visualization for analytics flow of Big Data.
5. Describe each Big Data stack using various database.
6. Use Mapping Analytics Flow for to Big Data Stack
7. Create analytics pattern
What is Analytics?
ANALYTICS
- is a broad term that encompasses the processes, technologies, framework,
and algorithms to extract meaningful insights from data.
Example: Computing the total number of likes for a particular post, computing the
average monthly rainfall or finding the average number of visitors per month on a website.
Diagnostics Analytics
- It can provide more insights into why certain a fault has occurred based on
the patterns in the sensor data for previous faults.
Example: A system that collects and analyzes sensor data from machines from
monitoring their health and predicting failures.
Predictive Analytics
- It can be done using predictive models which are trained by existing data.
These models learn the patterns and trends from the existing data and
predict the occurrence of an event or the likely outcome of an event
(classification models) or forecast numbers (regression models).
Example: Predictive analytics can be used for predicting when a fault will occur in a
machine, predicting whether a tumor is benign or malignant, predicting occurrence of
natural emergency or forecasting the pollution levels.
Prescriptive Analytics
- It uses multiple prediction models to predict various outcomes and the
best course of action for each outcome.
Example: Prescriptive analytics can be used to prescribed the best medicine for
treatment of a patient based on the outcomes of various medicines for similar patients.
Types of Analytics
What is Big Data?
BIG DATA
- It is defined as collections of datasets whose volume, velocity or variety is
so large that it is difficult to store, manage, process, and analyze the data
using traditional databases and data processing tools.
Volume
- It is a form of data whose volume is so large that it would not fit on a single
machine therefore specialized tools and frameworks are required to store
process and analyze such data.
Velocity
- It refers to how fast the data is generated.
Variety
- It refers to the forms of data.
Veracity
- It refers to how accurate is the data.
Value
- It refers to the usefulness of data for the intended purpose.
- The end goal of any big data analytics system is to extract value from the
data.
- The value of the data is also related to the veracity or accuracy of the data.
Domain Specific Examples of Big Data
Application of Big Data
- A span a wide range of domains including (but not limited to) homes, cities,
environment, energy systems, retail, logistics, industry, agriculture, Internet of
Things, and healthcare.
• Web
• Financial
• Healthcare
• Internet of Things
• Environment
• Logistics and Transportation
• Industry
• Retail
Domain Specific Examples of Big Data
Web
- Web analytics deals with collection and analysis of data on the user visits
on websites and cloud applications.
1. User visits are logged on the web server which collects data such as the date
and time of visit, resource requested, user’s IP address, HTTP status code, for
instance.
2. It is called page tagging, uses a JavaScript which is embedded in the web page
Whenever a user visits a web page, the JavaScript collects user data and sends it
to a third party data collection server.
Domain Specific Examples of Big Data
Web
- Performance Monitoring Multi-tier web and cloud applications such as
such as e-Commerce, Business-to-Business, Health care, Banking and
Financial, Retail and Social Networking applications, can experience rapid
changes in their workloads.
- users are displayed advertisements ("ads"), along with the search results, as
they search for specific keywords on a search engine.
Domain Specific Examples of Big Data
Web
Content Recommendation the content delivery applications that serve
content (such as music and video streaming applications), collect various
types of data such as user search patterns and browsing history, history of
content consumed, and user ratings.
- Real-time big data analytics frameworks can help in analyzing data from
disparate sources and label transactions in real-time.
Domain Specific Examples of Big Data
Healthcare
The healthcare ecosystem consists of numerous entities including healthcare
providers (primary care physicians, specialists, or hospitals), payers
(government, private health insurance companies, employers),
pharmaceutical, device and medical service companies, IT solutions and
services firms, and patients.
- The "Things" in IoT are the devices which can perform remote sensing,
actuating and monitoring.
Some IoT applications that can benefit from big data systems
• Intrusion Detection
• Smart Parkings
• Smart Roads
• Structural Health Monitoring
• Smart Irrigation
Domain Specific Examples of Big Data
Environment
- Environment monitoring systems generate high velocity and high volume
data.
Some environment monitoring applications that can benefit from big data systems
• Weather monitoring
• Air pollution monitoring
• Noise pollution monitoring
• Forest fire detection
• River floods detection
• Water quality monitoring
Domain Specific Examples of Big Data
Logistics and Transportation
Some logistics and transportation monitoring applications that can benefit
from big data systems:
Industry
Some industry monitoring applications that can benefit from big data
systems:
Some retail monitoring applications that can benefit from big data systems:
• Inventory Management
• Customer Recommendations
• Production Planning and Control
• Store Layout Optimization
• Forecasting Demand
Analytics flow for Big Data
>data collection
>data preparation
>analyses types
>analyses modes
Big Data Analysis Flow