0% found this document useful (0 votes)
4 views

CHAPTER+1+Big+Data+Analytics+a+Hands on+Approach

Module 1 covers the fundamentals of Big Data and analytics, defining key terms and explaining various types of analytics such as descriptive, diagnostic, predictive, and prescriptive. It discusses the characteristics of Big Data, including volume, velocity, variety, veracity, and value, along with domain-specific applications across industries like finance, healthcare, and logistics. The module also introduces the analytics flow for Big Data, outlining steps from data collection to analysis.

Uploaded by

tvtolentino1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

CHAPTER+1+Big+Data+Analytics+a+Hands on+Approach

Module 1 covers the fundamentals of Big Data and analytics, defining key terms and explaining various types of analytics such as descriptive, diagnostic, predictive, and prescriptive. It discusses the characteristics of Big Data, including volume, velocity, variety, veracity, and value, along with domain-specific applications across industries like finance, healthcare, and logistics. The module also introduces the analytics flow for Big Data, outlining steps from data collection to analysis.

Uploaded by

tvtolentino1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Module 1:

Fundamentals of
Big Data and
Techniques
Module 1
Fundamentals of Big Data and Techniques
Objectives:
After completing this chapter, students should be able to
1. Define the term Analytics.
2. Describe the characteristics of Big Data.
3. Explain each domain of Big Data.
4. Use data visualization for analytics flow of Big Data.
5. Describe each Big Data stack using various database.
6. Use Mapping Analytics Flow for to Big Data Stack
7. Create analytics pattern
What is Analytics?
ANALYTICS
- is a broad term that encompasses the processes, technologies, framework,
and algorithms to extract meaningful insights from data.

- is a process of extracting and creating information from raw data by


filtering, processing, categorizing, condensing and contextualizing data.
Types of Analytics
1. Descriptive Analytics – what happened?

2. Diagnostic Analytics – why did it happen?

3. Predictive Analytics – why is likely to happen?

4. Prescriptive Analytics – what can we do to make it happen?


Descriptive Analytics

- It includes analyzing past data to present it in a summarized form which


can be easily interpreted.

- It uses statistical method such as counts, maximum, minimum, mean,


top-N, and percentage.

- It is useful to summarize the data.

Example: Computing the total number of likes for a particular post, computing the
average monthly rainfall or finding the average number of visitors per month on a website.
Diagnostics Analytics

- It includes an analysis of past data to diagnose the reasons as to why a


certain events happened.

- It can provide more insights into why certain a fault has occurred based on
the patterns in the sensor data for previous faults.

Example: A system that collects and analyzes sensor data from machines from
monitoring their health and predicting failures.
Predictive Analytics

- It includes predicting the occurrence of an event or the likely outcome of


an event or forecasting the future values using prediction model.

- It can be done using predictive models which are trained by existing data.
These models learn the patterns and trends from the existing data and
predict the occurrence of an event or the likely outcome of an event
(classification models) or forecast numbers (regression models).

Example: Predictive analytics can be used for predicting when a fault will occur in a
machine, predicting whether a tumor is benign or malignant, predicting occurrence of
natural emergency or forecasting the pollution levels.
Prescriptive Analytics
- It uses multiple prediction models to predict various outcomes and the
best course of action for each outcome.

- It can predict the possible outcomes based on the current choice of


actions.

- It is considered as types of analytics that uses different prediction models


for different inputs.

Example: Prescriptive analytics can be used to prescribed the best medicine for
treatment of a patient based on the outcomes of various medicines for similar patients.
Types of Analytics
What is Big Data?

BIG DATA
- It is defined as collections of datasets whose volume, velocity or variety is
so large that it is difficult to store, manage, process, and analyze the data
using traditional databases and data processing tools.

- It deals with collection, storage, processing, and analysis of this massive-


scale data.
What is Big Data?
BIG DATA
BIG DATA
Specialized tools and frameworks

- When the volume of data involved is so large that is difficult to store,


process, and analyze data on single a machine.
- The velocity of data is very high and the data needs to be analyzed in real-
time.
- There is variety of data involved, which can be structured, unstructured or
semi-structured, and is collected from multiple data sources.
- Various types of analytics need to be performed to extract value from the
data such as descriptive, diagnostic, predictive and prescriptive analytics.
Characteristics Big Data

Volume
- It is a form of data whose volume is so large that it would not fit on a single
machine therefore specialized tools and frameworks are required to store
process and analyze such data.

Example: Social media applications process billions of messages everyday, industrial


and energy systems can generate terabytes of sensor data everyday, cab aggregation
applications can process millions of transactions in a day.
Characteristics Big Data

Velocity
- It refers to how fast the data is generated.

- It is the primary reason for the exponential growth of data.

Example: Social media and sensor data.


Characteristics Big Data

Variety
- It refers to the forms of data.

- It comes in different forms such as structured, unstructured, or semi-


structured, including text data, image, audio, video, and sensor data.
Characteristics Big Data

Veracity
- It refers to how accurate is the data.

- Data needs to be cleaned to remove noise. Data-driven applications can


reap the benefits of big data only when the data is meaningful and
accurate.

- Therefore, cleansing of data is important so that incorrect and faulty data


can be filtered out.
Characteristics Big Data

Value
- It refers to the usefulness of data for the intended purpose.

- The end goal of any big data analytics system is to extract value from the
data.

- The value of the data is also related to the veracity or accuracy of the data.
Domain Specific Examples of Big Data
Application of Big Data
- A span a wide range of domains including (but not limited to) homes, cities,
environment, energy systems, retail, logistics, industry, agriculture, Internet of
Things, and healthcare.

• Web
• Financial
• Healthcare
• Internet of Things
• Environment
• Logistics and Transportation
• Industry
• Retail
Domain Specific Examples of Big Data
Web
- Web analytics deals with collection and analysis of data on the user visits
on websites and cloud applications.

Two approaches to collect data

1. User visits are logged on the web server which collects data such as the date
and time of visit, resource requested, user’s IP address, HTTP status code, for
instance.
2. It is called page tagging, uses a JavaScript which is embedded in the web page
Whenever a user visits a web page, the JavaScript collects user data and sends it
to a third party data collection server.
Domain Specific Examples of Big Data
Web
- Performance Monitoring Multi-tier web and cloud applications such as
such as e-Commerce, Business-to-Business, Health care, Banking and
Financial, Retail and Social Networking applications, can experience rapid
changes in their workloads.

- To ensure market readiness of such applications, adequate resources need


to be provisioned so that the applications can meet the demands of
specified workload levels and at the same time ensure that the service
level agreements are met.
Domain Specific Examples of Big Data
Web
Ad Targeting and Analytics it search and display advertisements are the two
most widely used approaches for Internet advertising.

- users are displayed advertisements ("ads"), along with the search results, as
they search for specific keywords on a search engine.
Domain Specific Examples of Big Data
Web
Content Recommendation the content delivery applications that serve
content (such as music and video streaming applications), collect various
types of data such as user search patterns and browsing history, history of
content consumed, and user ratings.

Two Broad Categories


1. User-based recommendation - new items are recommended to a user based on
how similar users rate those items.
2. Item-based recommendation - new items are recommended to a user based on
how the user rated similar items.
Domain Specific Examples of Big Data
Financial
Credit Risk Modelling: Banking and Financial institutions use credit risk
modeling to score credit applications and predict if a borrower will default or
not in the future.

- It is created from the customer data that includes, credit scores


obtained from credit bureaus, credit history, account balance data, account
transactions data and spending patterns of the customer.

- It generates numerical scores that summarize the creditworthiness of


customers.
Domain Specific Examples of Big Data
Financial
Fraud Detection: Banking and Financial institutions can leverage big data
systems for detecting frauds such as credit card frauds, money laundering and
insurance claim frauds.

- Real-time big data analytics frameworks can help in analyzing data from
disparate sources and label transactions in real-time.
Domain Specific Examples of Big Data
Healthcare
The healthcare ecosystem consists of numerous entities including healthcare
providers (primary care physicians, specialists, or hospitals), payers
(government, private health insurance companies, employers),
pharmaceutical, device and medical service companies, IT solutions and
services firms, and patients.

- To promote more coordination of care across the multiple providers involved


with patients, their clinical information is increasingly aggregated from
diverse sources into Electronic Health Record (EHR) systems.
Domain Specific Examples of Big Data
Internet of Things
- It refers to things that have unique identities and are connected to the
Internet.

- The "Things" in IoT are the devices which can perform remote sensing,
actuating and monitoring.

Some IoT applications that can benefit from big data systems
• Intrusion Detection
• Smart Parkings
• Smart Roads
• Structural Health Monitoring
• Smart Irrigation
Domain Specific Examples of Big Data
Environment
- Environment monitoring systems generate high velocity and high volume
data.

Some environment monitoring applications that can benefit from big data systems
• Weather monitoring
• Air pollution monitoring
• Noise pollution monitoring
• Forest fire detection
• River floods detection
• Water quality monitoring
Domain Specific Examples of Big Data
Logistics and Transportation
Some logistics and transportation monitoring applications that can benefit
from big data systems:

• Real-time Fleet Tracking


• Shipment Monitoring
• Remote Vehicle Diagnostics
• River floods detection
• Route Generation & Scheduling
• Hyper-local Delivery
• Cab/Taxi Aggregators
Domain Specific Examples of Big Data

Industry
Some industry monitoring applications that can benefit from big data
systems:

• Machine Diagnosis & Prognosis


• Risk Analysis of Industrial Operations
• Production Planning and Control
Domain Specific Examples of Big Data
Retail
- Retailers can use big data systems for boosting sales, increasing profitability
and improving customer satisfaction.

Some retail monitoring applications that can benefit from big data systems:

• Inventory Management
• Customer Recommendations
• Production Planning and Control
• Store Layout Optimization
• Forecasting Demand
Analytics flow for Big Data

Flow for Big Data


- It is a novel data science and analytics application system design
methodology that can be used for big data analytics.

- It shows the analytics flow with various steps.

>data collection
>data preparation
>analyses types
>analyses modes
Big Data Analysis Flow

You might also like