Overview of Big Data: Saidatul Rahah Hamidi
Overview of Big Data: Saidatul Rahah Hamidi
Companies can
Accurately predict what specific segments of customers will want to
buy
Helps companies run their operations in a much more efficient way.
https://www.youtube.com/watch?v=eVSfJhssXUA
https://www.youtube.com/watch?v=TzxmjbL-i4Y
End
The Brief History of Big Data
1989 1991
Early use of term Big Data in The birth of the internet. Anyone can
magazine article by fiction author now go online and upload their own
Erik Larson – commenting on data, or analyze data uploaded by
advertisers’ use of data to target other people.
customers.
1997
Google launch their search engine 1999
which will quickly become the most First use of the term Big Data in an academic paper
popular in the world. – Visually Exploring Gigabyte Datasets in Realtime
Michael Lesk estimates the digital (ACM)
universe is increasing tenfold in size First use of term Internet of Things, in a business
every year. presentation by Kevin Ashton to Procter and Gamble.
The Brief History of Big Data
2001 2005
Three “Vs” of Big Data – Volume, Hadoop – an open source Big Data framework now
Velocity, Variety – defined by Doug developed by Apache – is developed.
Laney The birth of “Web 2.0 – the user-generated web”.
2014
Mobile internet use overtakes desktop
for the first time
88% of executives responding to an 2015
international survey by GE say that big The data volumes are exploding, more data has
data analysis is a top priority been created in the past two years than in the entire
previous history of the human race.
Big Data Analytics
Big data analytics is the process of collecting, organizing and analyzing large sets of data (we called it
Big Data) to discover patterns and other useful information.
Big Data Analytics
En
Characteristics of big data d
Velocity Volume
The speed at which the data is The quantity of generated and
generated and processed to meet stored data. The size of the data Sensor Data
the demands and challenges that determines the value and
lie in the path of growth and potential insight, and whether it We are increasingly surrounded by
development. Big data is often can be considered big data or sensors that collect and share
available in real-time. not. data. Take your smart phone, it
contains a global positioning
sensor to track exactly where you
are every second of the day, it
Variety includes an accelometer to track
Data comes in all types of formats the speed and direction at which
Veracity which is in form structured, you are travelling. We now have
The data quality of captured data numeric data in traditional sensors in many devices and
can vary greatly, affecting the databases to unstructured text products.
accurate analysis. documents, email, video, audio,
sensor data, stock ticker data and
financial transactions.
Characteristics
Big data can provide analytics to support major business decisions such
as production planning, sales management and capital investments.
Data Discrimination
Issues and challenges
of big data
Dealing with data
growth
Data Discrimination
En
d
Issues
Storage and
Data Privacy Data Security Data Discrimination
Transport Issues
• Privacy infractions • Data breaches mean • Being ready to • Big data need big
crop up may strike the discovery of access computerized storage. Current
immediately upon a customer info to instruction (please disk technology
institution enlists those that would in usual online activity, limits are about 4
powerless care a different way online preferences) terabytes per disk.
measures. don't have any which could So, 1 Exabyte would
• Although the access to such a skeptically have an require 25,000
systems software person delicate affect on the disks.
specialist remains instruction. With potential for an • To handle this issue,
chiefly responsible each of the latest individual, as the data should be
for the act, it may breaches, your example to secure a processed “in place”
happen to be private information loan, and on the and transmit only
stopped have to is actually in danger. outside that one the resulting
skillful happen to be person’s proficiency information.
stricter tools and to validate this
protocols that fact message is declared
shield penetrable. to be decidedly
unfair.
Challenges of Big Data
• In order to deal with data growth, organizations are turning to a number of different
technologies. When it comes to storage, converged and hyper converged infrastructure and
software-defined storage can make it easier for companies to scale their hardware. And
Dealing with data technologies like compression, reduplication and tiring can reduce the amount of space and
growth the costs associated with big data storage.
• It could be indisputable that one the resolved of a large data management comes to
analyzing and processing a large number of data. The obligation to navigate transformation
Getting Data into and extraction isn't defined to conventional relational data sets.
Big Data
Structure
• Big data brings along with it some huge analytical challenges. The type of analysis to be
done on this huge amount of data which can be unstructured, semi structured or structured
requires a large number of advance skills. This can be done by using one of two techniques:
Analytical either incorporate massive data volumes in analysis or determine upfront which Big data is
Challenge relevant.
Miscellaneous Challenges En
d
Other challenges may occur while integrating big data. Some of the
challenges include
integration of data Veracity data
skill availability validity of data.
solution cost
the volume of data
the rate of transformation of data
• It is also a challenge to process a large amount of data at a reasonable speed so that information
is available for data consumers when they need it. The validation of data set is also fulfilled while
transferring data from one source to another or to consumers as well.
Scalability:
The scalability issue of Big data has lead towards cloud computing, which now aggregates multiple
disparate workloads with varying performance goals.
This requires high level of sharing of resources which is expensive and various challenges like how
to run and execute jobs so that we can meet the goal of each workload cost effectively.
En
d
E-Commerce &
Retail/Customer
Customer Services
Retail/Customer
Merchandizing and market
basket analysis
Finances/Fraud Services
Compliance and regulatory
reporting
Risk analysis and management
Fraud detection and security
analytics Credit risk, scoring and
analysis
High speed arbitrage trading
Trade surveillance
Abnormal trading pattern analysis
https://www.youtube.com/watch?v=XjmldAL9RQs
En
d
Telecommunication