UNUT 1- Introduction and Data Analytics Life Cycle
UNUT 1- Introduction and Data Analytics Life Cycle
Note: The material to prepare this presentation has been taken from internet
and are generated only for students reference and not for commercial use.
• Industries that gather and exploit data
• Credit card companies monitor purchase
• Good at identifying fraudulent purchases
• Mobile phone companies analyze calling patterns –
e.g., even on rival networks
• Look for customers might switch providers
• For social networks data is primary product
• Intrinsic value increases as data grows
• Huge volume of data
• Not just thousands/millions, but billions of items
• Complexity of data types and structures
• Varity of sources, formats, structures
• Speed of new data creation and grow
• High velocity, rapid ingestion, fast analysis
• Volume
• Big Data observes and tracks what happens from various
sources which include business transactions, social media and
information from machine-to-machine or sensor data. This
creates large volumes of data.
• Variety
• Data comes in all formats that may be structured, numeric in
the traditional database or the unstructured text documents,
video, audio, email, stock ticker data.
• Velocity
• The data streams in high speed and must be dealt with timely.
The processing of data that is, analysis of streamed data to
produce near or real time results is also fast.
• Cost Savings : help in identifying more efficient ways of doing
business.
• Time Reductions :helps businesses analyzing data
immediately and make quick decisions based on the learnings.
• New Product Development : By knowing the trends of
customer needs and satisfaction through analytics you can
create products according to the wants of customers.
• Understand the market conditions : By analyzing big data you
can get a better understanding of current market conditions.
• Control online reputation: Big data tools can do sentiment
analysis. Therefore, you can get feedback about who is saying
what about your company.
• Mobile sensors – GPS, accelerometer, etc.
• Social media – 700 Facebook updates/sec in2012
• Video surveillance – street cameras, stores, etc.
• Video rendering – processing video for display
• Smart grids – gather and act on information
• Geophysical exploration – oil, gas, etc.
• Medical imaging – reveals internal body structures
• Gene sequencing – more prevalent, less expensive,
healthcare would like to predict personal illnesses
• Structured – defined data type, format, structure
• Transactional data, OLAP cubes, RDBMS, CVS files, spreadsheets
• Semi-structured
• Text data with discernable patterns – e.g., XML data
• Quasi-structured
• Text data with erratic data formats – e.g., clickstream data
• Unstructured
• Data with no inherent structure – text docs, PDF’s, images, video
Rno Name Address Phone no
1 Amit Nashik 9766543267
2 Neha Pune -
3 Jiya Mumbai -
4 Riya Aurangabad 8990765432
visiting 3 websites adds 3 URLs to user’s log files
Video about Antarctica Expedition
Business Intelligence (BI) versus Data Science
Data Science
Business Intelligence (BI) vs Data Science
Current Analytical Architecture
Typical Analytic Architecture
Data sources must be well understood
Phase 1: Discovery
Phase 6: Operationalize