Big Data - Module 1
Big Data - Module 1
Module -1
What is big data
• Big data is data that exceeds the processing
capacity of conventional database systems.
• This data comes from everywhere: sensors used to
gather climate information, posts to social media
sites, digital pictures and videos, purchase
transaction records, and cell phone GPS signals to
name a few. This data is big data.
• Big data usually includes data sets with sizes
beyond the ability of commonly used software
tools to capture, create, manage, and process the
data within a tolerable elapsed time
Categories of BIG Data
• Structured
• Written in a format that’s easy for machines to
understand.
• Structured data is easily searchable by basic algorithms.
• Examples : Fields/ Tables/ Columns/
RDBMS/Spreadsheet
• Semi-structured
• Markers/Tags to separate elements
• XML/HTML
• Unstructured
• No fields/attributes
• More like Human Language
• Free form text (E-mail body, notes, articles,…)
• Audio, video, and image
Big Data Analytics