Unit-2 (Data Litrecy)
Unit-2 (Data Litrecy)
INTELLIGENCE
UNIT-2
(Data Literacy)
WHAT IS DATA LITERACY?: Data literacy is the ability to understand, analyze, and communicate with data,
effectively. Data literacy is important because it helps you think critically, make better choices, and understand the
world of data.
1. Reading Data: It refers to the ability to read, analyze and understand data in, which may be available in various
formats such as numbers, tables, graphs, charts etc.
2. Working with Data: It refers to the process of collecting and managing data; checking facts and spotting
misleading information in it; storing and transmitting it in appropriate formats.
3. Communicating with Data: It refers to the process of understanding and interpreting data; spotting trends and
patterns in the data; tabulating, and reporting and presenting data in different and diverse formats.
1. Spot Data Trends: Data literacy helps us analyze situations, spot trends, and predict outcomes. For
instance, by examining historical data and spotting the data trends, we can understand the causes of
significant events like economic recessions.
2. Foster Critical Thinking: With data literacy, instead of accepting information at face value, we learn to
question its source, reliability, and potential biases. This saves us from misinformation.
3. Make Informed Decisions: With data skills, we can gather and analyse relevant information. This helps
us make choices based on facts, and not on assumptions or guesses.
4. Communicate Effectively: Data literacy empowers us with factual power. We can use data to support our
arguments and ideas. With the help of Data visualization, we can present information more clearly.
Data Literacy Process Framework: It involves majorly the following three steps:
Step 1: Identify
In this step, all the data & information collection happens, whether it's numbers, words, or pictures, e.g., if
we're studying the weather, the data might include temperature readings, rainfall amounts, wind speeds and so on.
Step 2: Analyse
In this step, data is carefully studied to uncover patterns, connections, correlations, trends, and outliers. For
example, analysing sales data might reveal which products are performing well and which ones need improvement.
Step 3: Interpret
In this step, the curated and visualised data goes through the lens of questions like, "What do these patterns
tell us?", "What can we learn from this data?" For instance, if you are analysing student test scores, you might
interpret the data to identify areas where students need extra help.
DATA SECURITY, PRIVACY AND AI: Data security and privacy are the twin pillars of safeguarding our digital
lives. Let us talk about these two and their relation with AI.
Data Security:
Data security involves safeguarding data and information from unauthorized access, theft, or alteration, and
involves preventives measures to stop data breaches, which can have severe consequences.
Data security involves the following things:
Data security and privacy have a mixed relation with AI. Data used to train AI should be secure and from
authentic sources. Comprised data can impact the performance of AI. At the same time, AI can play an important
role in detecting and protecting against data breaches.
BEST PRACTICS FOR CYBER SECURITY: Some best practices for cyber security are:
1. Use strong password: Use unique complex password for each of your account avoids using
easily guessable information like birthdays or pet names.
2. Keep Software and System Updates: Regularly update your operating system anti-virus
software.
3. Be Cautions with E-mails and Links: Be cautious of suspicious links or attachment in emails or
messages.
4. Backup Data Regularly: It is important to take backup of your work and data regularly.
5. Visit Secure Sites and Use Secure Connections: While browsing online always make sure to
visit secure site over secure connections.
6. Limited Access and Sharing of Sensitive Data: Only share personal or confidential
information when necessary.
TYPES OF DATA: Broadly data can be divided into two primary types:
1. Qualitative Data (Categorical Data) : Qualitative data describes qualities or characteristics of some
entity or phenomena.
Following are some examples of qualitative data:
Customer feedback: Such as comments, reviews, and testimonials provide qualitative insights into
customer satisfaction, preferences, and experiences.
Interview transcripts: Such as conversations with individuals or focus groups yield qualitative data,
offering perspectives, opinions, and personal stories.
Social media sentiment: Such as posts, comments, and discussions on social media platform reveal
qualitative insight into public opinion, trends and sentiment.
2. Quantitative Data (Numerical Data): Quantitative data represents information about something
through numerical values.
Following are some examples of qualitative data:
Sales data: Such as transactional records, revenue figures, and purchase history provide
quantitative insights into sales performance, trends, and patterns.
Financial metrics: Such as stock prices, market indices, and financial statements furnish
quantitative insights into economic indicators, investment performance, and financial health.
Qualitative Data Vs. Quantitative Data:
Data Acquisition: Data Acquisition refers to processes, methods or systems that are used to collect
information related to a certain theme or objective, to document or analyse some phenomenon.
Some of the common data acquisition methods are:
1. Surveys. Surveys are structured questionnaires administered to individuals or groups to gather data
on opinions, preferences, behaviours, and demographics.
2. Interviews. Interviews involve direct conversations between researchers and participants to collect
qualitative data through open-ended questions and probing inquiries.
3. Observations. Observational studies involve systematic observation and recording of behaviours,
interactions, and phenomena in natural or controlled settings.
Best practices for acquiring data: For effective data acquisition it is recommended to use following
essential guidelines:
Data Pre-processing: Data Pre-processing refers to the process of making data appropriate for use
by removing discrepancies in it.
Some common data pre-processing techniques are :
1. Data Cleaning
2. Data Reduction
3. Data Transformation
4. Data Integration
1. Data cleaning: data cleaning is a process of identifying and correcting errors inconsistencies and
anomalies in raw data.
There are multiple data cleaning method such as:
Removing duplicate record
Handling missing values
Standardizing format
2. Data Reduction: Data Reduction is a process or a set of techniques used to reduce the size of a
dataset while still preserving the most important information.
Some data reduction method are:
Sampling: pick sample representing a set of data. For example when exit polls after elections are
conducted, not everyone is asked.
Data aggregation: create aggregate or summarize of detailed data.
3. Data Transformation: Data Transformation refers to the process of converting raw data into a
suitable format or representation for analysis, visualization, or modelling.
There are multiple ways of data transformation, some of these are :
4. Data Integration: Data Integration refers to the process of combining data from multiple sources or
formats into a unified dataset for analysis, reporting, or decision-making.
There are multiple ways of integrating data, some of these are:
Merging datasets: which is combining datasets with common identifiers or keys.
Joining tables: which is linking tables based on shared fields or relationships to consolidate related
information.
Concatenating files: which is appending or concatenating files with similar structures or formats to
create a comprehensive dataset for analysis or reporting.
Data Processing: Data Processing refers to manipulating, analyzing, and Interpreting data to extract
meaningful information and derive insights.
Data processing involves:
1. Data analysis
2. Data interpretation
1. Data Analysis: Data Analysis is a process of apply many techniques on data to find/extract trends,
correlations, outliers, and variations that convey a meaning or point to a specific result. For instance, ;
“suppose you have a dataset of student test scores. You might analyze the data to see if there's a
relationship between study time and test performance by comparing scores of students who studied a lot
versus those who studied less”.
(i) Descriptive Analysis. Descriptive analysis shows what happened. It describes data using
statistics, e.g., analysing sales data to get sales numbers for each employee and the average sales.
(ii) Diagnostic Analysis. Diagnostic analysis finds why something happened, e.g., hospitals suddenly
start having increased number of patients. Descriptive analysis may find that many hospital patients had
the same virus symptoms, so the virus caused the patient increase.
(ii) Predictive Analysis. Predictive analysis predicts what might happen in the future based on data
patterns, e.g., a product sells best in September and October each year, so high sales are predicted for
those months next year.
(iv) Prescriptive Analysis. Prescriptive analysis recommends actions to take based on the other
analysis types, e.g., create a marketing plan to boost sales during the slower months after the
September/October peak.
2. Data Interpretation: Data Interpretation involves making sense of the analyzed data, drawing
conclusions, and deriving actionable insights to inform decision-making problem-solving.
For example, after analyzing the dataset of test score of students, you might interpret it as that students
who study more tend to perform better on tests, suggesting a positive relationship between study time
and academic achievement.