Uploaded by

likhita A.N

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

IIT_FDS_Assignment1

Uploaded by

likhita A.N

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Data Analysis and Cleaning in a Retail

Industry Use Case

1.Context

In the retail industry, data plays a crucial role in decision-making and strategic planning.
Companies rely on data from various sources, such as sales transactions, customer
feedback, and inventory management systems, to understand market trends, customer
preferences, and operational efficiency. Effective data management and analysis can
provide insights that lead to improved customer satisfaction, optimized inventory levels,
and increased profitability.

2. Content

This assignment focuses on key concepts related to data and databases, including the types
of data and attributes. You will learn how to import and export data using Python, load data
from various formats such as CSV, Excel, JSON, and HTML, and perform descriptive statistics
and data cleaning operations. These tasks will be integrated into a comprehensive analysis
of a retail dataset to simulate a real-world industry scenario.

3. Data Description

We will use a publicly available retail dataset for this assignment. The dataset contains
information about sales transactions, including the following attributes:
InvoiceNo: Invoice number
StockCode: Product code
Description: Product description
Quantity: Quantity of products sold
InvoiceDate: Date of the invoice
UnitPrice: Price per unit of the product
CustomerID: Customer identification number
Country: Country where the customer resides
You can download the dataset from the following link:
Retail Dataset: https://archive.ics.uci.edu/ml/machine-learning-databases/00352/Online
%20Retail.xlsx
4 Objective

The objective of this assignment is to gain hands-on experience with data analysis and
cleaning using Python. By the end of this assignment, you will be able to:
1. Identify types of data and attributes in a dataset.
2. Import and export data using Python.
3. Perform descriptive statistics to understand the dataset.
4. Clean the dataset by handling missing values, duplicate entries, and outliers.

5 Tasks

1. Identify Data Types and Attributes

Load the retail dataset and identify the types of data (numeric, categorical) and types of
attributes (nominal, ordinal, interval, ratio).

2. Data Import and Export with Python

Load the dataset from different formats such as CSV, Excel, JSON, and HTML into a pandas
DataFrame.
Export the cleaned dataset to CSV and Excel formats.

3. Descriptive Statistics
Calculate and interpret mean, median, mode, variance, standard deviation, skewness, and
correlation for the numeric attributes in the dataset.

4. Data Cleaning
Handle missing values by applying appropriate techniques such as imputation or removal.
Identify and remove duplicate entries in the dataset.
Detect and handle outliers using statistical methods.

5. Implementation Using Python

Write Python code to implement the above tasks. Ensure that your code is well-
documented and includes comments explaining each step.

Business Report Statistical Analysis of FoodHub Data
No ratings yet
Business Report Statistical Analysis of FoodHub Data
21 pages
HMIS Final Papers Ruweji
No ratings yet
HMIS Final Papers Ruweji
37 pages
Smart Data Discovery
No ratings yet
Smart Data Discovery
29 pages
Big Mart Sales Analysis
No ratings yet
Big Mart Sales Analysis
3 pages
Assignment 8
No ratings yet
Assignment 8
6 pages
Preview: Generation X and Generation Y in The Workplace: A Study Comparing Work Values of Generation X and Generation Y
100% (1)
Preview: Generation X and Generation Y in The Workplace: A Study Comparing Work Values of Generation X and Generation Y
24 pages
IIT FDS Assignment 1 Likhita
No ratings yet
IIT FDS Assignment 1 Likhita
7 pages
IIM PBA Assignment 2
No ratings yet
IIM PBA Assignment 2
3 pages
dw lab file
No ratings yet
dw lab file
18 pages
My first ETL pipeline
No ratings yet
My first ETL pipeline
10 pages
Ads Phase3
No ratings yet
Ads Phase3
9 pages
Ads Phase 5
No ratings yet
Ads Phase 5
23 pages
Lab 6
No ratings yet
Lab 6
9 pages
Ass 3 - Average
No ratings yet
Ass 3 - Average
10 pages
Big Mart Sales Analysis
No ratings yet
Big Mart Sales Analysis
3 pages
B M Sale Analysis
No ratings yet
B M Sale Analysis
3 pages
B Tech-AIML-question bank-2 Answer Key
No ratings yet
B Tech-AIML-question bank-2 Answer Key
9 pages
Task-by-Task Guide - Retail Data Analysis (2)
No ratings yet
Task-by-Task Guide - Retail Data Analysis (2)
6 pages
DAP writeups_merged
No ratings yet
DAP writeups_merged
33 pages
Data+Analysis+Project+on+Customer+Purchases+Dataset
No ratings yet
Data+Analysis+Project+on+Customer+Purchases+Dataset
1 page
rithika.ppt
No ratings yet
rithika.ppt
16 pages
Wrangle Report
No ratings yet
Wrangle Report
7 pages
PDS_Exp_7_to_9
No ratings yet
PDS_Exp_7_to_9
10 pages
Advance Data Analytics ASSIGNMENT
No ratings yet
Advance Data Analytics ASSIGNMENT
10 pages
Deep Learning Ram
No ratings yet
Deep Learning Ram
21 pages
DA108 Lab 08 Assignment
No ratings yet
DA108 Lab 08 Assignment
2 pages
Data Analysis
No ratings yet
Data Analysis
4 pages
Assigment 3 Data Science
No ratings yet
Assigment 3 Data Science
3 pages
Document (2)
No ratings yet
Document (2)
29 pages
Python Programming
No ratings yet
Python Programming
3 pages
Supermarket Sales Analysis Project
No ratings yet
Supermarket Sales Analysis Project
8 pages
Training
No ratings yet
Training
17 pages
task 1
No ratings yet
task 1
2 pages
Guides
No ratings yet
Guides
23 pages
Data Exploration Preparation
No ratings yet
Data Exploration Preparation
12 pages
E-Book Data Cleaning Techniques in Python
100% (2)
E-Book Data Cleaning Techniques in Python
50 pages
Prac 7
No ratings yet
Prac 7
5 pages
Python Data Wrangling for Business Analytics: Python for Business Analytics Series
From Everand
Python Data Wrangling for Business Analytics: Python for Business Analytics Series
George Snypes
2/5 (1)
Data Mining Journal 1 Kashan
No ratings yet
Data Mining Journal 1 Kashan
13 pages
Project List Data Analytics
No ratings yet
Project List Data Analytics
13 pages
Python Programming
No ratings yet
Python Programming
2 pages
Supermart Grocery Sales - Retail Analytics Dataset - (Data Analyst)
No ratings yet
Supermart Grocery Sales - Retail Analytics Dataset - (Data Analyst)
17 pages
Mall Customer Data Analysis PDF
No ratings yet
Mall Customer Data Analysis PDF
10 pages
RITHIKA CONTENT
No ratings yet
RITHIKA CONTENT
25 pages
Avneesh_To be printed Information Practice
No ratings yet
Avneesh_To be printed Information Practice
8 pages
DAUP_presentation
No ratings yet
DAUP_presentation
7 pages
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
From Everand
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
WINTON CLEM
No ratings yet
Data Preprocessing
No ratings yet
Data Preprocessing
84 pages
Experiment No 7 Dmv
No ratings yet
Experiment No 7 Dmv
5 pages
Business Analytics: Leveraging Data for Insights and Competitive Advantage
From Everand
Business Analytics: Leveraging Data for Insights and Competitive Advantage
Ronald BLaha
No ratings yet
Document Formatting
No ratings yet
Document Formatting
7 pages
Ass 3 - Best (2)
No ratings yet
Ass 3 - Best (2)
10 pages
Delhivery Feature Engineering - Solution Approach
No ratings yet
Delhivery Feature Engineering - Solution Approach
7 pages
Synopsis
No ratings yet
Synopsis
4 pages
Deep Learning Assignments
No ratings yet
Deep Learning Assignments
13 pages
Another Project-Creating Customer Segments
No ratings yet
Another Project-Creating Customer Segments
31 pages
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Question Bank-BDA (Module 1&2) 2
No ratings yet
Question Bank-BDA (Module 1&2) 2
5 pages
III-Unit
No ratings yet
III-Unit
4 pages
CUSTOMER SEGMENTATION
No ratings yet
CUSTOMER SEGMENTATION
9 pages
Pandas Data Cleaning Presentation
No ratings yet
Pandas Data Cleaning Presentation
11 pages
Statistical Transform Data Cleaning
No ratings yet
Statistical Transform Data Cleaning
30 pages
Data Analytics Project Task Description January
No ratings yet
Data Analytics Project Task Description January
2 pages
In Tenshi PPP Tte Jum Am
No ratings yet
In Tenshi PPP Tte Jum Am
23 pages
Riddle Shorts frame
No ratings yet
Riddle Shorts frame
1 page
IIT_FDS_Assignment2
No ratings yet
IIT_FDS_Assignment2
3 pages
Common DVBI Assignment-1
No ratings yet
Common DVBI Assignment-1
3 pages
Class 2
No ratings yet
Class 2
3 pages
FS00 Ob37 Ob62
No ratings yet
FS00 Ob37 Ob62
1 page
Local - Class - Event 1
No ratings yet
Local - Class - Event 1
2 pages
Class Event 2
No ratings yet
Class Event 2
3 pages
Class Local Friends
No ratings yet
Class Local Friends
2 pages
Class Local Private Protected Public
No ratings yet
Class Local Private Protected Public
3 pages
Module 6_Data Visualization with Tableau
No ratings yet
Module 6_Data Visualization with Tableau
31 pages
Disease Prediction Using Deep Learning
No ratings yet
Disease Prediction Using Deep Learning
25 pages
MACHINE-LEARNING-LAB
No ratings yet
MACHINE-LEARNING-LAB
3 pages
Summer Training Report by Abhinav Panwar Part 2 With Plagirised Report
No ratings yet
Summer Training Report by Abhinav Panwar Part 2 With Plagirised Report
55 pages
Sppu Dsbda QP Nov - Dec - 2023
No ratings yet
Sppu Dsbda QP Nov - Dec - 2023
3 pages
Module I Supervised Learning PPT-1
100% (1)
Module I Supervised Learning PPT-1
147 pages
Fried Man Test: Sample Problem
No ratings yet
Fried Man Test: Sample Problem
8 pages
SOWQMT1014JD11
No ratings yet
SOWQMT1014JD11
5 pages
Examples of Qualitative Dissertations
100% (2)
Examples of Qualitative Dissertations
8 pages
S1 Bivariate Data
No ratings yet
S1 Bivariate Data
18 pages
Unit-2
No ratings yet
Unit-2
8 pages
The Analysis of Variance: I S M T 2002
No ratings yet
The Analysis of Variance: I S M T 2002
31 pages
PRACTICE QUIZ
No ratings yet
PRACTICE QUIZ
10 pages
RM - HW 3
No ratings yet
RM - HW 3
3 pages
Instant Access to Applied Multivariate Research Design and Interpretation 3rd Edition Lawrence S. Meyers ebook Full Chapters
100% (6)
Instant Access to Applied Multivariate Research Design and Interpretation 3rd Edition Lawrence S. Meyers ebook Full Chapters
51 pages
Project Questions
No ratings yet
Project Questions
3 pages
Homer Research
No ratings yet
Homer Research
33 pages
We Can Think of Quantitative Data As Being Either Continuous or Discrete. and Still A Smaller Unit Exists. An Example of This Is The Age of
No ratings yet
We Can Think of Quantitative Data As Being Either Continuous or Discrete. and Still A Smaller Unit Exists. An Example of This Is The Age of
3 pages
Guidelines for Project DMBA404 - Nov 24
No ratings yet
Guidelines for Project DMBA404 - Nov 24
28 pages
Value Chain of PepsiCo
No ratings yet
Value Chain of PepsiCo
3 pages
Week 10 - Concept Notes - 0
No ratings yet
Week 10 - Concept Notes - 0
18 pages
Statistics For Business Decisions DSC 1
No ratings yet
Statistics For Business Decisions DSC 1
3 pages
Shubham Resume
No ratings yet
Shubham Resume
1 page
E-Bay Functional Analysis
No ratings yet
E-Bay Functional Analysis
6 pages
College of Business and Economics
No ratings yet
College of Business and Economics
22 pages
Describing Data: Graphs and Tables
No ratings yet
Describing Data: Graphs and Tables
14 pages

Uploaded by

Uploaded by

Data Analysis and Cleaning in a Retail

Industry Use Case

1. Identify Data Types and Attributes

2. Data Import and Export with Python

5. Implementation Using Python

You might also like