0% found this document useful (0 votes)
52 views

Training

This document discusses a project analyzing Diwali sales data using Python for data science. It involves acquiring data from multiple sources, preparing the data by cleaning and transforming it, then exploring the data through visualizations and statistical analysis. The analysis identifies patterns like females purchasing more than males on average. Top selling products and customer demographics like married women aged 26-35 from certain states and occupations are identified. The conclusion provides insights to improve customer experience and increase company revenue.

Uploaded by

goels0798
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views

Training

This document discusses a project analyzing Diwali sales data using Python for data science. It involves acquiring data from multiple sources, preparing the data by cleaning and transforming it, then exploring the data through visualizations and statistical analysis. The analysis identifies patterns like females purchasing more than males on average. Top selling products and customer demographics like married women aged 26-35 from certain states and occupations are identified. The conclusion provides insights to improve customer experience and increase company revenue.

Uploaded by

goels0798
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

PYTHON WITH DATA

SCIENCE
DIWALI SALES ANALYSIS
BY- Shivam Goel
ECE-3
Enrollment number – 04096302820
MAHARAJA SURAJMAL INSTITUTE OF TECHNOLOGY
DATA SCIENCE

• Data science is about deriving useful


insights from the data in order to
solve real world complex problems
Task Of A Data Scientist
• Data Acquisition – data is gathered from various sources like Databases ,
web servers , API's(application programming interface)
• Data Preparation – It involves data cleaning and data transformation
Tools like talend and informatica are used to perform
complex data transformations and helps in better understanding

Exploratory Data analysis- Exploratory Data Analysis refers to the critical


process of performing initial investigations on data so as to discover patterns,
to spot anomalies , and to check assumptions with the help of summary
statistics and graphical representations.
• Data Modelling - ML models like are applied on data to create a data
product(predict future outcomes , gain insights on data) from it.
Mostly data modelling is done by Python.

• Data Visualization - Data visualization is a way to represent


information graphically, highlighting patterns and trends in data and
helping the reader to achieve quick insights.
PROJECT - DIW
ALI SALES
ANALYSIS

• A company has provided us with their Diwali


sales data, they want us to analyze the data for
each record and attribute in the table and we
share a summary with them in the end by
which company can-

a) Improve customer experience by analyzing


sales data

b) Increase their revenue


• JUPYTER NOTEBOOK - use to create and share
documents that contain live code, equations,
visualizations, and text
• PYTHON - In data science , various python libraries are

Technologies used for fetching data and performing operations on it


• NUMPY - for creating N-dimensional arrays of data.(An
used in array is a special variable, which can hold more than one
value at a time.)

project • PANDAS - used for cleaning and organizing data and to


perform exploratory data analysis. It is better than
spreadsheets/excel as it has tools for reading and writing
data between many formats(csv file, excel file , sql
database)
• MATPLOTLIB,SEABORN – used for data visualization.
IMPORTING
LIBRARIES
Load csv file in jupyter notebook
Cleaning of data- Once we get our data, cleaning and organizing is
done
• 1) drop unrelated or blank columns from our dataset 2.) rename a column
3) check for presence of null values
and remove them

4.) convert columns to correct data


type.
5.) use of ‘describe’ function
and its application for a
specific column
DATA
EXPLORAT
ION
Data exploration is the first step in data analysis
involving the use of data visualization tools and
statistical techniques to uncover data set
characteristics and initial patterns.
1) Male buyers vs Female
buyers

Plot a bar chart for gender vs total amount

From above graphs we can see that most


of the buyers are females and even the
purchasing power of females are greater
than men
2.)total amount vs age group

3.) Total number of orders from top 10 states


4.) marital status

5.) Occupation

z
6.)Product category

7.) On the basis of ‘product ID’ we


want to see our top selling
products
Conclusion- Married women age group 26-35 yrs from UP, Maharastra and
Karnataka working in IT, Healthcare and Aviation are more likely to buy products from Food,
Clothing and Electronics category

Thank you

You might also like