0% found this document useful (0 votes)

10 views

Diwali_Sales_Analysis - Jupyter Notebook

Uploaded by

Vipin Gautam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Diwali_Sales_Analysis - Jupyter Notebook

Uploaded by

Vipin Gautam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

In [7]: # import python libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt # visualizing data
%matplotlib inline
import seaborn as sns

In [8]: # import csv file

df = pd.read_csv('Diwali Sales Data.csv', encoding= 'unicode_escape')

In [9]: df.shape

Out[9]: (11251, 15)

In [10]: df.head()

Out[10]:
Age
User_ID Cust_name Product_ID Gender Age Marital_Status State
Group

0 1002903 Sanskriti P00125942 F 26-35 28 0 Maharashtra W

1 1000732 Kartik P00110942 F 26-35 35 1 Andhra Pradesh So

2 1001990 Bindu P00118542 F 26-35 35 1 Uttar Pradesh C

3 1001425 Sudevi P00237842 M 0-17 16 0 Karnataka So

4 1000588 Joni P00057942 M 26-35 28 1 Gujarat W

In [11]: df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11251 entries, 0 to 11250
Data columns (total 15 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 User_ID 11251 non-null int64
1 Cust_name 11251 non-null object
2 Product_ID 11251 non-null object
3 Gender 11251 non-null object
4 Age Group 11251 non-null object
5 Age 11251 non-null int64
6 Marital_Status 11251 non-null int64
7 State 11251 non-null object
8 Zone 11251 non-null object
9 Occupation 11251 non-null object
10 Product_Category 11251 non-null object
11 Orders 11251 non-null int64
12 Amount 11239 non-null float64
13 Status 0 non-null float64
14 unnamed1 0 non-null float64
dtypes: float64(3), int64(4), object(8)
memory usage: 1.3+ MB

localhost:8888/notebooks/Downloads/Python_Diwali_Sales_Analysis/Python_Diwali_Sales_Analysis/Diwali_Sales_Analysis.ipynb 1/12
10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

In [12]: #drop unrelated/blank columns

df.drop(['Status', 'unnamed1'], axis=1, inplace=True)

In [13]: #check for null values

pd.isnull(df).sum()

Out[13]: User_ID 0
Cust_name 0
Product_ID 0
Gender 0
Age Group 0
Age 0
Marital_Status 0
State 0
Zone 0
Occupation 0
Product_Category 0
Orders 0
Amount 12
dtype: int64

In [14]: # drop null values

df.dropna(inplace=True)

In [15]: # change data type

df['Amount'] = df['Amount'].astype('int')

In [16]: df['Amount'].dtypes

Out[16]: dtype('int32')

In [17]: df.columns

Out[17]: Index(['User_ID', 'Cust_name', 'Product_ID', 'Gender', 'Age Group', 'Age',

'Marital_Status', 'State', 'Zone', 'Occupation', 'Product_Categor
y',
'Orders', 'Amount'],
dtype='object')

localhost:8888/notebooks/Downloads/Python_Diwali_Sales_Analysis/Python_Diwali_Sales_Analysis/Diwali_Sales_Analysis.ipynb 2/12
10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

In [18]: #rename column

df.rename(columns= {'Marital_Status':'Shaadi'})

Out[18]:
Age
User_ID Cust_name Product_ID Gender Age Shaadi State Z
Group

0 1002903 Sanskriti P00125942 F 26-35 28 0 Maharashtra Wes

1 1000732 Kartik P00110942 F 26-35 35 1 Andhra Pradesh Sout

2 1001990 Bindu P00118542 F 26-35 35 1 Uttar Pradesh Ce

3 1001425 Sudevi P00237842 M 0-17 16 0 Karnataka Sout

4 1000588 Joni P00057942 M 26-35 28 1 Gujarat Wes

... ... ... ... ... ... ... ... ...

11246 1000695 Manning P00296942 M 18-25 19 1 Maharashtra Wes

11247 1004089 Reichenbach P00171342 M 26-35 33 0 Haryana Nort

Madhya
11248 1001209 Oshin P00201342 F 36-45 40 0 Ce
Pradesh

11249 1004023 Noonan P00059442 M 36-45 37 0 Karnataka Sout

11250 1002744 Brumley P00281742 F 18-25 19 0 Maharashtra Wes

11239 rows × 13 columns

In [19]: # describe() method returns description of the data in the DataFrame (i.e.
df.describe()

Out[19]:
User_ID Age Marital_Status Orders Amount

count 1.123900e+04 11239.000000 11239.000000 11239.000000 11239.000000

mean 1.003004e+06 35.410357 0.420055 2.489634 9453.610553

std 1.716039e+03 12.753866 0.493589 1.114967 5222.355168

min 1.000001e+06 12.000000 0.000000 1.000000 188.000000

25% 1.001492e+06 27.000000 0.000000 2.000000 5443.000000

50% 1.003064e+06 33.000000 0.000000 2.000000 8109.000000

75% 1.004426e+06 43.000000 1.000000 3.000000 12675.000000

max 1.006040e+06 92.000000 1.000000 4.000000 23952.000000

localhost:8888/notebooks/Downloads/Python_Diwali_Sales_Analysis/Python_Diwali_Sales_Analysis/Diwali_Sales_Analysis.ipynb 3/12
10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

In [20]: # use describe() for specific columns

df[['Age', 'Orders', 'Amount']].describe()

Out[20]:
Age Orders Amount

count 11239.000000 11239.000000 11239.000000

mean 35.410357 2.489634 9453.610553

std 12.753866 1.114967 5222.355168

min 12.000000 1.000000 188.000000

25% 27.000000 2.000000 5443.000000

50% 33.000000 2.000000 8109.000000

75% 43.000000 3.000000 12675.000000

max 92.000000 4.000000 23952.000000

Exploratory Data Analysis

Gender

In [21]: # plotting a bar chart for Gender and it's count

ax = sns.countplot(x = 'Gender',data = df)

for bars in ax.containers:
ax.bar_label(bars)

localhost:8888/notebooks/Downloads/Python_Diwali_Sales_Analysis/Python_Diwali_Sales_Analysis/Diwali_Sales_Analysis.ipynb 4/12
10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

In [22]: # plotting a bar chart for gender vs total amount

sales_gen = df.groupby(['Gender'], as_index=False)['Amount'].sum().sort_val

sns.barplot(x = 'Gender',y= 'Amount' ,data = sales_gen)

Out[22]: <Axes: xlabel='Gender', ylabel='Amount'>

From above graphs we can see that most of the buyers are females and even the
purchasing power of females are greater than men

localhost:8888/notebooks/Downloads/Python_Diwali_Sales_Analysis/Python_Diwali_Sales_Analysis/Diwali_Sales_Analysis.ipynb 5/12
10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

Age

In [23]: ax = sns.countplot(data = df, x = 'Age Group', hue = 'Gender')

for bars in ax.containers:
ax.bar_label(bars)

localhost:8888/notebooks/Downloads/Python_Diwali_Sales_Analysis/Python_Diwali_Sales_Analysis/Diwali_Sales_Analysis.ipynb 6/12
10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

In [24]: # Total Amount vs Age Group

sales_age = df.groupby(['Age Group'], as_index=False)['Amount'].sum().sort_

sns.barplot(x = 'Age Group',y= 'Amount' ,data = sales_age)

Out[24]: <Axes: xlabel='Age Group', ylabel='Amount'>

From above graphs we can see that most of the buyers are of age group between 26-35 yrs
female

localhost:8888/notebooks/Downloads/Python_Diwali_Sales_Analysis/Python_Diwali_Sales_Analysis/Diwali_Sales_Analysis.ipynb 7/12
10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

State

In [25]: # total number of orders from top 10 states

sales_state = df.groupby(['State'], as_index=False)['Orders'].sum().sort_va

sns.set(rc={'figure.figsize':(15,5)})
sns.barplot(data = sales_state, x = 'State',y= 'Orders')

Out[25]: <Axes: xlabel='State', ylabel='Orders'>

In [26]: # total amount/sales from top 10 states

sales_state = df.groupby(['State'], as_index=False)['Amount'].sum().sort_va

sns.set(rc={'figure.figsize':(15,5)})
sns.barplot(data = sales_state, x = 'State',y= 'Amount')

Out[26]: <Axes: xlabel='State', ylabel='Amount'>

From above graphs we can see that most of the orders & total sales/amount are from Uttar
Pradesh, Maharashtra and Karnataka respectively

localhost:8888/notebooks/Downloads/Python_Diwali_Sales_Analysis/Python_Diwali_Sales_Analysis/Diwali_Sales_Analysis.ipynb 8/12
10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

Marital Status

In [27]: ax = sns.countplot(data = df, x = 'Marital_Status')

sns.set(rc={'figure.figsize':(7,5)})
for bars in ax.containers:
ax.bar_label(bars)

In [28]: sales_state = df.groupby(['Marital_Status', 'Gender'], as_index=False)['Amo

sns.set(rc={'figure.figsize':(6,5)})
sns.barplot(data = sales_state, x = 'Marital_Status',y= 'Amount', hue='Gend

Out[28]: <Axes: xlabel='Marital_Status', ylabel='Amount'>

localhost:8888/notebooks/Downloads/Python_Diwali_Sales_Analysis/Python_Diwali_Sales_Analysis/Diwali_Sales_Analysis.ipynb 9/12
10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

From above graphs we can see that most of the buyers are married (women) and they have
high purchasing power

Occupation

In [29]: sns.set(rc={'figure.figsize':(20,5)})
ax = sns.countplot(data = df, x = 'Occupation')

for bars in ax.containers:
ax.bar_label(bars)

In [30]: sales_state = df.groupby(['Occupation'], as_index=False)['Amount'].sum().so

sns.set(rc={'figure.figsize':(20,5)})
sns.barplot(data = sales_state, x = 'Occupation',y= 'Amount')

Out[30]: <Axes: xlabel='Occupation', ylabel='Amount'>

From above graphs we can see that most of the buyers are working in IT, Healthcare and
Aviation sector

localhost:8888/notebooks/Downloads/Python_Diwali_Sales_Analysis/Python_Diwali_Sales_Analysis/Diwali_Sales_Analysis.ipynb 10/12
10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

Product Category

In [31]: sns.set(rc={'figure.figsize':(20,5)})
ax = sns.countplot(data = df, x = 'Product_Category')

for bars in ax.containers:
ax.bar_label(bars)

In [32]: sales_state = df.groupby(['Product_Category'], as_index=False)['Amount'].su

sns.set(rc={'figure.figsize':(20,5)})
sns.barplot(data = sales_state, x = 'Product_Category',y= 'Amount')

Out[32]: <Axes: xlabel='Product_Category', ylabel='Amount'>

From above graphs we can see that most of the sold products are from Food, Clothing and
Electronics category

In [33]: sales_state = df.groupby(['Product_ID'], as_index=False)['Orders'].sum().so

sns.set(rc={'figure.figsize':(20,5)})
sns.barplot(data = sales_state, x = 'Product_ID',y= 'Orders')

Out[33]: <Axes: xlabel='Product_ID', ylabel='Orders'>

localhost:8888/notebooks/Downloads/Python_Diwali_Sales_Analysis/Python_Diwali_Sales_Analysis/Diwali_Sales_Analysis.ipynb 11/12
10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

In [34]: # top 10 most sold products (same thing as above)

fig1, ax1 = plt.subplots(figsize=(12,7))
df.groupby('Product_ID')['Orders'].sum().nlargest(10).sort_values(ascending

Out[34]: <Axes: xlabel='Product_ID'>

Conclusion:

Married women age group 26-35 yrs from UP, Maharastra and Karnataka working in IT,
Healthcare and Aviation are more likely to buy products from Food, Clothing and Electronics
category

Thank you!

localhost:8888/notebooks/Downloads/Python_Diwali_Sales_Analysis/Python_Diwali_Sales_Analysis/Diwali_Sales_Analysis.ipynb 12/12

AETCOM
100% (1)
AETCOM
11 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
29 pages
Book of Thoth
100% (1)
Book of Thoth
4 pages
Rogue Stars: Skirmish Wargaming in a Science Fiction Underworld
From Everand
Rogue Stars: Skirmish Wargaming in a Science Fiction Underworld
Andrea Sfiligoi
4/5 (11)
Rio Tinto Procurement Principles en
No ratings yet
Rio Tinto Procurement Principles en
22 pages
Project Sale Analysis
No ratings yet
Project Sale Analysis
8 pages
Diwali Sales Analysis EDA 1696347982
No ratings yet
Diwali Sales Analysis EDA 1696347982
8 pages
Diwali PDF
No ratings yet
Diwali PDF
19 pages
Training
No ratings yet
Training
17 pages
Diwali Sales Analysis
No ratings yet
Diwali Sales Analysis
14 pages
Project
No ratings yet
Project
12 pages
Untitled0.ipynb - Colab
No ratings yet
Untitled0.ipynb - Colab
6 pages
EDA (Omkar Mane 67)
No ratings yet
EDA (Omkar Mane 67)
9 pages
Supermarket Sales Analysis 1
No ratings yet
Supermarket Sales Analysis 1
13 pages
Diwali Sales Analysis doc
No ratings yet
Diwali Sales Analysis doc
4 pages
BIDA practical print
No ratings yet
BIDA practical print
56 pages
ML lab manual 1-10
No ratings yet
ML lab manual 1-10
58 pages
Supermarket Sales Analysis Project
No ratings yet
Supermarket Sales Analysis Project
8 pages
EcommerceAnalysis 1680541297
No ratings yet
EcommerceAnalysis 1680541297
11 pages
Data Visualization On Pandas - Jupyter Notebook
No ratings yet
Data Visualization On Pandas - Jupyter Notebook
7 pages
Supermarket Sales Data analysis
No ratings yet
Supermarket Sales Data analysis
6 pages
Rajendra Task-2
No ratings yet
Rajendra Task-2
15 pages
Vijaya Lakshman Task-2
No ratings yet
Vijaya Lakshman Task-2
15 pages
Data Analysis
No ratings yet
Data Analysis
4 pages
Walmart Solution PDF
No ratings yet
Walmart Solution PDF
35 pages
Data Analysis in The Banking Sector: Pandas Fundamentals
No ratings yet
Data Analysis in The Banking Sector: Pandas Fundamentals
16 pages
Masterclass Data Analysis.ipynb - Colab
No ratings yet
Masterclass Data Analysis.ipynb - Colab
4 pages
Exp 8_LM
No ratings yet
Exp 8_LM
10 pages
Set B
No ratings yet
Set B
8 pages
Customer Segmentation 1683225943
No ratings yet
Customer Segmentation 1683225943
34 pages
Project Report
No ratings yet
Project Report
7 pages
Python For Business Decision Making Asm2
No ratings yet
Python For Business Decision Making Asm2
21 pages
EDA Project
No ratings yet
EDA Project
7 pages
NM
No ratings yet
NM
33 pages
Lab 1 ML
No ratings yet
Lab 1 ML
2 pages
Universal Data Analytics Algorithm
No ratings yet
Universal Data Analytics Algorithm
51 pages
Synopsis
No ratings yet
Synopsis
19 pages
Technologyname Phase2
No ratings yet
Technologyname Phase2
20 pages
lab record dev
No ratings yet
lab record dev
20 pages
Intro To Pandas For Data Analytics
No ratings yet
Intro To Pandas For Data Analytics
20 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
20 pages
INDEX (1)
No ratings yet
INDEX (1)
16 pages
183052-Python Project
No ratings yet
183052-Python Project
13 pages
BigMart Sales Data Analysis
No ratings yet
BigMart Sales Data Analysis
16 pages
dev record final (3)
No ratings yet
dev record final (3)
34 pages
Data Science Task-2
No ratings yet
Data Science Task-2
13 pages
Customer_Marketing_Analysis_1738244935
No ratings yet
Customer_Marketing_Analysis_1738244935
42 pages
final dev record
No ratings yet
final dev record
49 pages
DAC Phase3
No ratings yet
DAC Phase3
6 pages
EDAP LAB
No ratings yet
EDAP LAB
47 pages
prac1
No ratings yet
prac1
5 pages
Data Exploration Preparation
No ratings yet
Data Exploration Preparation
12 pages
DMV - 1 - Jupyter Notebook
No ratings yet
DMV - 1 - Jupyter Notebook
4 pages
Data Visualization For Python - Sales Retail - r1
No ratings yet
Data Visualization For Python - Sales Retail - r1
19 pages
Battle of The Data Tools - Pandas Vs SQL
No ratings yet
Battle of The Data Tools - Pandas Vs SQL
12 pages
Data+Analysis+Project+on+Customer+Purchases+Dataset
No ratings yet
Data+Analysis+Project+on+Customer+Purchases+Dataset
1 page
SalesDataAnalysisProject
No ratings yet
SalesDataAnalysisProject
4 pages
Pandas Complete + Visualisation Summary of IBM Visualization
No ratings yet
Pandas Complete + Visualisation Summary of IBM Visualization
21 pages
Summary: Introduction To Data Visualization Tools
No ratings yet
Summary: Introduction To Data Visualization Tools
13 pages
DOC-20250118-WA0002.
No ratings yet
DOC-20250118-WA0002.
4 pages
SalesMgmtSystem XII IP Projectreport 2022 23
No ratings yet
SalesMgmtSystem XII IP Projectreport 2022 23
18 pages
Divyanshi 05401172023 Ds Practical
No ratings yet
Divyanshi 05401172023 Ds Practical
18 pages
EEIM in Engineering
No ratings yet
EEIM in Engineering
4 pages
CDR Worksheet
No ratings yet
CDR Worksheet
10 pages
Solutions Pushdown
No ratings yet
Solutions Pushdown
6 pages
Grade 2-T: Be A Math Detective - Catch The Thief
No ratings yet
Grade 2-T: Be A Math Detective - Catch The Thief
4 pages
Am36 Warranty Statement Aug 2022 (TC)
No ratings yet
Am36 Warranty Statement Aug 2022 (TC)
4 pages
Havells Double Height Chandelier Catalouge-2022
No ratings yet
Havells Double Height Chandelier Catalouge-2022
12 pages
Combatting Cult Mind Control PDF
No ratings yet
Combatting Cult Mind Control PDF
132 pages
Psychosocial Support
No ratings yet
Psychosocial Support
32 pages
Eapp1112 q1 Mod5 JdRivero
0% (1)
Eapp1112 q1 Mod5 JdRivero
28 pages
English Course - Academy - 5
No ratings yet
English Course - Academy - 5
4 pages
TOPIC 11.2 & 11.3 CURRENT ELECTRICITY & PRIMARY AND SECONDARY CELLS RoySci Notes 2021
No ratings yet
TOPIC 11.2 & 11.3 CURRENT ELECTRICITY & PRIMARY AND SECONDARY CELLS RoySci Notes 2021
21 pages
Presentation Section 3 Pharmaceutical Form - en
No ratings yet
Presentation Section 3 Pharmaceutical Form - en
12 pages
Instrument Student Lesson
No ratings yet
Instrument Student Lesson
3 pages
En 81-58
No ratings yet
En 81-58
32 pages
Instructions & Rubric
No ratings yet
Instructions & Rubric
3 pages
Geography Field Project-1 (1)
No ratings yet
Geography Field Project-1 (1)
15 pages
L02 R02D01 Fos 00 XX DWG Ar 00401
No ratings yet
L02 R02D01 Fos 00 XX DWG Ar 00401
1 page
Cost Estimation
No ratings yet
Cost Estimation
16 pages
Branches of Philosophy
No ratings yet
Branches of Philosophy
23 pages
English Club Activities
100% (1)
English Club Activities
6 pages
Object Oriented Software Engineering A U
No ratings yet
Object Oriented Software Engineering A U
9 pages
Parts Manual: First Edition Rev A Part No. 134887 August 2008
No ratings yet
Parts Manual: First Edition Rev A Part No. 134887 August 2008
62 pages
Dalubhasaan NG Lungsod NG Lucena: Bachelor of Science in Social Work Program
No ratings yet
Dalubhasaan NG Lungsod NG Lucena: Bachelor of Science in Social Work Program
3 pages
Extreme Programming Perspectives 1st Edition Michele Marchesi pdf download
No ratings yet
Extreme Programming Perspectives 1st Edition Michele Marchesi pdf download
85 pages
Homework 13.1.23 Answers
No ratings yet
Homework 13.1.23 Answers
3 pages
Iso 18611 2 2014
No ratings yet
Iso 18611 2 2014
15 pages
2017 NAVIGATION/MULTIMEDIA Receiver Firmware Update Guide
No ratings yet
2017 NAVIGATION/MULTIMEDIA Receiver Firmware Update Guide
1 page

Uploaded by

Uploaded by

10/15/23, 10:28 PM Diwali_Sales_Analysis - Jupyter Notebook

In [7]: # import python libraries

In [8]: # import csv file

Out[9]: (11251, 15)

0 1002903 Sanskriti P00125942 F 26-35 28 0 Maharashtra W

1 1000732 Kartik P00110942 F 26-35 35 1 Andhra Pradesh So

2 1001990 Bindu P00118542 F 26-35 35 1 Uttar Pradesh C

3 1001425 Sudevi P00237842 M 0-17 16 0 Karnataka So

4 1000588 Joni P00057942 M 26-35 28 1 Gujarat W

In [12]: #drop unrelated/blank columns

In [13]: #check for null values

In [14]: # drop null values

In [15]: # change data type

Out[17]: Index(['User_ID', 'Cust_name', 'Product_ID', 'Gender', 'Age Group', 'Age',

In [18]: #rename column

0 1002903 Sanskriti P00125942 F 26-35 28 0 Maharashtra Wes

1 1000732 Kartik P00110942 F 26-35 35 1 Andhra Pradesh Sout

2 1001990 Bindu P00118542 F 26-35 35 1 Uttar Pradesh Ce

3 1001425 Sudevi P00237842 M 0-17 16 0 Karnataka Sout

4 1000588 Joni P00057942 M 26-35 28 1 Gujarat Wes

... ... ... ... ... ... ... ... ...

11246 1000695 Manning P00296942 M 18-25 19 1 Maharashtra Wes

11247 1004089 Reichenbach P00171342 M 26-35 33 0 Haryana Nort

11249 1004023 Noonan P00059442 M 36-45 37 0 Karnataka Sout

11250 1002744 Brumley P00281742 F 18-25 19 0 Maharashtra Wes

11239 rows × 13 columns

count 1.123900e+04 11239.000000 11239.000000 11239.000000 11239.000000

mean 1.003004e+06 35.410357 0.420055 2.489634 9453.610553

std 1.716039e+03 12.753866 0.493589 1.114967 5222.355168

min 1.000001e+06 12.000000 0.000000 1.000000 188.000000

25% 1.001492e+06 27.000000 0.000000 2.000000 5443.000000

50% 1.003064e+06 33.000000 0.000000 2.000000 8109.000000

75% 1.004426e+06 43.000000 1.000000 3.000000 12675.000000

max 1.006040e+06 92.000000 1.000000 4.000000 23952.000000

In [20]: # use describe() for specific columns

count 11239.000000 11239.000000 11239.000000

mean 35.410357 2.489634 9453.610553

std 12.753866 1.114967 5222.355168

min 12.000000 1.000000 188.000000

25% 27.000000 2.000000 5443.000000

50% 33.000000 2.000000 8109.000000

75% 43.000000 3.000000 12675.000000

max 92.000000 4.000000 23952.000000

Exploratory Data Analysis

In [21]: # plotting a bar chart for Gender and it's count

In [22]: # plotting a bar chart for gender vs total amount

Out[22]: <Axes: xlabel='Gender', ylabel='Amount'>

In [23]: ax = sns.countplot(data = df, x = 'Age Group', hue = 'Gender')

In [24]: # Total Amount vs Age Group

Out[24]: <Axes: xlabel='Age Group', ylabel='Amount'>

In [25]: # total number of orders from top 10 states

Out[25]: <Axes: xlabel='State', ylabel='Orders'>

In [26]: # total amount/sales from top 10 states

Out[26]: <Axes: xlabel='State', ylabel='Amount'>

In [27]: ax = sns.countplot(data = df, x = 'Marital_Status')

In [28]: sales_state = df.groupby(['Marital_Status', 'Gender'], as_index=False)['Amo

Out[28]: <Axes: xlabel='Marital_Status', ylabel='Amount'>

In [30]: sales_state = df.groupby(['Occupation'], as_index=False)['Amount'].sum().so

Out[30]: <Axes: xlabel='Occupation', ylabel='Amount'>

In [32]: sales_state = df.groupby(['Product_Category'], as_index=False)['Amount'].su

Out[32]: <Axes: xlabel='Product_Category', ylabel='Amount'>

In [33]: sales_state = df.groupby(['Product_ID'], as_index=False)['Orders'].sum().so

Out[33]: <Axes: xlabel='Product_ID', ylabel='Orders'>

In [34]: # top 10 most sold products (same thing as above)

Out[34]: <Axes: xlabel='Product_ID'>

You might also like