This document discusses a project analyzing Diwali sales data using Python for data science. It involves acquiring data from multiple sources, preparing the data by cleaning and transforming it, then exploring the data through visualizations and statistical analysis. The analysis identifies patterns like females purchasing more than males on average. Top selling products and customer demographics like married women aged 26-35 from certain states and occupations are identified. The conclusion provides insights to improve customer experience and increase company revenue.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
52 views
Training
This document discusses a project analyzing Diwali sales data using Python for data science. It involves acquiring data from multiple sources, preparing the data by cleaning and transforming it, then exploring the data through visualizations and statistical analysis. The analysis identifies patterns like females purchasing more than males on average. Top selling products and customer demographics like married women aged 26-35 from certain states and occupations are identified. The conclusion provides insights to improve customer experience and increase company revenue.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17
PYTHON WITH DATA
SCIENCE DIWALI SALES ANALYSIS BY- Shivam Goel ECE-3 Enrollment number – 04096302820 MAHARAJA SURAJMAL INSTITUTE OF TECHNOLOGY DATA SCIENCE
• Data science is about deriving useful
insights from the data in order to solve real world complex problems Task Of A Data Scientist • Data Acquisition – data is gathered from various sources like Databases , web servers , API's(application programming interface) • Data Preparation – It involves data cleaning and data transformation Tools like talend and informatica are used to perform complex data transformations and helps in better understanding
Exploratory Data analysis- Exploratory Data Analysis refers to the critical
process of performing initial investigations on data so as to discover patterns, to spot anomalies , and to check assumptions with the help of summary statistics and graphical representations. • Data Modelling - ML models like are applied on data to create a data product(predict future outcomes , gain insights on data) from it. Mostly data modelling is done by Python.
• Data Visualization - Data visualization is a way to represent
information graphically, highlighting patterns and trends in data and helping the reader to achieve quick insights. PROJECT - DIW ALI SALES ANALYSIS
• A company has provided us with their Diwali
sales data, they want us to analyze the data for each record and attribute in the table and we share a summary with them in the end by which company can-
a) Improve customer experience by analyzing
sales data
b) Increase their revenue
• JUPYTER NOTEBOOK - use to create and share documents that contain live code, equations, visualizations, and text • PYTHON - In data science , various python libraries are
Technologies used for fetching data and performing operations on it
• NUMPY - for creating N-dimensional arrays of data.(An used in array is a special variable, which can hold more than one value at a time.)
project • PANDAS - used for cleaning and organizing data and to
perform exploratory data analysis. It is better than spreadsheets/excel as it has tools for reading and writing data between many formats(csv file, excel file , sql database) • MATPLOTLIB,SEABORN – used for data visualization. IMPORTING LIBRARIES Load csv file in jupyter notebook Cleaning of data- Once we get our data, cleaning and organizing is done • 1) drop unrelated or blank columns from our dataset 2.) rename a column 3) check for presence of null values and remove them
4.) convert columns to correct data
type. 5.) use of ‘describe’ function and its application for a specific column DATA EXPLORAT ION Data exploration is the first step in data analysis involving the use of data visualization tools and statistical techniques to uncover data set characteristics and initial patterns. 1) Male buyers vs Female buyers
Plot a bar chart for gender vs total amount
From above graphs we can see that most
of the buyers are females and even the purchasing power of females are greater than men 2.)total amount vs age group
3.) Total number of orders from top 10 states
4.) marital status
5.) Occupation
z 6.)Product category
7.) On the basis of ‘product ID’ we
want to see our top selling products Conclusion- Married women age group 26-35 yrs from UP, Maharastra and Karnataka working in IT, Healthcare and Aviation are more likely to buy products from Food, Clothing and Electronics category