0% found this document useful (0 votes)
21 views

rithika.ppt

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

rithika.ppt

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

RETAIL TRANSACTIONS ANALYSIS

By
AGENDA
 Synopsis
 Introduction
 Data overview
 Data preprocessing
Data Cleaning
Data Conversion
Text Data Preprocessing
Data Aggregation
Data Splitting
Normalization and Scaling
 Data Analysis
 Data Visualization
 Conclusion
SYNOPSIS
INTRODUCTION
 It provides detailed insights into transactions occurring within a retail environment.

 It captures multiple aspects of sales, products, and customer behavior, making it a


valuable resource for analyzing trends in the retail industry.

 The dataset includes both numerical and categorical features

 By analyzing this dataset, retailers can uncover patterns in customer purchases, optimize
inventory management, and enhance pricing strategies.

 Analyzing temporal data can help retailers identify peak shopping times, while store-
level data can reveal regional sales patterns and product preferences
DATA OVERVIEW

 The dataset records details of retail transactions, capturing variables such as transaction
IDs, customer names, products purchased, total cost, payment methods, store types, and
geographic locations.

 COLUMN DESCRIPTIONS:

• Transaction ID: Unique 10-digit ID for each purchase.

• Date: Date and time of the transaction.

• Customer Name: Name of the customer.

• Product: List of products bought in the transaction.

• Total Items: Total number of items purchased.

• Total Cost: Total cost of the transaction.

• Payment Method: Payment method used (e.g., credit card, cash)


 City:City where the purchase happened.

 Store Type:Type of store (e.g., supermarket, convenience store).

 Discount Applied:Whether a discount was applied (Yes/No).

 Customer Category:Category of the customer (e.g., age group).

 Season:Season when the purchase occurred (e.g., spring, summer)

 Promotion:Type of promotion used (e.g., BOGO, discount).


DATA CLEANING:
 Remove Duplicates:
• Identify and drop any duplicate transactions based on the Transaction ID column to ensure each
entry is unique.
 Handling Missing Data:
• Check for missing or null values in key columns (Customer Name, Product, Total Items, Total Cost,
Payment Method, etc.).
•For categorical data (Customer Name, City, Store Type), fill missing values with “Unknown” or
the mode.
•For numerical data (Total Cost, Total Items), use the mean or median to impute missing values
.
 Outlier Detection:
• Check for extreme values in Total Cost and Total Items that don’t fit realistic purchasing behavior.
Outliers may indicate data entry errors or unusual transactions
DATA CONVERSION:
 the goal is to ensure that all columns in the dataset have the correct data types and are formatted
properly. The most common conversions include:
• Converting date columns to datetime format.
• Converting categorical columns to categorical types or encoding them if needed.
• Ensuring numerical columns are properly formatted as int or float.

TEXT DATA PREPROCESSING:


•Lowercasing: Convert all text fields (e.g., Customer Name, Product, City) to lowercase to ensure
uniformity.
•Remove Special Characters: Strip unnecessary characters, punctuation, and symbols from text
fields.
•Lowercasing: Convert all text fields (e.g., Customer Name, Product, City) to lowercase to ensure
uniformity.
•Handle Missing Values: Fill missing text data with placeholders like "Unknown" or impute
based on context.
•Text Encoding: Convert categorical text data (Customer_Category, Store_Type) into
numerical form using label encoding or one-hot encoding for model readiness.

DATA AGGREGATION:
•Handle Missing Values: Fill missing text data with placeholders like "Unknown" or impute
based on context.
•Text Encoding: Convert categorical text data (Customer_Category, Store_Type) into
numerical form using label encoding or one-hot encoding for model readiness.

DATA AGGREGATION:
Data aggregation involves grouping the data based on specific columns and performing
aggregations such as sum, mean, count, etc., on other columns.

Total Sales per Day: The data is grouped by transaction_date, and the total sales for each
day are summed up.

Average Sales per Transaction: We calculate the mean sales per transaction for each day.
DATA SPLITTING :
Data splitting is a crucial step in machine learning and data analysis.
It involves dividing your dataset into subsets for training and testing purposes, ensuring that can evaluate how well your
model generalizes to unseen data.
The dataset is split into:
Training Set: Used to train the model.
Testing Set: Used to evaluate the model's performance on unseen data.

NORMALIZATION AND SCALING:


Normalization and scaling are common steps in data preprocessing, especially when you are
working with numerical data that has varying ranges

(Min-Max Scaling): This rescales the data so that all feature values fall between a specified
range, typically between 0 and 1..
DATA ANALYSIS:
EXPLORATORY DATA ANALYSIS (EDA)

Conduct a detailed EDA to understand distributions, relationships, and trends in the data. This includes
visualizing the data and analyzing summary statistics.
A. Univariate Analysis
Examine each feature independently.
B. Bivariate Analysis
Examine the relationship between two variables.
C. Payment Method Analysis
Analyze the preferred payment methods used by customers.
DATA VISUALIZATION:
 Histogram: Distribution of Total Items Purchased

A histogram allows us to see how many transactions fall within specific ranges of total items

 Bar Plot: Top 10 Purchased Products

Visualize the top 10 most purchased products to see which items were most popular.

 Pie Chart: Distribution of Store Types

A pie chart is used to show the proportion of each store type.

 Line Plot: Sales Trend Over Time

To analyze the sales trend, you can visualize how total sales fluctuate over time
CONCLUSION

Comprehensive Analysis: The dataset provides valuable insights into


customer behavior, purchasing patterns, and sales performance across different
regions and store types.
Business Optimization: It helps businesses make data-driven
decisions, improving inventory management, sales forecasting, and customer
targeting strategies.
Promotional Effectiveness: Through detailed analysis of
promotions and discounts, companies can refine their marketing efforts to enhance
customer engagement.
Predictive Power: With proper analysis, the dataset can be
leveraged for predictive modeling to forecast sales trends, identify high-value
customers, and optimize store operations.
QUERIES ?
THANK YOU

You might also like