0% found this document useful (0 votes)
10 views

DA108 Lab 08 Assignment

The document outlines a lab assignment focused on retail sales data analysis using Object-Oriented Programming (OOP) and Pandas. It includes tasks such as loading and cleaning a sales dataset, implementing a SalesDataProcessor class with various methods for data analysis, and creating a subclass for customer-specific sales insights. Additionally, it suggests bonus tasks for data visualization, including bar charts and pie charts to represent sales data.

Uploaded by

Arkodeep Ray
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

DA108 Lab 08 Assignment

The document outlines a lab assignment focused on retail sales data analysis using Object-Oriented Programming (OOP) and Pandas. It includes tasks such as loading and cleaning a sales dataset, implementing a SalesDataProcessor class with various methods for data analysis, and creating a subclass for customer-specific sales insights. Additionally, it suggests bonus tasks for data visualization, including bar charts and pie charts to represent sales data.

Uploaded by

Arkodeep Ray
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

DA108 | Lab 08 Assignment

Objective A: Retail Sales Data Analysis using OOP and Pandas

You are working as a data analyst for a retail store. The store maintains sales data for its products, and
your task is to process, analyze, and generate reports using Object-Oriented Programming (OOP) and
Pandas.

Dataset:

You have been given a CSV file (sales_data.csv) with the following columns:

●​ OrderID: Unique identifier for the order


●​ Product: Name of the product
●​ Category: Product category (e.g., Electronics, Clothing, etc.)
●​ Quantity: Number of items sold
●​ PricePerUnit: Price per unit of the product
●​ TotalPrice: Total price (Quantity × PricePerUnit)
●​ Date: Date of purchase
●​ CustomerID: Unique identifier for the customer
●​ City: The city where the purchase was made

Task

Part 1: Data Handling with Pandas (Basic)

1.​ Load the dataset into a Pandas DataFrame.


2.​ Display the first few rows of the dataset.
3.​ Handle missing values (if any) by filling or removing them.

Part 2: Implementing OOP Concepts (Intermediate)

Create a Python class called SalesDataProcessor with the following:

●​ Attributes:
○​ df: Stores the sales dataset as a Pandas DataFrame.
●​ Methods:
○​ load_data(file_path): Loads the dataset.
○​ clean_data(): Handles missing values and converts data types.
○​ get_total_sales(): Returns total sales (sum of TotalPrice).
○​ get_unique_products(): Returns a list of unique products.
○​ get_sales_by_category(): Returns total sales per product category.
○​ get_top_selling_product(): Returns the product with the highest sales.

Part 3: Extending OOP with Inheritance (Advanced)

Create a subclass called CustomerSalesProcessor, which extends SalesDataProcessor and


adds:

●​ New Methods:
○​ get_total_sales_by_customer(customer_id): Returns total sales made by a
specific customer.
○​ get_frequent_customers(n): Returns the top n customers who made the most
purchases.
○​ get_sales_by_city(): Returns total sales per city.

Part 4: Data Visualization (Bonus)

1.​ Plot a bar chart showing total sales by category.


2.​ Plot a line graph of daily sales trends.
3.​ Plot a pie chart showing the percentage contribution of different cities to total sales.

You might also like