0% found this document useful (0 votes)
13 views

ESE Ques Pattern

Uploaded by

neeturaunak123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

ESE Ques Pattern

Uploaded by

neeturaunak123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

A) You are given the following dataset representing the monthly average sales (amount)

of a city over a year:


months = ["January", "February", "March", "April", "May", "June", "July", "August",
"September", "October", "November", "December"]
sales = [3500, 5000, 9000, 13000, 18000, 22000, 25000, 24000, 20000, 14000, 8000, 4000]

Using Matplotlib in Python, write a script to perform the following tasks:

1. Create a line plot of the data where:

• The x-axis represents the months.


• The y-axis represents the average sales.
• The line is coloured blue and has square markers at each data point.
• Add appropriate labels to the x-axis and y-axis, and provide a title for the plot.

2. Add a second line to the same plot representing the monthly average bonus (amount)
for the same city:
bonus = [7800, 6000, 5500, 4500, 6000, 7000, 8500, 8000, 7500, 7000, 8500, 9000]

This line should be coloured green and have circular markers.


Add a legend to distinguish between the sales and bonus lines.

3. Customize the plot by:

• Adding a grid to the plot.


• Adjusting the figure size to make the plot wider.
• Saving the final plot as an image file named weather_plot.png.

B) You are provided with a dataset containing information about sales transactions from a
retail store. The dataset includes the following columns:

TransactionID: Unique identifier for each transaction.


ProductCategory: Category of the product sold (e.g., 'Electronics', 'Clothing',
'Groceries').
SalesAmount: The amount of the transaction in INR.
TransactionID ProductCategory SalesAmount TransactionDate
1 Electronics 25000 2024-01-15
2 Clothing 15000 2024-01-17
3 Groceries 50000 2024-01-18
... ... ... ...
Using Python's pandas library, write a script to perform the following tasks:
1. (3 marks) Group the data by ProductCategory and calculate the total sales
(SalesAmount) for each category. Display the resulting DataFrame.
2. (2 marks) Calculate the average sales amount per transaction for each
ProductCategory. Display the resulting DataFrame.
3. (3 marks) Group the data by both ProductCategory and TransactionDate, and
calculate the total sales for each group. Display the resulting DataFrame.
4. (2 marks) Identify the ProductCategory with the highest average sales amount per
transaction. Display the category name and the corresponding average sales amount.

C) You are provided with a dataset containing information about a company's


employees. The dataset includes the following columns:
• EmployeeID: Unique identifier for each employee.
• Name: The full name of the employee.
• Department: The department where the employee works (e.g., 'HR', 'Finance', 'IT').
• Salary: The annual salary of the employee in USD.
• JoinDate: The date when the employee joined the company.
• Gender: The gender of the employee, which contains some missing values.
• YearsAtCompany: The number of years the employee has been with the company,
which needs to be calculated.

EmployeeID Name Department Salary JoinDate Gender YearsAtCompany

1 Alice Johnson HR 60000 2020-02-15 Female

2 Bob Smith IT 75000 2018-08-10 Male

3 Charlie Brown Finance50000 2021-05-21

... ... ... ... ... ... ...

Using Python's pandas library, write a script to perform the following data wrangling tasks:
1. (3 marks) Handle missing values in the Gender column:
o Fill missing Gender values with the mode (most frequent value).
o Display the number of missing values in the Gender column before and after
filling.
2. (2 marks) Calculate the YearsAtCompany for each employee based on the current
date (assume today's date is September 1, 2024) and the JoinDate. Populate the
YearsAtCompany column with the calculated values and display the first 5 rows of
the updated DataFrame.
3. (3 marks) Standardize the Name column by converting all names to title case (e.g.,
"alice johnson" should become "Alice Johnson"). Display the first 5 rows of the
updated DataFrame.
4. (2 marks) Remove any duplicate rows in the DataFrame based on the EmployeeID
column and display the number of rows before and after removing duplicates.

You might also like