Aerofit_Case_Study
Aerofit_Case_Study
Data
The analysis was done on the data located at -
https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/001/125/original/aerofit_treadmill.csv
Libraries
Below are the libraries required for analysing and visualizing data
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 180 entries, 0 to 179
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Product 180 non-null object
1 Age 180 non-null int64
2 Gender 180 non-null object
3 Education 180 non-null int64
4 MaritalStatus 180 non-null object
5 Usage 180 non-null int64
6 Fitness 180 non-null int64
7 Income 180 non-null int64
8 Miles 180 non-null int64
dtypes: int64(6), object(3)
memory usage: 12.8+ KB
None
*************************************************
A quick look at the information of the data reviles that there are 180 rows and 9 columns implying 180
products have been sold to different customers with information of each customer like age, gender, income
to name a few. The datatype of product, gender and marital status is “object” and rest is of int64 datatype.
We can also infer that there are no missing values or nulls in the dataset. \ \ A smaple of the data is shown
below:
Out[3]: Product Age Gender Education MaritalStatus Usage Fitness Income Miles
In [4]: df.describe()
The above table shows the statistics of the data like mean, minimum and maximum value. As we can see
there is a large spread in the Icome and Miles data.
Analysis
Detecting outliers
a. Outliers for every continuous variable
In [5]: # helper function to detect outliers
def detectOutliers(df):
q1 = df.quantile(0.25)
q3 = df.quantile(0.75)
iqr = q3-q1
outliers = df[(df<(q1-1.5*iqr)) | (df>(q3+1.5*iqr))]
return outliers
The data is limited between the 5 and 95 percentile of each column so as to avoid any bias during analysis
Probability
a. Marginal probability of each product
In [12]: pd.crosstab(df['Product'], df['Product'], normalize=True)
Product
44.4% of customers have purchased KP281, 33.3% have purchased KP481 and 22.2% have purchased
KP781
Product
KP281 40 40 80
KP481 29 31 60
KP781 7 33 40
Of all the 180 customers who brought a product, 76 were female and 104 were male. So, the probability of
a female customer buying a product is 42.2% (76/180) and that of a male customer buying a product is
57.8% (104/180)
Product
KP281 48 32 80
KP481 36 24 60
KP781 23 17 40
Similarly, the probability of a partnered customer buying a product is 59.4% (107/180) and that of a
single customer buying a product is 40.6% (73/180)
Product
KP281 1 14 54 9 2 80
KP481 1 12 39 8 0 60
KP781 0 0 4 7 29 40
All 2 26 97 24 31 180
Based on the self rated fitness level, the probability of a moderately fit customer buying a product is
high, 53.9% (97/180) compared to other fitness level customers
c. Conditional probability
1. Given that a customer is female, the probability that she will buy KP281 is higher, 52.6% (40/76), than
the probability of her buying KP781, 9.2% (7/76).
2. Given that a customer is male, the probability that he will buy KP281, 38.5% (40/104), is little higher
compared to KP481 or KP781 which is almost same, 29.8% (31/104) and 31.7% (33/104) respectively.
3. Given that a customer is partnered, the probability of he/she buying KP281 is 44.9% (48/107), KP481 is
33.6% (36/107)) and KP781 is 21.5% (23/107).
4. Given that a customer is single, the probability of he/she buying KP281 is 43.8% (32/73), KP481 is
32.9% (24/73)) and KP781 is 23.3% (17/73).
5. Given that a customer is moderately fit, the probability of he/she buying KP281 is higher, 55.7%
(54/97).
6. Given that a customer is extremely fit, the probability of he/she buying KP781 is higher, 93.5% (29/31).
From the given dataset, it can be observed that Fitness and Miles are highly correlated followed by Usage
and Miles. This is expected as fit people tend to use the treadmill more often and run more miles.\ On the
other hand, Age seems to be unrelated to Usage, Miles and Fitness and therby we can conclude that
fitness can be achieved at any age
For KP281:
Age: Prefered by customers of all age.
Gender: Prefered by both male and female customers equally.
Education: Mostly prefered by customers who have completed less than 16
years of education.
MaritalStatus: Mostly Prefered by partnered customers than single
customers.
Usage: Prefered by customers who would use the treadmill for less than 4
times/week
Income: Prefered by low income(46,000 dollars average income) customers.
Fitness: Mostly prefered by customers with fitness level less than 3.
Miles: Mostly prefered by customers who expect to walk/run 82 miles/week on
average.
For KP481:
Age: Prefered by customers of all age.
Gender: Prefered by both male and female customers equally.
Education: Mostly prefered by customers who have completed less than 16
years of education.
MaritalStatus: Mostly Prefered by partnered customers than single
customers.
Usage: Prefered by customers who would use the treadmill for less than 4
times/week
Income: Prefered by low income(49,000 dollars average income) customers.
Fitness: Mostly prefered by customers with fitness level less than 3.
Miles: Mostly prefered by customers who expect to walk/run 88 miles/week on
average.
For KP781:
Age: Prefered by customers of all age.
Gender: Mostly prefered by male customers.
Education: Mostly prefered by customers who have completed greater than 16
years of education.
MaritalStatus: Mostly Prefered by partnered customers than single
customers.
Usage: Prefered by customers who would use the treadmill for greater than 4
times/week
Income: Mostly prefered by high income(75,000 dollars average income)
customers.
Fitness: Mostly prefered by customers with fitness level 3 and above.
Miles: Mostly prefered by customers who expect to walk/run 167 miles/week
on average.
b. Recommendation
The product KP281 and KP481 should continue to be sold to customers of all age,
gender, marital status, low to medium fitness level and with low income. It
should be selectively targeted towards customers with low to medium fitness but
with high income to pull them into fitness routine and later they will
automatically buy advance model, KP781, as cost wouldnt be a factor thereby
increasing sale of all models.
The product KP781 is mostly purchased by males with high income and high
fitness level. This model should be targeted towards high fitness individuals
but with low income by providing easy finance options like 0% EMI or
subscription basis. This model should also be targeted towards high income
females to increase sales.