0% found this document useful (0 votes)
49 views

Major Project

This project uses machine learning to predict flight prices based on parameters like stops, dates, airlines, and locations. Random forest regression is used to train a model on historical flight data, and hyperparameters are tuned. The model is then deployed using Flask to create a web app that allows users to input features and receive predicted prices.

Uploaded by

RISHABH GIRI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

Major Project

This project uses machine learning to predict flight prices based on parameters like stops, dates, airlines, and locations. Random forest regression is used to train a model on historical flight data, and hyperparameters are tuned. The model is then deployed using Flask to create a web app that allows users to input features and receive predicted prices.

Uploaded by

RISHABH GIRI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Project Name:

FLIGHT FARE PREDICTION

NAME: RISHABH GIRI (1906191)


NAME : RITIK SINGH (1906193)
NAME : SEJAL BARNWAL (1906207)
NAME : AKRITI SINHA (1906010)
ABOUT PROJECT

This model predicts the price of the flight based on some parameters like
total stops, journey Day, journey month, Air India, Indigo, source,
destination, etc. I have trained this model using the random forest
regressor and after training, fine-tune the model which is also known as
hyper parameter tuning. Then save a model and deploy this Flight Fare
Prediction model using the Flask application on the localhost.
OVERVIEWS
We have 2 datasets here — training set and test set.

The training set contains the features, along with the prices of the flights. It
contains 10683 records, 10 input features and 1 output column — ‘Price’.

The test set contains 2671 records and 10 input features. The output ‘Price’
column needs to be predicted in this set. We will use Regression techniques here,
since the predicted output will be a continuous value.

Following is the features available in the dataset – Airline, Date_of_Journey,


Source, Destination, Route, Dep_Time, Arrival_Time ,Duration, Total_Stops,
Additional_Info, Price.
01
PYTHON
Language use

02
Jupyter Notebook
Platform Used
Technology Used
In Project

03
Machine Learning
Algorithm

04
FLASK FRAMEWORK
PYTHON
Diffrent Type Of Process

1) Install Jupyter Notebook : Ide where we used to code

2) Install liberary : Install all the important python liberary.


Tools:-

Pandas- This library is used for data analysis.


NumPy-It is used for mathematical calculations.

Diffrent Type Of Process

Seaborn/Matplotlib- It is used for data visualization.


Scikit-learn- It isused to train validate and test our ML model.
XGBoost-used in supervised learning(regression and
classification problems).

3) Dataset : Download a dataset from kaggle website.


Diffrent Type Of Process

CLEAN DATASET: Delete unnecessary data from dataset

1. Missing Values in the dataset.


2. All the Numerical variables and Distribution of the numerical
variables
3. Categorical Variables
4. Outliers
5. Relationship between an independent and dependent feature(price)
Diffrent Type Of Process
5) Perform EDA

From description we can see that Date_of_Journey is a object


data type,
Therefore, we have to convert this datatype into timestamp so as
to use this column properly for prediction

For this we require pandas to_datetime to convert object data


type to datetime dtype.
.dt.day method will extract only day of that date
.dt.month method will extract only month of that date
AFTER CONVERTING
Diffrent Type Of Process

6) Feature Engineering : We add ,delete and combine the


dataset for better performance.
To prepare proper input data so that it is compatible with ML
algorithm.
List of Feature Engineering Techniques:-
Encoding
Grouping Operations
Feature Split
Diffrent Type Of Process
7) Feature Selection : Where we find the corelation value through heat map.
Diffrent Type Of Process
Fitting model using Random Forest

1. Split dataset into train and test set in order to


prediction w.r.t X_test
2. If needed do scaling of data
Scaling is not done in Random forest
3. Import model
4. Fit the data
5. Predict w.r.t X_test
6. In regression check RSME Score
7. Plot graph
Checking accuracy of the model:

Evaluating the model accuracy is an essential part of


the process of creating machine learning models to describe
how well the model is performing in its predictions. The MSE,
MAE, and RMSE metrics are mainly used to evaluate the prediction
error rates and model performance in regression analysis.

• MAE (Mean absolute error)


• MSE (Mean Squared Error)
• RMSE (Root Mean Squared Error)
Model Deployment
Model Deployment is one of the last stages of any machine learning project. Here, we will
design a user interface. we used a flask to make an HTML file for flight price prediction. this will
take the input value for each feature and calculate the price for a flight as shown in the image
below.
THANK YOU

You might also like