0% found this document useful (0 votes)

38 views

Linear Regression Experiment

Linear Regression

Uploaded by

rekha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views

Linear Regression Experiment

Linear Regression

Uploaded by

rekha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Aim: Implement Linear regression using R tool

Theory: Linear regression is a regression model that uses a straight line to describe the
relationship between variables. It finds the line of best fit through given data by searching for
the value of the regression coefficient(s) that minimizes the total error of the model.

There are two main types of linear regression:

 Simple linear regression uses only one independent variable

 Multiple linear regression uses two or more independent variables

Consider two datasets for implementing Linear Regression in R.

Simple Linear Regression: The first dataset contains observations about income (in a range
of $15k to $75k) and happiness (rated on a scale of 1 to 10) in an imaginary sample of 500
people. The income values are divided by 10,000 to make the income data match the scale of
the happiness scores (so a value of $2 represents $20,000, $3 is $30,000, etc.)
https://cdn.scribbr.com/wp-content/uploads//2020/02/income.data_.zip
Multiple Linear Regression :The second dataset contains observations on the percentage of
people biking to work each day, the percentage of people smoking, and the percentage of
people with heart disease in an imaginary sample of 500 towns.
https://cdn.scribbr.com/wp-content/uploads//2020/02/heart.data_.zip
Steps to implement Linear Regression in R:
Step 1: Open RStudio and click on File > New File > R Script. Then install the packages
needed for the analysis, using following code ( only need to do this once):

install.packages("ggplot2")
install.packages("dplyr")
install.packages("broom")
install.packages("ggpubr")

Next, load the packages into R environment by running this code (you need to do this every
time you restart R):

library(ggplot2)
library(dplyr)
library(broom)
library(ggpubr)

Step 2: Load the data into R

Follow these four steps for each dataset:

 In RStudio, go to File > Import dataset > From Text (base).

 Choose the data file you have downloaded (income.data or heart. Data), and an Import
Dataset window pops up.
 In the Data Frame window, X (index) column and columns listing the data for each of
the variables (income and happiness or biking, smoking, and heart.disease) is there.
 Click on the Import button and the file should appear in R Environment tab on the
upper right side of the RStudio screen.

Once the data is loaded, check that it has been read in correctly using summary().

1. Simple regression

summary(income.data)

Because both the variables are quantitative, when this function executed it shows
output as a table with a numeric summary of the data. This tells that
the minimum, median, mean, and maximum values of the independent variable
(income) and dependent variable (happiness):

2. Multiple regression
summary(heart.data)
Again, because the variables are quantitative, running the code produces a numeric
summary of the data for the independent variables (smoking and biking) and the
dependent variable (heart disease):

Step 3: To check whether the dependent variable follows a normal distribution, use
the hist() function.

hist(income.data$happiness)
Step 4: The relationship between the independent and dependent variable must be
linear. To test this visually with a scatter plot to see if the distribution of data points
could be described with a straight line or not.

plot(happiness ~ income, data = income.data)

Step 5: Use the cor() function to test the relationship between independent variables and
make sure they aren’t too highly correlated.

cor(heart.data$biking, heart.data$smoking)

When this code is executed, the output is 0.015. The correlation between biking and smoking
is small (0.015 is only a 1.5% correlation), so that include both parameters in our model.
Step 6: Use the hist() function to test whether your dependent variable follows a normal
distribution.

hist(heart.data$heart.disease)

Step 7: Linearity property is checked using two scatterplots: one for biking and heart
disease, and one for smoking and heart disease.

plot(heart.disease ~ biking, data=heart.data)

plot(heart.disease ~ smoking, data=heart.data)

Step 3: Perform the linear regression analysis

When the data meet the assumptions, perform a linear regression analysis to evaluate the
relationship between the independent and dependent variables.

A. Simple regression: income and happiness

Check if there’s a linear relationship between income and happiness in a survey of 500
people with incomes ranging from $15k to $75k, where happiness is measured on a scale of 1
to 10.

To perform a simple linear regression analysis and check the results, run the below two lines
of code. The first line of code makes the linear model, and the second line prints out the
summary of the model:

income.happiness.lm <- lm(happiness ~ income, data = income.data)

summary(income.happiness.lm)

The output looks like this:

This output table first presents the model equation, then summarizes the model residuals (see
step 4).

The Coefficients section shows:

The estimates (Estimate) for the model parameters – the value of the y-intercept (in this case
0.204) and the estimated effect of income on happiness (0.713).

The standard error of the estimated values (Std. Error).

The test statistic (t value, in this case the t-statistic).

The p-value ( Pr(>| t | ) ), aka the probability of finding the given t-statistic if the null
hypothesis of no relationship were true.

The final three lines are model diagnostics – the most important thing to note is the p-
value (here it is 2.2e-16, or almost zero), which will indicate whether the model fits the data
well.

Conclusion: The above result shows that there is a significant positive relationship between
income and happiness (p-value < 0.001), with a 0.713-unit (+/- 0.01) increase in happiness
for every unit increase in income.

Application Note Bioprocess Application Note Boo
No ratings yet
Application Note Bioprocess Application Note Boo
232 pages
Introduction to Applied Econometrics Analysis Using Stata
From Everand
Introduction to Applied Econometrics Analysis Using Stata
Justin Doran
5/5 (3)
Data Mining Sample Midterm Questions (Last Modified 2/17/19)
No ratings yet
Data Mining Sample Midterm Questions (Last Modified 2/17/19)
4 pages
R Egression Simplified
No ratings yet
R Egression Simplified
24 pages
21BCS5999 - Ankit Kumar (Assignment 2)
No ratings yet
21BCS5999 - Ankit Kumar (Assignment 2)
16 pages
Statistical Analysis: Linear Regression
No ratings yet
Statistical Analysis: Linear Regression
36 pages
Statistical Analysis
No ratings yet
Statistical Analysis
26 pages
R Unit 4th and 5th
No ratings yet
R Unit 4th and 5th
17 pages
Linear Regression. Com
No ratings yet
Linear Regression. Com
13 pages
Which Test When: 1 Exploratory Tests
No ratings yet
Which Test When: 1 Exploratory Tests
5 pages
Simple Regression Model Fitting
No ratings yet
Simple Regression Model Fitting
5 pages
LINEAR REGRESSION IN R
No ratings yet
LINEAR REGRESSION IN R
6 pages
Linear Regression
No ratings yet
Linear Regression
13 pages
Chapter 5.3-Mulitple Linear Regression
No ratings yet
Chapter 5.3-Mulitple Linear Regression
26 pages
(Mathe) Simple Linear Regression and Correlation
No ratings yet
(Mathe) Simple Linear Regression and Correlation
61 pages
R stastics pdf
No ratings yet
R stastics pdf
30 pages
DA-3rd unit
No ratings yet
DA-3rd unit
16 pages
Module 4
No ratings yet
Module 4
33 pages
Regression Analysis Assignment1111
No ratings yet
Regression Analysis Assignment1111
13 pages
Linear Regression
No ratings yet
Linear Regression
22 pages
RegrCorr PDF
No ratings yet
RegrCorr PDF
20 pages
Lecture 25 - Multiple Regression
No ratings yet
Lecture 25 - Multiple Regression
34 pages
Unit-III (Data Analytics)
100% (1)
Unit-III (Data Analytics)
15 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
Rdias FDP
No ratings yet
Rdias FDP
50 pages
Simple Linear Regression Homework Solutions
100% (1)
Simple Linear Regression Homework Solutions
6 pages
Econometrics
No ratings yet
Econometrics
18 pages
Linear Regression
100% (2)
Linear Regression
28 pages
Da Unit-3
No ratings yet
Da Unit-3
27 pages
Weatherwax Weisberg Solutions
No ratings yet
Weatherwax Weisberg Solutions
162 pages
LINEAR REGRESSION-R COMMANDS-06052024-1
No ratings yet
LINEAR REGRESSION-R COMMANDS-06052024-1
2 pages
Exp-6, 7
No ratings yet
Exp-6, 7
4 pages
ECON20003 S1 2024 Sample Exam
No ratings yet
ECON20003 S1 2024 Sample Exam
27 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
19 pages
Lecture-3---Linear-Regression-imran-20022025-092939am
No ratings yet
Lecture-3---Linear-Regression-imran-20022025-092939am
46 pages
Full download Stat2 1st Edition Ann R. Cannon pdf docx
No ratings yet
Full download Stat2 1st Edition Ann R. Cannon pdf docx
67 pages
Regression Analysis
No ratings yet
Regression Analysis
20 pages
An introduction to simple linear regression
No ratings yet
An introduction to simple linear regression
2 pages
15 Types of Regression You Should Know
No ratings yet
15 Types of Regression You Should Know
30 pages
@regression
No ratings yet
@regression
33 pages
unit5_R
No ratings yet
unit5_R
5 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
7 pages
Mda-Session-7 Simple Linear Regression
No ratings yet
Mda-Session-7 Simple Linear Regression
75 pages
Bcom RMFC Lab - Lab File Bcom RMFC Lab - Lab File
No ratings yet
Bcom RMFC Lab - Lab File Bcom RMFC Lab - Lab File
40 pages
MIT 302 - Statistical Computing II - Tutorial 03
No ratings yet
MIT 302 - Statistical Computing II - Tutorial 03
16 pages
Applications of Regression Analysis by Using R Programming - Tara Qasm Bakr
0% (1)
Applications of Regression Analysis by Using R Programming - Tara Qasm Bakr
29 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Chapter 3 - Classical Simple Linear Regression
No ratings yet
Chapter 3 - Classical Simple Linear Regression
52 pages
CS ELEC 4 Finals Module
No ratings yet
CS ELEC 4 Finals Module
57 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
BDA Exp7 Removed
No ratings yet
BDA Exp7 Removed
4 pages
BA - Advanced statistical method using R (P2)
No ratings yet
BA - Advanced statistical method using R (P2)
12 pages
SC&RP - Unit 5
No ratings yet
SC&RP - Unit 5
36 pages
ST T153A Regression Analysis
No ratings yet
ST T153A Regression Analysis
54 pages
Stat2 1st Edition Ann R. Cannon - Discover the ebook with all chapters in just a few seconds
No ratings yet
Stat2 1st Edition Ann R. Cannon - Discover the ebook with all chapters in just a few seconds
47 pages
Linear Regression for Real
No ratings yet
Linear Regression for Real
1 page
Basic Regression Analysis 2
No ratings yet
Basic Regression Analysis 2
6 pages
Lab-3: Regression Analysis and Modeling Name: Uid No. Objective
No ratings yet
Lab-3: Regression Analysis and Modeling Name: Uid No. Objective
9 pages
Statistical Testing and Prediction Using Linear Regression: Abstract
No ratings yet
Statistical Testing and Prediction Using Linear Regression: Abstract
10 pages
Unit-III
No ratings yet
Unit-III
13 pages
R-programming - Unit 5
No ratings yet
R-programming - Unit 5
43 pages
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
KYC Test Cases
No ratings yet
KYC Test Cases
18 pages
Progressive Band Selection Processing of Hyperspectral Image Classification
No ratings yet
Progressive Band Selection Processing of Hyperspectral Image Classification
5 pages
Preprocessing Techniques
No ratings yet
Preprocessing Techniques
63 pages
Stock Market Prediction Using Machine Learning Algorithms A Classification Study
No ratings yet
Stock Market Prediction Using Machine Learning Algorithms A Classification Study
4 pages
Analysing Stock Market Trend Prediction Using Machine Amp Deep Learning Models A Comprehensive Review
No ratings yet
Analysing Stock Market Trend Prediction Using Machine Amp Deep Learning Models A Comprehensive Review
10 pages
Indian Stock Market Prediction Using Deep Learning
No ratings yet
Indian Stock Market Prediction Using Deep Learning
6 pages
Mining High Utility Patterns in One Phase Without Generating Candidates
No ratings yet
Mining High Utility Patterns in One Phase Without Generating Candidates
17 pages
TCS
No ratings yet
TCS
21 pages
Spatial & Web Mining
No ratings yet
Spatial & Web Mining
45 pages
Fuzzy K-Mean Clustering To Preclude Cyber Security Risk: Problem Statement
No ratings yet
Fuzzy K-Mean Clustering To Preclude Cyber Security Risk: Problem Statement
5 pages
Tools: High Performance Habits
0% (1)
Tools: High Performance Habits
8 pages
Time Management Practices and Its Effect On Business Performance
No ratings yet
Time Management Practices and Its Effect On Business Performance
4 pages
How Different Are We? An Investigation of Confucian Values in The United States
No ratings yet
How Different Are We? An Investigation of Confucian Values in The United States
13 pages
Institute of Management and Technology (Imt), Enugu: Anayo Paschal Uchenna
No ratings yet
Institute of Management and Technology (Imt), Enugu: Anayo Paschal Uchenna
11 pages
Document 2
No ratings yet
Document 2
3 pages
Lecture 03
No ratings yet
Lecture 03
35 pages
A Structured Approach To Strategic Decisions - Kahneman Et Al - 2019
No ratings yet
A Structured Approach To Strategic Decisions - Kahneman Et Al - 2019
9 pages
3 Program Outcomes and Learning Outcomes
No ratings yet
3 Program Outcomes and Learning Outcomes
13 pages
Ready Made Mba Dissertation
100% (2)
Ready Made Mba Dissertation
4 pages
Preliminaries
No ratings yet
Preliminaries
8 pages
LAS 05 Constructing Probability Distribution Steps 3 and 4
No ratings yet
LAS 05 Constructing Probability Distribution Steps 3 and 4
2 pages
Modules
No ratings yet
Modules
8 pages
A Study On Financial Statement Analysis of Ultratech Cement Limited
No ratings yet
A Study On Financial Statement Analysis of Ultratech Cement Limited
5 pages
Cross Jury
No ratings yet
Cross Jury
39 pages
Blunt Et Al, 2011
No ratings yet
Blunt Et Al, 2011
73 pages
OPIM Forecasting
No ratings yet
OPIM Forecasting
4 pages
302-Article Text-2573-1-10-20240216
No ratings yet
302-Article Text-2573-1-10-20240216
18 pages
December 2007
No ratings yet
December 2007
67 pages
Urban Morphology Thesis Jaipur AMU
No ratings yet
Urban Morphology Thesis Jaipur AMU
167 pages
1 4 Multilevel and Longitudinal Mode PDF
No ratings yet
1 4 Multilevel and Longitudinal Mode PDF
1,503 pages
Certificate in Product Management Foundation
No ratings yet
Certificate in Product Management Foundation
9 pages
Relationship Between Friends and Academic Performance
No ratings yet
Relationship Between Friends and Academic Performance
7 pages
Chapter 3 - Data Gathering Procedure (Initial Draft)
No ratings yet
Chapter 3 - Data Gathering Procedure (Initial Draft)
4 pages
Developing Competency Dictionary
100% (1)
Developing Competency Dictionary
12 pages
Chapter 1 Descriptive Statistics
No ratings yet
Chapter 1 Descriptive Statistics
36 pages
Problems and Exercises in Operations Research: Ecole Polytechnique
No ratings yet
Problems and Exercises in Operations Research: Ecole Polytechnique
132 pages
A Comprehensive Evaluation of A Lifelong Learning Program
No ratings yet
A Comprehensive Evaluation of A Lifelong Learning Program
19 pages
Anova
No ratings yet
Anova
6 pages
CHAPTER 2 Developing Marketing Strategies
No ratings yet
CHAPTER 2 Developing Marketing Strategies
49 pages

Uploaded by

Uploaded by

Aim: Implement Linear regression using R tool

There are two main types of linear regression:

 Simple linear regression uses only one independent variable

Consider two datasets for implementing Linear Regression in R.

Step 2: Load the data into R

Follow these four steps for each dataset:

 In RStudio, go to File > Import dataset > From Text (base).

plot(happiness ~ income, data = income.data)

plot(heart.disease ~ biking, data=heart.data)

plot(heart.disease ~ smoking, data=heart.data)

A. Simple regression: income and happiness

income.happiness.lm <- lm(happiness ~ income, data = income.data)

The output looks like this:

The standard error of the estimated values (Std. Error).

The test statistic (t value, in this case the t-statistic).

You might also like