0% found this document useful (0 votes)

19 views75 pages

r File Finall

Uploaded by

aditi modi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views75 pages

r File Finall

Uploaded by

aditi modi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 75

Affiliated to Dr. A. P. J.

Abdul Kalam Technical University, Lucknow, Uttar Pradesh

PRACTICAL FILE

PROGRAM: -MBA (BUSINESS ANALYTICS)

SEMESTER-1A

ACADEMIC YEAR: - 2023-2024

SUBJECT: - BASICS OF DATA MANAGEMENT WITH “R”

SUBJECT CODE: - KMBA152

SCHOLAR NUMBER: - 2301008

SUBMITTED BY: - SUBMITTED TO: -

ADITI MODI MS. NEETU SINGH

(Assistant Professor)

1
INDEX
S.No. Module Page no. Signature
Intro How to Install R and R studio 5
1. Learn the Basic Syntax of R
1.1 R Script for Arithmetic Operators. 6–8
1.2 R Script for Logical Operators. 8 – 10
1.3 R Script for Relational Operators. 10 – 11
1.4 R script for Assignment Operators 12
1.5 R script for miscellaneous operators 12 – 14
1.6 R Script for Conditional Statements. 14 – 16
1.7 R Scripts for Looping. 17 – 19
1.8 R Scripts for User-Defined Functions. 19 – 21
1.9 R Scripts for Data Frames 21 - 26

2. Learn how to organize and modify data in R using data frames

and dplyr R Script for Data Manipulation with the help of
dplyr package
2.1 Filter Function 27
2.2 Distinct function 28
2.3 Arrange function 28
2.4 Select Function 29 – 30
2.5 Rename Function 31
2.6 Mutate Function 31 -32
2.7 Transmutate function 32
2.8 Summarize Function 33

3. Learn how to prepare data for analysis in R using dplyr and

tidyr R Script for Data Manipulation with the help of tidyr
package
3.1 Gather Function 35 -36
3.2 Separate Function 36 – 38
3.3 Unite Function 38 - 40
3.4 Spread Function 41 - 42

2
4. Learn the basics of how to create
visualizations using the popular R package
ggplot2
4.1 R Script for Summary of Data Set 42
4.2 R Script for Data Layers 43
4.3 R Script for Aesthetic Layer 43 – 44
4.4 R Script for Geometric Layer 44 – 45
4.5 R Script for Adding Size, Colour and Shape 45 – 47
4.6 R Script for Histogram Plot 47 – 48
4.7 R Script for Facet Layer 48 – 49
4.8 R Script for Statistics Layer 49 – 50
4.9 R Script for Coordinates Layer 50 – 51
4.10R Script for Coord_cartesian() 51 – 52
4.11 R Script for Theme Layer 52 – 53

5. Learn the basics of aggregate functions in R

with dplyr, which let us calculate quantities
that describe groups of data
5.1 R script to create with 4 columns and group 53 – 54
with subjects and get the aggregates like
minimum, sum, and maximum.
5.2 R Script to create with 4 columns and 55 – 56
group with subjects and get the average
(mean)
6. Learn the basics of joining tables together in
R with dplyr
6.1 R Script for Inner Join 57
6.2 R Script for Left Join 57 – 58
6.3 Script for Right Join 58 – 59
6.4 Script for Full Join 59 – 60
6.5 R Script for Semi Join 60 – 61
6.6 R Script for Anti Join 62

7. Learn to use R or manually calculate the

mean, median, and mode of real-world
datasets
7.1 R Script for importing data using read.csv 62 – 65
and find mean median and mode value

3
8. Learn how to quantify the spread of the dataset
by calculating the variance and standard 66 – 67
deviation in R

9. Learn how to calculate three important

descriptive statistics- Quartiles, Quantiles, and
Interquartile range that describes the spread of 68 – 69
the data
10. Learn about the statistics used to run
hypothesis tests and use R to run different t-tests that 70 - 74
compare distribution

HOW TO INSTALL R AND R STUDIO

4
Steps for Downloading R.

Step – 1: Go to CRAN R website.

Step – 2: Click on the Download R for Windows link

Step – 3: Click on the base subdirectory link or install R for the first-time link

Step – 4: Click Download R X.X.X for Windows (X.X.X stand for the latest version
of R. eg:-

4.3.2) and save the executable .exe file.

Step – 5: Run the .exe file and follow the installation instructions.

Steps for Downloading RStudio.

Step – 1: With R-base installed, let’s move on to installing RStudio. To begin, go

to download RStudio and click on the download button for RStudio desktop.

Step – 2: Click on the link for the windows version of RStudio and save the .exe
file.

Step – 3: Run the .exe and follow the installation instructions.

Now your R and R studio install on your desktop.

5
R SCRIPT
Module 1- Basics of R syntax:

Program 1

1.1 Arithmetic operators: Such operators are used for performing math operations
like addition, subtraction, multiplication, division etc. They are further of 6 types:
a. Addition operator: The values at the corresponding positions of both the
operands are added.
Code:
a = c (1,4.9)
b = c (6, 4)
print (a+b)
Output:

b. Subtraction operator: The second operand values are subtracted from the first.
Code:
a = c (1,0.1)
b = c (2.33, 4)
print (a-b)
Output:

6
c. Multiplication operator(*): The multiplication of corresponding elements of
vectors and Integers are multiplied.
Code:
a = c (1,8)
b = c (2, 4)
print (a*b)
Output:

d. Division operator (/): The first operand is divided by the second operand.
Code:
a = c (1,0.1)
b = c (2.33, 4)
print (a/b)
Output:

e. Power operator (^): The first operand is raised to the power of the second
operand.
Code:
a = c (1,2)
b = c (2, 4)
print (a^b)
Output:

7
f. Modulo operator (%%): The remainder of the first operand divided by the
second operand is returned.
Code:
a = c (1,6)
b = c (8 , 4)
print (a%%b)
Output:

1.2 Logical operators: Logical operations simulate element-wise decision operations,

based on the specified operator between the operands, which are then evaluated to
either a True or False Boolean value. They are of 5 types:
a. Element-wise Logical AND operator (&): Returns True if both the operands are
True.
Code:
a = c (7,5)
b = c (6, 7)
print (a&b)
Output:

8
b. Element-wise Logical OR operator (|): Returns True if either of the operands is
True.
Code:
a = c (7,10)
b = c (14, 5)
print (a|b)
Output:

c. NOT Operator(!): A unary operator that negates the status of the elements of
the operand.
Code:
a=7
print (!a)
Output:

d. Logical and operator (&&): Returns True if both the first elements of the
operands are True.
Code:
a=2
b=5
print (a&&b)
Output:

9
e. Logical OR operator (||): Returns True if either of the first elements of the
operands is True.
Code:
a=7
b=7
print (a||b)
Output:

1.3 Relational operators: The relational operators carry out comparison operations
between the corresponding elements of the operands. They are of 5 types:
a. Less than (<): Returns TRUE if the corresponding element of the first operand is
less than that of the second operand.
Code:
#less than
a <- c(5,3)
b <- c(4,7)
# Performing operations on Operands
print(a<b)
Output:

10
b. Less than equal to (<=): Returns TRUE if the corresponding element of the first
operand is less than or equal to that of the second operand.
Code:
#less than equal to
a <- c(5,3)
b <- c(4,3)
# Performing operations on Operands
print(a<=b)
Output:

c. Greater than (>): Returns TRUE if the corresponding element of the first operand
is greater than that of the second operand.
Code:
#greater than
a <- c(5,3)
b <- c(4,7)
# Performing operations on Operands
print(a>b)
Output:

d. Greater than equal to (>=): Returns TRUE if the corresponding element of the
first operand is greater or equal to that of the second operand.
Code:
#greater than equal to
a <- c(5,3)
b <- c(4,7)
# Performing operations on Operands

11
print(a>=b)
Output:

e. Not equal to (!=): Returns TRUE if the corresponding element of the first operand
is not equal to the second operand.
Code:
#not equal to
a <- c(5,7)
b <- c(4,7)
# Performing operations on Operands
print(a!=b)
Output:

1.4 Assignment operators: Assignment operators are used to assigning values to

various data objects in R. The objects may be integers, vectors, or functions. They
are of 2 types:
a. Left assignment: (<- or <<- or =): Assigns a value to a vector.
Code:
#left assignment
a <- c(7:15)
# Performing operations on Operands
print(a)
Output:

12
b. Right assignment: (-> or ->>): Assigns a value to a vector.
Code:
#right assignment
c(12:25) -> b
# Performing operations on Operands
print(b)
Output:

1.5 Miscellaneous operators: These are the mixed operators that simulate the
printing of sequences and assignment of vectors, either left or right-handed. They
are of 3 types:
a. %in%operator: Checks if an element belongs to a list and returns a boolean value TRUE
if the value is present.
Code:
#%in% operator
a = 0.2
list1 = c(TRUE, 0.1, "apple")
print (a %in% list1)
Output:

13
b. Colon operator : Prints a list of elements starting with the element before the colon to
the element after it.
Code:
#colon operator
a = (1:25)
print(a)
Output:

c. %*%operator: This operator is used to multiply a matrix with its transpose.

Code:
#%*% operator
m = matrix(c(10,9,3,9,5,6,4,6,7), nrow=3, ncol=3)
print(m)
print(t(m))
p = m %*% t(m)
print(p)

Output:

14
1.6 Decision making in R programming (Conditional statements) : The decision
making in R programming are as followed:
a. If statement: Keyword if tells compiler that this is a decision control instruction
and the condition following the keyword if is always enclosed within a pair of
parentheses. If the condition is TRUE the statement gets executed and if
condition is FALSE then statement does not get executed.
Code:
#IF statement
a = 58
b = 70
#TRUE condition
if (a>b)
{
c = a-b
print("condition a>b is TRUE")
print(paste("Difference between a,b is:", c))
}
#FALSE Condition
if (a<b)
{
c = a-b
print("condition a<b is TRUE")
print(paste("Difference between a,b is:", c))
}
Output:

15
b. If-else statement: it provides us with an optional else block which gets
executed if the condition for if block is false. If the condition provided to if block is
true then the statement within the if block gets executed, else the statement
within the else block gets executed.
Code:
#Ifelse statement
a = 58
b = 70
if (a>b)
{
c = a-b
print("condition a>b is TRUE")
print(paste("Difference between a,b is:", c))
} else
{
c = a-b
print("condition a<b is TRUE")
print(paste("Difference between a,b is:", c))
}
Output:

16
c. Nested if else statement: When we have an if-else block as an statement
within an if block or optionally within an else block, then it is called as nested if
else statement.
Code:
#Nested IF Statement
ifelse(test = 15>16,
yes = ifelse(test = 15>14,
yes = 'TRUE TWICE',
no = "YES, & NO"),
no = "No")
Output:

1.7 R scripts for looping: There are 3 types of loops in R programming:

a. For loop: Repeat a statement or group of statement for certain number of
times
17
Code:
#For loop
for(value in seq (10,20,2))
{
print(value)
}
Output:

b. While loop: Tests the condition and repeats a statement or group of

statements.
Code:
#while loop
a=2
while(a<=4)
{
print(a)
a=a+1
}
Output:

18
c. Repeat loop: It executes sequence of statements multiple times:
Code:
#repeat loop
a=3
repeat
{
print(a)
a=a+1
#checking stop condition
if(a>7)
{
#break statement to terminate loop
break
}
}

Output:

19
1.8 User defined Functions: In R, we can create our own functions, such
functions are known as user defined functions.

Code:

#CODE 1

# A simple R function to check

# whether x is a multiple of 2

multipleof2 = function(x){

if(x %% 2 == 0)

return("It is a multiple of 2")

else

return("It is not a multiple of 2")

20
print(multipleof2(4))

print(multipleof2(3))

Output:

Code:

#CODE 2

# A simple R program to demonstrate

# passing arguments to a function

Rectangle = function(length=4, width=7){

area = length * width

return(area)

# Case 1:

print(Rectangle(12, 13))

# Case 2:

print(Rectangle(width = 80, length = 45))

# Case 3:

21
print(Rectangle())

Output:

1.9 R scripts for data frames: Data Frames in R Language are generic data objects of
R which are used to store the tabular data. Data frames can also be interpreted as
matrices where each column of a matrix can be of the different data types. Data
Frame is made up of three principal components, the data, rows, and columns.
a. Creating a data frame:
Code:

#creating a data frame

df1 <- data.frame(

Training = c("Strength", "Stamina", "Other"),

Pulse = c(100, 150, 120),

Duration = c(60, 30, 45),

stringsAsFactors = FALSE

print(df1)

22
Output:

b. Getting structure of R data frame:

Code:

#using str()

df1 <- data.frame(

Training = c("Strength", "Stamina", "Other"),

Pulse = c(100, 150, 120),

Duration = c(60, 30, 45),

stringsAsFactors = FALSE

print(str(df1))

Output:

23
c. Summary of data in data frame:

Code:

#getting the summary

df1 <- data.frame(

Training = c("Strength", "Stamina", "Other"),

Pulse = c(100, 150, 120),

Duration = c(60, 30, 45),

stringsAsFactors = FALSE

summary(df1)

Output:

d. Extract data from data frame:

Code:

#extract data

df1 <- data.frame(

24
Training = c("Strength", "Stamina", "Other"),

Pulse = c(100, 150, 120),

Duration = c(60, 30, 45),

stringsAsFactors = FALSE

print(df1$Pulse)

Output:

e. Expand data frame:

Code 1(Adding Rows):

#expanding the dataframe

#adding rows

df1 <- data.frame(

Training = c("Strength", "Stamina", "Other"),

Pulse = c(100, 150, 120),

Duration = c(60, 30, 45),

stringsAsFactors = FALSE

New_row_DF = rbind(dataframe1, c("Training", 110, 110))

print(New_row_DF)

Output:

25
Code 2(Adding Columns):

#expanding the dataframe

#adding columns

df1 <- data.frame(

Training = c("Strength", "Stamina", "Other"),

Pulse = c(100, 150, 120),

Duration = c(60, 30, 45),

stringsAsFactors = FALSE

New_col_DF = cbind(dataframe1, Steps = c(3000, 6000, 4000))

print(New_col_DF)

Output:

f. Getting Dimensions of the dataframe:

Code:

df1 <- data.frame(

Training = c("Strength", "Stamina", "Other"),

Pulse = c(100, 150, 120),

26
Duration = c(60, 30, 45),

stringsAsFactors = FALSE

print(dim(df1))

Output:

g. Count of Rows and Columns in the dataframe:

Code:

#count of rows and columns in dataframe

df1 <- data.frame(

Training = c("Strength", "Stamina", "Other"),

Pulse = c(100, 150, 120),

Duration = c(60, 30, 45),

stringsAsFactors = FALSE

print(nrow(df1))

print(ncol(df1))

Output:

27
Module 2- Learn how to organize and modify data in R using data
frames and dplyr
Program 2: Data manipulation functions present in DPLYR are:

2.1 Filter: It produces a subset of data frame.

Code:

#Filter

library(dplyr)

d=data.frame(name=c("Abhinav", "Bharay",

"Cameron", "Devon"),

age=c(17, 15, 19, 16),

ht=c(46, NA, NA, 69),

school=c("yes", "yes", "no", "no"))

print(d)

d%>%filter(is.na(ht))

d%>%filter(!is.na(ht))

Output:

28
2.2 Distinct: Removes duplicate rows in a data frame.

Code:

library(dplyr)

d=data.frame(name=c("Abhinav", "Bharay",

"Cameron", "Devon"),

age=c(17, 15, 19, 16),

ht=c(46, NA, NA, 69),

school=c("yes", "yes", "no", "no"))

print(distinct(d))

Output:

2.3 Arrange: It reorders rows of data frame.

Code:
29
library(dplyr)

d=data.frame(name=c("Abhinav", "Bharay",

"Cameron", "Devon"),

age=c(17, 15, 19, 16),

ht=c(46, NA, NA, 69),

school=c("yes", "yes", "no", "no"))

d.name<- arrange(d, school)

print(d.name)

Output:

2.4 Select: The select method is used to extract the required columns as a table by

specifying the required column names in select method.

Code:

#Select

library(dplyr)

d=data.frame(name=c("Abhinav", "Bharay",

"Cameron", "Devon"),

age=c(17, 15, 19, 16),

ht=c(46, NA, NA, 69),

school=c("yes", "yes", "no", "no"))

select(d, starts_with("ht"))

30
select(d, -starts_with("ht"))

select(d, 1:2)

select(d, contains("n"))

select(d, matches("na"))

Output:

2.5 Rename: It rename the variables name.

Code:

31
#Rename

library(dplyr)

d=data.frame(name=c("Abhinav", "Bharay",

"Cameron", "Devon"),

age=c(17, 15, 19, 16),

ht=c(46, NA, NA, 69),

school=c("yes", "yes", "no", "no"))

rename(d, height=ht)

Output:

2.6 -7 Mutate and Transmutate : Create new variables without dropping old ones is

Mutate and Create new variables by dropping old.

Code:

#Mutate & Transmute

library(dplyr)

d=data.frame(name=c("Abhinav", "Bharay",

```````````````````````````````````` "Cameron", "Devon"),

age=c(17, 15, 19, 16),

ht=c(46, NA, NA, 69),

school=c("yes", "yes", "no", "no"))

mutate(d, x3=ht-age)

32
transmute(d, x3=ht+age)

Output:

2.7 Summarize: Give summarized data like sum, average etc.

Code:

#Summarize

library(dplyr)

d=data.frame(name=c("Abhinav", "Bharay",

"Cameron", "Devon"),

age=c(17, 15, 19, 16),

ht=c(46, NA, NA, 69),

school=c("yes", "yes", "no", "no"))

summarise(d, mean=mean(age))

summarise(d, med=min(age))

summarise(d, med=max(age))

summarise(d, med=sd(age))

Output:

33
2.8 Sample: Give the sample of the dataframe

Code:

#Getting part of the dataframe

library(dplyr)

d=data.frame(name=c("Abhinav", "Bharay",

"Cameron", "Devon"),

age=c(17, 15, 19, 16),

ht=c(46, NA, NA, 69),

school=c("yes", "yes", "no", "no"))

sample_n(d,3)

sample_frac(d,0.50)

Output:

34
Module 3- Data Manipulation in R with TIDYR package
Program 3

Step 1: Creation of data:

Code:

#creating the dataframe

library(tidyr)

n = 10

tidydf = data.frame(

S.No = c(1:n),

Group.1 = c(23, 345, 76, 212, 88,

199, 72, 35, 90, 265),

Group.2 = c(117, 89, 66, 334, 90,

101, 178, 233, 45, 200),

Group.3 = c(29, 101, 239, 289, 176,

320, 89, 109, 199, 56))

print(tidydf)

35
Output:

Data manipulation functions present in TIDYR are:

3.1 Gather:

Code:

#Gather

library(tidyr)

n = 10

tidydf = data.frame(

S.No = c(1:n),

Group.1 = c(23, 345, 76, 212, 88,

199, 72, 35, 90, 265),

Group.2 = c(117, 89, 66, 334, 90,

101, 178, 233, 45, 200),

Group.3 = c(29, 101, 239, 289, 176,

320, 89, 109, 199, 56))

tall=tidydf %>%

gather(Group, Frequency,

36
Group.1:Group.3)

print(tall)

Output:

3.2 Separate:

Code:

37
#Separate

library(tidyr)

n = 10

tidydf = data.frame(

S.No = c(1:n),

Group.1 = c(23, 345, 76, 212, 88,

199, 72, 35, 90, 265),

Group.2 = c(117, 89, 66, 334, 90,

101, 178, 233, 45, 200),

Group.3 = c(29, 101, 239, 289, 176,

320, 89, 109, 199, 56))

tall=tidydf %>%

gather(Group, Frequency,

Group.1:Group.3)

sep=tall %>%

separate(Group, c("Allotment",

"Number"))

print(sep)

Output:

38
3.3 Unite:

Code:

library(tidyr)

n = 10

tidydf = data.frame(

S.No = c(1:n),

Group.1 = c(23, 345, 76, 212, 88,

39
199, 72, 35, 90, 265),

Group.2 = c(117, 89, 66, 334, 90,

101, 178, 233, 45, 200),

Group.3 = c(29, 101, 239, 289, 176,

320, 89, 109, 199, 56))

sep=tall %>%

separate(Group, c("Allotment",

"Number"))

uni=sep %>%

unite(Group, Allotment,

Number, sep = ".")

print(uni)

Output:

40
41
3.4 Spread:

Code:

#Spread

library(tidyr)

n = 10

tidydf = data.frame(

S.No = c(1:n),

Group.1 = c(23, 345, 76, 212, 88,

199, 72, 35, 90, 265),

Group.2 = c(117, 89, 66, 334, 90,

101, 178, 233, 45, 200),

Group.3 = c(29, 101, 239, 289, 176,

320, 89, 109, 199, 56))

uni=sep %>%

unite(Group, Allotment,

Number, sep = ".")

sp=uni %>%

spread(Group, Frequency)

print(sp)

Output:

42
Module 4- Basics of how to create visualisations using the popular R package
ggplot 2
Program 4

4.1 Summary of dataset:

Code:

#installing and loading packages

install.packages("dplyr")

library(dplyr)

#summary of the dataset

summary(iris)

Output:

43
4.2 R script for data layers:

Code:

#Rscript for datalayers

library(ggplot2)

library(dplyr)

ggplot(data = iris)

Output:

4.3 R Script for Aesthetic Layer:

Code:

#aesthetic layer

ggplot(data = iris, aes(x = Sepal.Width, y=Petal.Length, col=Sepal.Length))

Output:

44
4.4 R Script for Geometric Layer:

Code:

#geometric layer

ggplot(data = iris, aes(x = Sepal.Width, y = Petal.Length))+geom_point()

Output:

45
4.5 R Script for Adding Size, Colour and Shape:

Code:

#Adding color & size

ggplot(data = iris,

aes(x = Sepal.Width, y = Petal.Length, col = Species)) + geom_point(size = 2)

Output:

46
Code:

# Adding colour and shape

ggplot(data = iris,

aes(x = Sepal.Width, y = Petal.Length, col = factor(Species),

shape = factor(Species))) +

geom_point()

Output:

47
4.6 R script for Histogram Plot

Code:

# Histogram

ggplot(data = iris, aes(x = Sepal.Length)) +

geom_histogram(binwidth = 0.5, fill = "red", color = "green", alpha = 0.7) +

labs(title = "Histogram of Sepal Length in Iris Dataset",

x = "Sepal Length",

y = "Frequency")

Output:

48
4.7 R script for Facet Layer

geom_point() +

stat_smooth(method = lm, col = "green") +

scale_y_continuous("sepal", limits = c(2, 10),

expand = c(0, 0)) +

scale_x_continuous("petal", limits = c(0, 10),

expand = c(0, 0)) + coord_equal()

51
Output:

4.10R script for Coord_cartesian()

Code:

#coord_cartesian

ggplot(data = iris, aes(x = Sepal.Length, y = Petal.Length, col = "pink")) +

geom_point() + geom_smooth() +

coord_cartesian(xlim = c(3, 6))

52
Output:

4.11R script for Theme layer

Code:

#theme layer

ggplot(data = iris, aes(x = Sepal.Length, y = Petal.Length)) +

geom_point() + facet_grid(. ~ Species) +

theme(plot.background = element_rect(

fill = "pink", colour = "purple"))

53
Output:

Module 5- Learn the basics of aggregate functions in R with dplyr, which let us
calculate quantities that describe groups of data.
Program 5

a. Display data:

Code:

library(dplyr)

employee_data = data.frame(emp_id = c(101,201,301,401,501,601,701), name =

c("Chhavi", "Vilay", "Yuvraj", "Udit", "Yukta", "Nivi", "Kshama"), department =
c("Finance","HR","Marketing","HR","Sales",

"Marketing","HR"),

salary = c(34000,23000,41000,20000,35000,67000,87000))

54
print("Original Data frame")

print(employee_data)

Output:

5.1 R script to create with 4 columns and group with subjects and get the aggregates like
minimum, sum, and maximum.

Code:

#R script with aggregates

library(dplyr)

employee_data = data.frame(emp_id = c(101,201,301,401,501,601,701), name =

c("Chhavi", "Gaurav", "geet", "deepika", "aastha", "Nivi", "Kshama"), department
= c("Finance","HR","Marketing","HR","Sales",

"Marketing","HR"),

salary = c(34000,23000,41000,20000,35000,67000,87000))

print(aggregate(employee_data$salary, list(employee_data$department), FUN =

sum))
55
print(aggregate(employee_data$salary, list(employee_data$department), FUN =
max))

print(aggregate(employee_data$salary, list(employee_data$department), FUN =

min)

Output:

5.2 R Script to create with 4 columns and group with subjects and get the average (mean).

Code:

#R script with aggregates

library(dplyr)

employee_data = data.frame(emp_id = c(101,201,301,401,501,601,701), name =

c("Chhavi", "Vikas", "bharat", "kiran", "srishti", "garima", "ritit"), department =
c("Finance","HR","Marketing","HR","Sales",

"Marketing","HR"),

salary = c(34000,23000,41000,20000,35000,67000,87000))

print(aggregate(employee_data$salary, list(employee_data$department), FUN =

mean))

# Import the data using read.csv()

myData = read.csv("C:\\Users\\91858\\Downloads\\CardioGoodFitness.csv",
stringsAsFactor=F)

#printing first 6 rows

print(head(myData))

Output:

3. Calculating mean:

Code:

# R program to import data into R

# Import the data using read.csv()

myData = read.csv("C:\\Users\\91858\\Downloads\\CardioGoodFitness.csv",
stringsAsFactor=F)

#calculating mean
63
mean = mean(myData$Age)

print(mean)

Output:

4. Calculating median:

Code:

# R program to import data into R

# Import the data using read.csv()

myData = read.csv("C:\\Users\\91858\\Downloads\\CardioGoodFitness.csv",
stringsAsFactor=F)

#calculating median

median = median(myData$Age)

print(median)

Output:

5. Calculating mode:

Code:

# R program to import data into R

# Import the data using read.csv()

64
myData = read.csv("C:\\Users\\91858\\Downloads\\CardioGoodFitness.csv",
stringsAsFactor=F)

#Calculate mode

mode = function(){

return(sort(-table(myData$Age))[1])

mode()

Output:

6. Printing the last 6 rows and their mode

Code:

# R program to import data into R

# Import the data using read.csv()

myData = read.csv("C:\\Users\\91858\\Downloads\\CardioGoodFitness.csv",
stringsAsFactor=F)

#printing the last 6 rows

print(tail(myData))

mode = function(){

return(sort(table(myData$Age))[2])

mode()
65
Output:

7. Calculating MFV (Most frequent value):

Code:

# R program to import data into R

# Import the data using read.csv()

myData = read.csv("C:\\Users\\91858\\Downloads\\CardioGoodFitness.csv",
stringsAsFactor=F)

#Most Frequenting Value

library(modeest)

mode_a = mfv(myData$Age)

print(mode_a)

Output:

66
Module 8- Calculating the Variance and Standard deviation in R

Program 8

a. Calculating variance:

Code:

#R program to get variance of a list

list=c(212,231,234,564,235)

#Calculating variance using var()

print(var(list))

Output:

b. Calculating standard deviation:

Code:

#R program to get standard deviation of a list

list=c(212,231,234,564,235)

#Calculating standard deviation using sd()

print(sd(list))

Output:

c. Calculating Variance and Standard Deviation using iris dataset

67
Code:

#printing 5 rows of iris data set

head(iris)

sepal = iris$Sepal.Length

#printing variance, sd and mean

print(var(sepal))

print(sd(sepal))

variance_value <- var(sepal)

std_dev_value <- sd(sepal)

# Print the results

cat("Sepal Length Variance:", variance_value, "\n")

cat("Sepal Length Standard Deviation:", std_dev_value,"\n")

Output:

68
Module 9- Learn how to calculate three important descriptive statistics-
Quartiles, Quantiles, and Interquartile range that describe the spread of the
data
Program 9
Quartile (0.25, 0.5, 0.75)
Code:
prob= iris$Sepal.Length
res1= quantile(prob, probs=c(0,0.25,0.5,0.75,1))
res1
Output:

OR
Code:
df<-data.frame(x=c(2,13,5,36,12,50),
y=c('a','b','c','c','c','b'))
res4<-quantile(df$x, probs=c(0,0.25,0.5,0.75,1))
res4
Output:

IQR= Inter quartile Range

Code:
prob= iris$Sepal.Length
IQR(prob)
Output:

Showing Quartiles and Inter quartile Range using Data set- Cardio good fitness
Code:

69
myData = read.csv("C:\\Users\\91858\\Downloads\\CardioGoodFitness.csv",
stringsAsFactor=F)

values<-c(values<-c(myData$Age))
quantile(values,0.25)

values<-c(myData$Age)
quantile(values,0.5)

values<-c(myData$Age)
quantile(values,0.75)

values<-c(myData$Age)
IQR(values)

Output:

70
Module 10- Learn about the statistics used to run hypothesis tests and use
R to run different t-tests that compare distributions

Program 10

CODE:
library(ggplot2)
library(dplyr)
library(tidyr)
library(magrittr)
library(gridExtra)
library(e1071)
midwest
head(midwest)
tail(midwest)
summary(midwest)
skewness(midwest$area)
kurtosis(midwest$area)
x = (midwest$popdensity)
t.test(x,y= NULL,alternative = c(“two.sided”, “less”, “greater”),
paired = FALSE, var.equal = FALSE, conf.level = 0.95)

t.test(x,y= NULL,alternative = c(“two.sided”, “less”, “greater”), mu=0,

paired = FALSE, var.equal = FALSE, conf.level = 0.95)

OUTPUT:

71
72
73
#Creating boxplot

Code:
#view first 6 rows of "airquality"dataset

head(airquality)

#create boxplot for the variable"ozone"

boxplot(airquality$Ozone)

#boxplot using ggplot

boxplot(airquality,

data=airquality,

main="temperature distribution by month",

xlab = "month",

ylab = "degrees(f)",

col="steelblue",

border="black")
74
Output:

Interventional Cardiology 1133 Questions An Interventional Cardiology Board Review 3rd Edition 2019
100% (4)
Interventional Cardiology 1133 Questions An Interventional Cardiology Board Review 3rd Edition 2019
901 pages
Start Small - Stay Small - A Developer's Guide To Launching A Startup
No ratings yet
Start Small - Stay Small - A Developer's Guide To Launching A Startup
4 pages
R-Programming Notes
100% (1)
R-Programming Notes
33 pages
(Robert J. Thierauf) Knowledge Management Systems PDF
100% (1)
(Robert J. Thierauf) Knowledge Management Systems PDF
376 pages
HIRARC - Changing A Flat Tire
No ratings yet
HIRARC - Changing A Flat Tire
8 pages
DSF Gourav-2
No ratings yet
DSF Gourav-2
30 pages
R programmimg Lab FIle
No ratings yet
R programmimg Lab FIle
35 pages
SMuR Assignment
No ratings yet
SMuR Assignment
8 pages
Big-Data Unit-4
No ratings yet
Big-Data Unit-4
110 pages
Unit 4 - Big Data Technologies
No ratings yet
Unit 4 - Big Data Technologies
48 pages
Data Analysis Using R - 2
No ratings yet
Data Analysis Using R - 2
23 pages
Satyam Jha r File
No ratings yet
Satyam Jha r File
41 pages
R Studio Assignments
No ratings yet
R Studio Assignments
95 pages
CH 4 Data Analytics With R and Weak Machine Learning
No ratings yet
CH 4 Data Analytics With R and Weak Machine Learning
82 pages
4 R and RStudio 2
No ratings yet
4 R and RStudio 2
20 pages
Unit III R Programming Fundamentals
No ratings yet
Unit III R Programming Fundamentals
33 pages
2 program
No ratings yet
2 program
11 pages
R Course Notes
No ratings yet
R Course Notes
10 pages
R Lanaguage
No ratings yet
R Lanaguage
25 pages
R Lab
No ratings yet
R Lab
114 pages
R program questions 1-24 (21)
No ratings yet
R program questions 1-24 (21)
56 pages
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
No ratings yet
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
36 pages
PushpendraLabFile
No ratings yet
PushpendraLabFile
51 pages
data anlytics using r notes
No ratings yet
data anlytics using r notes
14 pages
R Programming Lab
No ratings yet
R Programming Lab
46 pages
BR PDF File K
No ratings yet
BR PDF File K
100 pages
Introduction to R
No ratings yet
Introduction to R
23 pages
RSTUDIO DIVYA
No ratings yet
RSTUDIO DIVYA
68 pages
BRM PRACTICAL FILE H--
No ratings yet
BRM PRACTICAL FILE H--
37 pages
Part I: Introductory Materials: Introduction To R
No ratings yet
Part I: Introductory Materials: Introduction To R
25 pages
R For Absolute Beginners - Hands-On R Tutorial: June 2018
No ratings yet
R For Absolute Beginners - Hands-On R Tutorial: June 2018
43 pages
RM practical(2)
No ratings yet
RM practical(2)
38 pages
STATS LAB Basics of R PDF
No ratings yet
STATS LAB Basics of R PDF
77 pages
R - Lab Experiments - Manual
No ratings yet
R - Lab Experiments - Manual
39 pages
R Studio
No ratings yet
R Studio
41 pages
Introduction To R
No ratings yet
Introduction To R
39 pages
Lecture Notes - Programming in R
No ratings yet
Lecture Notes - Programming in R
9 pages
Introduction To R
No ratings yet
Introduction To R
34 pages
LAB MANUAL
No ratings yet
LAB MANUAL
46 pages
R Programming Checklist of Basic Skills With Examples
No ratings yet
R Programming Checklist of Basic Skills With Examples
33 pages
r Programming
No ratings yet
r Programming
56 pages
ROperators_c051e001f76c24a58f3a7254669a6507
No ratings yet
ROperators_c051e001f76c24a58f3a7254669a6507
11 pages
Getting Started in R
No ratings yet
Getting Started in R
39 pages
R Language Lab Manual Lab 1
100% (1)
R Language Lab Manual Lab 1
33 pages
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
No ratings yet
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
58 pages
datatypes variables operators in R
No ratings yet
datatypes variables operators in R
22 pages
Unit 1 Notes R Programming
No ratings yet
Unit 1 Notes R Programming
7 pages
R Programming
No ratings yet
R Programming
114 pages
Introduction to r Chap 2
No ratings yet
Introduction to r Chap 2
30 pages
Introduction to R in data analytics
No ratings yet
Introduction to R in data analytics
135 pages
9. Operators
No ratings yet
9. Operators
14 pages
DSC2608 Learning_Unit_1
No ratings yet
DSC2608 Learning_Unit_1
20 pages
R Language Lab Manual Lab 1
No ratings yet
R Language Lab Manual Lab 1
32 pages
R Language Notes
No ratings yet
R Language Notes
51 pages
R PPT
No ratings yet
R PPT
63 pages
UNIT-I
No ratings yet
UNIT-I
45 pages
Live Class - 2 - 24.08.24
No ratings yet
Live Class - 2 - 24.08.24
19 pages
R Programming
No ratings yet
R Programming
21 pages
R Programming
No ratings yet
R Programming
22 pages
Module 1-1
No ratings yet
Module 1-1
38 pages
R prog lab manual theory.docx
No ratings yet
R prog lab manual theory.docx
16 pages
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
From Everand
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
Ginno
No ratings yet
R Programming - a Comprehensive Guide: Software
From Everand
R Programming - a Comprehensive Guide: Software
Editor IJSMI
No ratings yet
Beginning R: The Statistical Programming Language
From Everand
Beginning R: The Statistical Programming Language
Mark Gardener
4.5/5 (4)
The Laser Additive Manufacture of Ti-6Al-4V: P.A. Kobryn and S.L. Semiatin
No ratings yet
The Laser Additive Manufacture of Ti-6Al-4V: P.A. Kobryn and S.L. Semiatin
3 pages
Basics An Switch Setup
No ratings yet
Basics An Switch Setup
1 page
Ager - Derek - The Nature of Stratigraphical Record (3rd Edition-1993) PDF
No ratings yet
Ager - Derek - The Nature of Stratigraphical Record (3rd Edition-1993) PDF
82 pages
Electronic Invoicing Mexico
No ratings yet
Electronic Invoicing Mexico
10 pages
Product Catalogue Pavers
No ratings yet
Product Catalogue Pavers
10 pages
72 BB
No ratings yet
72 BB
16 pages
The Haunting at Concha Cruz Drive
No ratings yet
The Haunting at Concha Cruz Drive
2 pages
1ER PARCIAL Nivel 8
No ratings yet
1ER PARCIAL Nivel 8
2 pages
ST 3000 Smart Transmitter Manual PDF
No ratings yet
ST 3000 Smart Transmitter Manual PDF
316 pages
upute-za-rukovanje-2973412-tru-components-tc-me31-aaax2240-modul-sucelja-modbus-rtu-modbus-tcp-modbus-gateway-dio-analogna-rj-45-rs-485
No ratings yet
upute-za-rukovanje-2973412-tru-components-tc-me31-aaax2240-modul-sucelja-modbus-rtu-modbus-tcp-modbus-gateway-dio-analogna-rj-45-rs-485
36 pages
CBLM SMAW NC I 1 Common Competency
100% (2)
CBLM SMAW NC I 1 Common Competency
58 pages
STES's Smt. Kashibai Navale Collge of Engineering, PUNE - 41: Subject: - Oopcg Lab
100% (1)
STES's Smt. Kashibai Navale Collge of Engineering, PUNE - 41: Subject: - Oopcg Lab
83 pages
The Complete Guide To The TOEFL PBT BOOK-pages-98-104
No ratings yet
The Complete Guide To The TOEFL PBT BOOK-pages-98-104
7 pages
Kisi Kisi Usbn Bing: Questions 16 - 18
No ratings yet
Kisi Kisi Usbn Bing: Questions 16 - 18
9 pages
alan turing ppt
No ratings yet
alan turing ppt
13 pages
Ozone R1-V4-Info
No ratings yet
Ozone R1-V4-Info
8 pages
Thaddius Barker - The Book of Whichcraft PDF
100% (4)
Thaddius Barker - The Book of Whichcraft PDF
45 pages
2.2 - Kim, Jaegwon - The Many Problems of Mental Causation
No ratings yet
2.2 - Kim, Jaegwon - The Many Problems of Mental Causation
32 pages
06EE81 - Industrial Management, Electrical Estimation & Economics
No ratings yet
06EE81 - Industrial Management, Electrical Estimation & Economics
21 pages
AWB#774884597778
No ratings yet
AWB#774884597778
4 pages
Unit 1 - Expressions Equations Inequalities
No ratings yet
Unit 1 - Expressions Equations Inequalities
2 pages
Former Gov't Officials Who Served Under Arroyo Admin Back Robredo For President
No ratings yet
Former Gov't Officials Who Served Under Arroyo Admin Back Robredo For President
3 pages
Marita Sturken and Lisa Cartwright Argue That Meanings Are Created in Part When
No ratings yet
Marita Sturken and Lisa Cartwright Argue That Meanings Are Created in Part When
5 pages
6 Chase Nat'l Bank of New York V Battat
No ratings yet
6 Chase Nat'l Bank of New York V Battat
2 pages
Antibody Engineering Methods and Protocols Damien Nevoltris download
100% (1)
Antibody Engineering Methods and Protocols Damien Nevoltris download
60 pages
Tutorial Week 7 - QUESTION - DEC2017
No ratings yet
Tutorial Week 7 - QUESTION - DEC2017
1 page