0% found this document useful (0 votes)

15 views36 pages

Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024

The document is a lab manual for a Data Science course using R, authored by Dr. P. Rajasekar. It includes a list of experiments covering basic mathematical functions, vector and matrix operations, data manipulation, and data visualization in R. Each experiment outlines aims, theoretical background, and practical assignments to enhance understanding of R programming.

Uploaded by

jkjai3113

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views36 pages

Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024

Uploaded by

jkjai3113

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Lab Manual

Data Science using R

Dr.P.RAJASEKAR
Associate Professor,
School of Computing
SRMIST
List of Experiments

1. Basic Mathematical functions in R

2. Implementation of vector data objects operations

3. Implementation of matrix, array and factors in R

4. Implementation and use of data frames in R

5. Create Sample (Dummy) Data in R and perform data manipulation with R

6. Write a R program to take input from the user (name and age) and display the values.

Also print the version of R installation.

7. Write a R program to create a sequence of numbers from 20 to 50 and find the mean of

numbers from 20 to 60 and sum of numbers from 51 to 91.

8. Write a R program to create three vectors a,b,c with 3 integers. Combine the three

vectors to become a 3×3 matrix where each column represents a vector. Print the

content of the matrix.

9. Write a R program to concatenate two given matrixes of same column but different

rows.

10… Write a R program to create a data frame from four given vectors.

11. Write a R program to sort a given data frame by multiple column(s).

12. Write a R program to count the number of NA values in a data frame column.

13. Write a R program to create a simple bar plot of four subjects’ marks.

14. Write a R program to create a simple bar plot for ozone concentration in air with
“airquality” dataset.

15. Write a R program to create a histogram for maximum daily temperature for with
“airquality” dataset.

16. Write a R program to create a boxplot for the variable “wind” with “airquality” dataset.
Experiment No: 1
Aim: To perform the basic mathematical operations in R programming

Theory:

In R, the fundamental unit of share-able code is the package. A package bundles

together code, data, documentation, and tests and provides an easy method to share with
others1. As of May 2017 there were over 10,000 packages available on CRAN. This huge
variety of packages is one of the reasons that R is so successful: chances are that someone has
already solved a problem that you’re working on, and you can benefit from their work by
downloading their package.

Installing Packages
The most common place to get packages from is CRAN. To install packages from CRAN you
use install.packages("packagename"). For instance, if you want to install the ggplot2
package, which is a very popular visualization package you would type the following in the
console:
# install package from CRAN
install.packages("ggplot2")

Loading Packages
Once the package is downloaded to your computer you can access the functions and
resources provided by the package in two different ways:
# load the package to use in the current R session
library(packagename)

Getting Help on Packages

For more direct help on packages that are installed on your computer you can use the help
and vignette functions. Here we can get help on the ggplot2 package with the following:
help(package = "ggplot2") # provides details regarding contents of a package
vignette(package = "ggplot2") # list vignettes available for a specific package
vignette("ggplot2-specs") # view specific vignette
vignette() # view all vignettes on your computer

Assignment
The first operator you’ll run into is the assignment operator. The assignment operator is used
to assign a value. For instance we can assign the value 3 to the variable x using the <-
assignment operator.
# assignment
x <- 3
Interestingly, R actually allows for five assignment operators:
# leftward assignment
x <- value
x = value
x <<- value
# rightward assignment
value -> x
value ->> x
The original assignment operator in R was <- and has continued to be the preferred among R
users. The = assignment operator was added in 2001 primarily because it is the accepted
assignment operator in many other languages and beginners to R coming from other
languages were so prone to use it.
The operators <<- is normally only used in functions which we will not get into the details.

Evaluation
We can then evaluate the variable by simply typing x at the command line which will return
the value of x. Note that prior to the value returned you’ll see ## [1] in the command line.
This simply implies that the output returned is the first output. Note that you can type any
comments in your code by preceding the comment with the hash tag (#) symbol. Any values,
symbols, and texts following # will not be evaluated.
# evaluation
x
## [1] 3

Case Sensitivity
Lastly, note that R is a case sensitive programming language. Meaning all variables,
functions, and objects must be called by their exact spelling:

x <- 1
y <- 3
z <- 4
x*y*z
## [1] 12
x*Y*z
## Error in eval(expr, envir, enclos): object 'Y' not found

Basic Arithmetic
At its most basic function R can be used as a calculator. When applying basic arithmetic, the
PEMDAS order of operations applies: parentheses first followed by exponentiation,
multiplication and division, and final addition and subtraction.

8+9/5^2
## [1] 8.36

8 + 9 / (5 ^ 2)
## [1] 8.36
8 + (9 / 5) ^ 2
## [1] 11.24
(8 + 9) / 5 ^ 2
## [1] 0.68
By default R will display seven digits but this can be changed using options() as previously
outlined.
1/7
## [1] 0.1428571
options(digits = 3)
1/7
## [1] 0.143
pi
## [1] 3.141592654
options(digits = 22)
pi
## [1] 3.141592653589793115998
We can also perform integer divide (%/%) and modulo (%%) functions. The integer divide
function will give the integer part of a fraction while the modulo will provide the remainder.
42 / 4 # regular division
## [1] 10.5
42 %/% 4 # integer division
## [1] 10
42 %% 4 # modulo (remainder)
## [1] 2

Miscellaneous Mathematical Functions

There are many built-in functions to be aware of. These include but are not limited to the
following. Go ahead and run this code in your console.
x <- 10
abs(x) # absolute value
sqrt(x) # square root
exp(x) # exponential transformation
log(x) # logarithmic transformation
cos(x) # cosine and other trigonometric functions

Infinite, and NaN Numbers:

When performing undefined calculations, R will produce Inf (infinity) and NaN (not a
number) outputs.
1/0 # infinity
## [1] Inf
Inf - Inf # infinity minus infinity
## [1] NaN

The workspace environment will also list your user defined objects such as vectors, matrices,
data frames, lists, and functions. For example, if you type the following in your console:
x <- 2
y <- 3
You will now see x and y listed in your workspace environment. To identify or remove the
objects (i.e. vectors, data frames, user defined functions, etc.) in your current R environment:

# list all objects

ls()

# identify if an R object with a given name is present

exists("x")

# remove defined object from the environment

rm(x)

# you can remove multiple objects

rm(x, y)

# basically removes everything in the working environment -- use with caution!

rm(list = ls())

Result:

In this way we had understand the basics of R programming.

Experiment No: 2

Aim: Implementation of vector and List data objects operations

Theory:

With R, it’s Important that one understand that there is a difference between the actual
R object and the manner in which that R object is printed to the console. Often, the printed
output may have additional bells and whistles to make the output more friendly to the users.
However, these bells and whistles are not inherently part of the object
R has five basic or “atomic” classes of objects:
• character
• numeric (real numbers)
• integer
• complex
• logical (True/False)
The most basic type of R object is a vector. Empty vectors can be created with the
vector() function. There is really only one rule about vectors in R, which is that A vector can
only contain objects of the same class. But of course, like any good rule, there is an
exception, which is a list, which we will get to a bit later. A list is represented as a vector but
can contain objects of different classes. Indeed, that’s usually why we use them.
There is also a class for “raw” objects, but they are not commonly used directly in data
analysis

Creating Vectors

The c() function can be used to create vectors of objects by concatenating things together.

> x <- c(0.5, 0.6) ## numeric

> x <- c(TRUE, FALSE) ## logical

> x <- c(T, F) ## logical

> x <- c("a", "b", "c") ## character

> x <- 9:29 ## integer

> x <- c(1+0i, 2+4i) ## complex

Note that in the above example, T and F are short-hand ways to specify TRUE and FALSE.
However, in general one should try to use the explicit TRUE and FALSE values when
indicating logical values. The T and F values are primarily there for when you’re feeling lazy.

You can also use the vector() function to initialize vectors.

> x <- vector("numeric", length = 10)

[1] 0 0 0 0 0 0 0 0 0 0

A vector is an object that contains a set of values called its elements.

Numeric vector

x <- c(1,2,3,4,5,6)

The operator <– is equivalent to "=" sign.

Character vector

State <- c("DL", "MU", "NY", "DL", "NY", "MU")

To calculate frequency for State vector, you can use table function.

To calculate mean for a vector, you can use mean function.

Since the above vector contains a NA (not available) value, the mean function returns NA.

To calculate mean for a vector excluding NA values, you can include na.rm = TRUE
parameter in mean function.
You can use subscripts to refer elements of a vector.

Convert a column "x" to numeric

data$x = as.numeric(data$x)

Some useful vectors can be created quickly with R. The colon operator is

used to generate integer sequences

> 1:10

[1] 1 2 3 4 5 6 7 8 9 10

> -3:4

[1] -3 -2 -1 0 1 2 3 4

> 9:5

[1] 9 8 7 6 5

More generally, the function seq() can generate any arithmetic progression.

> seq(from=2, to=6, by=0.4)

[1] 2.0 2.4 2.8 3.2 3.6 4.0 4.4 4.8 5.2 5.6 6.0

> seq(from=-1, to=1, length=6)

[1] -1.0 -0.6 -0.2 0.2 0.6 1.0

Sometimes it’s necessary to have repeated values, for which we use rep()

> rep(5,3)

[1] 5 5 5

> rep(2:5,each=3)

[1] 2 2 2 3 3 3 4 4 4 5 5 5

> rep(-1:3, length.out=10)

[1] -1 0 1 2 3 -1 0 1 2 3
We can also use R’s vectorization to create more interesting sequences:

> 2^(0:10)

[1] 1 2 4 8 16 32 64 128 256 512 1024

> 1:3 + rep(seq(from=0,by=10,to=30), each=3)

[1] 1 2 3 11 12 13 21 22 23 31 32 33

Lists:

A list allows you to store a variety of objects.

You can use subscripts to select the specific component of the list.
> x <- list(1:3, TRUE, "Hello", list(1:2, 5))

Here x has 4 elements: a numeric vector, a logical, a string and another list.

We can select an entry of x with double square brackets:

> x[[3]]

[1] "Hello"

To get a sub-list, use single brackets:

> x[c(1,3)]

[[1]]

[1] 1 2 3

[[2]]

[1] "Hello"

Notice the difference between x[[3]] and x[3].

We can also name some or all of the entries in our list, by supplying argument names to list():

> x <- list(y=1:3, TRUE, z="Hello")

[1] 1 2 3

[[2]]

[1] TRUE

[1] "Hello"
Notice that the [[1]] has been replaced by $y, which gives us a clue as to

how we can recover the entries by their name. We can still use the numeric

position if we prefer:

> x$y

[1] 1 2 3

> x[[1]]
[1] 1 2 3

The function names() can be used to obtain a character vector of all the

names of objects in a list.

> names(x)

[1] "y" "" "z"

Result:

Thus, we have done Implementation of vector and list data objects operations using R.
Experiment No. 3

Aim: Implementation of various operations on matrix, array and factors in R

Theory:
Matrices are much used in statistics, and so play an important role in R. To create a matrix
use the function matrix(), specifying elements by column first:

> matrix(1:12, nrow=3, ncol=4)

[,1] [,2] [,3] [,4]

[1,] 1 4 7 10

[2,] 2 5 8 11

[3,] 3 6 9 12

This is called column-major order. Of course, we need only give one of the dimensions:

> matrix(1:12, nrow=3)

unless we want vector recycling to help us:

> matrix(1:3, nrow=3, ncol=4)

[,1] [,2] [,3] [,4]

[1,] 1 1 1 1

[2,] 2 2 2 2

[3,] 3 3 3 3

Sometimes it’s useful to specify the elements by row first

> matrix(1:12, nrow=3, byrow=TRUE)

There are special functions for constructing certain matrices:

> diag(3)

[,1] [,2] [,3]

[1,] 1 0 0

[2,] 0 1 0

[3,] 0 0 1
> diag(1:3)

[,1] [,2] [,3]

[1,] 1 0 0

[2,] 0 2 0

[3,] 0 0 3

> 1:5 %o% 1:5

[,1] [,2] [,3] [,4] [,5]

[1,] 1 2 3 4 5

[2,] 2 4 6 8 10 [3,]

3 6 9 12 15 [4,] 4

8 12 16 20 [5,] 5

10 15 20 25

The last operator performs an outer product, so it creates a matrix with (i, j)-th entry xiyj .
The function outer() generalizes this to any function f on two arguments, to create a matrix
with entries f(xi , yj ). (More on functions later.)

> outer(1:3, 1:4, "+")

[,1] [,2] [,3] [,4]

[1,] 2 3 4 5

[2,] 3 4 5 6

[3,] 4 5 6 7

Matrix multiplication is performed using the operator %*%, which is quite

distinct from scalar multiplication *.

> A <- matrix(c(1:8,10), 3, 3)

> x <- c(1,2,3)

> A %*% x # matrix multiplication

[,1]
[1,] 30

[2,] 36

[3,] 45

> A*x # NOT matrix multiplication

[,1] [,2] [,3]

[1,] 1 4 7

[2,] 4 10 16

[3,] 9 18 30

Standard functions exist for common mathematical operations on matrices

> t(A) # transpose

[,1] [,2] [,3]

[1,] 1 2 3

[2,] 4 5 6

[3,] 7 8 10

> det(A) # determinant

[1] -3

> diag(A) # diagonal

[1] 1 5 10

> solve(A) # inverse

[,1] [,2] [,3]

[1,] -0.6667 -0.6667 1

[2,] -1.3333 3.6667 -2

[3,] 1.0000 -2.0000 1

Array:

Of course, if we have a data set consisting of more than two pieces of categorical information
about each subject, then a matrix is not sufficient. The generalization of matrices to higher
dimensions is the array. Arrays are defined much like matrices, with a call to the array()
command. Here is a 2 × 3 × 3 array:

> arr = array(1:18, dim=c(2,3,3))

> arr

,,1

[,1] [,2] [,3]

[1,] 1 3 5

[2,] 2 4 6

,,2

[,1] [,2] [,3]

[1,] 7 9 11

[2,] 8 10 12

,,3

[,1] [,2] [,3]

[1,] 13 15 17

[2,] 14 16 18

Each 2-dimensional slice defined by the last co-ordinate of the array is shown as a 2 × 3 matrix.
Note that we no longer specify the number of rows and columns separately, but use a single
vector dim whose length is the number of dimensions. You can recover this vector with the
dim() function.

> dim(arr)

[1] 2 3 3

Note that a 2-dimensional array is identical to a matrix. Arrays can be

subsetted and modified in exactly the same way as a matrix, only using the

appropriate number of co-ordinates:

> arr[1,2,3]

[1] 15

> arr[,2,]
[,1] [,2] [,3]

[1,] 3 9 15

[2,] 4 10 16

> arr[1,1,] = c(0,-1,-2) # change some values

> arr[,,1,drop=FALSE]

,,1

[,1] [,2] [,3]

[1,] 0 3 5

[2,] 2 4 6

Factors

R has a special data structure to store categorical variables. It tells R that a variable is
nominal or ordinal by making it a factor.

Simplest form of the factor function:

Ideal form of the factor function:

The factor function has three parameters:

1. Vector Name
2. Values (Optional)
3. Value labels (Optional)
Convert a column "x" to factor

data$x = as.factor(data$x)

Result:

Thus, we have done Implementation of various operations on matrix, array and factors in R.
Experiment No. 4

Aim: Implementation and to perform the various operations on data frames in R

Theory:

A data frame is a table or a two-dimensional array-like structure in which each column

contains values of one variable and each row contains one set of values from each column.

• Data frames are tabular data objects.

• A Data frame is a list of vectors of equal length.

• Data frame in R is used for storing data tables.

Characteristics of a data frame:

1. The column names should be non-empty.

2. The row names should be unique.

3. The data stored in a data frame can be of numeric, factor or character type.

Create Data Frame

# Create the data frame.

emp.data <- data.frame(
emp_id = c (1:5),
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),
start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-
11",
"2015-03-27")),
stringsAsFactors = FALSE
)# Print the data frame.
print(emp.data)

When we execute the above code, it produces the following result –

emp_id emp_name salary start_date
1 1 Rick 623.30 2012-01-01
2 2 Dan 515.20 2013-09-23
3 3 Michelle 611.00 2014-11-15
4 4 Ryan 729.00 2014-05-11
5 5 Gary 843.25 2015-03-27
Get the Structure of the Data Frame

The structure of the data frame can be seen by using str() function.

# Create the data frame.

emp.data <- data.frame(
emp_id = c (1:5),
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),

start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",

"2015-03-27")),
stringsAsFactors = FALSE
)
# Get the structure of the data frame.
str(emp.data)

When we execute the above code, it produces the following result –

'data.frame': 5 obs. of 4 variables:

$ emp_id : int 1 2 3 4 5
$ : chr "Rick" "Dan" "Michelle" "Ryan"
emp_name : num ...
$ start_date:
salary Date, format:
623 515"2012-01-01"
611 729 843 "2013-09-23" "2014-11-15" "2014-05-11" ...

Summary of Data in Data Frame

The statistical summary and nature of the data can be obtained by applying summary()
function.

# Create the data frame.

emp.data <- data.frame(
emp_id = c (1:5),
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),

start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",

"2015-03-27")),
stringsAsFactors = FALSE
)
# Print the summary.
print(summary(emp.data))
When we execute the above code, it produces the following result −

Min.emp_i emp_name
:1 Length:5 Min.salary
:515.2 start_date
Min. :2012-01-01
d
1st Qu.:2 Class :character 1st Qu.:611.0 1st Qu.:2013-09-23
Median :3 Mode :character Median :623.3 Median :2014-05-11
Mean :3 Mean :664.4 Mean :2014-01-14
3rd Qu.:4 3rd Qu.:729.0 3rd Qu.:2014-11-15
Max. :5 Max. :843.2 Max. :2015-03-27

Extract Data from Data Frame:

# Extract Specific columns.

result <- data.frame(emp.data$emp_name,emp.data$salary)

print(result)

When we execute the above code, it produces the following result −

emp.data.emp_name emp.data.salary
1 Rick 623.30
2 Dan 515.20
3 Michelle 611.00
4 Ryan 729.00
5 Gary 843.25

# Extract first two rows.

result <- emp.data[1:2,]

print(result)
When we execute the above code, it produces the following result −

emp_id emp_nam salary start_date

1 1 eRick 623.3 2012-01-01
2 2 Dan 515.2 2013-09-23

# Extract 3rd and 5th row with 2nd and 4th column.

result <- emp.data[c(3,5),c(2,4)]

print(result)

When we execute the above code, it produces the following result −

emp_name
start_date
3 Michelle 2014-11-15
5 Gary 2015-03-27

Expand Data Frame

A data frame can be expanded by adding columns and rows.0

1. Add Column

Just add the column vector using a new column name.

# Add the "dept" column.

emp.data$dept <- c("IT","Operations","IT","HR","Finance")

v <- emp.data

print(v)

When we execute the above code, it produces the following result –

emp_id emp_name salary start_date dept

1 Rick 623.30 2012-01-01 IT
2 Dan 515.20 2013-09-23 Operations
3 Michelle 611.00 2014-11-15 IT
4 Ryan 729.00 2014-05-11 HR
5 Gary 843.25 2015-03-27 Finance

2. Add Row
To add more rows permanently to an existing data frame, we need to bring in the new rows
in the same structure as the existing data frame and use the rbind() function.

In the example below we create a data frame with new rows and merge it with the existing
data frame to create the final data frame.

# Create the second data frame

emp.newdata <- data.frame(
emp_id = c (6:8),
emp_name = c("Rasmi","Pranab","Tusar"),
salary = c(578.0,722.5,632.8),
start_date = as.Date(c("2013-05-21","2013-07-30","2014-06-17")),
dept = c("IT","Operations","Fianance"),
stringsAsFactors = FALSE
)

# Bind the two data frames.

emp.finaldata <- rbind(emp.data,emp.newdata)
print(emp.finaldata)

Conclusion:

Thus, the Implementation and various operations on data frames are performed in R.
Experiment No. 5

Aim: To Create Sample (Dummy) Data in R and perform data manipulation with R

Theory:

This covers how to execute most frequently used data manipulation tasks with R. It includes
various examples with datasets and code. It gives you a quick look at several functions used
in R.

Drop data frame columns by name:

DF <- data.frame( x=1:10, y=10:1, z=rep(5,10), a=11:20 )

# for multiple

> drops <- c("x","z")

DF[ , !(names(DF) %in% drops)]

# OR

> keeps <- c("y", "a")

> DF[keeps]

> DF

Order function for sort:

d3=data.frame(roll=c(2,4,6,3,1,5),

name=c('a','b','c','d','e','e'),

marks=c(44,55,22,33,66,77))

> d3

d3[order(d3$roll),]

d3[with(d3,order(roll)),]

Subsets: roll=c(1:5)
names=c(letters[1:5])
marks=c(12,33,44,55,66)
d4=data.frame(roll,names,marks)
sub1=subset(d4,marks>33 & roll>4)
sub1
sub1=sub1=subset(d4,marks>33 & roll>4,select = c(roll,names))
sub1

Drop factor levels in a subsetted data frame:

df <- data.frame(letters=letters[1:5], numbers=seq(1:5))

df levels(df$letters)
sub2=subset(df,numbers>3) sub2
levels(sub2$letters)
sub2$letters=factor(sub2$letters)
levels(sub2$letters)

Rename Columns in R
colnames(d)[colnames(d)==“roll"]=“ID“

Sorting a vector
x= sample(1:50)
x = sort(x, decreasing = TRUE)
The function sort() is used for sorting a 1 dimensional vector. It cannot be used for more than
1 dimensional vector.

Dealing with missing data

We assume mydata as a data frame which is already available.

Number of missing values in a variable
colSums(is.na(mydata))
Number of missing values in a row
rowSums(is.na(mydata))
List rows of data that have missing values
mydata[!complete.cases(mydata),]
Creating a new dataset without missing data
mydata1 <- na.omit(mydata)
Convert a value to missing
mydata[mydata$Q1==999,"Q1"] <- NA
Experiment No. 6

Write a R program to take input from the user (name and age) and display the values. Also
print the version of R installation.

R Programming Code :

name = readline(prompt="Input your name: ")

age = readline(prompt="Input your age: ")
print(paste("My name is",name, "and I am",age ,"years old."))
print(R.version.string)

Sample Output:

Input your name: Input

your age:
[1] "My name is and I am years old."
[1] "R version 3.4.4 (2018-03-15)"

Result:
Thus, the program is executed successfully.
Experiment No. 7

Write a R program to create a sequence of numbers from 20 to 50 and find the mean of
numbers from 20 to 60 and sum of numbers from 51 to 91.

R Programming Code :

print("Sequence of numbers from 20 to 50:")

print(seq(20,50))
print("Mean of numbers from 20 to 60:")
print(mean(20:60))
print("Sum of numbers from 51 to 91:")
print(sum(51:91))

Sample Output:

[1] "Sequence of numbers from 20 to 50:"

[1] 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
40 41 42 43 44
[26] 45 46 47 48 49 50
[1] "Mean of numbers from 20 to 60:"
[1] 40
[1] "Sum of numbers from 51 to 91:" [1] 2911

Result:
Thus, the program is executed successfully.
Experiment No. 8

Write a R program to create three vectors a,b,c with 3 integers. Combine the three vectors to
become a 3×3 matrix where each column represents a vector. Print the content of the matrix.

R Programming Code :

a<-
c(1,2,3)
b<-
c(4,5,6)
c<-
c(7,8,9)
m<-cbind(a,b,c)
print("Content of the said matrix:")
print(m)

Sample Output:

[1] "Content of the said matrix:" a b c

[1,] 1 4 7 [2,] 2
5 8 [3,] 3 6 9

Result:
Thus, the program is executed successfully.
Experiment No. 9

Write a R program to concatenate two given matrixes of same column but different rows.

R Programming Code :

x = matrix(1:12, ncol=3)
y = matrix(13:24, ncol=3)
print("Matrix-
1") print(x)
print("Matrix-
2") print(y)
result = dim(rbind(x,y))
print("After concatenating two given matrices:")
print(result)

Sample Output:

[1] "Matrix-1"
[,1] [,2] [,3] [1,] 1
5 9 [2,] 2 6 10 [3,] 3
7 11 [4,] 4 8 12 [1]
"Matrix-2"
[,1] [,2] [,3]
[1,] 13 17 21 [2,] 14 18
22 [3,] 15 19 23 [4,] 16
20 24
[1] "After concatenating two given matrices:" [1] 8 3

Result:
Thus, the program is executed successfully.
Experiment No. 10

Write a R program to create a data frame from four given vectors.

R Programming Code :

name = c('Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin',
'Jonas')
score = c(12.5, 9, 16.5, 12, 9, 20, 14.5, 13.5, 8, 19)
attempts = c(1, 3, 2, 3, 2, 3, 1, 1, 2, 1)
qualify = c('yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes')
print("Original data frame:")
print(name)
print(score)
print(attempts)
print(qualify)
df = data.frame(name, score, attempts, qualify)
print(df)

Sample Output:

[1] "Original data frame:"

[1] "Anastasia" "Dima" "Katherine" "James" "Emily"
"Michael"
[7] "Matthew" "Laura" "Kevin" "Jonas"
[1] 12.5 9.0 16.5 12.0 9.0 20.0 14.5 13.5 8.0 19.0 [1] 1 3 2 3 2 3 1 1 2 1
[1] "yes" "no" "yes" "no" "no" "yes" "yes" "no" "no" "yes" name score attempts qualify
1 Anastasia 12.5 1 yes
2 Dima 9.0 3 no
3 Katherine 16.5 2 yes
4 James 12.0 3 no
5 Emily 9.0 2 no
6 Michael 20.0 3 yes
7 Matthew 14.5 1 yes
8 Laura 13.5 1 no
9 Kevin 8.0 2 no
10 Jonas 19.0 1 yes
Result:
Thus, the program is executed successfully.
Experiment No. 11

Write a R program to sort a given data frame by multiple column(s).

R Programming Code :

exam_data = data.frame(
name = c('Anastasia', 'Amsa', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew'),
score = c(12.5, 9, 16.5, 12, 9, 20, 14.5),
attempts = c(1, 3, 2, 3, 2, 3, 1),
qualify = c('yes', 'no', 'yes', 'no', 'no', 'yes', 'yes')
)
print("Original dataframe:")
print(exam_data)
print("dataframe after sorting 'name' and 'score' columns:")
exam_data = exam_data[with(exam_data, order(name, score)),
] print(exam_data)

Sample Output:
[1] "Original dataframe:"
name score attempts qualify
1 Anastasia 12.5 1 yes
2 Amsa 9.0 3 no
3 Katherine 16.5 2 yes
4 James 12.0 3 no
5 Emily 9.0 2 no
6 Michael 20.0 3 yes
7 Matthew 14.5 1 yes
[1] "dataframe after sorting 'name' and 'score' columns:"
name score attempts qualify
2 Amsa 9.0 3 no
1 Anastasia 12.5 1 yes
5 Emily 9.0 2 no
4 James 12.0 3 no
3 Katherine 16.5 2 yes
7 Matthew 14.5 1 yes
6 Michael 20.0 3 yes
Result:
Thus, the program is executed successfully.
Experiment No. 12

Write a R program to count the number of NA values in a data frame column.

R Programming Code :

exam_data = data.frame(
name = c('Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin',
'Jonas'),
score = c(12.5, 9, 16.5, 12, 9, 20, 14.5, 13.5, 8,
19), attempts = c(1, NA, 2, NA, 2, NA, 1, NA, 2,
1),
qualify = c('yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes')
)
print("Original dataframe:")
print(exam_data)
print("The number of NA values in attempts column:")
print(sum(is.na(exam_data$attempts)))

Sample Output:
[1] "Original dataframe:"
name score attempts qualify
1 Anastasia 12.5 1 yes
2 Dima 9.0 NA no
3 Katherine 16.5 2 yes
4 James 12.0 NA no
5 Emily 9.0 2 no
6 Michael 20.0 NA yes
7 Matthew 14.5 1 yes
8 Laura 13.5 NA no
9 Kevin 8.0 2 no
10 Jonas 19.0 1 yes
[1] "The number of NA values in attempts column:" [1] 4

Result:
Thus, the program is executed successfully.
Experiment No. 13

Bar Plot
There are two types of bar plots- horizontal and vertical which represent data points as horizontal
or vertical bars of certain lengths proportional to the value of the data item. They are generally
used for continuous and categorical variable plotting. By setting the horiz parameter to true and
false, we can get horizontal and vertical bar plots respectively.

13. Write a R program to create a simple bar plot of four subjects’ marks.
marks = c(70, 95, 80, 74)
barplot(marks,main = "Comparing marks of 5 subjects",
xlab = "Marks, ylab = "Subject",
names.arg = c("English", "Science", "Math.", "Hist."),
col = "darkred",
horiz = FALSE)
Output:
> marks = c(70, 95, 80, 74)
>barplot(marks,main = "Comparing marks of 5 subjects",
+ xlab = "Marks",
+ ylab = "Subject",
+ names.arg = c("English", "Science", "Math.", "Hist."),
+ col = "darkred",
+ horiz = FALSE)

Result:

Thus, the bar chart is created successfully.

Experiment No. 14

14. Write a R program to create a simple bar plot for ozone concentration in air with “airquality”
dataset.
# Horizontal Bar Plot for
# Ozone concentration in air
barplot(airquality$Ozone,
main = 'Ozone Concenteration in air',
xlab = 'ozone levels', horiz = TRUE)
# Vertical Bar Plot for
# Ozone concentration in air
barplot(airquality$Ozone, main = ‘Ozone Concenteration in air’,
xlab = ‘ozone levels’, col =’blue’, horiz = FALSE)

Result:

Thus, the bar chart is created successfully.

Experiment No. 15

A histogram is like a bar chart as it uses bars of varying height to represent data distribution.
However, in a histogram values are grouped into consecutive intervals called bins. In a
Histogram, continuous values are grouped and displayed in these bins whose size can be varied.
For a histogram, the parameter xlim can be used to specify the interval within which all values
are to be displayed.
Another parameter freq when set to TRUE denotes the frequency of the various values in the
histogram and when set to FALSE, the probability densities are represented on the y-axis such
that they are of the histogram adds up to one.
Histograms are used in the following scenarios:
• To verify an equal and symmetric distribution of the data.
• To identify deviations from expected values.
15. Write a R program to create a histogram for maximum daily temperature for with “airquality”
dataset.

hist(airquality$Temp, main ="La Guardia Airport's\

Maximum Temperature(Daily)",
xlab ="Temperature(Fahrenheit)",
xlim = c(50, 125), col ="yellow",
freq = TRUE)

Result:

Thus, the histogram is created successfully.

Experiment No. 16

Box Plot
The statistical summary of the given data is presented graphically using a boxplot. A boxplot
depicts information like the minimum and maximum data point, the median value, first and third
quartile, and interquartile range.

Box Plots are used for:

• To give a comprehensive statistical description of the data through a visual cue.
• To identify the outlier points that do not lie in the inter-quartile range of data.

16. Write a R program to create a boxplot for the variable “wind” with “airquality” dataset.

# Box plot for average wind speed

data(airquality)

boxplot(airquality$Wind, main = "Average wind speed\

at La Guardia Airport",
xlab = "Miles per hour", ylab = "Wind",
col = "orange", border = "brown",
horizontal = TRUE, notch = TRUE)
# Multiple Box plots, each representing
# an Air Quality Parameter
boxplot(airquality[, 0:4],
main ='Box Plots for Air Quality Parameters')

Result:

Thus, the boxplot is created successfully.

R-Programming Notes
100% (1)
R-Programming Notes
33 pages
R - A Practical Course
No ratings yet
R - A Practical Course
42 pages
MATH 7 CURRICULUM MAP 1st Quarter
100% (2)
MATH 7 CURRICULUM MAP 1st Quarter
4 pages
R Lab
No ratings yet
R Lab
114 pages
R Language Lab Manual Lab 1
100% (1)
R Language Lab Manual Lab 1
33 pages
R-Basic Concepts
No ratings yet
R-Basic Concepts
67 pages
R Language Lab Manual Lab 1
No ratings yet
R Language Lab Manual Lab 1
32 pages
Introduction To R
No ratings yet
Introduction To R
20 pages
R prog lab manual theory.docx
No ratings yet
R prog lab manual theory.docx
16 pages
MIS 4.hafta (Introduction To R)
No ratings yet
MIS 4.hafta (Introduction To R)
52 pages
R Software - Notes
No ratings yet
R Software - Notes
18 pages
R PPT
No ratings yet
R PPT
63 pages
R Studio
No ratings yet
R Studio
41 pages
SSMDA Expt 7
No ratings yet
SSMDA Expt 7
16 pages
Introduction to Analytics and R file
No ratings yet
Introduction to Analytics and R file
29 pages
R Session A
No ratings yet
R Session A
107 pages
Part I: Introductory Materials: Introduction To R
No ratings yet
Part I: Introductory Materials: Introduction To R
25 pages
R PROGRAMMING LAB MANUAL
No ratings yet
R PROGRAMMING LAB MANUAL
35 pages
Prerequis R
No ratings yet
Prerequis R
38 pages
Da Session 4
No ratings yet
Da Session 4
75 pages
Introduction To Rlogistic
No ratings yet
Introduction To Rlogistic
135 pages
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
No ratings yet
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
58 pages
Introduction To R PDF
No ratings yet
Introduction To R PDF
56 pages
Week 1-R Programming Notes
No ratings yet
Week 1-R Programming Notes
15 pages
Basics of R Programming - Part 2
No ratings yet
Basics of R Programming - Part 2
7 pages
R Intro
No ratings yet
R Intro
227 pages
Satyam Jha r File
No ratings yet
Satyam Jha r File
41 pages
AnalyticsEdge Rmanual PDF
100% (1)
AnalyticsEdge Rmanual PDF
44 pages
Chapter 1 Introduction To R
No ratings yet
Chapter 1 Introduction To R
33 pages
Introduction to R
No ratings yet
Introduction to R
23 pages
R language basics
No ratings yet
R language basics
13 pages
R Course ISLR Basics 2023
No ratings yet
R Course ISLR Basics 2023
77 pages
Lecture 1
No ratings yet
Lecture 1
42 pages
Introduction To R: Pavan Kumar A
No ratings yet
Introduction To R: Pavan Kumar A
55 pages
Programming With R: Lecture #4
No ratings yet
Programming With R: Lecture #4
34 pages
Module 1-1
No ratings yet
Module 1-1
38 pages
2 Undefined
No ratings yet
2 Undefined
86 pages
Introduction To R
No ratings yet
Introduction To R
34 pages
R Project
0% (1)
R Project
25 pages
Computing With R
No ratings yet
Computing With R
20 pages
All v2 Basic Statistics Using R
No ratings yet
All v2 Basic Statistics Using R
241 pages
WINSEM2021-22 MAT2001 ELA VL2021220501462 Reference Material I 04-01-2022 1. Introduction of R Language - I
No ratings yet
WINSEM2021-22 MAT2001 ELA VL2021220501462 Reference Material I 04-01-2022 1. Introduction of R Language - I
15 pages
1. About R Language
No ratings yet
1. About R Language
15 pages
SEE_R_Practical_Dhara
No ratings yet
SEE_R_Practical_Dhara
57 pages
Unit 2 Notes - Data Analysis Using r
No ratings yet
Unit 2 Notes - Data Analysis Using r
19 pages
S24_STATS10_LAB1-1
No ratings yet
S24_STATS10_LAB1-1
8 pages
Intro To Data Science Lecture 3
No ratings yet
Intro To Data Science Lecture 3
18 pages
RStudio Exercices
No ratings yet
RStudio Exercices
8 pages
Notes For R Tool
No ratings yet
Notes For R Tool
74 pages
It Workshop Lab File
No ratings yet
It Workshop Lab File
39 pages
R Programming Slides
No ratings yet
R Programming Slides
73 pages
Untitled
No ratings yet
Untitled
59 pages
KD Lab - 1 Introductions To R
No ratings yet
KD Lab - 1 Introductions To R
12 pages
Basic-coding-syntax-and-structure-in-R---version-2
No ratings yet
Basic-coding-syntax-and-structure-in-R---version-2
19 pages
Getting Started in R
No ratings yet
Getting Started in R
39 pages
R Programming
No ratings yet
R Programming
59 pages
In R programming pdf
No ratings yet
In R programming pdf
72 pages
r Studio Manual
No ratings yet
r Studio Manual
61 pages
Basics PDF
No ratings yet
Basics PDF
21 pages
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
From Everand
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
Ginno
No ratings yet
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
From Everand
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
Nikhil Khan
No ratings yet
PRMO - Geometry - Centroid
No ratings yet
PRMO - Geometry - Centroid
45 pages
12th_Probability
No ratings yet
12th_Probability
18 pages
Chapter 12 - Introduction To Three Dimensional Geometry Revision Notes
No ratings yet
Chapter 12 - Introduction To Three Dimensional Geometry Revision Notes
7 pages
Problem 10.26:: E E E, Note
No ratings yet
Problem 10.26:: E E E, Note
8 pages
9709_w24_ms_43
No ratings yet
9709_w24_ms_43
19 pages
WWW - Madeeasy.in Admin UploadDocument ExamSol ME GATE201416Mor
No ratings yet
WWW - Madeeasy.in Admin UploadDocument ExamSol ME GATE201416Mor
19 pages
Expt 1_Curve Fitting
No ratings yet
Expt 1_Curve Fitting
29 pages
Introduction To Management I 1006
No ratings yet
Introduction To Management I 1006
23 pages
Basic Calculus
100% (3)
Basic Calculus
292 pages
Month 4 Lessons 18 - 23 D
No ratings yet
Month 4 Lessons 18 - 23 D
48 pages
Tutorial 1 Discrete Probability Distribution: STA408: Statistics For Science and Engineering
No ratings yet
Tutorial 1 Discrete Probability Distribution: STA408: Statistics For Science and Engineering
6 pages
The Nyquist Plot A Frequency Response Analysis Technique
No ratings yet
The Nyquist Plot A Frequency Response Analysis Technique
33 pages
Garner March 23 Lesson Plan
No ratings yet
Garner March 23 Lesson Plan
4 pages
Matrices and Determinants Scribed
No ratings yet
Matrices and Determinants Scribed
2 pages
Quest - Potential Energy and Energy Conservation
No ratings yet
Quest - Potential Energy and Energy Conservation
9 pages
Orientation of Runway: The Runway Is Usually Oriented in The Direction of The Prevailing Winds
No ratings yet
Orientation of Runway: The Runway Is Usually Oriented in The Direction of The Prevailing Winds
20 pages
Limits and Continuity Worksheet(MNS)
No ratings yet
Limits and Continuity Worksheet(MNS)
7 pages
Class Test
100% (1)
Class Test
15 pages
MBA - V20PBBA02 - EA2252001010148 - G V N Selvavindhan Vaither
No ratings yet
MBA - V20PBBA02 - EA2252001010148 - G V N Selvavindhan Vaither
4 pages
DLL Mathematics 5 q1 w3
0% (1)
DLL Mathematics 5 q1 w3
9 pages
Pelton Wheel Experiment
No ratings yet
Pelton Wheel Experiment
7 pages
Models For Insulation Aging Under Electrical and Thermal Multistress
No ratings yet
Models For Insulation Aging Under Electrical and Thermal Multistress
12 pages
Delhi Public School Bangalore North ACADEMIC SESSION 2022-2023 Class - Viii
No ratings yet
Delhi Public School Bangalore North ACADEMIC SESSION 2022-2023 Class - Viii
28 pages
Adnan Moon-2010-The Apollonian Circles and Isodynamic Points-14p
No ratings yet
Adnan Moon-2010-The Apollonian Circles and Isodynamic Points-14p
14 pages
CHE456 Syllabus Fall2013
No ratings yet
CHE456 Syllabus Fall2013
6 pages
Prestressing Manual - Stresing
No ratings yet
Prestressing Manual - Stresing
3 pages
Quantitative Strategic Planning Matrix
50% (2)
Quantitative Strategic Planning Matrix
13 pages
HANDOUT
No ratings yet
HANDOUT
13 pages