0% found this document useful (0 votes)

18 views40 pages

Cs3361 Datascience Lab Record

manual

Uploaded by

814723104177

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views40 pages

Cs3361 Datascience Lab Record

manual

Uploaded by

814723104177

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

SRM TRP ENGINEERING COLLEGE

IRUNGALUR, TIRUCHIRAPALLI– 621 105

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

CS3361- DATA SCIENCE LABORATORY

Name

Roll No.

Reg. No.
SRM TRP ENGINEERING COLLEGE
IRUNGALUR, TIRUCHIRAPALLI-621 105.

RECORD NOTE BOOK

This is to certify that this Practical work titled CS3361 DATA SCIENCE LABORATORY record of
work done by Mr./Ms. of

III Semester, Department of Computer Science and Engineering during the academic year
2024-2025.

Faculty in-charge Head of the Department

Submitted for the University Practical Examination on …………………..

Internal Examiner External Examiner

TABLE OF CONTENTS

Ex. Page
Date Exercise No Marks Sign
No.
Download, Install and Explore The Features Of Numpy,
1 Scipy, Jupyter, Statsmodels and Pandas Packages

2 Working With Numpy Arrays

3 Working with pandas data frames

Reading data from text files, Excel and the web and
4 exploring various commands for doing descriptive
analytics on the Iris data set.

5 Use the diabetes data set from UCI and Pima Indians
Diabetes data set for performing the following:

5(a) Univariate analysis: Frequency, Mean, Median, Mode,

Variance, Standard Deviation, Skewness and Kurtosis.

5(b) Bivariate analysis: Linear and logistic regression

modeling

5(c) Multiple Regression analysis

Apply and explore various plotting functions on UCI
6 data sets.

6(a) Normal curves

6(b) Density and contour plots

6(c) Correlation and scatter plots

6(d) Histograms

6(e) Three dimensional plotting

7 Visualizing Geographic Data with Basemap

SRM TRP Engineering College, Trichy
Department of Computer Science and Engineering
Vision of the Institute

To carve the youth as dynamic, competent, valued and knowledgeable Technocrats through
research, innovation and entrepreneurial development for accomplishing the global expectations.

Mission of the Institute

M1: To inculcate academic excellence in engineering education to create talented professionals

M2: To promote research in basic sciences and applied engineering among faculty and students to
fulfill the societal expectations.

M3: To enhance the holistic development of students through meaningful interaction with industry
and academia.

M4: To foster the students on par with sustainable development goals thereby contributing to the
process of nation building

M5: To nurture and retain conducive lifelong learning environment towards professional
excellence.

Vision of the Department

To be recognized as Centre of Excellence for innovation and research in computer science and
engineering through the futuristic technologies by developing technocrats with ethical values to
serve the society at global level.

Mission of the Department

M1: To develop quality and technically competent computer professionals

through excellence in academics.
M2: To encouraging the faculty and students towards research and
development with advanced tools and technologies.

M3: To enhance industry institute interaction to build a strong technical

expertise among the students.
M4: To inculcate leadership skills with ethical behaviors and social
consciousness within the students.
M5: To nurture professional empowerment among students through continuous Learning.

Program Educational Objectives (PEO's)

The graduate of Computer Science and Engineering will have

PEO1: Ability to analyze and get solutions in the field of Computer Science and Engineering
through application of fundamental knowledge of Mathematics, Science and Electronics
(Preparation).
PEO2: Innovative ideas, methods and techniques thereby rendering expertise to the industrial
and societal needs in an effective manner and will be a competent computer/software
engineer (Core Competency).

PEO3: Good and broad knowledge with interpersonal skills so as to comprehend, analyze, design
and create novel products and solutions for real-time applications (Breadth).

PEO4: Professional with ethical values to develop leadership, effective communication skills and
teamwork to excel in career. (Professionalism)

PEO5: Strive to learn continuously and update their knowledge in the specific fields of computer
science & engineering for the societal growth. (Learning environment).

Program Outcomes (PO'S)

PO1: Engineering knowledge: Apply the basic knowledge of science, mathematics and
engineering fundamentals in the field of Computer Science and Engineering to solve complex
engineering problems.

PO2: Problem analysis: Ability to use basic principles of mathematics, natural sciences, and
engineering sciences to Identify, formulate, review research literature and analyze Computer
Science and engineering problems.
PO3: Design/development of solutions: Ability to design solutions for complex Computer
Science and engineering problems and basic design system to meet the desired needs within
realistic constraints such as manufacturability, durability, reliability, sustainability and economy
with appropriate consideration for the public health, safety, cultural, societal, and environmental
considerations.
PO4: Conduct investigations of complex problems: Ability to execute the experimental
activities using research-based knowledge and methods including analyze, interpret the data and
results with valid conclusion.

PO5: Modern tool usage: Ability to use state of the art of techniques, skills and modern
engineering tools necessary for engineering practice to satisfy the needs of the society with an
understanding of the limitations.

PO6: The Engineer and Society: Ability to apply reasoning informed by the contextual
knowledge to assess the impact of Computer Science and engineering solutions in legal, health,
cultural, safety and societal context and the consequent responsibilities relevant to the professional
engineering practice.

PO7: Environment and sustainability: Ability to understand the professional responsibility and
accountability to demonstrate the need for sustainable development globally in Computer
Science domain with consideration of environmental effect.

PO8: Ethics: Ability to understand and apply ethical principles and commitment to address the
professional ethical responsibilities of an engineer.

PO9: Individual and team work: Ability to function efficiently as an individual or as a group
member or leader in a team in multidisciplinary environment.

PO10: Communication: Ability to communicate, comprehend and present effectively with

engineering community and the society at large on complex engineering activities by receiving
clear instructions for preparing effective reports, design documentation and presentations.

PO11: Project management and finance: Ability to acquire and demonstrate the knowledge of
contemporary issues related to finance and managerial skills in one’s own work, as a member and
leader in a team, to manage projects and in multidisciplinary environments.

PO12: Life-long learning: Ability to recognize and adapt to the emerging field of application in
engineering and technology by developing self-confidence for lifelong learning process.

Program Specific Outcome (PSO's)

The graduates of Bachelor of Engineering in Computer Science and Engineering Programme will
be able to:
PSO1:Use Data structures, Data management, Networking, System software, Data science with
high end programming skills to design and implement automation in various domains of emerging
technologies.
PSO2: Apply engineering knowledge in project development with the end products and services
in the field of hardware and software platform to accomplish the industry expectations.
SRM TRP ENGINEERING COLLEGE
IRUNGALUR, TRICHY

COURSE OUTCOMES: At the end of this course, the students will be able to:
CO1: Make use of the python libraries for data science
CO2: Make use of the basic Statistical and Probability measures for data science.
CO3: Perform descriptive analytics on the benchmark data sets.
CO4: Perform correlation and regression analytics on standard data sets
CO5: Present and interpret data using visualization packages in Python.

CO’s-PO’s &PSO’s MAPPING

Ex. No:1 DOWNLOAD, INSTALL AND EXPLORE THE FEATURES
Date: OF NUMPY, SCIPY, JUPYTER, STATSMODELS AND
PANDAS PACKAGES

INTRODUCTION – SOFTWARE INSTALLATION

Software required

Spyder IDE.

What is Spyder?

Spyder is an open- source cross- platform IDE.

Written completely in Python
Also called as Scientific Python Development IDE.

Features of Spyder

Syntax highlight
Availability of breakponts
Run configuration
Automatic colon insertion after if, while, etc..
Support all ipython commands.
Inline display for graphics produced using Matplotlib.
Also provides features such as help, file, explorer, find files and so on.

SPYDER IDE Installation

Comes as a default implementation along with the Anaconda python distribution.

Step 1:
Go to Anaconda website https://www.anaconda.com.

Step 2: Click get started and click on download

option.

Step 3: Choose the version that is suitable for your OS and click on
download.

Step 4: Complete the Setup and Click on

Finish.

Step 5:
Launch Sypder from the Anaconda Navigator.
2) Working With Numpy Arrays

import numpy as np
a=np.array([[1,2,3],[4,5,6]])
b=np.array([[10,11,12],[13,14,15]])
c= a + b
print(c)
[[11 13 15]
[17 19 21]]

import numpy as np
a= np.array([[1,2,3],[4,5,6]])
b= 3 * a
print(b)
[[ 3 6 9]
[12 15 18]]

import numpy as np
i=np.eye(4)
print(i)

[[1. 0. 0. 0.]
[0. 1. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 1.]]

import numpy as np
a=np.array([[1,2,3],[4,5,6], [7,8,9]])
b=np.array([[2,3,4],[5,6,7],[8,9,10]])
c= a@b
print(c)

[[ 36 42 48]
[ 81 96 111]
[126 150 174]]

import numpy as np
a =np.array([[1,2,3],[4,5,6], [7,8,9]])
b = a.T
print(b)
[[1 4 7]
[2 5 8]
[3 6 9]]

import numpy as np
a =np.array([[2.5, 3.8, 1.5],[4.7, 2.9, 1.56]])
b = a.astype('int')
print(b)
[[2 3 1]
[4 2 1]]
import numpy as np
a1 =np.array([[1,2,3],[4,5,6]])
a2 = np.array([[7,8,9],[10,11,12]])
c = np.hstack((a1, a2))
print(c)

[[ 1 2 3 7 8 9]

[ 4 5 6 10 11 12]]

import numpy as np
a = np.array([[1,2],[3,4], [5,6]])
b = np.array([[7,8],[9,10], [10,11]])
c = np.vstack((a, b))
print(c)

[[ 1 2]
[ 3 4]
[ 5 6]
[ 7 8]
[ 9 10]
[10 11]]

import numpy as np
list = [x for x in range(0, 101, 2)]
a=np.array(list)
print(a)
[ 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70
72 74 76 78 80 82 84 86 88 90 92 94 96 98 100]

import numpy as np
a =np.full((2, 3), 5)
print(a)

[[5 5 5]
[5 5 5]]

import numpy as np
a = np.array ([[1, 4, 2], [3, 4, 6],[0, -1, 5]])
print (np.sort(a, axis = None))
print (np.sort(a, axis = 1))
print (np.sort(a, axis = 0))

[-1 0 1 2 3 4 4 5 6]
[[ 1 2 4]
[ 3 4 6]
[-1 0 5]]
[[ 0 -1 2]
[ 1 4 5]
[ 3 4 6]]
3) Working With Numpy Arrays

import pandas as pd
df =pd.DataFrame()
print(df)

Empty DataFrame
Columns: []
Index: []

import pandas as pd
data =[1,2,3,4,5]
df =pd.DataFrame(data)
print (df)

0
0 1
1 2
2 3
3 4
4 5

import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print (df)
Name Age
0 Alex 10
1 Bob 12
2 Clarke 13

import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve','Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print (df)
Name Age
0 Tom 28
1 Jack 34
2 Steve 29
3 Ricky 42

import pandas as pd
d = { 'one' :pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' :pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print (df)
one two
a 1.0 1
b 2.0 2
c 3.0 3
d NaN 4
import pandas as pd
data ={ 'Name':['Tom', 'Jack', 'Steve','Ricky'],'Age':[28,34,29,42]}
df =pd.DataFrame(data)
print (df)
df_sorted = df.sort_values(by='Name')
print ("Sorted data frame…")
print(df_sorted)

Name Age
0 Tom 28
1 Jack 34
2 Steve 29
3 Ricky 42
Sorted data frame…
Name Age
1 Jack 34
3 Ricky 42
2 Steve 29
0 Tom 28

import pandas as pd
d = { 'one' :pd.Series([1, 2, 3],
index=['a', 'b', 'c']),
'two' :pd.Series([1, 2, 3, 4],
index=['a', 'b', 'c', 'd'])}
df =pd.DataFrame(d)
print(df [ 'one'])
a 1.0
b 2.0
c 3.0
d NaN
Name: one, dtype: float64

import pandas as pd
d = { 'one' :pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' :pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
df['three']=pd.Series([10,20,30],index=['a','b','c'])
print(df)
one two three
a 1.0 1 10.0
b 2.0 2 20.0
c 3.0 3 30.0
d NaN 4 NaN
import pandas as pd
d = { 'one' :pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' :pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']),
'three' :pd.Series([10,20,30], index=['a','b','c'])}
df = pd.DataFrame(d)
print ("Deleting the first column using DEL function:")
del df['one']
print(df)

Deleting the first column using DEL function:

two three
a 1 10.0
b 2 20.0
c 3 30.0
d 4 NaN

import pandas as pd
d = { 'one' :pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' :pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print
df.loc['b']
one 2.0
two 2.0
Name: b, dtype: float64

import pandas as pd
df =pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 =pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df =df.append(df2)
print (df)

a b
0 1 2
1 3 4
0 5 6
1 7 8

import pandas as pd
df =pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 =pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df =df.append(df2) # Drop rows with label 0 df =
df.drop(0)
print (df)
a b
0 1 2
1 3 4
0 5 6
1 7 8
4) Reading data from text files, Excel and the web and
exploring various commands for doing descriptive
analytics on the Iris data set.

import pandas as pd
df=pd.read_csv("iris_csv.csv")
df.head()
df.shape
df.info()
df.describe()
df.isnull().sum()
df.value_counts("class")
Output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
# Column Non-Null Count Dtype

0 sepallength 150 non-null float64

1 sepalwidth 150 non-null float64
2 petallength 150 non-null float64
3 petalwidth 150 non-null float64
4 class 150 non-null object

dtypes: float64(4), object(1)

memory usage: 6.0+ KB
class
Iris-setosa 50
Iris-versicolor 50
Iris-virginica 50
dtype: int64
5a) Univariate analysis: Frequency, Mean, Median, Mode,
Variance, Standard Deviation, Skewness and Kurtosis.

import pandas as pd
import numpy as np
import statistics as st
df=pd.read_csv("diabetes_csv.csv")
print(df.shape)
print(df.info())
print('MEAN:\n',df.mean())
print('MEDIAN:\n',df.median())
print('MODE:\n',df.mode())
print('STANDARD DEVIATION\n',df.std())
print('VARIANCE:\n:',df.var())
print('SKEWNESS:\n:',df.skew())
print('KURTOSIS|n',df.kurtosis())
df.describe

Output:
(768, 9)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 768 entries, 0 to 767
Data columns (total 9 columns):
# Column Non-Null Count Dtype

0 preg 768 non-null int64

1 plas 768 non-null int64
2 pres 768 non-null int64
3 skin 768 non-null int64
4 insu 768 non-null int64
5 mass 768 non-null float64
6 pedi 768 non-null float64
7 age 768 non-null int64
8 class 768 non-null object
dtypes: float64(2), int64(6), object(1)
memory usage: 54.1+ KB
None
MEAN:
preg 3.845052
plas 120.894531
pres 69.105469
skin 20.536458
insu 79.799479
mass 31.992578
pedi 0.471876
age 33.240885
dtype: float64
MEDIAN:
preg 3.0000
plas 117.0000
pres 72.0000
skin 23.0000
insu 30.5000
mass 32.0000
pedi 0.3725
age 29.0000
dtype: float64
MODE:
preg plas pres skin insu mass pedi age class
0 1.0 99 70.0 0.0 0.0 32.0 0.254 22.0 tested_negative
1 NaN 100 NaN NaN NaN NaN 0.258 NaN NaN
STANDARD DEVIATION
preg 3.369578

plas 31.972618
pres 19.355807
skin 15.952218
insu 115.244002
mass 7.884160
pedi 0.331329
age 11.760232
dtype: float64

VARIANCE:
: preg 11.354056
plas 1022.248314
pres 374.647271
skin 254.473245
insu 13281.180078
mass 62.159984
pedi 0.109779
age 138.303046
dtype: float64
SKEWNESS:
: preg 0.901674
plas 0.173754
pres -1.843608
skin 0.109372
insu 2.272251
mass -0.428982
pedi 1.919911
age 1.129597
dtype: float64
KURTOSIS|n preg 0.159220
plas 0.640780
pres 5.180157
skin -0.520072
insu 7.214260
mass 3.290443
pedi 5.594954
age 0.643159
dtype: float64
pedi age class
0 6 148 72 35 0 33.6 0.62 50 tested_positiv
7 e
1 1 85 66 29 0 26.6 0.35 31 tested_negativ
1 e
2 8 183 64 0 0 23.3 0.67 32 tested_positiv
2 e
3 1 89 66 23 94 28.1 0.16 21 tested_negativ
7 e
4 0 137 40 35 168 43.1 2.28 33 tested_positiv
8 e
.. ... ... ... ... ... ... ... ... ...
763 10 101 76 48 180 32.9 0.17 63 tested_negativ
1 e
764 2 122 70 27 0 36.8 0.34 27 tested_negativ
0 e
765 5 121 72 23 112 26.2 0.24 30 tested_negativ
5 e
766 1 126 60 0 0 30.1 0.34 47 tested_positiv
9 e
767 1 93 70 31 0 30.4 0.31 23 tested_negativ
5 e

[768 rows x 9 columns]>

5b) Bivariate analysis: Linear and logistic regression
modeling
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from sklearn import datasets

from sklearn.model_selection import train_test_split
from sklearn import linear_model
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import cross_val_score

# Load the diabetes dataset

diabetes = datasets.load_diabetes()

# Put the data into a DataFrame

df = pd.DataFrame(diabetes['data'], columns=diabetes['feature_names'])x
= df
y = diabetes['target']

# Split the data into training and testing sets

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3,

# Initialize the Linear Regression model

model = linear_model.LinearRegression()

# Train the model

model.fit(x_train, y_train)

# Predict the test set results

y_pre = model.predict(x_test)

# Cross Validation Scores

scores = cross_val_score(model, x, y, scoring="neg_mean_squared_error", c
rmse_scores = np.sqrt(-scores).mean()
print('Cross validation RMSE:', rmse_scores)

# Checking predictions accuracy by r2 Score

r2 = r2_score(y_test, y_pre)
print('r^2:', r2)

# Calculating Root Mean Square Error mse

= mean_squared_error(y_test, y_pre)rmse =
np.sqrt(mse)
print('RMSE:', rmse)

# Getting Weights and Intercept of Model

print("Weights:", model.coef_)
print("\nIntercept:", model.intercept_)
Output:
Cross validation RMSE: 54.40468149952541
r^2: 0.45767579788519963
RMSE: 58.00932552866432
Weights: [ -8.02358048 -308.83941066 583.63743356 299.99074281 -360.66
454462
95.11692608 -93.03587104 118.15977759 662.11309186 26.07805489]

Intercept: 153.72032548545178
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from sklearn import datasets

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import r2_score, mean_squared_error

# Load the diabetes dataset

diabetes = datasets.load_diabetes()

# Print the keys to find the content of data

print(diabetes.keys())

# Put the data into a DataFrame

df = pd.DataFrame(diabetes['data'], columns=diabetes['feature_names'])
x = df
y = diabetes['target']

# Split the data into training and testing sets

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3,

# Build Logistic Regression Model

model = LogisticRegression()
model.fit(x_train, y_train)

# Prediction of test set result of the Prepared Model

y_pre = model.predict(x_test)

# Checking predictions accuracy by r2 Score

r2 = r2_score(y_test, y_pre)
print('r^2:', r2)

# Calculating Root Mean Square Error

mse = mean_squared_error(y_test, y_pre)
rmse = np.sqrt(mse)
print('RMSE:', rmse)

Output:
dict_keys(['data', 'target', 'frame', 'DESCR', 'feature_names', 'data_fil
ename', 'target_filename', 'data_module'])
r^2: -0.44401265478624397
RMSE: 94.65723681369009
5c) Multiple Regression analysi

import matplotlib.pyplot as plt

import numpy as np
from sklearn import datasets, linear_model, metrics
import pandas as pd

# Read the CSV file

df = pd.read_csv('diabetes.csv')

# Extract relevant
df = pd.read_csv('diabetes.csv')

# Extract relevant features and target

data = df[['Age', 'Glucose', 'BMI', 'BloodPressure', 'Pregnancies']] # C
target = df[['Outcome']]

print(data) print(target)

# Define feature matrix (X) and response vector (y)

X = data
y = target

# Split X and y into training and testing sets

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,

# Create linear regression object

reg = linear_model.LinearRegression()

# Train the model using the training sets

reg.fit(X_train, y_train)

# Predictions on the test set

y_predict = reg.predict(X_test)

# Regression coefficients
print('Coefficients:', reg.coef_)

# Variance score: 1 means perfect prediction

print('Variance score: {}'.format(reg.score(X_test, y_test)))

# Checking predictions accuracy by r2 Scorefrom

sklearn.metrics import r2_score print('r^2:',
r2_score(y_test, y_predict))

# Calculating Root Mean Square Error

from sklearn.metrics import mean_squared_errormse
= mean_squared_error(y_test, y_predict) rmse =
np.sqrt(mse)
print('RMSE:', rmse)
Output:
Age Glucose BMI BloodPressure Pregnancies
0 50 148 33.6 72 6
1 31 85 26.6 66 1
2 32 183 23.3 64 8
3 21 89 28.1 66 1
4 33 137 43.1 40 0
.. ... ... ... ... ...
763 63 101 32.9 76 10
764 27 122 36.8 70 2
765 30 121 26.2 72 5
766 47 126 30.1 60 1
767 23 93 30.4 70 1

[768 rows x 5 columns]

Outcome
0 1
1 0
2 1
3 0
4 1
.. ...
763 0
764 0
765 0
766 1
767 0

[768 rows x 1 columns]

Coefficients: [[ 0.00362921 0.0057603 0.01359201 -0.0022797 0.019033
24]]
Variance score: 0.3119613858813981
r^2: 0.3119613858813981
RMSE: 0.3958061749043919
6a) Normal curves

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df=pd.read_csv("Heart.csv")
f,ax=plt.subplots(figsize=(10,6))
x=df['Age']
ax=sns.distplot(x,bins=10)
plt.show()

Output:
6b) Density and contour plots

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df=pd.read_csv("Heart.csv")
f,ax=plt.subplots(figsize=(10,6))
x=df['Age']
X=pd.Series(x,name="Age variable")
ax=sns.kdeplot(x,shade=True,color='r')
plt.show()

Output:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df=pd.read_csv("Heart.csv")
f,ax=plt.subplots(figsize=(8,6))
ax=sns.countplot(x='ChestPain',data=df)
plt.show()

Output:
6c) Correlation and scatter plots

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df=pd.read_csv("Heart.csv")
sns.pairplot(data=df)

Output:
<seaborn.axisgrid.PairGrid at 0x1a1534e2130>
6d) Histograms

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df=pd.read_csv("Heart.csv")
plt.figure(figsize=(20,10))
sns.heatmap(df.corr(),annot=True,cmap='terrain')

Output:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df=pd.read_csv("Heart.csv")
f,ax=plt.subplots(figsize=(8,6))
ax=sns.scatterplot(x="Age",y="RestBP",data=df)
6e) Three dimensional plotting

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df=pd.read_csv("Heart.csv")
df.hist(figsize=(12,12),layout=(5,3))

Output:
array([[<AxesSubplot:title={'center':'Unnamed: 0'}>,
<AxesSubplot:title={'center':'Age'}>,
<AxesSubplot:title={'center':'Sex'}>],
[<AxesSubplot:title={'center':'RestBP'}>,
<AxesSubplot:title={'center':'Chol'}>,
<AxesSubplot:title={'center':'Fbs'}>],
[<AxesSubplot:title={'center':'RestECG'}>,
<AxesSubplot:title={'center':'MaxHR'}>,
<AxesSubplot:title={'center':'ExAng'}>],
[<AxesSubplot:title={'center':'Oldpeak'}>,
<AxesSubplot:title={'center':'Slope'}>,
<AxesSubplot:title={'center':'Ca'}>],
[<AxesSubplot:>, <AxesSubplot:>, <AxesSubplot:>]], dtype=object)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df=pd.read_csv("Heart.csv")
fig = plt.figure()
ax = plt.axes(projection='3d')
x=df["Age"]
x=pd.Series(x,name="Age variable")
y=df["Sex"]
y=pd.Series(y,name="Sex variable")
z=df["Chol"]
z=pd.Series(z,name="Cholestrol Variable")
ax.plot3D(x,y,z,'green')
ax.set_title('3D line plot Heart disease dataset')
plt.show()

Output:

I
7) Visualizing Geographic Data with Basemap

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
from cartopy.feature import NaturalEarthFeature

plt.figure(figsize=(8, 8))

# Create a Cartopy Orthographic projection

ax = plt.axes(projection=ccrs.Orthographic(central_latitude=50, central_l

# Add natural features such as coastlines, land, and ocean

ax.add_feature(NaturalEarthFeature('physical', 'land', '50m', edgecolor='
ax.add_feature(NaturalEarthFeature('physical', 'ocean', '50m', edgecolor=

# Set the title

plt.title('Orthographic Projection')

# Show the plot

plt.show()

Output:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
from cartopy.feature import NaturalEarthFeature

# Create a Cartopy Plate Carrée projection

fig, ax = plt.subplots(figsize=(6, 6), subplot_kw={'projection': ccrs.Pla

# Add natural features such as coastlines, land, and ocean

ax.add_feature(NaturalEarthFeature('physical', 'land', '50m', edgecolor='
ax.add_feature(NaturalEarthFeature('physical', 'ocean', '50m', edgecolor=

# Plot Seattle
seattle_lon, seattle_lat = -122.3, 47.6
ax.plot(seattle_lon, seattle_lat, 'ok', markersize=5)
ax.text(seattle_lon, seattle_lat, ' Seattle', fontsize=12, transform=ccrs

# Set the title

plt.title('Plate Carrée Projection')

# Show the plot

plt.show()

Output:
In [16]: %matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature
from itertools import chain

def draw_map(ax, scale=0.2):

ax.set_global()
ax.add_feature(cfeature.LAND, edgecolor='black', facecolor='lightgray
ax.add_feature(cfeature.COASTLINE, edgecolor='black', linewidth=0.5)
ax.add_feature(cfeature.BORDERS, linestyle='-', linewidth=0.5)

# Draw parallels and meridians

ax.gridlines(linestyle='-', alpha=0.3, color='white', draw_labels=Fal

# Create a figure and axes with PlateCarree projection

fig, ax = plt.subplots(subplot_kw={'projection': ccrs.PlateCarree()}, fig

# Draw the map

draw_map(ax)

# Show the plot

plt.show()

Output:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature
from itertools import chain

def draw_map(ax, scale=0.2):

# Draw parallels and meridians

ax.gridlines(linestyle='-', alpha=0.3, color='white', draw_labels=Fal

# Create a figure and axes with Mollweide projection

fig, ax = plt.subplots(subplot_kw={'projection': ccrs.Mollweide()}, figsi

# Draw the map

draw_map(ax)

# Show the plot

plt.show()

Output:
Content Beyond Syllabus

8. Use NumPy to implement a simple image processing algorithm.

Aim:
To Use NumPy to implement a simple image processing algorithm.

Procedure:
Step 1: Install Required Libraries
Make sure you have NumPy, Matplotlib, and Pillow installed. If not, you can install them
using:
pip install numpy matplotlib Pillow

Step 2: Import Libraries

Step 3: Load the Image

Step 4: Convert Image to NumPy Array

Convert the image to a NumPy array for processing:

Step 5: Display the Original Image

Use Matplotlib to display the original image:

Step 6: Grayscale Conversion

Convert the RGB image to grayscale using NumPy

Program:
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

# Load the image

image_path = "example_image.jpg"
image = Image.open(image_path)

# Convert the image to a NumPy array

image_array = np.array(image)

# Display the original image

plt.figure(figsize=(8, 4))
plt.subplot(1, 3, 1)
plt.imshow(image_array)
plt.title('Original Image')

# Grayscale conversion
gray_image = np.mean(image_array, axis=-1, keepdims=True)
# Display the grayscale image
plt.subplot(1, 3, 2)
plt.imshow(np.squeeze(gray_image), cmap='gray')
plt.title('Grayscale Image')

# Contrast adjustment (increase contrast)

adjusted_image = np.clip((gray_image - 100) * 1.5 + 100, 0, 255).astype(np.uint8)

# Display the adjusted image

plt.subplot(1, 3, 3)
plt.imshow(np.squeeze(adjusted_image), cmap='gray')
plt.title('Adjusted Image')

# Show the plots

plt.tight_layout()
plt.show()

Output:

Result:
Thus NumPy is used to implement a simple image processing algorithm and executed
Successfully.
9. Use NumPy to implement a simple algorithm for image classification.

Aim:
To Use NumPy to implement a simple algorithm for image classification.

Procedure:
Step 1: Install Required Libraries
Make sure you have NumPy installed:

Step 2: Import Libraries

Step 3: Generate or Load a Dataset

For simplicity, let's generate a synthetic dataset. Replace this step with loading your dataset:

Step 4: Split the Dataset

Split the dataset into training and testing sets:

Step 5: Define the Model

Define a simple linear classifier:

Step 6: Training
Train the model using simple gradient descent. Replace this with a more sophisticated
training algorithm for a real-world scenario:

Step 7: Make Predictions

Make predictions on the test set:

Step 8: Evaluate Accuracy

Evaluate the accuracy of the model:

Program:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Generate synthetic dataset

np.random.seed(42)
num_samples = 1000
image_size = 28 * 28 # Assuming 28x28 pixel images
num_classes = 2

X = np.random.rand(num_samples, image_size)
y = np.random.randint(0, num_classes, size=num_samples)

# Split the dataset

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the model

class SimpleClassifier:
def __init__(self, input_size, output_size):
self.weights = np.random.rand(input_size, output_size)
self.bias = np.zeros(output_size)

def predict(self, X):

return np.dot(X, self.weights) + self.bias

# Training
def train(model, X, y, learning_rate=0.001, epochs=100):
for epoch in range(epochs):
predictions = model.predict(X)
loss = np.mean((predictions - y.reshape(-1, 1))**2)

gradient_weights = 2 * np.dot(X.T, predictions - y.reshape(-1, 1)) / len(y)

gradient_bias = 2 * np.mean(predictions - y.reshape(-1, 1))

model.weights -= learning_rate * gradient_weights

model.bias -= learning_rate * gradient_bias

if epoch % 10 == 0:
print(f"Epoch {epoch}, Loss: {loss}")

# Initialize the model

model = SimpleClassifier(input_size=image_size, output_size=num_classes)

# Train the model

train(model, X_train, y_train)

# Make predictions
test_predictions = np.argmax(model.predict(X_test), axis=1)

# Evaluate accuracy
accuracy = accuracy_score(y_test, test_predictions)
print(f"Accuracy: {accuracy}")
Output:

Result:
Thus NumPy is used to implement a simple algorithm for image classification and executed
Successfully.

Summary For Cit 202
No ratings yet
Summary For Cit 202
17 pages
Boot Loader
No ratings yet
Boot Loader
19 pages
FPU Member Workbook 4th Edition Printer Friendly
100% (2)
FPU Member Workbook 4th Edition Printer Friendly
84 pages
Eald Bilingual Dictionary Bengali
No ratings yet
Eald Bilingual Dictionary Bengali
96 pages
Cs3361- Data Science Lab Record_pdf (1)
No ratings yet
Cs3361- Data Science Lab Record_pdf (1)
69 pages
Naan Mudhalvan Project Report-2
No ratings yet
Naan Mudhalvan Project Report-2
46 pages
欢迎来到monash大学作业封面页面！
100% (2)
欢迎来到monash大学作业封面页面！
8 pages
Complete Syllabus R2022
0% (1)
Complete Syllabus R2022
626 pages
Final Modified LAB Manal Image Processing and Computer Vision Copy
No ratings yet
Final Modified LAB Manal Image Processing and Computer Vision Copy
70 pages
DM LAB MANUAL
No ratings yet
DM LAB MANUAL
72 pages
CS3352 Foundations of Data Science
No ratings yet
CS3352 Foundations of Data Science
27 pages
SITXMGT004 Student Assessment Tasks
No ratings yet
SITXMGT004 Student Assessment Tasks
30 pages
CNS_LAB.pdf
No ratings yet
CNS_LAB.pdf
103 pages
Modernism and Postmodernism in Education
100% (2)
Modernism and Postmodernism in Education
2 pages
D_Machine Learning Lab Manual 2024-25.Docx
No ratings yet
D_Machine Learning Lab Manual 2024-25.Docx
52 pages
RISHABH ML FILE
No ratings yet
RISHABH ML FILE
102 pages
IV Sem BCSL404 DAA Lab Manual-compressed (1)
No ratings yet
IV Sem BCSL404 DAA Lab Manual-compressed (1)
78 pages
Automobile Engineering
No ratings yet
Automobile Engineering
19 pages
20EI271 - SIMULATION LABORATORY Finalized 24 25
No ratings yet
20EI271 - SIMULATION LABORATORY Finalized 24 25
82 pages
DATA MINING using PYTHON
No ratings yet
DATA MINING using PYTHON
37 pages
OOSE LAB manual final
No ratings yet
OOSE LAB manual final
194 pages
R Programming Manual 24-25
No ratings yet
R Programming Manual 24-25
58 pages
EDA Lab Record
No ratings yet
EDA Lab Record
45 pages
TDS-EN-Contite Waterbars - Rev 006-Oct18
100% (1)
TDS-EN-Contite Waterbars - Rev 006-Oct18
2 pages
fa1a8fe33ea455ddf446167f3ca33d5d
No ratings yet
fa1a8fe33ea455ddf446167f3ca33d5d
137 pages
June 2019 Internship Impact Assessment Report
No ratings yet
June 2019 Internship Impact Assessment Report
26 pages
Data Science[1]
No ratings yet
Data Science[1]
60 pages
2022-23 NMMS Exam Circular
No ratings yet
2022-23 NMMS Exam Circular
14 pages
ACM Catalogue: Specifications Catalogue of The ACM Cars, All Models and Types. Check Also ACM Timeline Catalogue
No ratings yet
ACM Catalogue: Specifications Catalogue of The ACM Cars, All Models and Types. Check Also ACM Timeline Catalogue
2 pages
CS3361 LAB MANUAL FINAL DS 2023-24 ODD
No ratings yet
CS3361 LAB MANUAL FINAL DS 2023-24 ODD
12 pages
DMML Lab
No ratings yet
DMML Lab
35 pages
3130702 Data Structure Lab Manual
No ratings yet
3130702 Data Structure Lab Manual
104 pages
DSBA Manual 2025
No ratings yet
DSBA Manual 2025
77 pages
APznzabE783TWm5PnmnZmyqlGmiUEdbJFKKdhbU5kfZJdaXqkCbDMUET6fHn52VeIAHBjWb3yyxWC0s_vedvJwFmDOhu8F0pXFqlRhuaRw...s-qiyolcyk7MV1Rz2bTnmGPhPRKHhqgjpjO9GQHZmWcy_5-FQT1J3H213pCjd6oz938_c8GmybJDHy9GcpYVoc-ZKEYOlSDTUxgRHhmb1OaxyDReGNAK4ySaWh.pdf
No ratings yet
APznzabE783TWm5PnmnZmyqlGmiUEdbJFKKdhbU5kfZJdaXqkCbDMUET6fHn52VeIAHBjWb3yyxWC0s_vedvJwFmDOhu8F0pXFqlRhuaRw...s-qiyolcyk7MV1Rz2bTnmGPhPRKHhqgjpjO9GQHZmWcy_5-FQT1J3H213pCjd6oz938_c8GmybJDHy9GcpYVoc-ZKEYOlSDTUxgRHhmb1OaxyDReGNAK4ySaWh.pdf
45 pages
ML File Fnail Merged
No ratings yet
ML File Fnail Merged
82 pages
final file
No ratings yet
final file
38 pages
OCS353 - Data Science Manual-FULL
No ratings yet
OCS353 - Data Science Manual-FULL
64 pages
CAF 3 Autumn 2024
No ratings yet
CAF 3 Autumn 2024
3 pages
A1-TEF Structure
No ratings yet
A1-TEF Structure
10 pages
Multiband, Multistandard Transmitter Design Using The RF DAC
No ratings yet
Multiband, Multistandard Transmitter Design Using The RF DAC
7 pages
R2023-CSE-Curriculum & Syllabus Batch 2024-2025
No ratings yet
R2023-CSE-Curriculum & Syllabus Batch 2024-2025
69 pages
In The Merry Old Land of Oz
No ratings yet
In The Merry Old Land of Oz
12 pages
Ca2 Ai
No ratings yet
Ca2 Ai
4 pages
CS3361 - Data Science Lab Manual-1
No ratings yet
CS3361 - Data Science Lab Manual-1
65 pages
ICT-Year 10 - Test - Unit 4 Assessment 4
No ratings yet
ICT-Year 10 - Test - Unit 4 Assessment 4
6 pages
ADAA lab manual final
No ratings yet
ADAA lab manual final
45 pages
LecturePlan_CS201_21CSH-471 (1)
No ratings yet
LecturePlan_CS201_21CSH-471 (1)
8 pages
Zamioculcas - ZZ Plant - Aroid Plant - Philippine Medicinal Herbs - Alternative Medicine
0% (1)
Zamioculcas - ZZ Plant - Aroid Plant - Philippine Medicinal Herbs - Alternative Medicine
5 pages
ML Lab Manual Simplified
No ratings yet
ML Lab Manual Simplified
40 pages
Python Lab Manual (4)
No ratings yet
Python Lab Manual (4)
21 pages
Skills Enhancement Plan: No. Descriptions Book/Ebook/Online/Webinar/Website Title
No ratings yet
Skills Enhancement Plan: No. Descriptions Book/Ebook/Online/Webinar/Website Title
4 pages
3160714 Data Mining. Vbv Jjj Ldce Vgec (1) - Copy
No ratings yet
3160714 Data Mining. Vbv Jjj Ldce Vgec (1) - Copy
43 pages
II Year Syllabus (2023-24)
No ratings yet
II Year Syllabus (2023-24)
57 pages
1 to 5 and 9
No ratings yet
1 to 5 and 9
38 pages
Bcs 452 Amit Singh 69
No ratings yet
Bcs 452 Amit Singh 69
20 pages
BCS-452-1 (1)
No ratings yet
BCS-452-1 (1)
10 pages
Python - Lab - Manual - B.Tech. - Sem-2 Aneek
No ratings yet
Python - Lab - Manual - B.Tech. - Sem-2 Aneek
40 pages
Syllabus Data Science 2020
No ratings yet
Syllabus Data Science 2020
93 pages
Machine Learning Lab (R20a0590)
No ratings yet
Machine Learning Lab (R20a0590)
81 pages
Wolverine WVR-8C
No ratings yet
Wolverine WVR-8C
1 page
20SMT-460_Statistical Methods Using R
No ratings yet
20SMT-460_Statistical Methods Using R
11 pages
DM lab manual
No ratings yet
DM lab manual
26 pages
DRAM Technology
No ratings yet
DRAM Technology
16 pages
Where Do People Find: Hope?
No ratings yet
Where Do People Find: Hope?
8 pages
DAL Lab File
No ratings yet
DAL Lab File
38 pages
Python Lab Manual Final
100% (6)
Python Lab Manual Final
88 pages
DataVisualizationUsingPython LAB MANUAL
No ratings yet
DataVisualizationUsingPython LAB MANUAL
47 pages
DL Lab Manual Student
No ratings yet
DL Lab Manual Student
6 pages
bk-12hk Brochure A PDF
No ratings yet
bk-12hk Brochure A PDF
2 pages
(R18A0584) Data Structures Lab Manual
No ratings yet
(R18A0584) Data Structures Lab Manual
104 pages
Experiment List. DSPYL
No ratings yet
Experiment List. DSPYL
10 pages
K500 PROG2 vs. AK500Pro vs. AK500 Plus vs. AK500+ Mercedes Key Programmer: A
No ratings yet
K500 PROG2 vs. AK500Pro vs. AK500 Plus vs. AK500+ Mercedes Key Programmer: A
6 pages
Geethanjali College of Engineering and Technology (Ugc Autonomous Institution)
No ratings yet
Geethanjali College of Engineering and Technology (Ugc Autonomous Institution)
34 pages
BCS-452-1
No ratings yet
BCS-452-1
10 pages
Internship
No ratings yet
Internship
22 pages
Python Lab Mannual
No ratings yet
Python Lab Mannual
20 pages
Certificate of Registration: Information Security Management System - ISO/IEC 27001:2013
No ratings yet
Certificate of Registration: Information Security Management System - ISO/IEC 27001:2013
1 page
LecturePlan CS201 20SMP-460
No ratings yet
LecturePlan CS201 20SMP-460
5 pages
Biotech8 DLL October 10
No ratings yet
Biotech8 DLL October 10
4 pages
CSB4231 Python Programming Laboratory
No ratings yet
CSB4231 Python Programming Laboratory
82 pages
Sap HR Faq
No ratings yet
Sap HR Faq
36 pages
AIML Lab vision,mission,course assessment matrix
No ratings yet
AIML Lab vision,mission,course assessment matrix
3 pages
Updated_Lab_File_BCS351
No ratings yet
Updated_Lab_File_BCS351
7 pages
BDA Final Lab Manual
100% (1)
BDA Final Lab Manual
56 pages
LecturePlan BI519 23CSP-201
No ratings yet
LecturePlan BI519 23CSP-201
7 pages
Python Programming & Data Science Lab Manual
No ratings yet
Python Programming & Data Science Lab Manual
25 pages
Mahindra Value Delivery System
No ratings yet
Mahindra Value Delivery System
5 pages
DVP Manual
No ratings yet
DVP Manual
37 pages
AITS 2223 FT X JEEA Paper 2
No ratings yet
AITS 2223 FT X JEEA Paper 2
12 pages
Machine Learning Mastery for Engineers
From Everand
Machine Learning Mastery for Engineers
Abdellatif Sadeq
No ratings yet