Numpy
Numpy
How to install?
• In windows write pip install numpy in command prompt.
• In Ubuntu write sudo apt install python3-numpy in terminal.
2- Dimensional Array:
>>> M = np.array([[1, 2, 3], [4, 5, 6]])
>>> M
array([[1, 2, 3],
[4, 5, 6]])
[The display is like a Matrix with 2 rows and 3 columns.]
>>> M.ndim
2
>>> M.size
6
>>> M.shape # [Shape of the 2D array as a tuple]
(2, 3)
Shape, Reshape, Size and Resize of arrays:
>>> A = np.array([2, 4, 6, 8, 10, 12])
>>> A.shape
(6,)
[1D array with 6 elements.]
# Converted to a 2D array ( 2 × 3)
>>> B =A.reshape(2, 3)
>>> B
array([[ 2, 4, 6],
[ 8, 10, 12]])
>>> A.reshape(3, 2)
array([[ 2, 4],
[ 6, 8],
[10, 12]])
>>> A
array([ 2, 4, 6, 8, 10, 12])
Thus, In case of reshape, the original array remains unchanged.
>>> A.size
6
>>> B.size
6
>>> A.resize(2, 3)
>>> A
array([[ 2, 4, 6],
[ 8, 10, 12]])
Thus, in case of using resize, the original array becomes modified.
# 1 Range of numbers
np.arange(start, stop, step)
# Examples
>>> np.arange(2, 10, 3)
array([2, 5, 8])
# Default step = 1
>>> np.arange(2, 10)
array([2, 3, 4, 5, 6, 7, 8, 9])
# Default start = 0
>>> np.arange(5)
array([0, 1, 2, 3, 4])
>>> np.arange(0.2, 2, 0.4)
array([0.2, 0.6, 1. , 1.4, 1.8])
>>> np.arange(5,10, 2, dtype= ‘f’)
array([5., 7., 9.], dtype=float32)
# 2 Linear space
np.linspace(start, end, number of elements)
# Examples
>>> np.linspace(10, 20, 5)
array([10., 12.5, 15., 17.5, 20.])
Note: Step size = (20 – 10)/4 = 2.5 [4 gaps]
>>> np.linspace(10, 20, 5, endpoint = True)
array([10., 12.5, 15., 17.5, 20.])
>>> np.linspace(10, 20, 5, endpoint = False)
array([10., 12., 14., 16., 18.])
Concatenating arrays:
1D:
>>> a = np.array([10, 20, 30])
>>> b =np.array([40, 50, 60])
>>> np.concatenate((a, b))
array([10, 20, 30, 40, 50, 60])
2D:
>>> x
array([[1, 2],
[3, 4]])
>>> y
array([[5, 6],
[7, 8]])
>>> np.concatenate((x, y), axis = 1)
array([[1, 2, 5, 6 ],
[3, 4, 7, 8 ]])
>>> np.concatenate((x, y), axis = 0)
array([[1, 2],
[3, 4],
[5, 6],
[7, 8] ])
# By default axis = 0
>>> np.concatenate((x, y))
array([[1, 2],
[3, 4],
[5, 6] ,
[7, 8] ])
Some other numpy fuctions:
# Mean, Median
>>> np.mean(y, axis = 0)
array([40., 50., 60.])
[Also, y.mean(0) as attribute.]
>>> np.mean(y, 1)
array([20., 50., 80.])
>>> np.mean(y)
50.0
>>> np.median(y, 0)
array([40., 50., 60.])
>>> np.median(y, 1)
array([20., 50., 80.])
>>>np.median(y)
50.0
# Sorting
>>> x = [-2, 0, 3, 1, 10, 5, -7]
# Ascending order
>>> np.sort(x)
array([-7, -2, 0, 1, 3, 5, 10])
# Descending order
>>> np.sort(x)[::-1]
array([10, 5, 3, 1, 0, -2, -7])
>>>a1.T
#[[0 3]
[1 4]
[5 2]]
>>>print(a1 + a2)
# [[ 6 9 12]
[15 18 21]]
>>>print(a1 - a2)
# [[ -6 -7 -8]
[ -9 -10 -11]]
>>>print(a1 * a2)
# [[ 0 8 20]
[36 56 80]]
>>>print(a1 / a2)
# [[0. 0.125 0.2 ]
[0.25 0.28571429 0.3125 ]]
>>>print(a1**a2)
# [[ 0 1 1024]
[ 531441 268435456 152587890625]]
>>>print(np.dot(a1, a2))
# [[ 3 4 5]
[ 9 14 19]]
>>>print(a1.dot(a2))
# [[ 3 4 5]
[ 9 14 19]]
>>>print(np.linalg.inv(a))
# [[ 3. -5.]
[-1. 2.]]
# For singular matrix it will produce error.
>>>a_singular = np.array([[0, 0], [1, 3]])
>>>print(a_singular)
# [[0 0]
[1 3]]
>>>print(np.linalg.inv(a_singular))
# LinAlgError: Singular matrix
np.empty((3,3))
# It return a new empty array of given shape and type, without any entries.
np.identity(4)
# It returns identity matrix of given order
np.eye(4)
#It returns a 2-D array with ones on the diagonal and zeros elsewhere.
np.ones((3, 4))
#It returns an array of the given shape with all elements 1.
np.zeros((3, 3))
It returns an array with given shape with all elements zero.
np.diag((1, 2, 3))
It returns an array with given diagonal elements and all other elements are
zeroes.
##
>>>A=np.array([[1,2],[2,3],[3,4]])
>>>print(A.flatten())
#[1 2 2 3 3 4]
>>> print(A)
#[[1 2]
[2 3]
[3 4]]
>>>print(A.ravel())
#[1 2 2 3 3 4]
>>> print(A)
#[1 2 2 3 3 4]
Thus, ravel() modifies the original array but flatten() does not modify
original array.
Some problems:
Write a program to add two 2-dimensional NumPy arrays where the
elements of the arrays are user given.
import numpy as np
What is Pandas?
Pandas is an open-source, BSD-licensed Python library providing high-performance, easy-to-use data
structures and data analysis tools for the Python programming language.
Why Pandas?
The beauty of Pandas is that it simplifies the task related to data frames and makes it simple to do
many of the time-consuming, repetitive tasks involved in working with data frames, such as:
Import datasets - available in the form of spreadsheets, comma-separated values (CSV) files, and
more.
Data cleansing - dealing with missing values and representing them as NaN, NA, or NaT.
Size mutability - columns can be added and removed from DataFrame and higher-dimensional
objects.
Data normalization – normalize the data into a suitable format for analysis.
Reshaping and pivoting of datasets – datasets can be reshaped and pivoted as per the need.
Efficient manipulation and extraction - manipulation and extraction of specific parts of extensive
datasets using intelligent label-based slicing, indexing, and subsetting techniques.
Statistical analysis - to perform statistical operations on datasets.
Data visualization - Visualize datasets and uncover insights.
data = np.array([1,2,3,4]) 0 6
s = pd.Series(data, index=[0,2,5,8], dtype=int, copy=False) 2 2
5 3
print(s) 8 4
dtype: int64
s.iloc[0] = 6 [6 2 3 4]
print(s)
print(data)
print(s)
How to retrieve elements of series?
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
print(s[[‘b’]]) # b 2
a 1
#Retrieve multiple elements c 3
print(s[['a','c','d’]]) d 4
dtype: int64
print(s[[‘f’]]) KeyError: 'f'
DataFrame
A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in
rows and columns.
A pandas DataFrame can be created using the following constructor −
pandas.DataFrame( data, index, columns, dtype, copy)
A pandas DataFrame can be created using various inputs like −
Lists
dictionary
Series
Numpy ndarrays
Another DataFrame
Creating DataFrame:
#import the pandas library and aliasing as pd Empty DataFrame
import pandas as pd Columns: []
df = pd.DataFrame() Index: []
print(df)
Matplotlib
1. Introduction
When we want to convey some information to others, there are several ways to do so. The
process of conveying the information with the help of plots and graphics is called Data
Visualization.
• In python, we will use the basic data visualisation tool Matplotlib.
• It can create popular visualization types – line plot, scatter plot, histogram, bar chart,
error charts, pie chart, box plot, and many more types of plot.
• It can export visualizations to all of the common formats like PDF, SVG, JPG, PNG,
BMP etc.
How to install?
• In windows write pip install matplotlib in command prompt.
• In Ubuntu write sudo apt install python3-matplotlib in terminal.
How to import Matplotlib?
We can import Matplotlib as follows:-
import matplotlib
Most of the time, we have to work with pyplot interface of Matplotlib. So, I will import
pyplot interface of Matplotlib as follows:-
import matplotlib.pyplot as plt
# Code for line plot using matplotlib
# Changing line colors
Scatter plot:
Here the points are represented individually with a dot or a circle. We cannot change markers
in this type of plot.
Scatter Plot with plt.plot():
We have used plt.plot() to produce line plots. We can use the same functions to
produce the scatter plots as follows:-
X = np.linspace(0, 10, 30)
y = np.sin(x7)
plt.plot(X, y, 'o', color = 'black')
Markers:
Some markers:
"." point
"," pixel
Histogram:
Histogram charts are a graphical display of frequencies. They
are represented as bars. They show what portion of the dataset
falls into each category, usually specified as non-overlapping
intervals. These categories are called bins.
The plt.hist() function can be used to plot a simple histogram
as follows:-
Bar Chart:
Bar charts display rectangular bars either in vertical or
horizontal form. Their length is proportional to the values they
represent. They are used to compare two or more values.
We can plot a bar chart using plt.bar() function. We can plot a
bar chart as follows:-
Pie chart:
Pie charts are circular representations, divided into sectors. The sectors are also
called wedges. The arc length of each sector is proportional to the quantity we
are describing. It is an effective way to represent information when we are
interested mainly in comparing the wedge against the whole pie, instead of
wedges against each other.
Matplotlib provides the plt.pie() function to plot pie charts from an array X.
Wedges are created proportionally, so that each value x of array X generates a
wedge proportional to x/sum(X).