0% found this document useful (0 votes)

2 views43 pages

21284254 Python Module 5

This document covers the basics of data processing using Python libraries such as NumPy, Pandas, and Matplotlib. It details the functionalities of NumPy for numerical data manipulation, including array creation, arithmetic operations, and linear algebra, as well as how to work with Pandas for data manipulation and CSV file handling. Additionally, it introduces Matplotlib for data visualization, explaining how to create plots and customize them with labels and legends.

Uploaded by

jocktmpx4oxc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views43 pages

21284254 Python Module 5

Uploaded by

jocktmpx4oxc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

MODULE V – Data Processing

NumPy - Basics, Creating arrays, Arithmetic, Slicing, Matrix Operations, Random numbers.
Plotting and visualization. Matplotlib - Basic plot, Ticks, Labels, and Legends. Working with
CSV files. – Pandas - Reading, Manipulating, and Processing Data. Introduction to Micro
services using Flask.

13/06/23
Tuesday

Numpy: It provides the data structures, algorithms, and library glue needed
for most scientific applications involving numerical data in Python.

Pandas: It provides high-level data structures and functions designed to make working
with structured or tabular data fast, easy, and expressive.

Matplolib: It is the most popular Python library for producing plots and other two-
dimensional data visualizations.

NumPy - Basics, Creating arrays, Arithmetic, Slicing, Matrix Operations, Random numbers

* NumPy, short for Numerical Python, is one of the most important foundational packages for
numerical computing in Python.

*It has fast array-processing capabilities and is used in data analysis as a container for data to be
passed between algorithms and libraries. Most computational packages providing scientific
functionality use NumPy’s array objects for data exchange.

* One of the reasons NumPy is so important for numerical computations in Python is because it is
designed for efficiency on large arrays of data

- NumPy internally stores data in a contiguous block of memory, independent of other built-in
Python objects

- NumPy operations perform complex computations on entire arrays without the need for Python
for loops

In short numpy has following features:

* ndarray: an efficient multidimensional array providing fast array-oriented arithmetic operations

and flexible broadcasting capabilities.

* Mathematical functions for fast operations on entire arrays of data without having to write loops.

* Tools for reading/writing array data to disk and working with memory-mapped files.

* Linear algebra, random number generation, and Fourier transform capabilities.

* A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.

The NumPy ndarray: A Multidimensional Array Object

* N-dimensional array object, or ndarray, is a fast, flexible container for large datasets in Python.

* An ndarray is a generic multidimensional container for homogeneous data; that is, all of the
elements must be the same type

* Illustration : Import NumPy and generate a small array of random data

import numpy as np
data = np.random.randn(2, 3)
data

* Creating ndarrays

- The easiest way to create an array is to use the array function. This accepts any
sequence-like object (including other arrays) and produces a new NumPy array containing the
passed data.

* Nested
sequences, like a list of equal-length lists, will be converted into a multidimensional array
* Python infers the shape of the array from the data
* ndim and shape attributes show number of dimensions and size of the arrays

* To know the data type of the array use dtype metadata object;

* In addition to np.array , there are a number of other functions for creating new arrays.

EX:
- zeros and ones create arrays of 0s or 1s, respectively, with a given length or shape.

- empty creates an array without initializing its values to any particular value.

- arange is an array-valued version of the built-in Python range function

- To create a higher dimensional array with these methods, pass a tuple for the shape
Data Types for ndarrays

* The data type or dtype is a special object containing the information (or metadata, data about data)
the ndarray needs to interpret a chunk of memory as a particular type of data.

* You can explicitly convert or cast an array from one dtype to another using ndarray’s astype
method

* Calling astype always creates a new array (a copy of the data), even if the new dtype is the same
as the old dtype.

* If we convert floating-point numbers to be of integer dtype, the decimal part will be truncated

* If casting were to fail for some reason (like a string that cannot be converted to float64 ), a
ValueError will be raised.
Arithmetic with NumPy Arrays

* Arrays are important because they enable you to express batch operations on data
without writing any for loops. NumPy users call this vectorization.

* Any arithmetic operations between equal-size arrays applies the operation element-wise.

* Illustration

* Arithmetic operations with scalars propagate the scalar argument to each element in
the array.
* Comparisons between arrays of the same size yield boolean arrays:

* Operations between differently sized arrays is called broadcasting

14/6/2023
Wednesday

Basic Indexing and Slicing

* Here we discuss about the different ways you may want to select a subset of your data or
individual elements.

* One-dimensional arrays are simple and act similar to python lists

Illustration 1

* As you can see in the above illustration 1, if you assign a scalar value to a slice, as in arr[5:8] = 12
, the value is propagated (or broadcasted ) to the entire selection.

* Note: Array slices are views on the original array. This means that the data is not copied, and any
modifications to the view will be reflected in the source array.

Illustration 2

* First create a slice of arr and then change values in arr_slice

- when the values in arr_slice are changed, the mutations are reflected in the original array arr :

* The “bare” slice [:] will assign to all values in an array:

* If you want a copy of a slice of an ndarray instead of a view, you will need to explicitly copy the
array—
for example, arr[5:8].copy()

* Two-dimensional array: In a 2d array the elements at each index are no longer scalars, but rather
one-dimensional arrays:

* To select individual elements, you can pass a comma-separated list of indices.

So the following expressions are equivalent:

* 3 dimensional array : 2 × 2 × 3 array arr3d

- Note: In multidimensional arrays, if you omit later indices, the returned object will be a
lower dimensional ndarray consisting of all the data along the higher dimensions.

* Both scalar values and arrays can be assigned to arr3d[0]

* arr3d[1, 0] gives you all of the values whose indices start with (1, 0), forming a 1-dimensional
array
Indexing with Slices

* 1-D array Slicing

* 2 -D array slicing

arr2d[:2] => select the first two rows of arr2d .

* You can pass multiple slices just like you can pass multiple indices
* You can mix integer indexes and slices

- To select the second row but only the first two columns:

- To select the third column but only the first two rows

Transposing Arrays

* Transposing is a special form of reshaping that similarly returns a view on the under‐
lying data without copying anything. Arrays have the transpose method and also the
special T attribute.
Linear Algebra

* Linear algebra, like matrix multiplication, decompositions, determinants, and other

square matrix math, is an important part of any array library.

* In numpy there is a function dot , both an array method and a function in the numpy
namespace, for matrix multiplication:
* numpy.linalg has a standard set of matrix decompositions and things like inverse
and determinant.
26/06/2023
Monday

Pandas - Reading, Manipulating, and Processing Data.

* pandas: It contains data structures and data manipulation tools designed to make data cleaning
and analysis fast and easy in Python. Panda’s high-level data structures and functions make,
working with structured or tabular data fast, easy, and expressive.

* The primary objects in pandas are the DataFrame , a tabular, column-oriented data structure with
both row and column labels, and the Series , a one-dimensional labeled array object.

* pandas adopts significant parts of NumPy’s idiomatic style of array-based computing, especially
array-based functions and a preference for data processing without for loops

* pandas blends the high-performance, array-computing ideas of NumPy with the flexible data
manipulation capabilities of spreadsheets and relational databases (such as SQL).

- It provides sophisticated indexing functionality to make it easy to reshape, slice

and dice, perform aggregations, and select subsets of data.

* The biggest difference between Panda and Numpy is that pandas is designed for working with
tabular or heterogeneous data. NumPy, by contrast, is best suited for working with homogeneous
numerical array data.

* Importing Panda

import pandas as pd
from pandas import Series, DataFrame

Introduction to pandas Data Structures

* Series
A Series is a one-dimensional array-like object containing a sequence of values (of similar types
to NumPy types) and an associated array of data labels, called its index.
* Since we did not specify an index for the data, a default one consisting of the integers 0 through N
- 1 (where N is the length of the data) is created.

* You can get the array representation and index object of the Series via its values and index
attributes, respectively

* To create a Series with an index identifying each data point with a label

* You can use labels in the index when selecting single values or a set of values
DataFrame

* A DataFrame represents a rectangular table of data and contains an ordered collection of

columns, each of which can be a different value type (numeric, string,boolean, etc.).

* The DataFrame has both a row and column index; it can be thought of as a dict of Series all
sharing the same index.
Under the hood, the data is stored as one or more two-dimensional blocks rather than a list, dict,
or some other collection of one-dimensional arrays.

* There are many ways to construct a DataFrame, though one of the most common is from a dict of
equal-length lists or NumPy arrays:

* The resulting DataFrame will have its index assigned automatically as with Series, and
the columns are placed in sorted order:

* For large DataFrames, the head method selects only the first five rows
* If you specify a sequence of columns, the DataFrame’s columns will be arranged in
that order

* A column in a DataFrame can be retrieved as a Series either by dict-like notation or

by attribute:
* Rows can also be retrieved by position or name with the special loc attribute

* Columns can be modified by assignment

* When you are assigning lists or arrays to a column, the value’s length must match the
length of the DataFrame. If you assign a Series, its labels will be realigned exactly to
the DataFrame’s index, inserting missing values in any holes

Data Loading, Storage, and File Formats

Accessing data is a necessary first step for using most of the tools. pandas features a number of
functions for reading tabular data as a DataFrame object.

* All these functions, are meant to convert text data into a DataFrame.
Working with CSV: CSV - (Comma Separated value)

* CSV (comma-separated value) files are a common file format for transferring and storing data.

- It is a simple text file, following a few formatting conventions

-*CSV is a standard for storing tabular data in text format, where commas are used to separate the
different columns, and newlines (carriage return / press enter) used to separate rows. Typically, the
first row in a CSV file contains the names of the columns for the data.

- The ability to read, manipulate, and write data to and from CSV files using Python is a key skill to
master for any data scientist or business analysis.

* A CSV file is a file with a “.csv” file extension, e.g. “data.csv”, “super_information.csv”. The
“CSV” in this case lets the computer know that the data contained in the file is in “comma separated
value” format

* A “CSV” file, that is, a file with a “csv” filetype, is a basic text file. Any text editor such as
NotePad on windows or TextEdit on Mac, can open a CSV file and show the contents.

* You can create a text file in a text editor, save it with a .csv extension, and open that file in Excel
or Google Sheets to see the table form.

* Pandas is the most popular data manipulation package in Python, and DataFrames are the Pandas
data type for storing tabular 2D data.

CSV Data Loading using Pandas

* Accessing data is a necessary first step for most of the programs.

* Pandas provide function to read csv file

Reading data from CSV files

* Let ex1.csv be a csv file with following data

* We can use
read_csv to read
it into a
DataFrame

* A file may not always have a header. Consider such a CSV file , ex2.csv

* You can allow pandas to assign default column names, or you can specify names yourself:

* Suppose you wanted the message column to be the index of the returned DataFrame. You can
either indicate you want the column at index 4 or named 'message' using the index_col argument:
* You can skip the first, third, and fourth rows of a file with skiprows

pd.read_csv('examples/ex2.csv', skiprows=[0, 2, 3])

* Some frequently used options in pandas.read_csv and pandas.read_table. (It has more than 50
options)
Reading Text Files in Pieces

* When processing very large files or figuring out the right set of arguments to correctly process a
large file, you may only want to read in a small piece of a file or iterate through smaller chunks of
the file.
* If you want to only read a small number of rows (avoiding reading the entire file), specify that
with nrows :

Writing Data to Text Format

* Data can also be exported to a delimited format. Let’s consider one of the CSV files
read before

* Using DataFrame’s to_csv method, we can write the data out to a comma-separated
file. Missing values appear as empty strings in the output.
* Missing values appear as empty strings in the output. You might want to denote them
by some other sentinel value:
(import sys: writing to sys.stdout so it prints the text result to the console)

* both the row and column labels can be disabled

* You can also write only a subset of the columns, and in an order of your choosing:
Q1. Read a CSV file named courses.csv using pandas and do the following tasks:

a) Display its contents:

b)Set courses Column as Index

c) skip first-row

# Skip first few rows

df = pd.read_csv('courses.csv', header=None, skiprows=0)
print(df)

#Yields below output

# 0 1 2 3
#0 Pandas 20000 35 Days 1000
#1 Java 15000 NaN 800
#2 Python 15000 30 Days 500
#3 PHP 18000 30 Days 800

d) Read CSV by Ignoring Column Names

By default, it considers the first row from excel as a header and used it as
DataFrame column names. In case you wanted to consider the first row
from excel as a data record use header=None param and use names
param to specify the column names

e)Load only Selected Columns

f) Set DataTypes to Columns
27/06/23
Tuesday

Plotting and Visualization

* Making informative visualizations (sometimes called plots) is one of the

most important tasks in data analysis.

* matplotlib is a desktop plotting package designed for creating (mostly

two dimensional) publication-quality plots.

* To import

import matplotlib.pyplot as plt

* Draw a simple line plot

Figures and Subplots

* Plots in matplotlib reside within a Figure object. You can create a new
figure with plt.figure :

In [16]: fig = plt.figure()

- In IPython, an empty plot window will appear.

* You can’t make a plot with a blank figure. You have to create one or
more subplots using add_subplot :

In [17]: ax1 = fig.add_subplot(2, 2, 1)

Colors, Markers, and Line Styles

* Matplotlib’s main plot function accepts arrays of x and y coordinates and

optionally a string abbreviation indicating color and line style.

-For example, to plot x versus y with green dashes, you would execute:

ax.plot(x, y, 'g--')

-The same plot could also have been expressed more explicitly as:

ax.plot(x, y, linestyle='--', color='g')

* Line plots can additionally have markers to highlight the actual data
points. The marker can be part of the style string, which must have color
followed by marker type and line style.
* This could also have been written more explicitly as:

plot(randn(30).cumsum(), color='k', linestyle='dashed', marker='o')

Ticks, Labels, and Legends

* The pyplot interface, designed for interactive use, consists of methods

like xlim, xticks , and xticklabels . These control the plot range, tick
locations, and tick labels, respectively.

* They can be used in two ways:

1. Called with no arguments returns the current parameter value (e.g.,

plt.xlim() returns the current x-axis plotting range)

2. Called with parameters sets the parameter value (e.g., plt.xlim([0, 10]) ,
sets the x-axis range to 0 to 10)

Setting the title, axis labels, ticks, and ticklabels

* Eg:Create a simple figure and plot a random walk

Axis Labels and Ticks

To change the x-axis ticks, it’s easiest to use set_xticks and set_xticklabels .

* The set_xticks() and set_yticks() function takes a list object as argument. The
elements in the list denote the positions on corresponding action where ticks will be
displayed.

- set_xticks instructs matplotlib where to place the ticks along the data range. By
default these locations will also be the labels.

- we can set any other values (other than the default tick values) as the labels using
set_xticklabels :

* Similarly, labels corresponding to tick marks can be set by set_xlabels() and

set_ylabels() functions respectively.
- The rotation option sets the x tick labels at a 30-degree rotation. Lastly, set_xlabel
gives a name to the x-axis and set_title the subplot title

Title

* set_title() gives a title the plot

Adding legends

* ax.legend() or plt.legend() to automatically create a legend.

Demonstration 1
Demonstration 2

Matplotlib - Simple Plot

* Display a simple line plot of angle in radians vs. its sine value in Matplotlib.
- we need an array of numbers to plot
x = np.arange(0, math.pi*2, 0.05)

This ndarray object serves as values on x axis of the graph.

Corresponding sine values of angles in x to be displayed on y axis are obtained by :

y = np.sin(x)
- values from two arrays are plotted using the plot() function.
Matplotlib – Bar Plot
* A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars
with heights or lengths proportional to the values that they represent. The bars can be plotted
vertically or horizontally.
Demonstration
A simple example of the Matplotlib bar plot is given below. It shows the number of students
enrolled for various courses offered at an institute
Matplotlib – Pie Chart

* Pie Chart can only display one series of data. Pie charts show the size of items
(called wedge) in one data series, proportional to the sum of the items. The data
points in a pie chart are shown as a percentage of the whole pie.

Demonstration

* Following code uses the pie() function to display the pie chart of the list of students
enrolled for various computer language courses. The proportionate percentage is
displayed inside the respective wedge with the help of autopct parameter which is set
to %1.2f%.
Matplotlib - Scatter Plot

* Scatter plots are used to plot data points on horizontal and vertical axis in the attempt to show
how much one variable is affected by another. Each row in the data table is represented by a marker
the position depends on its values in the columns set on the X and Y axes. A third variable can be
set to correspond to the color or size of the markers, thus adding yet another dimension to the plot.

Demonstration

* The script below plots a scatter diagram of grades range vs grades of boys and girls in two
different colors.
Python Flask

* Flask is a web framework, it’s a Python module that lets you develop web applications easily.

-A Web Application Framework or a simply a Web Framework represents a collection of libraries

and modules that enable web application developers to write applications without worrying about
low-level details such as protocol, thread management, and so on.

* Flask is a web application framework written in Python. It was developed by Armin Ronacher,
who led a team of international Python enthusiasts called Poocco. Flask is based on the Werkzeg
WSGI toolkit and the Jinja2 template engine. Both are Pocco projects.

* It’s has a small and easy-to-extend core: it’s a microframework that doesn’t include an ORM
(Object Relational Manager) or such features.

* It does have many cool features like url routing, template engine. It is a WSGI web app
framework.

Flask Components

* WSGI
The Web Server Gateway Interface (Web Server Gateway Interface, WSGI) has been used as a
standard for Python web application development. WSGI is the specification of a common interface
between web servers and web applications.
* Werkzeug
Werkzeug is a WSGI toolkit that implements requests, response objects, and utility functions. This
enables a web frame to be built on it. The Flask framework uses Werkzeg as one of its bases.
* jinja2
jinja2 is a popular template engine for Python. A web template system combines a template with a
specific data source to render a dynamic web page.

* Create the “Hello World” app

* If you want to develop on your local computer, you can do so easily. Save this program as
server.py and run it with python server.py.

* It then starts a web server which is available only on your computer. In a web browser open
localhost on port 5000 (the url) and you’ll see “Hello World” show up.

* It’s a microframework, but that doesn’t mean your whole app should be inside one single Python
file. You can and should use many files for larger programs, to handle complexity.
Micro means that the Flask framework is simple but extensible

20214254 Python Module 2
No ratings yet
20214254 Python Module 2
73 pages
Unit-V Python_BCC402
No ratings yet
Unit-V Python_BCC402
20 pages
PI Manual Logger 2014 R3 Data Collector Guide
100% (1)
PI Manual Logger 2014 R3 Data Collector Guide
82 pages
05 NumPy - Arrays and Vectorized Computation
No ratings yet
05 NumPy - Arrays and Vectorized Computation
47 pages
20934254 Python Module 3
No ratings yet
20934254 Python Module 3
55 pages
19674254 Python Module 1
No ratings yet
19674254 Python Module 1
55 pages
DAY6 Pandas Seaborn
No ratings yet
DAY6 Pandas Seaborn
97 pages
B14_LT2_07_Numpy Matplotlib Pandas
No ratings yet
B14_LT2_07_Numpy Matplotlib Pandas
101 pages
20944254 Python Module 4
No ratings yet
20944254 Python Module 4
37 pages
Module 3
No ratings yet
Module 3
98 pages
HMX7001 Analysis of Data Using SPSS - Advanced Level
No ratings yet
HMX7001 Analysis of Data Using SPSS - Advanced Level
97 pages
M3-Introduction to Numpy and Pandas
No ratings yet
M3-Introduction to Numpy and Pandas
55 pages
Numpy Operations
No ratings yet
Numpy Operations
55 pages
Numpy User
No ratings yet
Numpy User
659 pages
Git Log
No ratings yet
Git Log
52 pages
UNIT-4
No ratings yet
UNIT-4
62 pages
Lecture 8 Applications of Data Mining
No ratings yet
Lecture 8 Applications of Data Mining
16 pages
What is NumPy
No ratings yet
What is NumPy
11 pages
Lecture 4-Python-NumPy Hadi Updated
No ratings yet
Lecture 4-Python-NumPy Hadi Updated
44 pages
CSE-III SEM - VIII SEM PRINT SYLLABUS
No ratings yet
CSE-III SEM - VIII SEM PRINT SYLLABUS
181 pages
Unit 3
No ratings yet
Unit 3
37 pages
Further 3&4 Solutions
No ratings yet
Further 3&4 Solutions
223 pages
Python My Notes
No ratings yet
Python My Notes
8 pages
Unit 1
No ratings yet
Unit 1
170 pages
SQL Scenario Based Questions-1
No ratings yet
SQL Scenario Based Questions-1
25 pages
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
100% (1)
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
84 pages
dbms an dw syllabus
No ratings yet
dbms an dw syllabus
2 pages
Lesson 2
No ratings yet
Lesson 2
30 pages
unit 1 part b information technology class 10th
No ratings yet
unit 1 part b information technology class 10th
27 pages
Arshad Resume_2025 (1)
No ratings yet
Arshad Resume_2025 (1)
2 pages
Ontology 101 An Introduction To Knowledge Representation & Ontology
No ratings yet
Ontology 101 An Introduction To Knowledge Representation & Ontology
141 pages
Python Presentation 3
No ratings yet
Python Presentation 3
44 pages
Numpy
No ratings yet
Numpy
32 pages
Lab5 n01530481
No ratings yet
Lab5 n01530481
5 pages
UNIT-03_Numpy (1)
No ratings yet
UNIT-03_Numpy (1)
49 pages
Advanced DBMS 2023
No ratings yet
Advanced DBMS 2023
49 pages
CIS 207 Oracle - Database Programming and SQL Homework: # 13 Key
No ratings yet
CIS 207 Oracle - Database Programming and SQL Homework: # 13 Key
5 pages
Tuba, Benguet: Jump To Navigation Jump To Search
No ratings yet
Tuba, Benguet: Jump To Navigation Jump To Search
14 pages
CSE488_Lab3_Numpy
No ratings yet
CSE488_Lab3_Numpy
14 pages
Unit 3
No ratings yet
Unit 3
42 pages
SAP Basis exam questions
No ratings yet
SAP Basis exam questions
19 pages
Harshita Loganathan Resume
No ratings yet
Harshita Loganathan Resume
1 page
LT2 - 07 - Numpy Matplotlib Pandas
No ratings yet
LT2 - 07 - Numpy Matplotlib Pandas
101 pages
CS330 Software Engineering: Software Requirements Specifications Document
No ratings yet
CS330 Software Engineering: Software Requirements Specifications Document
5 pages
Satish Dangi
No ratings yet
Satish Dangi
13 pages
3.1 Structured Query Language (SQL) : Unit-Iii
No ratings yet
3.1 Structured Query Language (SQL) : Unit-Iii
18 pages
A Parallel SPH Implementation On Multi-Core Cpus: (1981), Number 0 Pp. 1-12
No ratings yet
A Parallel SPH Implementation On Multi-Core Cpus: (1981), Number 0 Pp. 1-12
12 pages
Numpy
No ratings yet
Numpy
14 pages
Numpy Tutorial
No ratings yet
Numpy Tutorial
19 pages
NUMPY
No ratings yet
NUMPY
33 pages
45B AIML Practical1.1
No ratings yet
45B AIML Practical1.1
57 pages
Lecture 2 - NumPy I
No ratings yet
Lecture 2 - NumPy I
12 pages
Module3 Advance Pythonlibraries
No ratings yet
Module3 Advance Pythonlibraries
53 pages
numpyintro-pdf
No ratings yet
numpyintro-pdf
17 pages
Numpy
No ratings yet
Numpy
64 pages
Python Sem v Portion 2
No ratings yet
Python Sem v Portion 2
29 pages
Tutorial 2
No ratings yet
Tutorial 2
9 pages
Num Py
No ratings yet
Num Py
31 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
42 pages
p
No ratings yet
p
27 pages
Numpy Basics
No ratings yet
Numpy Basics
66 pages
RAW Data
No ratings yet
RAW Data
22 pages
Python Unit 3
No ratings yet
Python Unit 3
38 pages
Lecture 2 - NumPy I
No ratings yet
Lecture 2 - NumPy I
11 pages
Data Visualization1
No ratings yet
Data Visualization1
52 pages
Unit - Iii
No ratings yet
Unit - Iii
79 pages
1Z0-771
No ratings yet
1Z0-771
4 pages
Module Numpy
No ratings yet
Module Numpy
67 pages
AU14D08-Now I Have AESEv3
No ratings yet
AU14D08-Now I Have AESEv3
44 pages
Print
No ratings yet
Print
296 pages
Python 5th Sem
No ratings yet
Python 5th Sem
33 pages
Lecture 2 - NumPy I
No ratings yet
Lecture 2 - NumPy I
12 pages
Numpy & Pandas
No ratings yet
Numpy & Pandas
13 pages
Numpy Python
No ratings yet
Numpy Python
36 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
PYTHON UNIT-5 Part-B
No ratings yet
PYTHON UNIT-5 Part-B
3 pages
1 - Numpy
No ratings yet
1 - Numpy
1 page
NumPy and Pandas
No ratings yet
NumPy and Pandas
72 pages
Lab 2, Python Numpy - LUMS
No ratings yet
Lab 2, Python Numpy - LUMS
4 pages
Data Science Handwritten Notes - 3
No ratings yet
Data Science Handwritten Notes - 3
26 pages
Numpy: Usage For Data Analysis Operations
No ratings yet
Numpy: Usage For Data Analysis Operations
20 pages
Introduction To Numpy Pandas and Matplotlib
No ratings yet
Introduction To Numpy Pandas and Matplotlib
2 pages
Workplqce
No ratings yet
Workplqce
116 pages
Numpy Matplot
No ratings yet
Numpy Matplot
14 pages
Inventory Management System DBMS Project
No ratings yet
Inventory Management System DBMS Project
27 pages
NUMPY Basics: Computation and File I/O Using Arrays
No ratings yet
NUMPY Basics: Computation and File I/O Using Arrays
9 pages
Lab-3 AI
No ratings yet
Lab-3 AI
21 pages
11 M-Way Search Trees
No ratings yet
11 M-Way Search Trees
33 pages
python-notes-BCC-302 (Unit - 05)
No ratings yet
python-notes-BCC-302 (Unit - 05)
25 pages
NumPy Notes
No ratings yet
NumPy Notes
13 pages
Azure Continuum
No ratings yet
Azure Continuum
4 pages
Data Guard
No ratings yet
Data Guard
3 pages
Understanding How PeopleCode Events Work
No ratings yet
Understanding How PeopleCode Events Work
14 pages
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet
Design And Analysis Of Algorithm
From Everand
Design And Analysis Of Algorithm
Bhupendra Mandloi
No ratings yet

Uploaded by

Uploaded by

MODULE V – Data Processing

In short numpy has following features:

* ndarray: an efficient multidimensional array providing fast array-oriented arithmetic operations

* Linear algebra, random number generation, and Fourier transform capabilities.

The NumPy ndarray: A Multidimensional Array Object

* Illustration : Import NumPy and generate a small array of random data

- arange is an array-valued version of the built-in Python range function

* Operations between differently sized arrays is called broadcasting

Basic Indexing and Slicing

* One-dimensional arrays are simple and act similar to python lists

* First create a slice of arr and then change values in arr_slice

* The “bare” slice [:] will assign to all values in an array:

* To select individual elements, you can pass a comma-separated list of indices.

So the following expressions are equivalent:

* 3 dimensional array : 2 × 2 × 3 array arr3d

* Both scalar values and arrays can be assigned to arr3d[0]

* 1-D array Slicing

arr2d[:2] => select the first two rows of arr2d .

* Linear algebra, like matrix multiplication, decompositions, determinants, and other

Pandas - Reading, Manipulating, and Processing Data.

- It provides sophisticated indexing functionality to make it easy to reshape, slice

Introduction to pandas Data Structures

* A DataFrame represents a rectangular table of data and contains an ordered collection of

* A column in a DataFrame can be retrieved as a Series either by dict-like notation or

* Columns can be modified by assignment

Data Loading, Storage, and File Formats

- It is a simple text file, following a few formatting conventions

CSV Data Loading using Pandas

* Accessing data is a necessary first step for most of the programs.

* Pandas provide function to read csv file

* Let ex1.csv be a csv file with following data

pd.read_csv('examples/ex2.csv', skiprows=[0, 2, 3])

Writing Data to Text Format

* both the row and column labels can be disabled

a) Display its contents:

b)Set courses Column as Index

# Skip first few rows

#Yields below output

d) Read CSV by Ignoring Column Names

e)Load only Selected Columns

Plotting and Visualization

* Making informative visualizations (sometimes called plots) is one of the

* matplotlib is a desktop plotting package designed for creating (mostly

import matplotlib.pyplot as plt

* Draw a simple line plot

In [16]: fig = plt.figure()

- In IPython, an empty plot window will appear.

In [17]: ax1 = fig.add_subplot(2, 2, 1)

Colors, Markers, and Line Styles

* Matplotlib’s main plot function accepts arrays of x and y coordinates and

ax.plot(x, y, linestyle='--', color='g')

plot(randn(30).cumsum(), color='k', linestyle='dashed', marker='o')

Ticks, Labels, and Legends

* The pyplot interface, designed for interactive use, consists of methods

* They can be used in two ways:

1. Called with no arguments returns the current parameter value (e.g.,

Setting the title, axis labels, ticks, and ticklabels

* Eg:Create a simple figure and plot a random walk

* Similarly, labels corresponding to tick marks can be set by set_xlabels() and

* set_title() gives a title the plot

* ax.legend() or plt.legend() to automatically create a legend.

Matplotlib - Simple Plot

This ndarray object serves as values on x axis of the graph.

Corresponding sine values of angles in x to be displayed on y axis are obtained by :

-A Web Application Framework or a simply a Web Framework represents a collection of libraries

* Create the “Hello World” app

You might also like