0% found this document useful (0 votes)

11 views109 pages

DAP Module4

This document provides an overview of web scraping and numerical analysis using Python, focusing on data acquisition techniques, including the use of libraries such as Requests and BeautifulSoup. It covers the steps involved in web scraping, the importance of CSS selectors, and the functionality of Selenium for automating web browsers. The document includes practical examples and code snippets to illustrate the concepts discussed.

Uploaded by

batch0406sem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views109 pages

DAP Module4

Uploaded by

batch0406sem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 109

Data Analytics using Python

Module-4
Web Scraping And Numerical Analysis

Prepared by

ROOPA H M
Assistant Professor
RNS INSTITUTE OF TECHNOLOGY

Roopa H M, Asst. Professor, MCA, RNSIT

Topics to be studied

• Data Acquisition by Scraping web applications

• Submitting a form

• Fetching web pages

• Downloading web pages through form submission

• CSS Selectors.

• NumPy Essentials: The NumPy

Roopa H M, Asst. Professor, MCA, RNSIT

Need for Web Scraping

• Let’s suppose you want to get some information from a website?

• Let’s say an article from the some news article, what will you do?
• The first thing that may come in your mind is to copy and paste the
information into your local media.
• But what if you want a large amount of data on a daily basis and as quickly
as possible.
• In such situations, copy and paste will not work and that’s where you’ll
need web scraping.

Roopa H M, Asst. Professor, MCA, RNSIT

Web Scraping

• Web scraping is a technique used to extract data from websites. It involves

fetching and parsing HTML content to gather information.

• The main purpose of web scraping is to collect and analyze data from
websites for various applications, such as research, business intelligence, or
creating datasets.

• Developers use tools and libraries like BeautifulSoup (for Python), Scrapy, or
Puppeteer to automate the process of fetching and parsing web data.

Roopa H M, Asst. Professor, MCA, RNSIT

Python Libraries

• requests
• Beautiful Soup
• Selenium

Roopa H M, Asst. Professor, MCA, RNSIT

Requests

• The requests module allows you to send HTTP requests using

Python.
• The HTTP request returns a Response Object with all the response
data (content, encoding, status, etc).
• Install requests with pip install requests

Roopa H M, Asst. Professor, MCA, RNSIT

Python script to make a simple HTTP GET request
import requests
# Specify the URL you want to make a GET request to
url = "https://www.w3schools.com"
# Make the GET request
response = requests.get(url)
# Check if the request was successful (status code 200)
if response.status_code == 200:
# Print the content of the response
print("Response content:")
print(response.text)
else:
# Print an error message if the request was not successful
print(f"Error: {response.status_code}")
import requests
# Specify the base URL
base_url = "https://jsonplaceholder.typicode.com"
# GET request
get_response = requests.get(f"{base_url}/posts/1")
print(f"GET Response:\n{get_response.json()}\n")
# POST request
new_post_data = {
'title': 'New Post',
'body': 'This is the body of the new post.',
'userId': 1
}
post_response = requests.post(f"{base_url}/posts", json=new_post_data)
print(f"POST Response:\n{post_response.json()}\n")
# PUT request (Update the post with ID 1)
updated_post_data = {
'title': 'Updated Post',
'body': 'This is the updated body of the post.',
'userId': 1
}
put_response = requests.put(f"{base_url}/posts/1", json=updated_post_data)
print(f"PUT Response:\n{put_response.json()}\n")
# DELETE request (Delete the post with ID 1)
delete_response = requests.delete(f"{base_url}/posts/1")
print(f"DELETE Response:\nStatus Code: {delete_response.status_code}")

Roopa H M, Asst. Professor, MCA, RNSIT

Implementing Web Scraping in Python with BeautifulSoup

There are mainly two ways to extract data from a website:

• Use the API of the website (if it exists). Ex. Facebook Graph API
• Access the HTML of the webpage and extract useful information/data
from it.
Ex. WebScraping
Steps involved in web scraping

• Send an HTTP request to URL

• Parse the data which is accessed

• Navigate and search the parse tree that we created

BeautifulSoup

• It is an incredible tool for pulling out information from a webpage.

• Used to extract tables, lists, paragraph and you can also put filters to extract
information from web pages.

• BeautifulSoup does not fetch the web page for us. So we use requests pip
install beautifulsoup4
BeautifulSoup

from bs4 import BeautifulSoup

# parsing the document

soup = BeautifulSoup('''<h1>Knowx Innovations PVt Ltd</h1>''', "html.parser")

print(type(soup))
Tag Object

• Tag object corresponds to an XML or HTML tag in the original document.

• This object is usually used to extract a tag from the whole HTML document.

• Beautiful Soup is not an HTTP client which means to scrap online websites
you first have to download them using the requests module and then serve
them to Beautiful Soup for scraping.
• This object returns the first found tag if your document has multiple tags with the same name.
from bs4 import BeautifulSoup
# Initialize the object with an HTML page
soup = BeautifulSoup('''
<html>
<b>RNSIT</b>
<b> Knowx Innovations</b>
</html>
''', "html.parser")
# Get the tag
tag = soup.b
print(tag)
# Print the output
print(type(tag))
• The tag contains many methods and attributes. And two important features of a tag are
its name and attributes.
• Name:The name of the tag can be accessed through ‘.name’ as suffix.
• Attributes: Anything that is NOT tag
# Import Beautiful Soup
from bs4 import BeautifulSoup
# Initialize the object with an HTML page
soup = BeautifulSoup('''
<html>
<b>Knowx Innovations</b>
</html>
''', "html.parser")
# Get the tag
tag = soup.b
# Print the output
print(tag.name)
# changing the tag
tag.name = "Strong"
print(tag)
from bs4 import BeautifulSoup
# Initialize the object with an HTML page
soup = BeautifulSoup('''
<html>
<b class=“RNSIT“ name=“knowx”>Knowx Innoavtions</b>
</html>
''', "html.parser")
# Get the tag
tag = soup.b
print(tag["class"])
# modifying class
tag["class"] = “ekant"
print(tag)
# delete the class attributes
del tag["class"]
print(tag)
• A document may contain multi-valued attributes and can be accessed using key-value pair.

# Import Beautiful Soup

from bs4 import BeautifulSoup
# Initialize the object with an HTML page
# soup for multi_valued attributes
soup = BeautifulSoup('''
<html>
<b class="rnsit knowx">Knowx Innovations</b>
</html>
''', "html.parser")
# Get the tag
tag = soup.b
print(tag["class"])
• NavigableString Object: A string corresponds to a bit of text within a tag. Beautiful Soup uses
the NavigableString class to contain these bits of text

from bs4 import BeautifulSoup

soup = BeautifulSoup('''
<html>
<b>Knowx Innovations</b>
</html>
''', "html.parser")
tag = soup.b
# Get the string inside the tag
string = tag.string
print(string)
# Print the output
print(type(string))
Find the Siblings of the tag

• previous_sibling is used to find the previous element of the given element

• next_sibling is used to find the next element of the given element

• previous_siblings is used to find all previous element of the given element

• next_siblings is used to find all next element of the given element

descendants generator

• descendants generator is provided by Beautiful Soup

• The .contents and .children attribute only consider a tag’s direct children
• The descendants generator is used to iterate over all of the tag’s children,
recursively.
Example for descendants generator
from bs4 import BeautifulSoup
# Create the document
doc = "<body><b> <p>Hello world<i>innermost</i><p> </b><p> Outer text</p><body>"
# Initialize the object with the document
soup = BeautifulSoup(doc, "html.parser")
# Get the body tag
tag = soup.body
for content in tag.contents:
print(content)
for child in tag.children:
print(child)
for descendant in tag.descendants:
print(descendant)
Searching and Extract for specific tags With Beautiful Soup

• Python BeautifulSoup – find all class

# Import Module
from bs4 import BeautifulSoup
import requests
# Website URL
URL = 'https://www.python.org/'
# class list set
class_list = set()
# Page content from Website URL
page = requests.get( URL )
# parse html content
soup = BeautifulSoup( page.content , 'html.parser')
# get all tags
tags = {tag.name for tag in soup.find_all()}
# iterate all tags
for tag in tags:
# find all element of tag
for i in soup.find_all( tag ):
# if tag has attribute of class
if i.has_attr( "class" ):
if len( i['class'] ) != 0:
class_list.add(" ".join( i['class']))
print( class_list )

Roopa H M, Asst. Professor, MCA, RNSIT

Find a particular class
html_doc = """<html><head><title>Welcome to geeksforgeeks</title></head>
<body>
<p class="title"><b>Geeks</b></p>
<p class="body">This is an example to find a perticular class
</body>
"""
# import module
from bs4 import BeautifulSoup
# parse html content
soup = BeautifulSoup( html_doc , 'html.parser')
# Finding by class name
c=soup.find( class_ = "body")
print(c)
Search by text inside a tag

Steps involved for searching the text inside the tag:

• Import module
• Pass the URL
• Request page
• Specify the tag to be searched
• For Search by text inside tag we need to check condition to with help of string function.
• The string function will return the text inside a tag.
• When we will navigate tag then we will check the condition with the text.
• Return text
from bs4 import BeautifulSoup
import requests
# sample web page
sample_web_page = 'https://www.python.org'
# call get method to request that page
page = requests.get(sample_web_page)
# with the help of beautifulSoup and html parser create soup
soup = BeautifulSoup(page.content, "html.parser")
child_soup = soup.find_all('strong')
#print(child_soup)
text = """Notice:"""
# we will search the tag with in which text is same as given text
for i in child_soup:
if(i.string == text):
print(i)
IMPORTANTS POINTS

• BeautifulSoup provides several methods for searching for tags based on their contents,
such as find(), find_all(), and select().
• The find_all() method returns a list of all tags that match a given filter, while the find()
method returns the first tag that matches the filter.
• You can use the text keyword argument to search for tags that contain specific text.
Select method

• The select method in BeautifulSoup (bs4) is used to find all elements in a

parsed HTML or XML document that match a specific CSS selector.

• CSS selectors are patterns used to select and style elements in a

document.

• The select method allows you to apply these selectors to navigate and
extract data from the parsed document easily.
CSS Selector

• Id selector (#)
• Class selector (.)
• Universal Selector (*)
• Element Selector (tag)
• Grouping Selector(,)
CSS Selector

• Id selector (#) :The ID selector targets a specific HTML element based on its unique
identifier attribute (id). An ID is intended to be unique within a webpage, so using the ID
selector allows you to style or apply CSS rules to a particular element with a specific ID.
#header {
color: blue;
font-size: 16px;
}
• Class selector (.) : The class selector is used to select and style HTML elements based on
their class attribute. Unlike IDs, multiple elements can share the same class, enabling
you to apply the same styles to multiple elements throughout the document.
.highlight {
background-color: yellow;
font-weight: bold;
}
CSS Selector

• Universal Selector (*) :The universal selector selects all HTML elements on the webpage.
It can be used to apply styles or rules globally, affecting every element. However, it is
important to use the universal selector judiciously to avoid unintended consequences.
*{
margin: 0;
padding: 0;
}
• Element Selector (tag) : The element selector targets all instances of a specific HTML
element on the page. It allows you to apply styles universally to elements of the same
type, regardless of their class or ID.
p{
color: green;
font-size: 14px;
}
• Grouping Selector(,) : The grouping selector allows you to apply the same styles to
multiple selectors at once. Selectors are separated by commas, and the styles specified
will be applied to all the listed selectors.
h1, h2, h3 {
font-family: 'Arial', sans-serif;
color: #333;
}

• These selectors are fundamental to CSS and provide a powerful way to target and style
different elements on a webpage.

Roopa H M, Asst. Professor, MCA, RNSIT

<!DOCTYPE html>
<html>
<head>
<title>Sample Page</title>
</head>
<body>
<div id="content">
Creating a basic HTML page <h1>Heading 1</h1>
<p class="paragraph">This is a sample paragraph.</p>
<ul>
<li>Item 1</li>
<li>Item 2</li>
<li>Item 3</li>
</ul>
<a href="https://example.com">Visit Example</a>
</div>
</body>
</html>
Scraping example using CSS selectors
from bs4 import BeautifulSoup # 4. Select by attribute
Html=request.get((“web.html”) link =
soup = BeautifulSoup(Html, 'html.parser') soup.select('a[href="https://example.com"]
# 1. Select by tag name ')
heading = soup.select('h1') print("4. Link:", link[0]['href'])
print("1. Heading:", heading[0].text) # 5. Select all list items
# 2. Select by class list_items = soup.select('ul li')
paragraph = soup.select('.paragraph') print("5. List Items:")
print("2. Paragraph:", paragraph[0].text)
for item in list_items:
# 3. Select by ID
print("-", item.text)
div_content = soup.select('#content')
print("3. Div Content:", div_content[0].text)
Selenium
• Selenium is an open-source testing tool, which means it can be downloaded
from the internet without spending anything.

• Selenium is a functional testing tool and also compatible with non-

functional testing tools as well.

• Pip install selenium

Steps in form filling

• Import the webdriver from selenium

• Create driver instance by specifying browser

• Find the element

• Send the values to the elements

• Use click function to submit

Webdriver

• WebDriver is a powerful tool for automating web browsers.

• It provides a programming interface for interacting with web browsers and

performing various operations, such as clicking buttons, filling forms,
navigating between pages, and more.

• WebDriver supports multiple programming languages

from selenium import webdriver
Creating Webdriver instance

• You can create the instance of webdriver by using class webdriver and a browser which
you want to use
• Ex: driver = webdriver.Chrome()
• Browsers:
– webdriver.Chrome()
– webdriver.Firefox()
– webdriver.Edge()
– webdriver.Safari()
– webdriver.Opera()
– webdriver.Ie()
Find the element

• First you need get the form using function get()

• To find the element you can use find_element() by specifying any of the
fallowing arguments
—XPATH
—CSS Selector
XPATH
CSS Selector
from selenium import webdriver
import time
from selenium.webdriver.common.by import By
# Create a new instance of the Chrome driver
driver = webdriver.Chrome()
driver.maximize_window()
time.sleep(3)
# Navigate to the form page
driver.get('https://www.confirmtkt.com/pnr-status')
# Locate form elements
pnr_field = driver.find_element("name", "pnr")
submit_button = driver.find_element(By.CSS_SELECTOR, '.col-xs-4')
# Fill in form fields
pnr_field.send_keys('4358851774')
# Submit the form
submit_button.click()
Downloading web pages through form submission
from selenium import webdriver
import time
from selenium.webdriver.common.by import By
# Create a new instance of the Chrome driver
driver = webdriver.Chrome()
driver.maximize_window()
time.sleep(3)
# Navigate to the form page
driver.get('https://www.confirmtkt.com/pnr-status')
# Locate form elements
pnr_field = driver.find_element("name", "pnr")
submit_button = driver.find_element(By.CSS_SELECTOR, '.col-xs-4')
# Fill in form fields
pnr_field.send_keys('4358851774')
# Submit the form
submit_button.click()
welcome_message = driver.find_element(By.CSS_SELECTOR,".pnr-card")
# Print or use the scraped values
print(type(welcome_message))
html_content = welcome_message.get_attribute('outerHTML')
# Print the HTML content
print("HTML Content:", html_content)
# Close the browser
driver.quit()
Roopa H M, Asst. Professor, MCA, RNSIT
A Python Integer Is More Than Just an Integer
Every Python object is simply a cleverly disguised
C structure, which contains not only its value, but
other information as well.

X = 10000

X is not just a “raw” integer. It’s actually a

pointer to a compound C structure, which
contains several values.

Difference between C and Python Variable

Roopa H M, Asst. Professor, MCA, RNSIT

A Python List Is More Than Just a List

Roopa H M, Asst. Professor, MCA, RNSIT

A Python List Is More Than Just a List
Because of Python’s dynamic typing, we can even create heterogeneous lists:
In the special case that all variables are of the same type, much of this information is
redundant: it can be much more efficient to store data in a fixed-type array. The
difference between a dynamic-type list and a fixed-type (NumPy-style) array is
illustrated in Figure.

Roopa H M, Asst. Professor, MCA, RNSIT

Fixed-Type Arrays in Python
• Python offers several different options for storing data in efficient, fixed-type data
buffers. The built-in array module (available since Python 3.3) can be used to create
dense arrays of a uniform type:

While Python’s array object provides efficient storage of array-based data, NumPy adds to
this efficient operations on that data.

Roopa H M, Asst. Professor, MCA, RNSIT

Creating Arrays from Python Lists
import numpy as np

NumPy is constrained to arrays that all contain the same type. If types do not match, NumPy will upcast if possible

If we want to explicitly set the data type of the resulting array, we can use the dtype keyword:

Roopa H M, Asst. Professor, MCA, RNSIT

Creating Arrays from Python Lists
• NumPy arrays can explicitly be multidimensional; here’s one way of initializing a
multidimensional array using a list of lists:

Roopa H M, Asst. Professor, MCA, RNSIT

Creating Arrays from Scratch

Roopa H M, Asst. Professor, MCA, RNSIT

Roopa H M, Asst. Professor, MCA, RNSIT
Roopa H M, Asst. Professor, MCA, RNSIT
Roopa H M, Asst. Professor, MCA, RNSIT
NumPy Standard Data Types
• While constructing an array, you can specify them using a string:

• Or using the associated NumPy object:

Roopa H M, Asst. Professor, MCA, RNSIT

Roopa H M, Asst. Professor, MCA, RNSIT
The Basics of NumPy Arrays

Roopa H M, Asst. Professor, MCA, RNSIT

The Basics of NumPy Arrays
We’ll cover a few categories of basic array manipulations here:
• Attributes of arrays
Determining the size, shape, memory consumption, and data types of arrays
• Indexing of arrays
Getting and setting the value of individual array elements
• Slicing of arrays
Getting and setting smaller subarrays within a larger array
• Reshaping of arrays
Changing the shape of a given array
• Joining and splitting of arrays
Combining multiple arrays into one, and splitting one array into many

Roopa H M, Asst. Professor, MCA, RNSIT

NumPy Array Attributes

Roopa H M, Asst. Professor, MCA, RNSIT

NumPy Array Attributes
• Each array has attributes
ndim (the number of dimensions)
shape (the size of each dimension)
size (the total size of the array)

Roopa H M, Asst. Professor, MCA, RNSIT

Roopa H M, Asst. Professor, MCA, RNSIT
Write a Python program that creates a mxn integer arrayand Prints its attributes using
Numpy

Roopa H M, Asst. Professor, MCA, RNSIT

Output:

Roopa H M, Asst. Professor, MCA, RNSIT

Array Indexing: Accessing Single Elements

Roopa H M, Asst. Professor, MCA, RNSIT

In a multidimensional array, you access items using a comma-separated tuple of indices:

Roopa H M, Asst. Professor, MCA, RNSIT

You can also modify values using any of the above index notation:

NumPy arrays have a fixed type. This means, for example, that if you attempt to insert a floating-point value
to an integer array, the value will be silently truncated.

Roopa H M, Asst. Professor, MCA, RNSIT

Array Slicing: Accessing Subarrays
One-dimensional subarrays

Roopa H M, Asst. Professor, MCA, RNSIT

Roopa H M, Asst. Professor, MCA, RNSIT
Multidimensional subarrays
Subarray dimensions can even be reversed together:

Roopa H M, Asst. Professor, MCA, RNSIT

Accessing array rows and columns

Roopa H M, Asst. Professor, MCA, RNSIT

Subarrays as no-copy views

Now if we modify this subarray, we’ll see that

the original array is changed! Observe:

Roopa H M, Asst. Professor, MCA, RNSIT

Creating copies of arrays

Roopa H M, Asst. Professor, MCA, RNSIT

Reshaping of Arrays
Another useful type of operation is reshaping of arrays. The most flexible way of doing this
is with the reshape() method. For example, if you want to put the numbers 1 through 9 in a
3×3 grid, you can do the following:

Roopa H M, Asst. Professor, MCA, RNSIT

• Note that for this to work, the size of the initial array must match the size of the
reshaped array.

• The reshape method will use a no-copy view of the initial array, but with noncontiguous
memory buffers this is not always the case.

Roopa H M, Asst. Professor, MCA, RNSIT

Another common reshaping pattern is the conversion of a one-dimensional array into a
two-dimensional row or column matrix.

Roopa H M, Asst. Professor, MCA, RNSIT

• Reshaping can be done with the reshape method, or more easily by making use of the
newaxis keyword within a slice operation.

Roopa H M, Asst. Professor, MCA, RNSIT

Array Concatenation and Splitting
• Concatenation of arrays

• Concatenating more than two arrays at once:

Roopa H M, Asst. Professor, MCA, RNSIT

np.concatenate can also be used for two-dimensional arrays

Roopa H M, Asst. Professor, MCA, RNSIT

For working with arrays of mixed dimensions, it can be clearer to use the np.vstack
(vertical stack) and np.hstack (horizontal stack) functions:

Roopa H M, Asst. Professor, MCA, RNSIT

Splitting of arrays
• The opposite of concatenation is splitting, which is implemented by the functions np.split,
np.hsplit, and np.vsplit. For each of these, we can pass a list of indices giving the split points:

N split points lead to N + 1 subarrays.

Roopa H M, Asst. Professor, MCA, RNSIT

Roopa H M, Asst. Professor, MCA, RNSIT
Computation on NumPy Arrays: Universal Functions
• NumPy is so important in the Python data science world. It provides an easy and flexible
interface to optimized computation with arrays of data.

• Computation on NumPy arrays can be very fast, or it can be very slow. The key to making
it fast is to use vectorized operations, generally implemented through NumPy’s universal
functions (ufuncs).

• NumPy’s ufuncs can be used to make repeated calculations on array elements much
more efficient.

Roopa H M, Asst. Professor, MCA, RNSIT

The Slowness of Loops

Roopa H M, Asst. Professor, MCA, RNSIT

Each time the reciprocal is computed, Python first examines the object’s type and does a
dynamic lookup of the correct function to use for that type. If we were working in
compiled code instead, this type specification would be known before the code exe‐
cutes and the result could be computed much more efficiently.

Roopa H M, Asst. Professor, MCA, RNSIT

• For many types of operations, NumPy provides a convenient interface into this kind of
statically typed, compiled routine. This is known as a vectorized operation.

• This vectorized approach is designed to push the loop into the compiled layer that
underlies NumPy, leading to much faster execution.

Roopa H M, Asst. Professor, MCA, RNSIT

• Looking at the execution time for our big array, we see that it completes orders of
magnitude faster than the Python loop:

Roopa H M, Asst. Professor, MCA, RNSIT

Introducing UFuncs
• Vectorized operations in NumPy are implemented via ufuncs, whose main purpose is
to quickly execute repeated operations on values in NumPy arrays.

• Ufuncs are extremely flexible—before we saw an operation between a scalar and an

array, but we can also operate between two arrays:

Roopa H M, Asst. Professor, MCA, RNSIT

• ufunc operations are not limited to one-dimensional arrays—they can act on
multidimensional arrays as well:

Roopa H M, Asst. Professor, MCA, RNSIT

Exploring NumPy’s UFuncs
• Ufuncs exist in two flavors:
― unary ufuncs, which operate on a single input
― binary ufuncs, which operate on two inputs.

• We’ll see examples of both these types of functions here with-

— Array arithmetic
— Absolute value
— Trigonometric functions
— Exponents and logarithms

Roopa H M, Asst. Professor, MCA, RNSIT

Array arithmetic
• NumPy’s ufuncs feel very natural to use because they make use of Python’s native
arithmetic operators. The standard addition, subtraction, multiplication, and division can
all be used:

Roopa H M, Asst. Professor, MCA, RNSIT

• There is also a unary ufunc for negation, a ** operator for exponentiation, and a % operator for modulus:

All of these arithmetic operations are

simply convenient wrappers around
specific functions built into NumPy; for
example, the + operator is a wrapper for
the add function.

Roopa H M, Asst. Professor, MCA, RNSIT

Roopa H M, Asst. Professor, MCA, RNSIT
Absolute value
• The corresponding NumPy ufunc is np.absolute, which is also available under the alias
np.abs:

Roopa H M, Asst. Professor, MCA, RNSIT

Trigonometric functions
• NumPy provides a large number of useful ufuncs, and some of the most useful for the
data scientist are the trigonometric functions.

Roopa H M, Asst. Professor, MCA, RNSIT

Exponents and logarithms

Roopa H M, Asst. Professor, MCA, RNSIT

Roopa H M, Asst. Professor, MCA, RNSIT
Advanced Ufunc Features
Few specialized features of ufuncs are
• Specifying output
• Aggregates
• Outer products

Roopa H M, Asst. Professor, MCA, RNSIT

Specifying output
• For large calculations, it is sometimes useful to be able to specify the array where the
result of the calculation will be stored. Rather than creating a temporary array, you can
use this to write computation results directly to the memory location where you’d
like them to be. For all ufuncs, you can do this using the out argument of the function:

Roopa H M, Asst. Professor, MCA, RNSIT

we can write the results of a computation to every other element of a specified array:

If we had instead written y[::2] = 2 ** x, this would have resulted in the creation of
a temporary array to hold the results of 2 ** x

Roopa H M, Asst. Professor, MCA, RNSIT

Aggregates
• For binary ufuncs, there are some interesting aggregates that can be computed directly
from the object. we can use the reduce method of any ufunc can do this.

• A reduce method repeatedly applies a given operation to the elements of an array until
only a single result remains.
• For example, calling reduce on the add ufunc returns the sum of all elements in the
array:

Roopa H M, Asst. Professor, MCA, RNSIT

calling reduce on the multiply ufunc results in the product of all array elements:

to store all the intermediate results of the computation

Note that for these particular cases, there are dedicated NumPy functions to compute the results
(np.sum, np.prod, np.cumsum, np.cumprod)
Roopa H M, Asst. Professor, MCA, RNSIT
Outer products
• Finally, any ufunc can compute the output of all pairs of two different inputs using the
outer method. This allows you, in one line, to do things like create a multiplication table:

Roopa H M, Asst. Professor, MCA, RNSIT

Broadcasting
Broadcasting in NumPy is a powerful mechanism that allows for the arithmetic operations on arrays of
different shapes and sizes, without explicitly creating additional copies of the data. It simplifies the
process of performing element-wise operations on arrays of different shapes, making code more
concise and efficient.

Here are the key concepts of broadcasting in NumPy:

• Shape Compatibility: Broadcasting is possible when the dimensions of the arrays involved are
compatible. Dimensions are considered compatible when they are equal or one of them is 1. NumPy
automatically adjusts the shape of smaller arrays to match the shape of the larger array during the
operation.
• Rules of Broadcasting: For broadcasting to occur, the sizes of the dimensions must either be the
same or one of them must be 1. If the sizes are different and none of them is 1, then broadcasting is
not possible, and NumPy will raise a ValueError.

Roopa H M, Asst. Professor, MCA, RNSIT

• Automatic Replication: When broadcasting, NumPy automatically replicates the smaller array along the
necessary dimensions to make it compatible with the larger array. This replication is done without actually
creating multiple copies of the data, which helps in saving memory.
Example:
Suppose you have a 2D array A of shape (3, 1) and another 1D array B of shape (3). Broadcasting allows you to
add these arrays directly, and NumPy will automatically replicate the second array along the second
dimension to match the shape of the first array.

import numpy as np array([[5, 6, 7],

A = np.array([[1], [2], [3]]) [6, 7, 8],
[7, 8, 9]])
B = np.array([4, 5, 6])
result = A + B # Broadcasting occurs here

Roopa H M, Asst. Professor, MCA, RNSIT

Beautiful Soup Documentation
No ratings yet
Beautiful Soup Documentation
53 pages
Final Year Project Report
No ratings yet
Final Year Project Report
44 pages
GEN-AI
No ratings yet
GEN-AI
37 pages
Dictionaries: A Mapping Type, Arrays and Modular Programming
No ratings yet
Dictionaries: A Mapping Type, Arrays and Modular Programming
47 pages
Internship Report Priyank vasoya
No ratings yet
Internship Report Priyank vasoya
80 pages
Christos Chen
No ratings yet
Christos Chen
42 pages
Report Ipl Prediction
No ratings yet
Report Ipl Prediction
31 pages
AI Lab Record for Class x
No ratings yet
AI Lab Record for Class x
11 pages
Web Scraping Using Python - Notes
No ratings yet
Web Scraping Using Python - Notes
6 pages
03 Web Scraping
No ratings yet
03 Web Scraping
41 pages
genaifile
No ratings yet
genaifile
39 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
ibm-python-module-5-apis-data-collection
No ratings yet
ibm-python-module-5-apis-data-collection
3 pages
Beautiful Soup Tutorial
100% (2)
Beautiful Soup Tutorial
56 pages
NK DT Project
No ratings yet
NK DT Project
54 pages
HKU - 7001 - 4. Web Scraping
No ratings yet
HKU - 7001 - 4. Web Scraping
73 pages
LIST OF QUESTION For PPS PRACTICAL EXAMINATION
No ratings yet
LIST OF QUESTION For PPS PRACTICAL EXAMINATION
2 pages
E & Ai Lab Manual
No ratings yet
E & Ai Lab Manual
31 pages
Introduction To Python
No ratings yet
Introduction To Python
24 pages
1747399713103-1747037056197-webscraping
No ratings yet
1747399713103-1747037056197-webscraping
12 pages
Module - 5 Functions
No ratings yet
Module - 5 Functions
41 pages
PYTHON MODULE-4
No ratings yet
PYTHON MODULE-4
109 pages
Web Scraping Cheat Sheet (2021), Python For Web Scraping by Frank Andrade Geek Culture - Medium
100% (2)
Web Scraping Cheat Sheet (2021), Python For Web Scraping by Frank Andrade Geek Culture - Medium
26 pages
Beautiful Soup Documentation: Getting Help
100% (1)
Beautiful Soup Documentation: Getting Help
56 pages
How To Scrape Websites With Python and BeautifulSoup PDF
100% (2)
How To Scrape Websites With Python and BeautifulSoup PDF
10 pages
beautifulSoup
No ratings yet
beautifulSoup
61 pages
Web-Scraping-With-Python
No ratings yet
Web-Scraping-With-Python
16 pages
3rd Quart Exam Robtics & AI Class 10
No ratings yet
3rd Quart Exam Robtics & AI Class 10
5 pages
05 MGMT 590 Fall 2019 Beautiful Soup
No ratings yet
05 MGMT 590 Fall 2019 Beautiful Soup
9 pages
Beautiful Soup Documentation - Beautiful Soup 4.4.0 Documentation
No ratings yet
Beautiful Soup Documentation - Beautiful Soup 4.4.0 Documentation
49 pages
Problem Solving & Python Programming Manual
No ratings yet
Problem Solving & Python Programming Manual
73 pages
DAP_Module 4
No ratings yet
DAP_Module 4
57 pages
BeautifulSoup for Python RPA
No ratings yet
BeautifulSoup for Python RPA
6 pages
3252_ids_10
No ratings yet
3252_ids_10
5 pages
Lecture03 Data II
No ratings yet
Lecture03 Data II
42 pages
Beautiful Soup
No ratings yet
Beautiful Soup
40 pages
Group 4 Review 1
No ratings yet
Group 4 Review 1
13 pages
Numpy For Matlab User
No ratings yet
Numpy For Matlab User
17 pages
Web+Scraping+Cheat+Sheet+2 0
No ratings yet
Web+Scraping+Cheat+Sheet+2 0
3 pages
Beautiful Soup Documentation
No ratings yet
Beautiful Soup Documentation
61 pages
Data - Collection Python
No ratings yet
Data - Collection Python
40 pages
Scrapping The Web
100% (1)
Scrapping The Web
13 pages
design report 1 (Repaired)
No ratings yet
design report 1 (Repaired)
50 pages
200336.055-en
No ratings yet
200336.055-en
2 pages
Beautiful Soup
No ratings yet
Beautiful Soup
7 pages
Webscraping1 1 PDF
No ratings yet
Webscraping1 1 PDF
10 pages
Implementing Web Scraping in Python With Beautifulsoup
No ratings yet
Implementing Web Scraping in Python With Beautifulsoup
6 pages
Face Detection
No ratings yet
Face Detection
18 pages
Machine Learning Lab: Raheel Aslam (74-FET/BSEE/F16)
No ratings yet
Machine Learning Lab: Raheel Aslam (74-FET/BSEE/F16)
5 pages
Web Scraping
No ratings yet
Web Scraping
11 pages
Strip HTML Tags Using Python
No ratings yet
Strip HTML Tags Using Python
8 pages
Test 2
No ratings yet
Test 2
2 pages
Web Scraping and HTML Basics
No ratings yet
Web Scraping and HTML Basics
4 pages
Python For Web Scraping - Week 3: 1 Installing A Module
No ratings yet
Python For Web Scraping - Week 3: 1 Installing A Module
4 pages
A Simple Python Web Crawler...
100% (1)
A Simple Python Web Crawler...
5 pages
Web Crawling - python
No ratings yet
Web Crawling - python
34 pages
scraping
No ratings yet
scraping
6 pages
Cheat Sheet Collection
100% (1)
Cheat Sheet Collection
15 pages
Subject: Informatics Practices (Code-065) Class - XII
No ratings yet
Subject: Informatics Practices (Code-065) Class - XII
11 pages
web scraping using python
No ratings yet
web scraping using python
18 pages
Api and data structure
No ratings yet
Api and data structure
3 pages
Beautifulsoup: Web Scraping With Python
No ratings yet
Beautifulsoup: Web Scraping With Python
43 pages
Notes for Web Scraping - BeautifulSoup-3903
No ratings yet
Notes for Web Scraping - BeautifulSoup-3903
6 pages
DAP_4_module
No ratings yet
DAP_4_module
45 pages
Web Scraping and Data Collection CheatSheet 1731972399
No ratings yet
Web Scraping and Data Collection CheatSheet 1731972399
10 pages
Machine Learning Training Program - March 24
No ratings yet
Machine Learning Training Program - March 24
8 pages
Download
No ratings yet
Download
4 pages
Lesson 4 Unstructured Data
No ratings yet
Lesson 4 Unstructured Data
20 pages
Web Scarpping
No ratings yet
Web Scarpping
4 pages
PDF Document 2
No ratings yet
PDF Document 2
24 pages
Introduction to Web Crawling chapter -13
No ratings yet
Introduction to Web Crawling chapter -13
3 pages
Learning SciPy For Numerical and Scientific Computing - Second Edition - Sample Chapter
No ratings yet
Learning SciPy For Numerical and Scientific Computing - Second Edition - Sample Chapter
21 pages
(NEW) Beyond Technical Analysis With Python - A C - Hayden Van Der Post-Dual-Translated
67% (3)
(NEW) Beyond Technical Analysis With Python - A C - Hayden Van Der Post-Dual-Translated
262 pages
Mastering RethinkDB
From Everand
Mastering RethinkDB
Shahid Shaikh
No ratings yet
A Guide To Web Scraping in Python Using Beautiful Soup
No ratings yet
A Guide To Web Scraping in Python Using Beautiful Soup
6 pages
Basics of Python Programming
No ratings yet
Basics of Python Programming
51 pages
4a82c633-5051-45ef-a932-6a6495641a0e_4F_IntroToWebScraping
No ratings yet
4a82c633-5051-45ef-a932-6a6495641a0e_4F_IntroToWebScraping
6 pages
Responsive Web Design by Example : Beginner's Guide - Second Edition
From Everand
Responsive Web Design by Example : Beginner's Guide - Second Edition
Thoriq Firdaus
No ratings yet
Practical Introduction To Web Scraping in Python
100% (1)
Practical Introduction To Web Scraping in Python
14 pages
Machine Learning For Absolute Beginners - Oliver Theobald
No ratings yet
Machine Learning For Absolute Beginners - Oliver Theobald
128 pages
Programming 2 Lectures
No ratings yet
Programming 2 Lectures
52 pages
A Taste of Python Discrete and Fast Fourier Transforms
No ratings yet
A Taste of Python Discrete and Fast Fourier Transforms
11 pages
Beginner Guide To Web Scraping of Data
No ratings yet
Beginner Guide To Web Scraping of Data
14 pages
Web Scraping with Python Step by Step: A Practical Guide with Examples
From Everand
Web Scraping with Python Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Essential n8n Playbook
From Everand
Essential n8n Playbook
Leandro Calado
No ratings yet
Computer Vision
No ratings yet
Computer Vision
13 pages
Web Scraping Cheat Sheet 2.0
No ratings yet
Web Scraping Cheat Sheet 2.0
3 pages
Hands-On Web Scraping with Python: Perform advanced scraping operations using various Python libraries and tools such as Selenium, Regex, and others
From Everand
Hands-On Web Scraping with Python: Perform advanced scraping operations using various Python libraries and tools such as Selenium, Regex, and others
Anish Chapagain
No ratings yet
Web Scraping for SEO with Python
From Everand
Web Scraping for SEO with Python
Enrique Vicente
No ratings yet