Data Exploration & Visualization - Unit 2
Data Exploration & Visualization - Unit 2
Importing Matplotlib – Simple line plots – Simple scatter plots – visualizing errors – density and contour plots –
Histograms – legends – colors – subplots – text and annotation – customization – three dimensional plotting -
Geographic Data with Basemap - Visualization with Seaborn.
IMPORTING MATPLOTLIB
• Matplotlib is a low level graph plotting library in python that serves as a visualization utility. Matplotlib
was created by John D. Hunter.
• Matplotlib is mostly written in python, a few segments are written in C, Objective-C and Javascript for
Platform compatibility. Matplotlib is a Python library that helps to plot graphs.
• It is used in data visualization and graphical plotting. To use matplotlib, we need to install it.
Pyplot
Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported under the plt
alias:
import matplotlib.pyplot as plt
Example
Draw a line in a diagram from position (0,0) to position (6,250):
Matplotlib is a data visualization library in Python. Line charts are used to represent the relation between two
data X and Y on a different axis.
In this example, a simple line chart is generated using NumPy to define data values. The x-values are evenly
spaced points, and the y-values are calculated as twice the corresponding x-values.
# importing the required libraries
import matplotlib.pyplot as plt
import numpy as np
# define data values
x = np.array([1, 2, 3, 4])
y = x*2
plt.plot(x, y) plt.show()
Similarly, the line style can be adjusted using the linestyle keyword:
• plt.plot(x, y, linestyle='solid')
• plt.plot(x, y, linestyle='dashed')
• plt.plot(x, y, linestyle='dashdot')
• plt.plot(x, y, linestyle='dotted');
# For short, you can use the following codes:
• plt.plot(x, y, linestyle='-') # solid
• plt.plot(x, y, linestyle='--') # dashed
• plt.plot(x, y, linestyle='-.') # dashdot
• plt.plot(x, y, linestyle=':'); # dotted
If you would like to be extremely terse, these linestyle and color codes can be combined into a single non-
keyword argument to the plt.plot() function:
• plt.plot(x, y, '-g') # solid green
• plt.plot(x, y, '--c') # dashed cyan
• plt.plot(x, y, '-.k') # dashdot black
• plt.plot(x, y, ':r'); # dotted red
Example
Shorter syntax:
plt.plot(ypoints, ls = ':')
ADJUSTING THE PLOT: AXES LIMITS
Matplotlib does a decent job of choosing default axes limits for your plot, but sometimes it's nice to have finer
control. The most basic way to adjust axis limits is to use the plt.xlim() and plt.ylim() methods:
• plt.plot(x, np.sin(x))
• plt.xlim(-1, 11)
• plt.ylim(-1.5, 1.5);
LABELING PLOTS
Titles and axis labels are the simplest such labels—there are methods that can be used to quickly set them:
• plt.plot(x, np.sin(x))
• plt.title("A Sine Curve")
• plt.xlabel("x")
• plt.ylabel("sin(x)");
LINE WIDTH
You can use the keyword argument linewidth or the shorter lw to change the width of the line.
The value is a floating number, in points:
• plt.plot(x,y, linewidth = '20.5')
plt.legend();
As you can see, the plt.legend() function keeps track of the line style and color, and matches these with
the correct label. More information on specifying and formatting plot legends can be found in the plt.legend
docstring
• With Pyplot, you can use the scatter() function to draw a scatter plot.
• The scatter() function plots one dot for each observation. It needs two arrays of the same length, one for
the values of the x-axis, and one for values on the y-axis:
COMPARE PLOTS
We can also compare the relationship between two plots
Example
Draw two plots on the same figure:
import matplotlib.pyplot as plt
import numpy as np
#day one, the age and speed of 13 cars:
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)
COLORS
You can set your own color for each scatter plot with the color or the c argument:
Example
Set your own color of the markers:
…………………….
plt.scatter(x, y, color = 'hotpink')
………………………
plt.scatter(x, y, color = '#88c999')
plt.show()
COLOR EACH DOT
• To set a specific color for each dot by using an array of colors as value for the c argument:
• You cannot use the color argument for this, only the c argument.
Example
Set your own color of the markers:
…………..
colors = np.array(["red","green","blue","yellow","pink","black","orange","purple","beige","brown","gray","cyan","magenta"])
plt.scatter(x, y, c=colors)
plt.show()
COLORMAP
The Matplotlib module has a number of available colormaps. A colormap is like a list of colors, where each
color has a value that ranges from 0 to 100.
Here is an example of a colormap:
• This colormap is called 'viridis' and as you can see it ranges from 0, which is a purple color, up to 100,
which is a yellow color.
• You can specify the colormap with the keyword argument cmap with the value of the colormap, in this
case 'viridis' which is one of the built-in colormaps available in Matplotlib.
• In addition you have to create an array with values (from 0 to 100), one value for each point in the
scatter plot:
Example
Create a color array, and specify a colormap in the scatter plot:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])
plt.scatter(x, y, c=colors, cmap='viridis')
plt.show()
ALPHA
You can adjust the transparency of the dots with the alpha argument. Just like colors, make sure the array for
sizes has the same length as the arrays for the x- and y-axis:
Example
Set your own size for the markers:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
sizes = np.array([20,50,100,200,500,1000,60,90,10,300,600,800,75])
plt.scatter(x, y, s=sizes, alpha=0.5)
plt.show()