This notebook will cover the following topics:
The Jupyter notebook of this story is available here in the visualization part.
Introduction
It is significantly important to visualize the data efficiently to get more knowledge about the problem. A good visualization can lead to a better understanding of the problem. Although seaborn has been increasingly used, it’s created based on Matplotlib. So, we first assess the functionality of Matplotlib and we’ll work with seaborn later.
1. Basic Matplotlib
Let’s import the required packages for this notebook.
The version of packages when I’m using to prepare this notebook is
Matplotlib: 3.5.1
NumPy: 1.22.2
Note: In the early part of this notebook, you will see the plots are not very well organized. This is on purpose to value the power of object-oriented use of Matplotlib. After introducing the object-oriented method for Matplotlib, figures get awesome! 🙂
1.1 Simple Matplotlib
plt.plot()
This is the simplest job we can do with Matplotlib. To take the best out of Matplotlib, we should know this package better though. First, let’s see the hierarchical relationship between the three objects of Matplotlib.
Now, we should familiarize ourselves with the anatomy of axes and axis in Matplotlib.
Let’s have our first plot.
Let’s set the color of lines, markers, line style, and line width.
Here is the list of available colors in Matplotlib
and the list of available markers
We can also use emoji as a marker
1.2 Subplots
We can create a figure with subplots using plt.subplot(n_row, n_col, number)
.
We can also have one plot over multiple subplots!
Text and annotation
To add more details to your plots, texts and annotations can be added.
You can see here different types of connection styles
Changing the font and color between plots
Figure size
An important option in using Matplotlib is setting figure size in inches.
plt.figure(figsize=(width, height))
Default values: [6.4, 4.8]
plt.figure(figsize=(8, 5))
plt.plot(x1, y1)
We can also specify the dot per inch (DPI), the color of background, and edge
However, if you want to set the color of the background of the plot, you need more power which can be obtained easier if you use Matplotlib in an object-oriented manner.
1.3 Object-oriented method
Let’s level up the quality of our figures by using the object-oriented property of Matplotlib
Reference for Text properties
We can also use ax.set_xlim()
and ax.set_ylim()
to limit the axis.
We use alpha
in a plot to make the line transparent.
Save figures
Figures can be saved as
https://gist.github.com/da06dcf4c1334ed2b38e78573da3aa4a
2. Different types of plot
Here we learn about different types of plots using Matplotlib
We can also change the style of plots using the following syntax
plt.style.use(style_name)
So, I pick the default
style for the rest of this notebook but you can find your favorite style here.
plt.style.use('default')
2.1 Scatter plot
Scatter plots are used to see the relationship between two variables.
Syntax
matplotlib.pyplot.scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, *, edgecolors=None, plotnonfinite=False, data=None, **kwargs)
We can set a color for each sample based on their category.
We can change the labels in legend using set_text()
.
Plot XKCD
Sometimes might be interested try out XKCD
!!
2.2 Bar plot
Bar plots are used to study the categorical variable vs a numerical variable.
Syntax
matplotlib.pyplot.bar(x, height, width=0.8, bottom=None, *, align='center', data=None, **kwargs)
2.3 Histogram
Histograms are used to show the frequency occurrence of data.
Syntax
matplotlib.pyplot.hist(x, bins=None, range=None, density=False, weights=None, cumulative=False, bottom=None, histtype='bar', align='mid', orientation='vertical', rwidth=None, log=False, color=None, label=None, stacked=False, *, data=None, **kwargs)
Let’s fake up some data
letter = 'a b c d'
random_letter1 = np.random.choice(letter.split(), 70)
random_letter2 = np.random.choice(letter.split(), 50)
Hist for more than one variable
In this case, the histtype
is important.
2D histogram
2.4 Pie chart
Pie charts show the proportion of features in circular form.
Syntax
matplotlib.pyplot.pie(x, explode=None, labels=None, colors=None, autopct=None, pctdistance=0.6, shadow=False, labeldistance=1.1, startangle=0, radius=1, counterclock=True, wedgeprops=None, textprops=None, center=(0, 0), frame=False, rotatelabels=False, *, normalize=True, data=None)
2.5 Box Plot
A boxplot is used to display the distribution of data based on a five-number summary:
- minimum
- the first quartile (Q1)
- median
- the third quartile (Q3)
- maximum Box plots can help us to find outliers.
matplotlib.pyplot.boxplot(x, notch=None, sym=None, vert=None, whis=None, positions=None, widths=None, patch_artist=None, bootstrap=None, usermedians=None, conf_intervals=None, meanline=None, showmeans=None, showcaps=None, showbox=None, showfliers=None, boxprops=None, labels=None, flierprops=None, medianprops=None, meanprops=None, capprops=None, whiskerprops=None, manage_ticks=True, autorange=False, zorder=None, *, data=None)
2.6 Violin plot
A violin plot is more informative than a plain box plot. While a box plot only shows summary statistics such as mean/median and interquartile ranges, the violin plot shows the full distribution of the data.
Syntax
matplotlib.pyplot.violinplot(dataset, positions=None, vert=True, widths=0.5, showmeans=False, showextrema=True, showmedians=False, quantiles=None, points=100, bw_method=None, *, data=None)
As a simple example, I put violin and box plots next to each other for a better comparison.
3. Images with Matplotlib
We can read an image using imread()
method and plot it using imshow()
method.
# Read an image
image = plt.imread('./img/cat.jpeg');# Show the image
plt.imshow(image)
plt.axis('off'); # Turn off the axis
A loaded image is 3D where the third dimension is the values for RGB (Red, Blue, Green).
# Show the green part
plt.imshow(image[:, :, 1])
plt.axis('off');
Different color scales can be picked for images using cmap
. Also, we can set the maximum and minimum value in color scale using vmin
and vmax
.
So picking a good colormap and color limit is very important if you need to compare the values of two 2D variables.
4. Animation using Matplotlib
To make animation using Matplotlib, you need to import another module from this package as
import matplotlib.animation as FuncAnimation
4.1 Live graph with Matplotlib
Here is an example of live plotting. There is a text file in the repository called stock.txt
. By changing this file, the plot gets updated automatically.
By changing stock.txt
, we should see the changes live on the plot.
The file stock.txt
and more notebooks about Python are available in the following repository.
Thanks for reading and your feedbacks