Learn how to create a histogram in matplotlib
, a powerful data visualization library in Python that backs other libraries, such as seaborn
. Check out the following 8 examples of matplotlib
histograms, with different levels of customization. Use the table of contents on the left to navigate through the post.
The data we've used for this example includes the heights and weights of Summer and Winter Olympic athletes from 1896 to 2016. Open and fork the canvas below for the full code.
Basic histogram: plt.hist(data, x, bins)
The following two examples create the same basic histogram. The plt.hist()
function can take in an array of data as x
, or it can take in a dataset as a DataFrame or another object as data
with column names, which can then be referenced using a string fed into the argument x
.
Example 1: plt.hist(x)
import matplotlib.pyplot as plt
# Basic histogram 1
plt.hist(x = df["Height"])
plt.show()
Example 2: plt.hist(data, x)
# Basic histogram 2
plt.hist(data = df, x = "Height")
plt.show()
Output:

Example 3: plt.hist(bins = n)
If you want to adjust the binning of a histogram, you can set the argument bins
equal to the number of bins you would like in your chart.
# Histogram 3, bins
plt.hist(x = df["Height"], bins = 50)
plt.show()
Output:

Example 4: plt.hist(bins = lst)
Alternatively, you can provide a list of values, which will determine the cutoff points for each bin. Note that the last bin is inclusive. In the example below, based on the list bins = [150, 200, 250, 300]
, the bins are as follows: [150, 200), [200, 250), and [250, 300], where the last bin includes both 250 and 300.
bins = [150, 200, 250, 300]
# Histogram 4, bins
plt.hist(x = df["Height"], bins = bins)
plt.show()
Output:

color
, labels
, orientation
, histtype
Advanced histograms: multiple datasets, plt.hist(x = [var1, var2])
)
Example 5: histogram with multiple datasets (If you have two variables that you would like to plot on the same histogram, you can do so by passing in a list to the the argument x
, as in the example below. If you do so, you may also want to color the bars differently, and create a legend using the following arguments:
color
: takes in a list or array of colorslabel
: takes in a list or array of labelsplt.legend()
: adds in a legend for the plot
# Histogram with 2 datasets
plt.hist(x = [df["Height"], df["Weight"]], bins = 25, color = ["lightskyblue", "lightgreen"], label = ["Height", "Weight"])
plt.legend(loc = "upper left")
plt.show()
Output:

Example 6: histogram, horizontal orientation
In addition to the variables we've used in previous examples, for this one, we've used the orientation = "horizontal"
option to change the layout of the histogram entirely.
# Histogram with 2 datasets
plt.hist(x = [df["Height"], df["Weight"]], bins = 25, color = ["lightskyblue", "lightgreen"], label = ["Height", "Weight"], orientation = "horizontal")
plt.legend(loc = "upper right")
plt.show()
Output:

Example 7: histogram colored by group
In the following example, we've created a grouped histogram by creating two separate sets of data, one for the heights of summer athletes, and one for the heights of winter athletes. We have then used arguments previously used to create the following chart.
Note: bins that overlap for winter and summer athletes are displayed side-by-side by default.
# Histogram grouped by category
summer = df[df["Season"] == "Summer"]["Height"]
winter = df[df["Season"] == "Winter"]["Height"]
plt.hist(x = [summer, winter], bins = 25, color = ["gold", "lightskyblue"], label = ["Summer Olympics", "Winter Olympics"])
plt.legend(loc = "upper left")
plt.show()
Output:

Example 8: different histogram types
Lastly, if you are unsatisfied with bars, the plt.hist()
function comes with a few prepackaged types that you can use:
histtype:
type of histogram, options are'bar'
,'barstacked'
,'step'
, and'stepfilled'
# Histogram grouped by category
summer = df[df["Season"] == "Summer"]["Height"]
winter = df[df["Season"] == "Winter"]["Height"]
plt.hist(x = [summer, winter], bins = 25, color = ["gold", "lightskyblue"], label = ["Summer Olympics", "Winter Olympics"], histtype = "step")
plt.legend(loc = "upper left")
plt.show()
Output:

About
Einblick is an AI-native data science platform that provides data teams with an agile workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.