In this post, we’ll review seaborn’s
catplot() function, which is helpful for creating different kinds of plots to help you analyze and understand the relationships between continuous and categorical variables.
For this tutorial, we used a museum dataset from Kaggle. The dataset includes information about museum type, location, income, revenue, and more. You can view the full code in the embedded canvas below. Otherwise, read on to learn more about how to customize your own categorical plot.
seaborn has a variety of other functions to create plots for categorical data. One of the main benefits of the
sns.catplot() function is that you can access several different kinds of plots–from boxplots and violin plots to swarm plots and bar plots–with just one function. Since
sns.catplot() returns a
FacetGrid object, which you can then tune easily to create beautiful and functional visualizations.
We'll create four different plots using the
catplot() function. You can create these plots using some of seaborn's other built-in functions as well--
Plot #1: default--jittered strip plot
# Import libraries, set theme import matplotlib.pyplot as plt import seaborn as sns sns.set_theme() # Create default catplot(), which is a jittered strip plot ax1 = sns.catplot(x = "Museum Type", y = "Income", hue = "Museum Type", data = df) # Remove xtick labels ax1.set_xticklabels() # Position legend in upper right corner plt.legend(bbox_to_anchor = (1, 1.02), loc = 'upper left')
The default plot,
ax1, used the following arguments.
x: the column name with x-axis variable
y: the column name with y-axis variable
hue: the column name containing a categorical variable for coloring the data
data: DataFrame containing dataset
To improve the aesthetics, we used the
set_xticklabels() argument to remove the tick labels, and added a legend in the upper left to help readability of the long labels.
Plot #2: violin plot--
kind = "violin"
kind = "violin", you can create a violin plot, similar to the
# Create a violin plot using kind = "violin" ax2 = sns.catplot(x = "Museum Type", y = "Income", hue = "Museum Type", kind = "violin", data = df) # Remove xtick labels, add legend ax2.set_xticklabels() # Position legend in upper right corner plt.legend(bbox_to_anchor = (1, 1.02), loc = 'upper left')
Plot #3: box plot
# Create a boxplot using kind = "box" ax3 = sns.catplot(x = "State (Administrative Location)", y = "Income", kind = "box", data = df)
Plot #4: default, with 2 categorical variables
Next, we used the
hue argument on a different categorical variable to look at the relationship between state, museum type, and income.
# Create default catplot(), which is a jittered strip plot ax4 = sns.catplot(x = "State (Administrative Location)", y = "Income", hue = "Museum Type", data = df)
FacetGrid properties with seaborn's
Lastly, we leveraged the
FacetGrid properties to create a 2x2 grid of plots--one for each museum type of interest.
# Create series of boxplots, create a 2 x 2 set of grids using col and col_wrap arguments ax5 = sns.catplot(x = "State (Administrative Location)", y = "Income", kind = "box", col = "Museum Type", col_wrap = 2, data = df) ax5.set_titles(size = 8)
For each of the 4 plots, the state is on the x-axis, and income is on the y-axis. To create the 2x2 grid, we used the following arguments. All other arguments are the same as plot #3 from the above section.
col: sets the categorical variable that will determine the faceting of the grid, in this case,
col_wrap: sets the number of columns in the facet grid, in this case, 2 columns
To improve readability, we also used the
set_titles() method to shrink the font size of the tables for each chart.
There are a number of other ways to tune and use seaborn's versatile
catplot() function. Good luck, and happy coding!
Einblick is an agile data science platform that provides data scientists with a collaborative workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick customers include Cisco, DARPA, Fuji, NetApp and USDA. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.