In this post, we’ll review seaborn’s catplot()
function, which is helpful for creating different kinds of plots to help you analyze and understand the relationships between continuous and categorical variables.
For this tutorial, we used a museum dataset from Kaggle. The dataset includes information about museum type, location, income, revenue, and more. You can view the full code in the embedded canvas below. Otherwise, read on to learn more about how to customize your own categorical plot.
Note: seaborn
has a variety of other functions to create plots for categorical data. One of the main benefits of the sns.catplot()
function is that you can access several different kinds of plots–from boxplots and violin plots to swarm plots and bar plots–with just one function. Since sns.catplot()
returns a FacetGrid
object, which you can then tune easily to create beautiful and functional visualizations.
catplot()
Examples
seaborn's We'll create four different plots using the catplot()
function. You can create these plots using some of seaborn's other built-in functions as well--sns.stripplot()
, sns.violinplot()
, sns.boxplot()
, etc.
Plot #1: default--jittered strip plot
# Import libraries, set theme
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_theme()
# Create default catplot(), which is a jittered strip plot
ax1 = sns.catplot(x = "Museum Type", y = "Income", hue = "Museum Type", data = df)
# Remove xtick labels
ax1.set_xticklabels([])
# Position legend in upper right corner
plt.legend(bbox_to_anchor = (1, 1.02), loc = 'upper left')

The default plot, ax1
, used the following arguments.
x
: the column name with x-axis variabley
: the column name with y-axis variablehue
: the column name containing a categorical variable for coloring the datadata
: DataFrame containing dataset
To improve the aesthetics, we used the set_xticklabels()
argument to remove the tick labels, and added a legend in the upper left to help readability of the long labels.
kind = "violin"
Plot #2: violin plot--By setting kind = "violin"
, you can create a violin plot, similar to the sns.violinplot()
function.
# Create a violin plot using kind = "violin"
ax2 = sns.catplot(x = "Museum Type", y = "Income", hue = "Museum Type", kind = "violin", data = df)
# Remove xtick labels, add legend
ax2.set_xticklabels([])
# Position legend in upper right corner
plt.legend(bbox_to_anchor = (1, 1.02), loc = 'upper left')

Plot #3: box plot
By setting kind = "box"
, you can create a violin plot, similar to the sns.boxplot()
function. There are a number of other plots you can create. Check out the seaborn documentation to learn more.
# Create a boxplot using kind = "box"
ax3 = sns.catplot(x = "State (Administrative Location)", y = "Income", kind = "box", data = df)

Plot #4: default, with 2 categorical variables
Next, we used the hue
argument on a different categorical variable to look at the relationship between state, museum type, and income.
# Create default catplot(), which is a jittered strip plot
ax4 = sns.catplot(x = "State (Administrative Location)", y = "Income", hue = "Museum Type", data = df)

FacetGrid
properties with seaborn's catplot()
Leveraging Lastly, we leveraged the FacetGrid
properties to create a 2x2 grid of plots--one for each museum type of interest.
# Create series of boxplots, create a 2 x 2 set of grids using col and col_wrap arguments
ax5 = sns.catplot(x = "State (Administrative Location)", y = "Income",
kind = "box", col = "Museum Type", col_wrap = 2, data = df)
ax5.set_titles(size = 8)

For each of the 4 plots, the state is on the x-axis, and income is on the y-axis. To create the 2x2 grid, we used the following arguments. All other arguments are the same as plot #3 from the above section.
col
: sets the categorical variable that will determine the faceting of the grid, in this case,"Museum Type"
col_wrap
: sets the number of columns in the facet grid, in this case, 2 columns
To improve readability, we also used the set_titles()
method to shrink the font size of the tables for each chart.
There are a number of other ways to tune and use seaborn's versatile catplot()
function. Good luck, and happy coding!
About
Einblick is an AI-native data science platform that provides data teams with an agile workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.