# Categorical plots with seaborn catplot()

Einblick Content Team - January 11th, 2023

In this post, we’ll review seaborn’s catplot() function, which is helpful for creating different kinds of plots to help you analyze and understand the relationships between continuous and categorical variables.

For this tutorial, we used a museum dataset from Kaggle. The dataset includes information about museum type, location, income, revenue, and more. You can view the full code in the embedded canvas below. Otherwise, read on to learn more about how to customize your own categorical plot.

Note: seaborn has a variety of other functions to create plots for categorical data. One of the main benefits of the sns.catplot() function is that you can access several different kinds of plots–from boxplots and violin plots to swarm plots and bar plots–with just one function. Since sns.catplot() returns a FacetGrid object, which you can then tune easily to create beautiful and functional visualizations.

## seaborn's catplot() Examples

We'll create four different plots using the catplot() function. You can create these plots using some of seaborn's other built-in functions as well--sns.stripplot(), sns.violinplot(), sns.boxplot(), etc.

### Plot #1: default--jittered strip plot

# Import libraries, set theme
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_theme()

# Create default catplot(), which is a jittered strip plot
ax1 = sns.catplot(x = "Museum Type", y = "Income", hue = "Museum Type", data = df)

# Remove xtick labels
ax1.set_xticklabels([])

# Position legend in upper right corner
plt.legend(bbox_to_anchor = (1, 1.02), loc = 'upper left')

The default plot, ax1, used the following arguments.

• x: the column name with x-axis variable
• y: the column name with y-axis variable
• hue: the column name containing a categorical variable for coloring the data
• data: DataFrame containing dataset

To improve the aesthetics, we used the set_xticklabels() argument to remove the tick labels, and added a legend in the upper left to help readability of the long labels.

### Plot #2: violin plot--kind = "violin"

By setting kind = "violin", you can create a violin plot, similar to the sns.violinplot() function.

# Create a violin plot using kind = "violin"
ax2 = sns.catplot(x = "Museum Type", y = "Income", hue = "Museum Type", kind = "violin", data = df)

# Remove xtick labels, add legend
ax2.set_xticklabels([])

# Position legend in upper right corner
plt.legend(bbox_to_anchor = (1, 1.02), loc = 'upper left')

### Plot #3: box plot

By setting kind = "box", you can create a violin plot, similar to the sns.boxplot() function. There are a number of other plots you can create. Check out the seaborn documentation to learn more.

# Create a boxplot using kind = "box"
ax3 = sns.catplot(x = "State (Administrative Location)", y = "Income", kind = "box", data = df)

### Plot #4: default, with 2 categorical variables

Next, we used the hue argument on a different categorical variable to look at the relationship between state, museum type, and income.

# Create default catplot(), which is a jittered strip plot
ax4 = sns.catplot(x = "State (Administrative Location)", y = "Income", hue = "Museum Type", data = df)

## Leveraging FacetGrid properties with seaborn's catplot()

Lastly, we leveraged the FacetGrid properties to create a 2x2 grid of plots--one for each museum type of interest.

# Create series of boxplots, create a 2 x 2 set of grids using col and col_wrap arguments
ax5 = sns.catplot(x = "State (Administrative Location)", y = "Income",
kind = "box", col = "Museum Type", col_wrap = 2, data = df)
ax5.set_titles(size = 8)

For each of the 4 plots, the state is on the x-axis, and income is on the y-axis. To create the 2x2 grid, we used the following arguments. All other arguments are the same as plot #3 from the above section.

• col: sets the categorical variable that will determine the faceting of the grid, in this case, "Museum Type"
• col_wrap: sets the number of columns in the facet grid, in this case, 2 columns

To improve readability, we also used the set_titles() method to shrink the font size of the tables for each chart.

There are a number of other ways to tune and use seaborn's versatile catplot() function. Good luck, and happy coding!

Einblick is an agile data science platform that provides data scientists with a collaborative workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick customers include Cisco, DARPA, Fuji, NetApp and USDA. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.

## Start using Einblick

Pull all your data sources together, and build actionable insights on a single unified platform.

• All connectors
• Unlimited teammates
• All operators
###### Company
Website Data Collection