Seaborn countplot() basics: x, y, order, and hue

Einblick Content Team - March 7th, 2023

As a data scientist, you have to deal with a lot of categorical data, from product to subscription types, there are many instances when you need an intuitive plot to compare different groups. In this post, we’ll review the basics of seaborn’s countplot() function. We’ll cover a few key arguments, including data, x, y, order, and hue, as well as a few plot examples.

Use the table of contents on the left to jump to the most relevant plot for your use-case. Otherwise, check out the canvas below for all of the code. We’ve used a subset of a books dataset found on Kaggle.

Import and setup

import seaborn as sns
sns.set_theme()

# Create publication_year column
df["publication_year"] = [int(date[-4:]) for date in df["publication_date"]]

# Subset the data to include only three authors and publication years between 1998 and 2004
df = df[df["authors"].isin(["Stephen King", "Orson Scott Card", "James Patterson"]) & (df["publication_year"] > 1998) & (df["publication_year"] < 2004)]

df.head()

Output:

Subset of book dataset, df.head resultsSubset of book dataset, df.head results

From the results of df.head() we can see there are 6 columns. We'll be focusing on authors and publication_year.

Basic count plot: sns.countplot(data, x or y)

sns.countplot() Example 1: x

# Example 1
sns.countplot(data = df, x = "authors")

Output:

seaborn countplot example, x argumentseaborn countplot example, x argument

The most basic sns.countplot() example, uses just two arguments:

  • data: dataset, such as a DataFrame (i.e. df)
  • x: name of variable to be plotted on the x-axis (i.e. authors), results in a count plot with vertical bars

sns.countplot() Example 2: y

If you want a count plot with horizontal bars, you can simply use the y argument, rather than x.

# Example 2
sns.countplot(data = df, y = "authors")

Output:

seaborn countplot example, y argumentseaborn countplot example, y argument

Advanced count plot: order and hue

For more advanced plots, you can specify the order and color of your bars based on the categories being plotted.

sns.countplot() Example 3: order

# Specify order of bars
year_order = df['publication_year'].value_counts().index
print(year_order)

# Example 3
sns.countplot(data = df, x = "publication_year", order = year_order)

Output:

Int64Index([2002, 2001, 2000, 2003, 1999], dtype='int64')
seaborn countplot example, order argumentseaborn countplot example, order argument

In this case, we ordered the bars based on how many books were published in each year.

sns.countplot() Example 4: hue

Lastly, we'll use the hue argument to compare books published by author and year variables.

# Example 4
sns.countplot(data = df, y = "authors", hue = "publication_year")

# Adjust legend placement
import matplotlib.pyplot as plt
plt.legend(bbox_to_anchor = (1.02, 1), loc = 'upper left', borderaxespad = 0)

Output:

seaborn countplot example, hue argumentseaborn countplot example, hue argument

About

Einblick is an agile data science platform that provides data scientists with a collaborative workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick customers include Cisco, DARPA, Fuji, NetApp and USDA. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.

Start using Einblick

Pull all your data sources together, and build actionable insights on a single unified platform.

  • All connectors
  • Unlimited teammates
  • All operators