Effective graphs with seaborn scatterplot(): hue, style, palette, size

Einblick Content Team - March 8th, 2023

Effective visuals are a critical part of painting a complete data story. Scatter plots are one of the most fundamental graphs in your toolkit. In this post, we’ll provide a comprehensive guide on seaborn’s scatterplot() function. We’ll cover a few key arguments, including hue, style, palette, and size that will help you create more compelling graphs.

Skip to the most relevant example for your use-case using the table of contents on the left. Otherwise, you can open and fork the canvas below for all of the code.

Import and setup

import seaborn as sns
sns.set_theme()

We’ve used a subset of an Olympics dataset found on Kaggle. The dataset contains information about 120 years of Olympic athletes. We’ve filtered the data for athletes that earned a medal in the Summer Olympic Games, in the year 2000 or later.

Basic scatter plot: sns.scatterplot(data, x, y)

sns.scatterplot(data = df, x = "Height", y = "Weight")

Output:

Seaborn scatterplot() basic exampleSeaborn scatterplot() basic example

We've stored the data in a DataFrame called df. Like many seaborn plots, you only need three key arguments to create a basic plot: data, x, and y. These variables store the dataset, the x-variable, and the y-variable respectively. The values for x and y are the relevant column names of the dataset.

Advanced scatter plots: hue, style, palette, and size

sns.scatterplot() Example: color (hue)

sns.scatterplot(data = df, x = "Height", y = "Weight", hue = "Medal")

Output:

Seaborn scatterplot() hue exampleSeaborn scatterplot() hue example

In the above plot, we used the hue argument and specified a column, "Medal" to dictate the color of the data points. Note that without further arguments, seaborn has a default color palette and order it will use for the data points.

BONUS: seaborn scatterplots with Generative AI

If you're interested in generative AI and want to visualize your data faster than ever, check out our AI agent, Einblick Prompt, which can create complex, beautiful charts from as little as one sentence. In the below canvas, we used built a comparable scatterplot with just one prompt. Check out how we did it below:

Using Generative AI in Einblick

  1. Open and fork the canvas
  2. Connect to your data
  3. Right-click anywhere in the canvas > Prompt
  4. Type in: "Plot a scatterplot of height and weight, using seaborn. Change the color by Sport (only include Gymnastics, Cycling, Swimming, Shooting, and Fencing.)"
  5. Run the code in Einblick's data notebook immediately

If you try out Prompt, let us know what you think!

sns.scatterplot() Example: marker shape (style)

sns.scatterplot(data = df, x = "Height", y = "Weight", hue = "Sport", style = "Sport")

Output:

Seaborn scatterplot() style example plotSeaborn scatterplot() style example plot

Beyond the hue argument, you can use the style argument as well to change the appearance of data points based on a particular variable. In this case, we used the different sport categories stored in "Sport" to dictate the color (hue) and the shape (marker) of each data point.

sns.scatterplot() Example: custom color palette (hue and palette)

If you have a particular color palette in mind that makes sense for your context, in this case, medal colors, you can create a dictionary to map the category names from your hue variable with the color names from matplotlib. Then you set the palette argument equal to your dictionary.

# Color palette dictionary
colors = {"Gold": "gold", "Silver": "grey", "Bronze": "darkgoldenrod"}

# Create scatterplot
sns.scatterplot(data = df, x = "Height", y = "Weight", hue = "Medal", style = "Medal", palette = colors)

Output:

Seaborn scatterplot() palette exampleSeaborn scatterplot() palette example

sns.scatterplot() Example: marker size (size)

In this last example, we create a numerical column to represent the value of a gold vs. silver vs. bronze medal at the Olympics, and use the size variable to manipulate how large the markers are, giving different context to end-users about the data.

# Create new column that maps medal color to value
d = {"Gold": 3, "Silver": 2, "Bronze": 1}
df['Medal_Val'] = df['Medal'].map(d)

# Create scatter plot
sns.scatterplot(data = df[df["Sport"] == "Swimming"], x = "Height", y = "Weight", hue = "Medal", size = "Medal_Val", palette = colors)

Output:

Seaborn scatterplot() size exampleSeaborn scatterplot() size example

NOTE: you can mix-and-match any of the arguments we've talked about to create a highly customized graph.

About

Einblick is an AI-native data science platform that provides data teams with an agile workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.

Start using Einblick

Pull all your data sources together, and build actionable insights on a single unified platform.

  • All connectors
  • Unlimited teammates
  • All operators