# Creating custom graphs with seaborn boxplot()

Einblick Content Team - March 17th, 2023

There are many different kinds of visualizations that are most effective for conveying different kinds of information. Box plots or box-and-whisker plots are particularly useful in comparing distributions of continuous variables across groups, and identifying outliers. In this post, we’ll use seaborn’s boxplot() function to create and customize different box plots.

Open and fork the canvas below for all example boxplots. We’ve used a subset of an Olympics dataset found on Kaggle, and will use it to examine the distribution of heights among athletes participating in different sports.

## Import and setup

import seaborn as sns
sns.set_theme()

We've loaded in our dataset as a pandas DataFrame called df.

## Basic box plot: sns.boxplot(x or y)

The most basic box plot in seaborn will show the distribution of one continuous variable. You only need one argument: x or y. The variable you choose will alter the orientation of the box plot--whether it is horizontal or vertical.

NOTE: Alternatively you can use the data argument to specify where the entire dataset is stored, and then use the column name as a string to specify the value of x or y.

sns.boxplot(x = df["Height"])

Output:

sns.boxplot(y = df["Height"])

Output:

## Advanced box plots: comparing groups

### Comparing groups: sns.boxplot(x and y, hue)

If you would like to compare the distributions of certain continuous variables across different categories of data, you can use a combination of the x, y, and hue arguments. While using just x and y, where one is continuous and the other is categorical, is sufficient, if you would like a legend, it is easiest to also set the hue argument equal to the categorical variable.

import matplotlib.pyplot as plt

sns.boxplot(data = df, x = "Height", y = "Sport", hue = "Sport", dodge = False)

Output:

If you would like to compare groups within groups, you can set hue to a different categorical variable than the one already present in the x and y dimensions, as below.

sns.boxplot(data = df, x = "Height", y="Sport", hue = "Sex")

Output:

## BONUS: seaborn boxplots with Generative AI

If you want to visualize your data faster than ever, check out our AI agent, Einblick Prompt, which can create complex, beautiful charts from as little as one sentence. In the below canvas, we used generative AI to build a comparable set of boxplots. Check out how we did it below:

### Using Generative AI in Einblick

1. Open and fork the canvas
3. Right-click anywhere in the canvas > Prompt
4. Type in: "Use the seaborn library to plot boxplots, x = Height, y = Sport, only include Gymnastics, Rowing, Swimming, and Cycling."
5. Run the code in Einblick's data notebook immediately

If you try out Prompt, let us know what you think!