Bar plots, sometimes bucketed with count plots, are a great way of visualizing aggregate statistics across different groups. In this post, we'll provide a few example prompts that can be used to generate different bar plots using Prompt AI. For the purpose of this post, we'll be using a Goodreads dataset from Kaggle.
book_data.head()
Code Output:

All the examples are available in a shared canvas, that has other example Prompts for other data visualizations and use-cases. You can simply open and fork the canvas below.
Prompt 1: Basic bar plot
Plot the distribution of publisher
One great thing about Prompt is that it can infer from ambiguous terms like "distribution" what kind of chart would be most appropriate. Since the publisher
column is categorical, Prompt creates a bar chart, rather than a histogram.
# PROMPT: Plot the distribution of publisher
# Import the necessary libraries
import matplotlib.pyplot as plt
# Plot the distribution of publishers using a bar plot for the book_data dataframe
plt.figure(figsize=(10, 6))
subset_book_data['publisher'].value_counts().plot(kind='bar')
plt.title('Distribution of Publishers (Book Data)')
plt.xlabel('Publisher')
plt.ylabel('Frequency')
plt.xticks(rotation=45)
plt.show()
Code Output:

Prompt 2: Change orientation
If you want to change the orientation of the bar plot, you can simply select Prompt > Change this cell, and try the query below.
Make the bar plot horizontal
# CHANGE: Make the bar plot horizontal
# Import the necessary libraries
import matplotlib.pyplot as plt
# Create a horizontal bar plot of publishers using the 'publisher' column from the 'subset_book_data' dataframe
subset_book_data['publisher'].value_counts().plot(kind='barh')
# Set the title and labels for the plot
plt.title('Count of Books Published by Each Publisher')
plt.xlabel('Count')
plt.ylabel('Publisher')
# Display the plot
plt.show()
Code Output:

Prompt 3: (Change library) and add error bars
If you want to change the color (or use the other axis in seaborn
), you can just add on to the original prompt.
Create bar plot of publisher and average rating
Use seaborn, and add error bars.
After getting the initial plot, I saw the chart didn't show much difference in average rating across publishers, so I wanted to add error bars to see if there was a noticeable difference in the spread of ratings.
# PROMPT: Create a bar plot of publisher and average rating
# CHANGE: Use seaborn, and add error bars.
# Import the necessary libraries
import seaborn as sns
import matplotlib.pyplot as plt
# Group the 'subset_book_data' dataframe by 'publisher' and calculate the average rating for each publisher
average_rating_by_publisher = subset_book_data.groupby('publisher')['average_rating'].mean()
# Calculate the standard deviation of the average rating for each publisher
std_dev_by_publisher = subset_book_data.groupby('publisher')['average_rating'].std()
# Create a bar plot using seaborn with error bars
sns.barplot(x=average_rating_by_publisher.index, y=average_rating_by_publisher, yerr=std_dev_by_publisher)
plt.xlabel('Publisher')
plt.ylabel('Average Rating')
plt.title('Average Rating by Publisher')
plt.xticks(rotation=90)
plt.show()
Code Output:

Prompt 4: Grouped (or stacked) bar chart
If you want to change the color based on a categorical variable, you can create a grouped or stacked bar chart. I prefer grouped bar charts so that was my target. If you prefer stacked bar charts, you can simply ask Prompt to change it.
Subset for books published between 2005 and 2010
Create a bar plot of publisher and average rating, colored by publication year
Use seaborn instead of matplotlib
TIP: there are some Python libraries that handle certain kinds of chart customization better. When I hit the error after the first Prompt, I asked Prompt to change to seaborn
to see if this would fix the error (and I prefer the seaborn
library over matplotlib
).
Before I get the plot I want, I'll first ask Prompt to filter my data for me. Prompt can also clean and transform data as well.
# PROMPT: Subset for books published between 2005 and 2010
# Filter the 'subset_book_data' dataframe for books published between 2005 and 2010
subset_book_data_filtered = subset_book_data[(subset_book_data['publication year'] >= 2005) & (subset_book_data['publication year'] <= 2010)]
Once the data was subsetted, I asked Prompt to create the chart.
# PROMPT: Create a bar plot of publisher and average rating, colored by publication year
# CHANGE: Use seaborn instead of matplotlib
# Import the necessary libraries for data visualization
import seaborn as sns
import matplotlib.pyplot as plt
# Convert the 'publication year' column to integer type
subset_book_data_filtered['publication year'] = subset_book_data_filtered['publication year'].astype(int)
# Create a bar plot using the 'subset_book_data_filtered' dataframe
plt.figure(figsize=(12, 6))
sns.barplot(x=subset_book_data_filtered['publisher'], y=subset_book_data_filtered['average_rating'], hue=subset_book_data_filtered['publication year'])
plt.xlabel('Publisher')
plt.ylabel('Average Rating')
plt.title('Publisher vs Average Rating (Colored by Publication Year)')
plt.xticks(rotation=45)
plt.legend(title='Publication Year')
plt.show()
Code Output:

About
Einblick is an AI-native data science platform that provides data teams with an agile workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.