In the ever-evolving landscape of data analysis and scientific exploration, the ability to visually represent data is a superpower that every data analyst and scientist must wield. Among the arsenal of data visualization tools, line plots are unique in their ability to effectively trace trends, uncover patterns, and provide invaluable insights. With the advent of generative AI, it has never been easier to create line plots. Since line plots are highly effective at displaying temporal data, we used a sample of movie data from Kaggle. The data spans 1980 to 2020.
From the full data set, we sampled movies that were categorized as Action, Comedy, Drama, and Horror films.
The full examples can be found in the embedded canvas below. Simply open and fork the below canvas. All of the natural language prompts used are provided in an accompanying Markdown cell to the left of the graph, along with any modifications made.
Prompt 1: Basic line plot
Create a lineplot of the number of movies released over time.
NOTE: replace movies with whatever variable you wish to count over time.
To create the below plot, we didn't have to do any preprocessing of the data, as Prompt grouped everything as needed for us.
# PROMPT: Create a lineplot of the number of movies released over time. # Import the necessary libraries import matplotlib.pyplot as plt # Group the 'movies_data' dataframe by year using the 'released_date' column movies_by_year = movies_data.groupby(movies_data['released_date'].dt.year) # Count the number of movies released each year movies_count = movies_by_year.size() # Create a line plot using the year as the x-axis and the count of movies as the y-axis plt.plot(movies_count.index, movies_count.values) plt.xlabel('Year') plt.ylabel('Number of Movies Released') plt.title('Number of Movies Released Over Time') plt.show()
This is a great first plot, but there is a lot more information that a simple line plot can show, and a lot more ways you may want to customize your plot. For example, you may want to:
- Use a different visualization library
- Color parts of the graph differently based on a different variable
- Add markers and change style of the line
Einblick Prompt can handle all of these aesthetic changes easily.
Prompt 2: Data cleaning and
Make a new table of counts of movies, by genre, released every year
Use seaborn to create a line plot of movies released every year.
TIP: Prompt can not only create beautiful charts, it can also clean your data for you, and do any tedious preprocessing steps.
# PROMPT: Make a new table of counts of movies, by genre, released every year # Import the necessary libraries import pandas as pd # Group the 'movies_sub' dataframe by 'genre' and 'year' and count the number of movies in each group movie_counts = movies_sub.groupby(['genre', 'year']).size().reset_index(name='count') # Print the new table display(movie_counts)
genre year count 0 Action 1980.0 12 1 Action 1981.0 17 2 Action 1982.0 22 3 Action 1983.0 30 4 Action 1984.0 31 ... ... ... ... 159 Horror 2016.0 11 160 Horror 2017.0 12 161 Horror 2018.0 9 162 Horror 2019.0 8 163 Horror 2020.0 2
Then you can ask Prompt to create the line plot again. Since the data is grouped a specific way,
seaborn is able to construct error bands based on the different number of releases per genre.
# PROMPT: Use seaborn to create a line plot of movies released every year. # Create a lineplot using seaborn sns.lineplot(data=movie_counts, x='year', y='count')
Next, we can disaggregate the plot to show different groups using color, and more.
Prompt 3: Customize line color
Now that the data has been cleaned properly, we can easily create an interesting line plot where we can compare releases by genre.
Using seaborn, create a lineplot of the movies released every year, color by genre.
# PROMPT: Using seaborn, create a lineplot of the movies released every year, color by genre. # Import the necessary libraries import seaborn as sns import matplotlib.pyplot as plt # Create a lineplot using seaborn sns.lineplot(data=movie_counts, x='year', y='count', hue='genre') # Display the plot plt.show()
Prompt 4: Customize style and legend, add markers
If you want more distinction between the groups, you can add in markers, and change up the style of the lines. Additionally, you may want to move the legend to make the chart more aesthetically pleasing. We achieved the below results via three separate prompts issued to the above chart.
To do this, we simply selected Prompt > Change this cell. Then typed in the following queries, in succession.
Change the line style by genre
Change the marker by genre
Move the legend off the chart
# CHANGE: Change the line style by genre # CHANGE: Change the marker by genre # CHANGE: Move the legend off the chart # Import the necessary libraries import seaborn as sns import matplotlib.pyplot as plt # Create a lineplot using seaborn with different line styles and markers for each genre sns.lineplot(data=movie_counts, x='year', y='count', hue='genre', style='genre', markers=True) # Move the legend off the chart plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left') # Display the plot plt.show()
Now that we've covered a few examples on how to leverage generative AI to create beautiful line charts, you can also check out our other examples on histograms and scatter plots.
Einblick is an AI-native data science platform that provides data teams with an agile workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.