Einblick Prompt vs. ChatGPT Code Interpreter

Becca Weng - August 28th, 2023

In the third installment of our series covering popular AI code generators and assistants, I’ll be covering ChatGPT’s Code Interpreter. The brainchild of OpenAI, the company that brought generative AI to the forefront of everyone’s minds, the Code Interpreter is a plugin available within the ChatGPT chat interface for ChatGPT Plus users. In this article, I’ll provide a brief overview of the product in comparison with Einblick’s AI agent, Prompt, a feature comparison chart, as well as in-depth use-cases.

If you’re interested in Jupyter AI or GitHub Copilot, check out our earlier articles.

What is ChatGPT Code Interpreter

I decided to ask ChatGPT itself what the difference was between the original ChatGPT and its new Code Interpreter functionality, to which it said the following:

Key here is that ChatGPT is still not the same as say a Python environment or an IDE. You cannot edit code or write code directly, but you can ask ChatGPT to write and run code for you. You can see how this plays out in the use-cases outlined below.

What is Einblick Prompt

Einblick is an AI-native data notebook that can write and debug code, create beautiful charts, build and refine models, and much more. Central to Einblick's functionality is Einblick Prompt, a context-aware AI agent designed specifically for data tasks. Prompt can set up and execute entire data workflows with as little as one sentence. As it runs, Prompt restates the request, and lists out the steps it will execute to reach the intended goal. Leveraging its context-awareness, Prompt can discern specifics like column names and dataset titles. If necessary, Prompt will ask the user follow-up questions. If the outcome doesn't meet your expectations, Prompt can modify the current cell or fix any syntax issues. From there, you have the freedom to keep building out your workflow manually or with Prompt’s help.

Featuring a distinct 2-D canvas design, Einblick offers a seamless experience for users to work with Python, SQL, and interactive components, like Tables, Charts, and Filters. All Einblick users can access Prompt at no extra cost just by clicking anywhere in an Einblick notebook, or by selecting an existing cell, and adding a Prompt query.

Prompt vs. ChatGPT Code Interpreter Feature Comparison

Einblick PromptChatGPT Code Interpreter
Interface with model in a data notebook
Generates executable code in code cells❌* (See Results in ChatGPT section)
Directly fixes and debugs code
User can manually edit code
Built for data tasks
Context-aware
Can auto-run code
Generates commented code
Installation requiredN/AN/A
Prompt persistenceDropdown of last 5 prompts run in notebookPrompts available by scrolling through chat history
Chat interface

Installation and getting started

Both Einblick and ChatGPT are web-based apps, as are the additional AI features I’m covering in this article, so there is no installation required. However, in Einblick, Prompt is available to all Einblick users, including those on the free tier. Unfortunately, the ChatGPT Code Interpreter is currently only available to those with a ChatGPT Plus subscription, which costs $20/month.

Using Einblick Prompt vs. ChatGPT Code Interpreter

Results in Einblick

To use Prompt, a user can click anywhere in an Einblick workspace, or click the Prompt icon at the bottom of another cell. Then the user simply types in their query, and hits Enter. Prompt will generate the code directly in Einblick. There’s even an “auto-run” option so that you can see the results immediately. You can actually view all of the code in this canvas below. Just open and fork it!

Since Einblick is a data notebook on a canvas, any code generated by Prompt is run directly in a coding environment. As such, once in Einblick, users have all of the benefits of something like a Jupyter notebook, plus much more. Prompt generates the code, and then you can start coding directly in Einblick, or you can continue building out the rest of your workflow using Prompt. Einblick supports multimodal workflows, so you can code in SQL, Python, or augment your work using interactive components like Charts, Filters, and Tables. You can even collaborate with stakeholders and use Einblick as a presentation layer for your work.

Results in ChatGPT

While ChatGPT’s code interpreter provides you with the full code and results, the technical aspects are automatically hidden under a button that says “Finished working.” To see the code, you just click “Show work.” But ChatGPT will automatically show any charts or plots that you ask it to create.

Although you can ask ChatGPT to change something in the code, there is no direct way of manually editing code in the code interpreter. The copy code button is useful, but ultimately, no one is looking to write tons of code in ChatGPT. The majority of the output from ChatGPT is interpretation or plain English descriptions of what it did, and what it could mean. The emphasis here is still on the “conversational” aspect of working with ChatGPT.

Use-case: Data cleaning and EDA (exploratory data analysis)

For this analysis and app comparison, we used the wine dataset from sklearn. This dataset includes 178 observations, and 13 variables describing different wines, as well as a target variable, which is the kind of wine it is (groups 0, 1, and 2).

Prompt 1: Summary statistics

Show summary statistics

Einblick Prompt Output

# Task: Show summary statistics

# Subtask: Use the `describe()` function on the `joined_data` dataframe to show summary statistics.

joined_data.describe()

Output:

        alcohol	    malic_acid	... proline	    target
count	178.000000	178.000000	... 178.000000	178.000000
mean	13.000618	2.336348    ...	746.893258	0.938202
std	    0.811827	1.117146    ...	314.907474	0.775035
min	    11.030000	0.740000	...	278.000000	0.000000
25%	    12.362500	1.602500	...	500.500000	0.000000
50%	    13.050000	1.865000    ... 673.500000	1.000000
75%	    13.677500	3.082500	... 985.000000	2.000000
max	    14.830000	5.800000	... 1680.000000	2.000000

ChatGPT Output

Although ChatGPT's Code Interpreter used a similar line of code, initially, it's easier to focus on all of the text that it generates, describing the code. If you click the "Show work" button, you'll see the same code as from Einblick Prompt.

Prompt 2: Basic countplot

Plot the distribution of target

Einblick Prompt Output

import seaborn as sns

sns.countplot(data=joined_data, x='target')
plt.show()

Output:

ChatGPT Output

import matplotlib.pyplot as plt
import seaborn as sns

# Set the style of the visualization
sns.set(style="whitegrid")

# Create a countplot of the target values
plt.figure(figsize=(8,6))
sns.countplot(x='target', data=wine_data)
plt.title('Distribution of Target Values')
plt.xlabel('Target')
plt.ylabel('Count')
plt.show()

While ChatGPT generated the same graph, it offers some interpretation as well.

Prompt 3: Feature selection plots

Plot the distribution of each variable, colored by target
Create boxplots of each variable, colored by target

In order to determine which of the 13 variables to include, I wanted to create some plots that showed correlation with the target variable.

Einblick Prompt Output

import seaborn as sns
import matplotlib.pyplot as plt

sns.pairplot(data=joined_data, hue='target')
plt.show()

Output:

At first, Prompt used the sns.pairplot() function, which is a highly effective graph as it shows correlation between all variables, but since there are 13 variables, I then asked Prompt to change it to box plots, which would be easier to compare side-by-side, while still capturing the distribution of each variable.

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_style("whitegrid")

# Create a 4 x 4 grid of subplots
fig, axes = plt.subplots(4, 4, figsize=(12, 12))

# Iterate over each variable and plot the boxplot in the corresponding subplot
variables = ['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash', 'magnesium', 'total_phenols', 'flavanoids', 'nonflavanoid_phenols', 'proanthocyanins', 'color_intensity', 'hue', 'od280/od315_of_diluted_wines', 'proline']
for i, variable in enumerate(variables):
row = i // 4
col = i % 4
sns.boxplot(data=joined_data, x='target', y=variable, ax=axes[row, col])

plt.tight_layout()
plt.show()

Output:

Initially Prompt had created the box plots one after another, which became cumbersome to scroll through. I was easily able to ask Prompt to "Arrange the boxplots in a 4 x 4 grid."

ChatGPT Output

# Setting up the plotting environment
plt.figure(figsize=(20, 15))
sns.set_palette("pastel")

# Number of columns in the dataset (excluding target)
num_columns = wine_data.shape[1] - 1

# Plotting the distribution of each variable colored by target
for i, column in enumerate(wine_data.columns[:-1], 1):
    plt.subplot(4, 4, i)
    sns.histplot(data=wine_data, x=column, hue='target', element='step', common_norm=False)
    plt.title(f'Distribution of {column}')
    plt.tight_layout()

plt.show()

Although the histograms are somewhat effective, in this case, box plots seemed to provide a clearer view of the data. It was easy to change the plot type by just continuing the conversation with ChatGPT.

Use-case Building ML models

Prompt 1: Build a logistic regression model

Subset for alcohol, flavonoids, color intensity, and target
Predict target

Lastly, I wanted to actually build and evaluate a predictive model.

Einblick Prompt Output

# Subset the 'joined_data' dataframe to include only the 'alcohol', 'flavanoids', 'color_intensity', and 'target' columns

subset_data = joined_data[['alcohol', 'flavanoids', 'color_intensity', 'target']]
!pip install scikit-learn

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix

# Split the data into features (X) and target (y)
X = subset_data[['alcohol', 'flavanoids', 'color_intensity']]
y = subset_data['target']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize a machine learning model (Logistic Regression)
model = LogisticRegression()

# Fit the model to the training data
model.fit(X_train, y_train)

# Make predictions on the testing data
y_pred = model.predict(X_test)

ChatGPT Output

It seems that just like Prompt, if ChatGPT wants more information or confirmation about a choice from the user, it will simply ask, and the user can type in more information.

For some reason, the Markdown syntax did not render properly, but otherwise, the model worked well.

Prompt 2: Create a confusion matrix

Plot the confusion matrix of the model

Einblick Prompt Output

import matplotlib.pyplot as plt
import seaborn as sns

# Create the confusion matrix
confusion_matrix = confusion_matrix(y_test, y_pred)

# Plot the confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(confusion_matrix, annot=True, cmap='Blues', fmt='g')
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix')
plt.show()

Output:

ChatGPT Output

from sklearn.metrics import confusion_matrix
import seaborn as sns

# Compute confusion matrix
conf_matrix = confusion_matrix(y_test, y_pred)

# Plot the confusion matrix
plt.figure(figsize=(8,6))
sns.heatmap(conf_matrix, annot=True, fmt='g', cmap='Blues', 
            xticklabels=['Target 0', 'Target 1', 'Target 2'], 
            yticklabels=['Target 0', 'Target 1', 'Target 2'])
plt.title('Confusion Matrix')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.show()

Prompt 3: Interpreting the model

Plot coefficients for each target class

Einblick Prompt Output

# Extract the feature importances from the trained Logistic Regression model
feature_importances = model.coef_

# Create a dataframe with the feature importances and their corresponding names
importance_df = pd.DataFrame({'Feature': X_test.columns, 'Class 0': feature_importances[0], 'Class 1': feature_importances[1], 'Class 2': feature_importances[2]})

# Sort the dataframe by the magnitude of the feature importances
importance_df = importance_df.set_index('Feature')
importance_df = importance_df.T

# Create a bar plot to visualize the feature importances
importance_df.plot(kind='bar', figsize=(10, 6))
plt.xlabel('Target Class')
plt.ylabel('Importance')
plt.title('Feature Importances for Each Target Class')
plt.xticks(rotation=0)
plt.show()

Output:

ChatGPT Output

# Extract the coefficients
coefficients = log_reg.coef_

# Plotting the coefficients as feature importances
plt.figure(figsize=(12, 8))
for i, label in enumerate(['Target 0', 'Target 1', 'Target 2']):
    plt.plot(coefficients[i], label=f'Class {label}', marker='o')

plt.xticks(range(X_train.shape[1]), X_train.columns, rotation=45)
plt.title('Feature Coefficients for Each Target Class')
plt.xlabel('Features')
plt.ylabel('Coefficient Value')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

Fixing Errors and Changing Code

Code Interpreter

The Code Interpreter will automatically try to fix its own errors, but it can take a bit to diagnose what the problem is, particularly if the session has reset itself. If you want to change anything about the code, you can just continue “chatting” with ChatGPT, and ask it to change particular things, or use specific functions or libraries.

Einblick Prompt

Screenshot highlighting fix with prompt button in EinblickScreenshot highlighting fix with prompt button in Einblick

If you ever hit an error message in Einblick, whether while running code that Prompt generated or while running code that you have written manually, a “Fix with Prompt” button will appear. Once you click the button, Prompt will debug the code, explain how it fixed the issue, and will also add a comment where the code was changed so you can track the updates. If you’re unhappy with the fix, you can also undo what Prompt did.

If you just want to change the cell, for example, changing the color palette or using different plot types, instead of clicking “Add new cells below,” you can just click “Change the above cell.” Prompt will be able to adjust the cell without too many context clues. For example “use box plots instead” will change whatever plot type to a box plot.

Conclusion

ChatGPT’s code interpreter adds a more reliable technical element to a now beloved app. The code generated was pretty reliable, and the explanations are helpful for those less familiar with technical concepts. That being said, ChatGPT is only an accessory to other more robust tools. All of the explanations can be helpful to those new to data analytics and data science, but are a bit unnecessary if you're only looking for the code generation aspect of the app. The copy code button is useful, but also indicates that it is more of an external tool, rather than something that is meant to standalone in a data team’s toolkit.

Since Prompt is embedded into a full-fledged data notebook, by using Prompt, data analysts and data scientists gain access to highly tailored tools for their work. Prompt has specific features such as the “Fix with Prompt” button, explanations for any code changes, and context-awareness that speed up every part of the data workflow. Prompt is a great tool for users who are familiar with statistics and machine learning concepts that are looking to get results quickly, and who want to leverage the 2-D space for prototyping and experimentation. The ability to bring stakeholders and colleagues into the platform as well makes it easier to present and share work as well, creating a space for important business conversations.

About

Einblick is an AI-native data science platform that provides data teams with an agile workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.