ChatGPT & Python: A Potent Combination

Paul Yang - July 25th, 2023

If you’re looking for a “how-to” guide on making API calls to OpenAI / ChatGPT using Python, you’re probably looking for this short guide.

Knowing how to code is great. Actually writing code is a slog.

This is not a controversial opinion, but everyone benefits from using ChatGPT for Python. If a madman locked a senior data scientist in a room, and the password to the door was the syntax for rotating x-axis labels for a seaborn plot… they're probably unlikely to survive.

It’s not so easy even if you know it. - Source

ChatGPT as Python assistant

ChatGPT helps with automated boilerplate code generation. Start with plain English descriptions or high-level instructions to generate complex code snippets. In just one sentence request, you can get a block of code that removes nulls, splits your data into training and testing datasets, sets up RandomizedSearch for XGBoost parameter tuning, and displays shap values.

Beyond the high-level boilerplate, syntax is always tedious and requires very precise “computer grammar” that humans just aren’t super good at. I would guess that most experienced data scientists are not much better at remembering the names of function arguments. Experience mostly shows itself in knowing how to compose the overall workflow and methodology, and how to read results out–not in regurgitating syntax perfectly.

And when the syntax is not known, documentation is bad. This is why prior to ChatGPT, StackOverflow was, and still is, the go-to for “how-tos.” When `pd.read_csv()` has 46 possible parameters including 6 different ones to handle dates, I simply am not going to read (or even skim) documentation to figure out what I want to do.

ChatGPT Interpreter can execute code

You can learn more at OpenAI’s blog.

As a nod to the power of actually being able to execute the code, ChatGPT now has a Python interpreter built in. This means that for certain data science tasks, including visualizing an uploaded file or doing arithmetic, it can execute the code it generates. You will get the actual output, rather than code to be executed elsewhere. (Of course if you’re building a large project, its home is likely not the ChatGPT web interface.)

ChatGPT is still “just” generating text

As a brief bit of context, remember ChatGPT typically does not know how to do anything except generate text. It generates sentences one word at a time, with the next word having the highest probability to be next. By adding the code interpreter in, and running the code it generates (which is text!), ChatGPT is now able to accomplish real programming tasks and produce “real” outputs beyond blocks of text. Essentially, OpenAI gave ChatGPT arms in addition to just being a brain.

The Problem: ChatGPT is not a Data Science Notebook

Conceptually, combining the power of generative AI makes sense. However, in practice, ChatGPT was never going to replace Jupyter notebooks. Even after adding the ability to execute code, ChatGPT lacks the core features that enable end-to-end analysis.

The most obvious is that a chatbot is far from being a developer environment. Whether version management, package installation, keeping of secrets, etc… there are a lot of specialized features that are table stakes for development that ChatGPT developers probably won’t handle as it’s not their true market.

Also, we want code help, but not every step is AI. Despite the hype surrounding AI, it won’t replace human beings doing data science and data analysis. As such, we desperately need the ability to break out of AI chatbot-mode and go back into Jupyter programming mode. Whether it’s a step that’s simpler to just code, or it’s fixing hallucinations, or it’s making refinements, I need human-first access to code too. Data science tasks require iterative processing with human input at each step.

Maybe most importantly, contextual knowledge across variables, datasets, and packages is a specialized skill set that ChatGPT does not natively have. Yes, there is a window of short-term memory that ChatGPT has access to: if you ask to build on the last block of code run, it will. However, if we have four SQL datasets that need to be individually cleaned, joined, and used together, this requires specific software engineering that once again, is just not ChatGPT’s forte.

Einblick: the power of a Jupyter Notebook with the simplicity of ChatGPT

Einblick has created an AI-native data science environment, which integrates the benefits of ChatGPT directly into its canvas-based data notebooks.

Unlike ChatGPT, which allows users to ask programming questions but lacks integration with their workspace and contextual knowledge, Einblick Prompt seamlessly interacts with the canvas and even the user's database. It understands the existing content and can immediately test, edit, and share the code generated by large language models (LLMs) within the workspace.

Data science needs a canvas

  • 2D canvases that let you set up and organize analyses in a way that mirrors how data scientists actually explore data and build workflows
  • Mix modalities, and go straight from SQL to Python (or query Python dataframes with SQL)
  • Share and collaborate on multiplayer canvases

Ultimately, you want to focus on insight generation. By offloading repetitive and mundane coding tasks to generative AI, you can focus more on higher-level problem-solving and data analysis. Einblick can boost productivity by reducing the time spent on routine coding tasks and allowing you to spend more time on critical thinking.

Oh, and maybe the best part – Einblick costs less than a ChatGPT Plus subscription. Try for free. Just ask Einblick Prompt a few questions and see how it can generate smart answers for you. Prompt is available in every canvas, at no extra cost to users.

About

Einblick is an AI-native data science platform that provides data teams with an agile workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.