In a notebook, you can do a lot–from preprocessing data to EDA to tuning machine learning models–which is great! But, in notebooks, there’s a lot of upfront work that you, as a data scientist, must do every time before, and as you start analyzing data and building models. Then, you have to scroll through dozens of Python cells to compare models and visualizations.
As infuriating as Python notebooks can be, there are certain aspects of notebooks that are functional and convenient in today’s data science workspace. Einblick eliminates repetitive and mundane everyday tasks with an innately collaborative visual canvas and remarkably fast progressive computational engine.
To that end, we have embedded some of the best aspects of data science notebooks into Einblick’s platform so that our users can import existing code, while simultaneously reaping the benefits of Einblick’s unique capabilities.
What’s New: Notebook Features
The new features equip data scientists with easy ways to continue existing work, collaborate with others, and move seamlessly from any Python notebook to Einblick’s innovative environment.
- Import any Python notebook into Einblick, and resume work immediately, without the limitations of a notebook
- Classic keyboard shortcut (
Enteron PC) lets you run Python cells in a couple keystrokes
- Einblick’s Python cells all run in the same kernel, allowing you to save and manipulate variables across cells
- Drag your cleaned dataframe directly onto the canvas as a table or chart from your Python cell, bridging the code and visual environment
- Share your entire workspace with non-technical stakeholders and discuss results directly on the canvas together
- Kick-off and compare ML models from multiple AutoML runs at once, side-by-side
- Speed up exploratory data analysis by creating multiple visualizations and analytic operators from the same Python cell
- Utilize Einblick’s progressive computation engine with operators like AutoML
Get Started with Einblick
In order to leverage all of Einblick’s capabilities on an existing project, let’s first open Einblick’s Main Menu. Once you’re there, you can import a notebook one of two ways.
1) Import a Notebook
Option 1: Click the “Import Notebook” button in the upper righthand corner
Option 2: Drag-and-drop a notebook into the Main Menu
Now that you’ve imported a notebook, Einblick has created a new canvas, pre-populated with all of the markdown and Python cells from your original notebook. It’s still in the same order too–but now you have the freedom of an expansive, collaborative, highly visual canvas.
2) Run your Python cells
We’ve added a familiar keyboard shortcut to Einblick’s functionality–
Enter on PC–to run a Python cell.
You’ll also notice familiar square brackets on the left side of every Python cell in an Einblick canvas. Now you can track the order in which your cells have been run, regardless of their location on the canvas.
3) Grab your data
- Using an Einblick connector: Create a database connection, and then use the SQL operator to write your desired query directly onto the canvas.
- Working with a CSV: If you’ve already uploaded a CSV, simply find it using the “+” button under datasets. Otherwise, hit the “create a new dataset” button and upload it directly.
- Connect with an API or driver: If you’re pulling in data within the script, you should be all set already! Einblick supports installing the packages you need in order to pull in data from external sources. Just make sure that you have the
%pip installat the top. Just make the data eventually turns into a pandas dataframe
4) Accelerate your data science workflow with Einblick’s operators
At Einblick, we’ve created operators, which are essentially pre-packaged bits of code that you can use, so you don’t have to Google syntax–AGAIN. You can find a full list of our operators on the left side of every Einblick canvas. Here’s a shortlist of popular ones, grouped into categories:
- Data cleaning
- Detect Outliers: uses the IsolationForest algorithm to detect outliers in a dataset using the specified columns
- Remove Duplicate Rows: removes all (but one) duplicate rows from a dataframe
- Machine learning
- AutoML: easily build accurate models (and drag each one out to compare pipelines)
- Statistical analysis
- Key Driver Analysis: automatically discover fundamental patterns/drivers in your data