Blog

Becca Weng - November 18th, 2022

In this post, we’ll be breaking down the what, why, and how of exploratory data analysis. We’ll start with a brief overview of what exploratory data analysis is, why it’s important, a high-level approach, and then we’ll dig into a specific example of EDA using a Goodreads dataset available on Kaggle. Throughout the example, we’ll cover some of the fundamental libraries for you to be successful as a data explorer.

Becca Weng - November 11th, 2022

Data cleaning as part of data preparation can involve many steps, tools, time, and resources. In this article, we’ll simplify the data cleaning process, and focus on how to clean data in Python using built-in packages and commands.

Einblick Content Team - November 3rd, 2022

Part of the data wrangling process is to cleanse, aggregate, or otherwise manipulate data in preparation for analysis, visualization, or storage in a database. Read on to learn more about data wrangling.

Einblick Content Team - November 1st, 2022

Whether it's data preparation or going in-depth on the best steps to take to transform your data into something actionable, we have you covered. With that in mind let’s go over a comprehensive review of data cleaning.

Einblick Content Team - October 20th, 2022

Alteryx is a popular data science and analytics automation software program, but Alteryx can be a bit expensive. You may be looking for other alternatives, and want to understand the marketplace a bit better before committing to a solution.

Einblick Content Team - October 5th, 2022

Data exploration is the process of analyzing datasets to find patterns and relationships, and is sometimes more formally referred to as exploratory data analysis (EDA). Learn more about data exploration techniques that will help you build predictive models and craft compelling narratives.

Einblick Content Team - September 29th, 2022

Churn analytics is the process of measuring and understanding the rate at which customers quit the product, site, or service. Churn analytics is critical for getting a performance overview, identifying improvements and understanding which channels are driving the most value.

Einblick Content Team - September 26th, 2022

Data transformation tools help standardize data formatting, apply business logic, and otherwise play the transform role in ETL (extract, transform, load) or ELT (extract, load, transform). These tools are used to provide a more consistent, uniform execution of data transformations, regardless of data source.

Einblick Content Team - September 21st, 2022

Data profiling is the process of examining data from various sources and collecting statistics or summaries about the data. This process can help you check if you have the right kind of data for your problem, as well as ensure data quality.

Becca Weng - September 19th, 2022

In this post, I will highlight the core paradoxes that Gartner introduced to help the data and analytics community unleash innovation and transform uncertainty. By unifying seemingly disparate concepts, Gartner’s summit created opportunities for new perspectives on age-old problems.

Becca Weng - September 13th, 2022

In a notebook, you can do a lot–from preprocessing data to EDA to tuning machine learning models–which is great! But, in notebooks, there’s a lot of upfront work that you, as data scientists, must do every time before, and as you start analyzing data and building models.

Cynthia Leung - September 6th, 2022

We conducted a survey about the top challenges facing data scientists and data professionals across industries. Remarkably few of the responses were about model accuracy, but much of it was around collaboration, process, and communication.

Cynthia Leung - August 2nd, 2022

Data science notebooks are powerful, flexible tools that data scientists use every day. But they are code-heavy linear workflows which do not properly address data scientists' need for multi-stakeholder collaboration, reproducibility, fast iterative discovery, and operational work to deploy. We explore a few ways notebooks fail data scientists here.

Paul Yang - June 1st, 2022

Historically, Machine Learning algorithms were a bit painful to use, and required tedious human intervention in order to tune hyperparameters. Recent innovations in AutoML means that data scientists can now get better models in less time, by using new tools that support automatic exploration of how to assemble the best ML pipeline.

Tim Kraksa - March 27th, 2022

Low-code tools are revolutionizing businesses, enabling citizen developers to create new business applications that drive innovation. Now, the same thing is starting to happen for citizen data scientists.

Benedetto Buratti - February 1st, 2022

As organizations made data analytics a strategic priority, demand for data analysis outputs exceeded supply of trained data scientists. To bridge the gap, no code workflow platforms (KNIME, Alteryx…) were developed to make advanced data science easier, and give access to wide audiences.

Paul Yang - November 1st, 2021

Move fast and break things — but still be data informed. Startups must tailor their data analytics practices to focus on on delivering strategic insights quickly. These are a few observations we’ve observed in our partnerships with startups, as Einblick helps lean organizations produce better analytics.

Paul Yang - October 6th, 2021

While code can accomplish everything, there is a set of repetitive operations where visual-based no code operators will help every data scientist. In that way, no code operators are just the next logical extension of importing libraries.

Paul Yang - August 27th, 2021

Why have advancements in Machine Learning (ML) imperfectly translated to better data driven decision making? How can business line stakeholders and data scientists bridge the gap between quality analysis and executed changes?

Benedetto Buratti & Paul Yang - March 1st, 2021

In data science, there are many different versions of correctness. Accuracy itself can be highly misleading: We don't want accurate nuclear launch detection and we don't want accurate self driving cars.

Paul Yang - January 26th, 2021

But it’s 2022 and it’s time to say goodbye to spreadsheets as the primary tool for data analysis. You should be able to work in a fast, collaborative space for business analysis, and harness innovations in AI/ML to quickly identify key drivers and even access predictive modeling.