Everyone Is Still Unhappy About Data in 2023
Your organization generates more data than ever, and we assume that you already agree that data-driven decision-making is critical. This is not another thinkpiece about the importance of data.
However, the principal problem for most organizations is that everyone is unhappy with their relationship with data:
- Data scientists and data engineers are really expensive to hire
- It takes a really long time to get a shared services data team to generate outputs
- Separation of business or domain expertise and data science expertise leads to incomplete answers or the wrong key takeaway.
- Results are delivered in batch, rather than an iterative conversation focused on exploring a topic fully.
This is because there’s been a historically high barrier preventing the average insights consumer (imagine any marketing manager, for instance) from accessing advanced data insights. Data scientists needed two different skills:
- Understanding how to frame business questions in terms of the data that is accessible
- Knowing a 2nd language (i.e. Python, R, SAS)
Using an absurd analogy, if only Swedish speakers could play tennis, there would be a lot fewer tennis players. But it has always been the case that you needed to learn a second language alongside gaining the skills to build out complex analysis.
Why No Code Didn’t Work (But GenAI Can)
But we are now in an exciting time, where the barrier to entry has been halved. Generative AI technologies have become experts at writing text based on user request and translation. The knock-on effect is that they have become exceptionally proficient at translating user meaning to code. If you ask, an LLM can write poetry like Shakespeare in Portuguese, and you can ask it to write Python to make a chart.
Previous attempts at “citizen data science” and “no code” were not really helpful because all no code does is map the same technical language into buttons instead of text. It seems less frightening but is not actually any less complex. Building a pivot chart in Excel requires you to click between 8 and 12 buttons in the right order, and you can think of that as a language in and of itself.
Unfortunately, it’s not yet a “free lunch.” Folks will still have to learn what “F1 Score” means or know what data cleansing techniques exist. The generative AI solutions can help make suggestions, but human domain expertise and knowledge is still additive. And, importantly, we have greatly lowered the bar to start learning.
If You Build It, Some of Them Will Come
By lowering the bar to start data science projects, two critical things happen.
First, more people will be willing to hop over the bar and start learning data science concepts. It is already tedious and not straightforward simply installing Python, installing Jupyter Notebook, installing packages, and figuring out how to run everything. Once the bar is lower, and English is all that is needed to get started, eager learners can simply get started. Data science concepts can be tied to real business problems, and will become much less intimidating than coding for the first time.
Second, dividends on time invested into learning data science start returning much sooner. This means learners will have wins sooner, which is always good for morale. And for the organization, this makes devoting resources easier to stomach, as it won’t be purely academic for 12 weeks. No, a 22-year-old marketer will not transform into an ML expert overnight. But they can start producing self-serve charting and correlation analysis in 30 minutes of concentrated learning. Especially by focusing on those on the team who are already eager to learn, and giving them the space (and safe opportunity to be wrong!), you will have “fairly good analysts” pretty quickly.
And really, what’s the worst-case scenario for pushing teams towards learning data science concepts?
Stop Relying on Excel
Jumping from that point, let’s take a detour to both praise and attack Excel very specifically (we all know that’s the primary business analytics tool everywhere).
While Excel may not excel in any single dimension, it had previously compensated by offering a versatile set of features, allowing users to perform data transformations and analysis without the need for coding, and makes it accessible to a wide range of individuals, regardless of their technical expertise.
But Excel cannot handle large datasets, nor can it solve sophisticated problems. Now that you can accomplish the same things as Excel with a few English sentences, this is your wakeup call to stop using Excel for everything in your organization.
Try Einblick: We’re Early to the Game
There are other code-completion tools available (like Github Copilot) which help experienced programmers become more productive. But we’re the first AI-native data science platform. Code is given equal (or lesser) weight to conversational workflows with our AI agent, Einblick Prompt. Prompt can reason alongside your data teams to determine the right tool for the right task, and build out data workflows in as little as one sentence.
Einblick is an AI-native data science platform that provides data teams with an agile workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.