Over the last two decades the analytics space has drastically changed. Data has gone from scarce to superabundant and the popularity of data analytics as a means of making better decisions has exploded. However, taking advantage of data is hard as it requires technical expertise in data management, visualization, machine learning, statistics, among other disciplines. Explaining why we’ve seen so many new visual tools appearing on the market, all of them aiming to make data analytics more broadly accessible. To date, there have been three waves of tools aimed at democratizing analytics.
The first wave: finding answers to “what happened?”
Descriptive analytics centers on the question, what happened? Arguably, this is the foundation of any business. It’s used to understand how sales have developed, to track manufacturing costs and numerous other factors in the past. Traditionally, descriptive analytics was done in the form of reports, and the tools to create those reports were cumbersome, requiring extensive knowledge of SQL.
The advent of Business Intelligence (BI) tools made it significantly easier to understand past data. Broadly speaking, this change came over three generations of BI tools. The first generation moved users away from reports toward (interactive) dashboards and easy-to-use visual editors. The second generation lowered the barriers to entry even further by moving the software from on-premise applications that were hard to install to the cloud. The still-evolving third generation of BI tools, sometimes referred to as augmented analytics, aims to increase the self-service aspect of descriptive analytics by allowing users to ask what happened-type questions using natural language, among other things.
Over these three generations, BI tools grew in power and functionality, but their goal largely remained the same: create the best visualization of past data. Moreover, the manner in which users interacted with BI tools also hasn’t changed much. A single user creates a single visualization using various dialogues over a single dataset, then composes several of these visualizations into a dashboard so that others can view it.
However, before the user creates a visualization, data integration and cleaning are usually done with external tools, which sometimes come bundled with the BI tool. Unfortunately, this separation of cleaning and integration makes it hard for users to understand the underlying assumptions behind a visualization. While this can cause serious problems in some scenarios, this is not the case when dashboards are carefully curated by an expert for consumption by others. Moreover, as most users tend to consume the dashboards rather than create them, the tools focus on a single user for the creation of visualization rather than allowing people to work together to make new discoveries.
The second wave: finding answers to “what might happen?”
While understanding what happened is key to any business, it’s a backward-looking approach. Often of equal interest is the question, What might happen?, aka predictive analytics. Here, techniques like forecasting models and technologies like machine learning (ML) are dominant. These technologies used to be the exclusive domain of highly trained staticians and data scientists fluent in Python or R. More recently, however, the market has seen a broad range of new visual tools that seek to make predictive analytics more widely accessible. These tools, sometimes referred to as self-service ML or Data Science platforms, provide visual user interfaces for building models and/or creating entire machine learning pipelines.
Interestingly, the user interfaces of self-service ML/Data Science tools are often quite different from the BI tools as they aim to create the best possible model. Rather than dialogue-based interfaces, they are usually built on top of visual workflow engines, where individual operations are represented by boxes, which are then connected by the user to form an entire ML pipeline. The advantage of this interface is that it makes it easier to understand how the data “flows” from its source and raw format to the final model to eventually create a prediction. This is particularly important for ML as different ways to clean and encode data can have profound impact on the final accuracy of the model. (In fact, this is also true for “descriptive analytics,” but is often less obvious.)
The downside of workflow engines, however, is that they do not provide any immediate feedback or interactivity. The user has to press a “play” button after curating the pipeline, which then starts the computation of the composed workflow, and it might take hours until the first result is produced. While some tools try to overcome this issue by providing more immediate feedback for parts of the pipeline through specialized interfaces (e.g., for hyperparameter tuning), they do so at the cost of ensuring that the user still sees and understands the whole process.
The third wave: finding answers to “what should we do?”
While the focus in the first and second waves is on What happened? and What might happen?, the real underlying question every organization wants to answer is, What should we do? There are cases where the right action can be found just by understanding the past (first wave). For example, after identifying the biggest under-performing sales person, the right action might be simply to fire him. However, in other cases, finding the right action might require building a forecasting model (second wave). For example, a model can help to weigh the deals a salesperson might bring in over the next year. Other cases may require evaluating several scenarios and considering factors like the risk the salesperson might take some clients with them. Tools which help in these situations include what-if analysis, optimization tools like constraint solvers and other techniques.
Unfortunately, none of these techniques — which were originally framed as prescriptive analytics — is easy to use and so far can only be found in highly specialized verticals.
We believe that making the best decision cannot rely on what-if analysis or linear solvers alone. Rather, finding the right action requires understanding the past (descriptive analytics), building models for the present and future (predictive analytics), and techniques to analyze different scenarios to optimize decisions (prescriptive analytics). It requires moving quickly between these modalities, as all three are needed to derive the best outcome. At the same time, finding actionable insights requires neither pixel-perfect dashboards nor the best possible model. Decisions usually need to be made in a timely fashion and are often just made once. The focus should therefore be on fast prototyping and making sure that everyone understands and signs off on the entire decision process, from data integration and cleaning, through modeling, to the final decision optimization.
Current analytic tools for descriptive or predictive analytics do not solve that need and neither can they be easily adjusted for it as they were designed for a different goal: creating the best visualization or model. They make it hard to quickly iterate between data integration and cleaning, and between descriptive and predictive analytics, and none supports techniques for prescriptive analytics. Moreover, their user interfaces are not designed for prescriptive analytics. On one hand, BI tools focus on a single visualization at a time and make it hard to see the entire pipeline. On the other hand, ML workflow tools do not provide the immediate feedback to understand past data, and none supports the creation of what-if scenarios or the solving of optimization problems, as their user interaction paradigms are too restrictive.
Another aspect current analytics tools fail is that decisions are rarely made alone. Good decisions tend to be the result of collaboration, and are discussed, checked, modified, and discussed again, and so on. However, current tools are designed for a single user, who then can only share the final results.
In conclusion, democratizing prescriptive analytics requires a fundamental rethinking of the way people interact with data. What is needed is a much more flexible and collaborative approach, with radically reimagined user interfaces that facilitate the data exploration commonly done with BI tools, the flexibility of visual ML workflow tools, and a seamless integration of prescriptive tools, like optimization and what-if analysis. Einblick is our attempt to do exactly that. Einblick is the first visual data computing platform, which combines the best aspects of workflow-centric AI tools with the visualization-centric BI tools. It is built from ground-up to foster collaboration, either remotely or in-person, to enable teams to make data-driven decision. If you like to learn more about Einblick, we highly recommend watching our short demo video or encourage you to directly reach out to us.