Programming languages, as we know them, should not be conflated with programming code. Code exists to provide computers with instructions. Since the inception of computing devices, many versions of code have existed, including code composed of gears, discs, and wheels.
The first modern computers relied on inscrutable machine languages. But ever since Grace Hopper invented the compiler in the early 1950s, it became possible to translate a human-written coding language into another machine language before execution. As a result, compared to early Assembly, code has evolved to be more straightforward to write, easier to read and share with others, and more general purpose.
But as the modern computing era approaches its second century, it is now time to radically rethink what code actually means. Does it have to be formal language, where we have to incant
public static void main string args and the compiler moans about misplaced parentheses? Or can you write in French, Chinese, Latin, or Esperanto instead? Perhaps code should be written in normal language (natural language) - for me, in plain English sentences.
Large Language Models (LLMs) are the ultimate compiler
This has suddenly become achievable. With the recent improvements to Large Language Models (LLMs), LLMs are now incredible at interpreting the semantic meaning of data and very proficient in providing translations. In short, they can consume the metadata of my project, understand my plain English request, and combine the two into executable code.
By removing the restriction that code can only be composed of arcane programming language syntax, we give many more people the ability to program in every domain. At its core, computer programming lets you build things, design things, and output things, and humans need a lot to muddle through day-to-day affairs–so let everyone build charts from big data, query databases, design marketing landing pages, automate tasks, etc. There is an abundance of well-defined operations that largely exist as “code-only” that can be precisely described in words. The words are still what programmers use to communicate with each other after all, in describing deliverables and providing feedback. So why lock up the ability to create in the restrictive realm of programming languages when we can use natural language now?
NOTE: I’ll pause to acknowledge the hyperbole that programming languages will be “killed” – natural language does not contain the precision needed to build every program. But consider that a small group of people still write in Assembly–you just happen to never write in Assembly, because you don’t need to. In the same way, for the 99% of people who currently cannot code, a natural language compiler eliminates the need to learn programming languages in order to achieve your desired goals.
If you’re not convinced that programming languages can evolve, let’s go on the longest short history of programming.
In principle, code was never even required for computers.
Computers are generally defined as devices capable of performing logical operations. It is generally accepted that the Chinese abacus, dating back to before 300 BCE is a rudimentary computer that can compute sums beyond simple counting. In this case, flicking your fingers is the code.
Over the next two millenia, more computers were built for single purposes – for instance, a tide-predicting analog machine. Again, there is no programming language or code, though it was a robust calculator of complex math. Instead, the “code” is written in a system of pulleys and gears.
Modern Computers–and Evolving Language
As we jump forward in time, let us race through a series of developments that led to the programming languages of today. (There is a fascinating history here that you should read; many folks like The Dream Machine)
1937: Arrival of Binary
Claude Shannon writes his seminal thesis relating Boolean algebra to electrical engineering. If you count “0” and “1” as language, then sure, I guess we could call this language. At this point, very few people could code and almost no one had a reason to.
**This program is not guaranteed to be correct–I used an LLM to translate this from another language.
1940’s: Assembly Languages
Even if programs are still computer specific, “ADD” or “MOV” and memory locations add a human-read/write layer of language onto what was previously purely numeric.
1957: FORTRAN adopts the concept of a compiler
This enabled developers to write code at a higher level of abstraction and have it automatically translated into machine language. This breakthrough opened up new possibilities for non-low level programming.
By this point, we can see a structure that better follows human logic and a simpler, readable syntax, very reminiscent of today’s most popular programming languages.
Hey, that’s not bad!
C: A Recent Turning Point
1972: C is released
C quickly became (and still is) one of the most popular programming languages. Its ability to develop and compile for different systems, coupled with its general-purpose, and fairly high performance makes it an all-around champion.
It’s not so far removed from the equivalent syntax in FORTRAN–again, adopting a fairly human-readable, human-writable syntax.
Various other languages come and some of them go (and some of them are Go). Python is now king, C is still queen, Java is a court jester–and though your mega-cap technology companies will occasionally force a million people to learn Swift, the most popular languages are cast from similar molds.
Today: natural language and programming converge
To hound the point on simplicity, access, and winning the software language wars, remember that our toy program in Python is just:
Despite Python’s weaknesses, it has become a favorite in both web design and data science, radically different domains, for its simplicity and lack of boilerplate overhead. But what if we wanted to show a bar chart?
It takes 14 lines of code for the bar chart–this example is direct from Matplotlib documentation:
width = 0.25 # the width of the bars multiplier = 0 fig, ax = plt.subplots(layout='constrained') for attribute, measurement in penguin_means.items(): offset = width * multiplier rects = ax.bar(x + offset, measurement, width, label=attribute) ax.bar_label(rects, padding=3) multiplier += 1 # Add some text for labels, title and custom x-axis tick labels, etc. ax.set_ylabel('Length (mm)') ax.set_title('Penguin attributes by species') ax.set_xticks(x + width, species) ax.legend(loc='upper left', ncols=3) ax.set_ylim(0, 250) plt.show()
It’s cousin Seaborn is slightly more succinct:
sns.set_style("whitegrid") sns.set_palette("colorblind") ax = sns.barplot(data=df, ci="sd") ax.set_ylabel('Length (mm)') ax.set_title('Penguin attributes by species') ax.set_ylim(0, 250) ax.legend(title=None, ncol=3) plt.show()
Here, the formal syntax is clear to read, and logical to write, but what a pain! There’s no way I can remember each of these functions and arguments, and even with autocomplete, it’s a few minutes.
I would much rather write the sentence: “plot bill length and depth, and flipper length in a clustered bar chart by species, and add labels.” If you can believe it, by combining the power of LLMs and some straightforward English, we are able to generate this plot right away, with just that sentence.
And for the majority of the people who ever need to make a chart, this is entirely sufficient. Carl Sagan quipped “If you wish to make an apple pie from scratch you must first invent the universe.” And for years, we have just been okay with the notion that you ought to first learn to program to “properly” make a bar chart in Python. But why are we still clinging to this notion when we have better tools now?
Let everyone program!
My conclusion is simple: dream big about the impact of programming in a common language.
In 1450, Johannes Gutenberg's invention of the printing press revolutionized the way information was produced and disseminated. Prior to this, books were expensive and copied by hand. However, the printing press allowed for books to be produced more quickly and inexpensively, and in common languages such as German, English, and French.
Previously, only the priest read the Bible to you; now, you could read it yourself. But the critical impact was not that you became a priest! During the Protestant Reformation, religion, and language itself, became accessible.
This proliferation of religious and non-religious texts led to increased literacy and knowledge, and the democratization of information during the Reformation. Increased access to information and knowledge made possible by the printing press helped to usher in the Enlightenment and its scientific advancement, democratic reform, religious tolerance, and achievements in the humanities.
In a digital age, in a digital economy, the barrier of programming languages creates so much missing access to the power of programming. Not everyone will become a software engineer–and we don’t need everyone to be a software engineer! But giving billions more people access to the power of programming will show us all the creative and scientific things they might accomplish beyond writing software. Some might even go as far as to forecast a New Enlightenment.
That’s to say, I am very excited about the democratization of programming that is beginning to happen.
Einblick is an agile data science platform that provides data scientists with a collaborative workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick customers include Cisco, DARPA, Fuji, NetApp and USDA. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.