How Conversational Programming Will Democratize Computing
OpenAI’s GPT and other Large Language Models (LLMs) have released a kind of breathless energy from parts of the tech community who have invested their time at the bleeding edges of computing, and who recognize opportunities looking back at them. In my previous post, I looked soberly at how LLMs can help developers now, and the existing utility of Copilot. But what about the degree to which everyone else can build software? Looking past the lack of battle-hardened examples, there are now enough clues to see where all this is probably going.
Introducing Conversational Programming
First of all, a rather important naming issue. I will refer to conversational programming in this post, even though that term is already very familiar in another domain (CNC machine tooling software). Forcefully reclaiming it will not be trivial. However, I think the term “prompt engineering”, while popular with users of image creation services like Midjourney, seems to imply a technical approach that is the antithesis of a conversation.
So what do I mean by “conversational programming”? ChatGPT has shown us that we no longer require a “machine whisperer” to get a computer to do useful things. Conversational programming moves the focus away from the specialization and planning associated with classic software use, towards informality and reactive steps. The conversation between a crew member and the ship’s computer in Star Trek is the hidden mental model most geeks harbor and remains a useful lodestar. Whether through text or speech, computing will slowly disappear into the background — much like Homer Simpson into the hedge. But there are a number of rules that should guide the early measurement of these systems, and I’ve teased some of them out below.
Let’s Have a Chat
Conversations in real life start with a common or shared context. This helps us pare down the seemingly limitless possibilities of language. LLMs need this context for the same reasons — it’s very different from a blinking cursor on a system with no expectations of past and future. Most of the new innovations will come within existing apps because their users already share a context. The word “chat” in front of “ChatGPT” is doing more work than you think — it prepares us for the idea of an informal conversation with a stranger. Plus, that computer in Star Trek shares the same circumstances as the crew.
Different Conversations, Same Result
Two people who explain the same thing in different ways must get the same results if we are to trust a system. Otherwise, we are just back to prompt engineering and machine whisperers. While this feels like it should be part of training data, we already know that ChatGPT has a habit of answering in different ways to similar input. If GPT systems have access to a set of commonly asked requests, then it can better reason about the likely meaning of similar requests. As ChatGPT is associated with one company at present, and we are not looking too hard at security and legal issues, this is a reasonable proposition.
Not Standards, but Understood Objects
As well as an agreed context, there must be an agreement on the form of the unit of work, or outcome or progress. But not in the old sense of a “standard”. As long as conversions exist, there is little or no effort for an LLM to apply them and change the described form of an object to suit fixed specifications. This is because the GPT engine can track what an image, a calendar appointment, a document or a stone is while applying outward traits to them. The implication is that apps and organizations that already specialize in these model domains will receive pressure to make these areas available for autonomous requests. At the moment, LLMs train largely with what’s on the web; and that is the same for everyone. What we don’t want is corporate training that inculcates a brand or product as a base type.
Hi AI, It’s AI Calling
Talking of autonomous requests, to achieve their goals LLMs will launch background tasks that return with the required information. AutoGPT projects are trying to use APIs to connect to other LLMs and act as task management agents. What makes LLMs different from any other goal-orientated system is that GPT can analyze its own reasoning and even criticize the outcome.
How will people adapt their existing systems? This is the job of the ChatGPT plugin, and Simon Willison gives a good description of using it. He maintains Datasette, an established tool for exploring and publishing data:
“Building ChatGPT plugins, like so much involving Large Language Models, is both really easy and deceptively complicated. You give ChatGPT a short, human-ish language description of your plugin and how to use it, and a machine-readable OpenAPI schema with the details of the API. And that’s it! The language model figures out everything else. Datasette exposes a JSON API that speaks SQL. ChatGPT knows SQL already, so all my prompt needed to do was give it some hints.”
The Human Stack
The scope of a conversation must mirror a human “mental stack”, not that of a computer. When I use a conventional Windows interface on my laptop, I am confronted with the computer’s file system which is presented as folders and files. That effort is reversed in conversational programming — the LLM system has to work with my limited human cognition facilities. This means creating things in response to requests, and reporting outcomes at the same level that I asked for them. Returning arcane error codes in response to requests will immediately break the conversation. We have already seen ChatGPT reflect on its errors, which means a conversation should retain its value for the user.
Midjourney responds to requests for new images by showing you four alternatives, and you can then further operate on them. The scrolling Discord log (Midjourney is commonly used within the Discord app) acts as a conventional “memory”, but otherwise I can work moment by moment.
A Conversation Example
Here is an example of me creating a wiki page via conversation with a GPT-enhanced corporate wiki server.
David: Create a new team wiki page
Wiki: Hi David. You don’t have permission to create pages in the company wiki space at the moment, but I’ll email the owner and request permission. Meanwhile, I’ll create a provisional page in your private area and publish it when permission is granted. There is an existing space team template, so I will use that. I’ll name your team “Team Amazing Spoon”; you can rename it before publishing. I’ve added a team member section and added you as the first member. I have emailed you the link to the provisional page
David: Rename the team page “Team Awesome Teacup”
Wiki: I have renamed the page to “Team Awesome Teacup”. I will register the page in the teams directory when the page is published.
Note how the system deals with the first problem and handles required default information. It also understands standard patterns already in use. By removing the little pitfalls that make technical tasks so much less straightforward, the average employee can gain the confidence to create resources — without in any way degrading the existing process.
The Democracy of Computing vs. Risks
The industrialization of LLMs is the only thing we can be reasonably sure about, because the investment has already been made. However, the rapid advancement of GPT systems will likely run aground in the same areas that other large-scale projects have in the past. The lack of collaboration between large competitors has eroded countless good ideas that depended on interoperability. Also, however slowly the law moves, it will catch up. How many autonomous decisions do you want based on Wikipedia articles?
Apps today already give people remarkable access to data, but that is normally in a personal capacity. Conversational programming will lead to a democratization of computing, in that more people will be able to take responsibility for a wider range of tasks — tasks we would have previously described as specialist. And that includes building systems that others will then use. But don’t expect many of these to reach the consumer market anywhere near as quickly as the progress of GPT systems might imply.