GitHub Copilot: A Powerful, Controversial Autocomplete for Developers
I’ve been covering this application development space for a long time and have seen a lot of breakthroughs. Some catch my eye more than others, and GitHub’s Copilot is one of those eyecatchers.
The technology is truly promising. “GitHub Copilot draws context from the code you’re working on, suggesting whole lines or entire functions,” GitHub CEO Nat Friedman explained in a blog post introducing the technology.
Copilot helps developers to quickly discover alternative ways to solve problems, write tests, and explore new APIs without having to tediously tailor a search for answers on sites like Stack Overflow and across the internet, Friedman said. And, as it is machine learning-based, it learns as you use it. “As you type, it adapts to the way you write code — to help you complete your work faster,” he noted.
The technology is now in technical preview and, so far, is getting great reviews — as well as a bit of pushback — from users both inside and outside the Microsoft/GitHub fold.
“I’m impressed by how GitHub Copilot seems to know exactly what I want to type next,” said Feross Aboukhadijeh, founder of Socket, a maker of privacy and security software, in a statement. “GitHub Copilot is particularly helpful when working on React components, where it makes eerily accurate predictions. GitHub Copilot has become an indispensable part of my programmer toolbelt.”
I’ve been testing #GitHubCopilot in Alpha for the past two weeks. Some of the code suggestions it comes up with are eerily good.
Here’s a thread with some examples that I found surprising. Will update with new examples over time. https://t.co/lD5xYEV76Z
— Feross (@feross) June 30, 2021
While not terribly nonpartisan by affiliation, Alex Polozov, a senior researcher at Microsoft Research, tweeted, “Not exaggerating, Copilot will be in top-3 tech developments of 2020s.”
So stoked to finally discuss Copilot!
I’ve used it inside MSR for months, watched it evolve, and discussed collabs.
Not exaggerating, Copilot will be in top-3 tech developments of 2020s 🧵👇 https://t.co/aoQMfpSgtT
— Alex Polozov (@Skiminok) June 29, 2021
Out of the Microsoft/OpenAI deal
Friedman explained that “GitHub Copilot is powered by OpenAI Codex, a new AI system created by OpenAI.”
OpenAI Codex has a broad knowledge of how people use code and is significantly more capable than GPT-3 in code generation, in part, because it was trained on a data set that includes a much larger concentration of public source code, Friedman explained.
Some folks seem to be concerned that it will generate code that is identical to code that has been generated under open source licenses that don’t allow derivative works, but which will then be used by a developer unknowingly.
GitHub declined to have a spokesperson be interviewed for this news, directing me to the technical preview’s rather thorough FAQ. For instance, to my question about the data sources Copilot would be using, GitHub’s response was: “It has been trained on a selection of English language and source code from publicly available sources, including (but not limited to) code in public repositories on GitHub.”
But which ones?
“For sure they are using the GitHub repos. And absolutely the public ones,” said Ronald Schmelzer, an analyst at Cognilytica, which focuses on AI research and analysis. “Of course, the question is, are they also using the private GitHub repos. And with or without user consent? Perhaps additional sources could be Stack Overflow and other places where people post code for comment. But that’s of dubious nature in terms of quality.”
Moreover, because it was trained on publicly available source code and natural language, Copilot understands both programming and human languages. This enables developers to describe a task in English, and GitHub Copilot will then provide the corresponding code, the company said.
Supports Multiple Languages
And according to the tech preview page: GitHub Copilot is currently only available as a Visual Studio Code extension. It works wherever Visual Studio Code works — on your machine or in the cloud on GitHub Codespaces. And it’s fast enough to use as you type.
“Copilot looks like as a potentially fantastic learning tool — for developers of all abilities,” said James Governor, an analyst at RedMonk. “It can remove barriers to entry. It can help with learning new languages, and for folks working on polyglot codebases. It arguably continues GitHub’s rich heritage as a world-class learning tool. It’s early days but AI-assisted programming is going to be a thing, and where better to start experiencing it than GitHub?”
Yeah, But Can It Scale?
Some observers see Copilot as useful for simple projects but maybe not ready for prime time.
“It’s a very interesting idea, and should work well for simple examples, but I’d be curious to see how well it will work for complex code problems,” Eric Newcomer, Chief Technology Officer of enterprise infrastructure software provider WSO2, said of Copilot in an interview.
“I’d be skeptical over the *length* of the autocompletion,” he said in a tweet. “A whole file? function body? etc. Gmail can’t write your entire email, but its autocompletion is undeniably useful. AI code autocompletion is here to stay.”
I’d be skeptical over the *length* of the autocompletion. A whole file? function body? etc.
Gmail can’t write your entire email, but its autocompletion is undeniably useful.
AI code autocompletion is here to stay.
— Guillermo Rauch (@rauchg) June 29, 2021
The issue of scale is a concern for GitHub, according to the tech preview FAQ: “If the technical preview is successful, our plan is to build a commercial version of GitHub Copilot in the future. We want to use the preview to learn how people use GitHub Copilot and what it takes to operate it at scale.”
GitHub spent the last year working closely with OpenAI to build Copilot. GitHub developers, along with some users inside Microsoft, have been using it every day internally for months.
Open Source Issues
Indeed, Rauch, who also is founder of Vercel and creator of Next.js, cited in a tweet a statement from the Copilot tech preview FAQ page, “GitHub Copilot is a code synthesizer, not a search engine: the vast majority of the code that it suggests is uniquely generated and has never been seen before.”
To that, Rauch simply typed: “The future.”
“GitHub Copilot is a code synthesizer, not a search engine: the vast majority of the code that it suggests is uniquely generated and has never been seen before.”
— Guillermo Rauch (@rauchg) June 29, 2021
Rauch’s post is relevant in that one of the knocks against Copilot is that some folks seem to be concerned that it will generate code that is identical to code that has been generated under open source licenses that don’t allow derivative works, but which will then be used by a developer unknowingly.
github copilot has, by their own admission, been trained on mountains of gpl code, so i’m unclear on how it’s not a form of laundering open source code into commercial works. the handwave of “it usually doesn’t reproduce exact chunks” is not very satisfying pic.twitter.com/IzqtK2kGGo
— eevee (@eevee) June 30, 2021
I’ve seen lots of ground-breaking and potentially game-changing technologies, but then what do I know? Not saying Copilot is necessarily ground-breaking, but it’s damned sure a breakthrough. It gets us a lot farther along the spectrum in terms of what AI, particularly ML, can do to help developers build software. For instance, when I first saw the capabilities of IBM‘s Watson so many years ago — even as it “whupped” humans at Jeopardy in 2011 — I said this technology will someday help advance the process of creating software. It just took longer than I expected. But this stuff is not easy. And we’ve still got a ways to go.
For instance, in a tweet thread Eddie Aftandilian, a researcher in the office of the CTO at GitHub, described how he thought program synthesis “would never work.” But after a stint at Google, he saw significant progress and when he later came to GitHub and saw how far along they were, he was “flabbergasted,” he said.
“It [Copilot] was especially useful to me as a newbie to both TypeScript and Python, which our code is written in,” Aftandilian said in a tweet. “I didn’t have to keep searching Stack Overflow to find the right way to do something in these unfamiliar languages; Copilot would suggest it without taking me out of flow.”
Meanwhile, Rauch said he shared Copilot with the Next.js team — which just became the 44th most popular project on GitHub — and everyone in the development team has adopted it and insists their productivity has increased.
Next.js team members send each other screenshots on Slack expressing disbelief about how good the Copilot suggestions are, he said. Yet, Rauch said he believes that developers have mostly trained themselves to react instantly to autocompletion and trust it because it’s traditionally just one word.
However, “This system [Copilot] might sometimes auto-expand an entire function body, which is amazing,” Rauch said. “But at the same time, as developers, we have to be careful to read what we’re adding to our programs.”
Rauch likens the situation to GitHub providing a way of creating an “inline pull request,” where the submitter is an AI and you’re constantly reviewing their proposals, he said.
“Overall, I think it’s a quantum leap forward, and I can’t wait to learn more about what this technology makes possible, how it’s going to improve and what creativity it’ll unlock.”
Enough said. I couldn’t agree more.
Pushback on the Technology
However, as with any not-yet-fully-baked technology (it is in tech preview after all), contrarians are poking holes at GitHub Copilot. Some are calling it old wine in new bottles, and others are saying it’s the beginning of the end to mass employment for programmers.
“In my opinion, the most interesting thing about Copilot is that it typically generates original code — that is, code that is not represented verbatim in the training data,” he said. “People have to remember, however, that it is entirely unable to write creative code. Creativity — for now — is still in the hands of humans. So, is using Copilot pair programming? Only if you don’t mind that one of the two programmers isn’t creative.”
Stephen O’Grady, another RedMonk analyst agreed, noting that Copilot is indeed the natural evolution of code generation. First, the GitHub team started with syntax autocomplete, moved on to code completion, and now to AI-based generative solutions trained on enormous bodies of public code, he noted.
“While it is, according to the outsized public reaction to it at least, a big deal, Copilot is very far — in my opinion — from the beginning of the end for programming jobs,” O’Grady told me. “It can only work off what’s existing — novel solutions will still require people.”
Indeed, much like Ruby on Rails once made developers more efficient by automatically generating a lot of boilerplate scaffolding from web projects, Copilot should save developers time by reducing or eliminating their ability to reimplement basic building block features, which in turn helps them move more quickly, O’Grady said.
More Developer Productivity, Better Code Quality
So, developer productivity is a goal for the Copilot technology and one that has been achieved by internal users, according to the FAQ. Better code quality is yet another goal, as is the broadening of the pool of developers — not shrinking it — by helping newbies learn to code and by enabling existing developers to learn new languages faster.
“We’ve known for a long time that the world does not have enough developers for the code that needs to be written,” said Holger Mueller, an analyst at Constellation Research. “It is good to see that concepts like developer velocity may now put an end to the era of ‘one developer and his toolchain’ by infusing AI into the process. While they may be initially skeptical, developers will embrace any help they can get from vendors, like today it’s GitHub, to make them more productive.”
Yes, but Holger, is Copilot ground-breaking or potentially game-changing?
“No, it is the overall industry trend,” Mueller said. “ML is out there and needs to be applied — as often and as good as possible. All the tools vendors are doing it.”
Copilot also is useful for the ever-important concept of testing software. The tech preview page insightfully notes that: “Tests are the backbone of any robust software engineering project. Import a unit test package and let GitHub Copilot suggest tests that match your implementation code.”
Getting Its Money’s Worth
Meanwhile, Cognilytica’s Schmelzer says it is no surprise that Microsoft continues to leverage the billion-dollar investment it made in OpenAI, as OpenAI’s GPT-3 will probably make a starring or cameo appearance in many of Microsoft’s products as well as the products of the company’s acquisitions such as GitHub and LinkedIn.
“The use of GPT-3 to ‘fill in the blanks’ has been one of those cool use cases for Natural Language Generation (NLG), of which GPT-3 is one kind,” Schmelzer said. Yet, “OpenAI clearly has built an extension of their GPT-3 network specifically trained on the ‘bajillions’ of lines of code in GitHub to do the super-complete magic you see here.”
But is this groundbreaking, Ron?
Well, because the use of GPT-3 for suggested code completion has been around since GPT-3 was first unveiled, with many folks using it for HTML and other applications, it’s not a completely new idea, he said in an interview. GPT-3 was initially released in June of 2020.
“However, the embedding of this capability in IDEs and development suites will no doubt push the use of AI-based code suggestion up a new level to a ‘must-have,'” Schmelzer said. “Of course, we must take all this with lots of caveats.”
Whether or not that code is truly what is needed, relevant, without bugs, or applicable is up to the developer, he stated.
“Code suggestion should never be used blindly, especially when trained on the quantities of GitHub code, which might be of variable quality,” Schmelzer said.
That quality is key, as systems could potentially be built on weak foundations. In a sarcastic tweet, Grady Booch, chief scientist for software engineering at IBM Research, cited an observation from a blog post by Maxim Khailo, co-founder of Merit Capital, that implied that using Copilot without fully vetting results could be like building a house of cards.
“Now imagine your scaffolding itself is written mostly by Copilot,” the tweet reads. “Bugs will propagate in new ways, via systems that build systems.”