TNS
VOXPOP
Where are you using WebAssembly?
Wasm promises to let developers build once and run anywhere. Are you using it yet?
At work, for production apps
0%
At work, but not for production apps
0%
I don’t use WebAssembly but expect to when the technology matures
0%
I have no plans to use WebAssembly
0%
No plans and I get mad whenever I see the buzzword
0%
AI / Frontend Development / Large Language Models / Software Development

The Pros (And Con) of Customizing Large Language Models

Tabnine has added a natural language chat to its AI assistant, giving developers access to a custom-trained developer bot from within the IDE.
Jul 3rd, 2023 10:08am by
Featued image for: The Pros (And Con) of Customizing Large Language Models
Image via Unsplash

Before the artificial intelligence collective destroy humanity, it’s going to have to settle a few things in its own house first. For instance: Which is better — a large language model (LLM) or a custom-trained LLM?

Tabnine’s is team custom-trained, while its primary competitor — GitHub Copilot — is driven by the more broadly trained LLM GPT-4.

The company added chat functionality to its AI-powered assistant for developers last week. The chat assistant can write code and answer questions from the model, which is trained on fully permissive open source code, but it can also be custom-trained on an enterprise’s code base, said Brandon Jung, vice president of Ecosystem at Tabnine.

“Tabnine was the first AI code assistant on the planet,” Jung stressed to The New Stack in a recent interview. “Tabnine has a unique UI/UX, we use three LLMs at any one time — one local, one for longer completions and one for chat — and they’re purpose-built. …The second piece is we build our own models. ”

The chat is available for free early access now but will be generally available to Pro and Enterprise customers by fall.

Custom Versus General Purpose

By custom training a model on its own code, enterprises can ensure the languages they want are supported and therefore leverage (we hope) secure code for its examples. The model is then run on-premise or on the company’s virtual private cloud (VPC). Jung drops a few names: Samsung, the Israeli Army and Telsa run it on-premise, while Accenture runs it in a VPC.

“I don’t mean this to be flippant, but students aren’t committing code that has to go to code base, [but] it’s not part of the software development piece, it’s just build and learn how to do it, so a Copilot’s a good fit,” he said. “But if you’re Samsung, you’re going to say I have some of the most valuable code on the planet, that I don’t want leaked, I don’t want to be in someone’s data. And I need to build a model that my thousands of developers are getting suggestions from my code base, not from some random open source somewhere.”

If a new developer wants to know how to use an API, the code will pull up the details from a company’s specific code, rather than code for generic APIs, Jung said. It can also be used to document code that maybe has bad documentation or no documentation. Another differentiator for Tabnine is that the AI assistant integrates with popular IDEs, which means you don’t have to rely on a plugin being available.

“We may think general AI is magic, but at the end of the day it’s just machine learning,” he said. “It’s good data in, good data out, bad in and bad data out.”

Implications for the Frontend

For the frontend, custom training may not have the impact it will with backend systems, Jung acknowledged. But it can still be useful to get code that mirrors the specifications the development team has for its frontend coding, for instance.

“You also may have some very specific patterns for how you’re doing that front end, and you want that repeated. Citibank wants their frontend JavaScript to all look the same. So there’s still value in it,” he said.

There are tradeoffs in a custom trade model: You do have to know where you’ll be using the model, Jung said. A pocket knife is a good general tool, but it’s not as effective as a torque wrench for a lug nut, he pointed out. If the bulk of the custom model is trained on backend languages, it might not be as good a fit for a JavaScript frontend.

That’s one of the mistakes Jung says underlies our understanding of generative AI.

“The idea of a universal model sits as the way people think about it — I don’t have to know anything, I throw everything in the black box, and it gives me magic answer,” he said. “When you get to a customized model, you have to know.”

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.