TNS
VOXPOP
Where are you using WebAssembly?
Wasm promises to let developers build once and run anywhere. Are you using it yet?
At work, for production apps
0%
At work, but not for production apps
0%
I don’t use WebAssembly but expect to when the technology matures
0%
I have no plans to use WebAssembly
0%
No plans and I get mad whenever I see the buzzword
0%
AI / Large Language Models

The First Thing to Tell an LLM

Technologist Adrian Cockcroft discusses how fine-tuning large language models starts with varying forms of prompt engineering.
Aug 31st, 2023 9:30am by
Featued image for: The First Thing to Tell an LLM

In a recent interview on The New Stack Makers, recognized technologist Adrian Cockcroft joined us for a conversation about large language models (LLM) and how fine-tuning them starts with varying forms of prompt engineering.

Suppose you are going to get programming advice from an LLM. In that case, you first tell it: “I want good programming advice,” said Cockcroft, known for his work as a cloud architect at Netflix and later Amazon Web Services. Cockcroft now enjoys his time as a semi-retired technologist.

The model then looks in its model for all the programs it has seen. “And you might say, ‘I want you to program, like you’re an expert programmer in this system, or you know, write me Java code, like your James Gosling, or something like that,” Cockcroft said. “And maybe it goes and finds code that he wrote or something like that.”

Which leads to how prompt engineering applies.

“So there are these things you can do with prompt engineering, where you’re setting up the conversation to try and bias the AI into a space where you want it to go,” Cockcroft said. “And these prompts are getting increasingly sophisticated with plugins and things where you’re building quite a lot of information, which is basically loaded into the model before you start using it.”

Cockcroft said that between prompt engineering and the training comes the idea of fine-tuning the model where it’s potentially too big to put in a prompt.

“For example, I’ve heard of people sort of signing up with OpenAI with an account at a corporate level, but the first thing they do is they feed all of their corporate information into it,” Cockcroft said.

Wiki pages, SharePoint documents, the corporate website — everything gets added.

“And it sort of sets up so that the AI model now understands your terminology, your domain, how to do things that your company, your internal processes,” Cockcroft said. “You’re effectively training the model, you’re fine-tuning the model to be somebody that understands your company. And then you go and start building something on top of that, or you’re asking questions or building whatever assistance you want. But it’s kind of got this extra level of information, where you’ve fine-tuned it.”

ChatGPT does some things well, but it’s more like the tasks that you can give it, Cockcroft said.

Cockcroft recently did some data analysis and used R, the programming language. He started asking about syntax, and ChatGPT began writing the code.

The demand for vector databases reflects the more resounding need to get better information from the LLMs, Cockcroft said. And that is pretty much why we can expect vectors, which have “a whole load of weights which embed the meanings of the words,” to serve as a store for doing fuzzy matches against the numbers that represent the vectors, Cockcroft said.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.