The First Thing to Tell an LLM
In a recent interview on The New Stack Makers, recognized technologist Adrian Cockcroft joined us for a conversation about large language models (LLM) and how fine-tuning them starts with varying forms of prompt engineering.
Suppose you are going to get programming advice from an LLM. In that case, you first tell it: “I want good programming advice,” said Cockcroft, known for his work as a cloud architect at Netflix and later Amazon Web Services. Cockcroft now enjoys his time as a semi-retired technologist.
The model then looks in its model for all the programs it has seen. “And you might say, ‘I want you to program, like you’re an expert programmer in this system, or you know, write me Java code, like your James Gosling, or something like that,” Cockcroft said. “And maybe it goes and finds code that he wrote or something like that.”
Which leads to how prompt engineering applies.
“So there are these things you can do with prompt engineering, where you’re setting up the conversation to try and bias the AI into a space where you want it to go,” Cockcroft said. “And these prompts are getting increasingly sophisticated with plugins and things where you’re building quite a lot of information, which is basically loaded into the model before you start using it.”
Cockcroft said that between prompt engineering and the training comes the idea of fine-tuning the model where it’s potentially too big to put in a prompt.
“For example, I’ve heard of people sort of signing up with OpenAI with an account at a corporate level, but the first thing they do is they feed all of their corporate information into it,” Cockcroft said.
Wiki pages, SharePoint documents, the corporate website — everything gets added.
“And it sort of sets up so that the AI model now understands your terminology, your domain, how to do things that your company, your internal processes,” Cockcroft said. “You’re effectively training the model, you’re fine-tuning the model to be somebody that understands your company. And then you go and start building something on top of that, or you’re asking questions or building whatever assistance you want. But it’s kind of got this extra level of information, where you’ve fine-tuned it.”
ChatGPT does some things well, but it’s more like the tasks that you can give it, Cockcroft said.
Cockcroft recently did some data analysis and used R, the programming language. He started asking about syntax, and ChatGPT began writing the code.
The demand for vector databases reflects the more resounding need to get better information from the LLMs, Cockcroft said. And that is pretty much why we can expect vectors, which have “a whole load of weights which embed the meanings of the words,” to serve as a store for doing fuzzy matches against the numbers that represent the vectors, Cockcroft said.