VMware’s Dev-Centered Approach to Pre-Trained Models and Generative AI

Who knows anything about pre-trained models? Who can explain how to use generative AI?
Developers, platform teams, infrastructure teams, security teams — no one is that advanced. Everyone knows just a smidgeon about using pre-trained models or generative AI. Few know that a pre-trained model consists of a large data set that gets trained with machine learning. (Disclosure: I referred to ChatGPT for that definition).
ChatGPT also will tell you that these larger data sets often get paired with natural language processing to find patterns in data and predict text, such as code. And generative AI? It’s far broader by definition. That’s what the Claude AI from Anthropic will tell you. It’s constructed by using neural network techniques like adversarial training like diffusion modeling, knowledge integration etc. The focus is on modeling high-dimensional distributions of data and efficiently sampling from them.
It’s that early in the cycle where I feel compelled to define terms at the start of a story. And really, how people are getting started with these new thought-provoking technologies is about all I could think about when hearing the AI-filled keynotes at VMware Explore late last month. So, I looked in the conference’s session catalog and found a Spring developer workshop about generative AI.
My curiosity: What do these Java developers know at all about something that is so new? Spring is the popular Java framework.
In front of a full conference room, a VMware engineer started with the absolute basics. Before ChatGPT, the developer trained the model, the VMware engineer said. The “P” in GPT stands for pre-trained. For level setting, GPT stands for “generative pre-trained transformers.”
Pre-Trained Models
He conveyed that the pre-trained model makes all the difference. Generative AI makes development something even a non-programmer can do. And it can help a veteran developer code in ways that lower the risk. Code generation becomes a far simpler task. It makes software development a universal capability.
“And so this kind of transforms AI into being more of a general developer tool than sort of a very specialized area,” the instructor said. “So consequently, it’s going to be ubiquitous.”
But it’s not just about the pre-trained model and generative AI.
“It’s about the software ecosystem around what you’re doing, right?” he said. “How do you get data in and out? How do you make this accessible over the web, and do enterprise integration patterns? Integrating different components and data is all super relevant to creating an effective solution. And of course, you know, the large ecosystem that Spring has in projects, meaning that we can quickly pull together very compelling solutions in the AI space by bringing these components together.”
He went on to explain concepts about generative AI. What’s a model? What are the benefits of ChatGPT? What are its limitations? He explained prompts and the rise of prompt engineering. He discussed how prompts and Spring intersect. He explained how tokens work. He listed the Java Client APIs from Azure OpenAI, OpenAI, and Google Bard.
Then, he introduced SpringAI, now in the Spring Experimental GitHub organization, inspired by the LangChain/LlamaIndex open source projects.
Developers Need the Basics
The introduction to generative AI speaks to what developers and operations teams need now: They want the basics. They need to learn the terminology, prompting tips, the role of queries, etc.
But how did VMware go about adopting pre-trained models? How has the company adapted to integrating pre-trained models? And what have they worked on to make pre-trained models part of its Tanzu application stack?
For example, VMware developed a hub, a graph database, said Purnima Padmanabhan, senior vice president and general manager at VMware. It builds on Aria, a graph-based data store with a GraphQL API originally built for managing IT resources. The graph data stores provide a visual overview of applications and environments.
The Path
The VMware team normalized its data. Normalization plays a crucial role when data gets used in a generative AI environment. It prevents issues such as bias, and for private data to be anonymized. Consistency issues in data formats get resolved, and deduplication means better efficiencies.
Once data is normalized, the team modeled the topology and did the near real-time discovery of environments. They used traditional AI and machine learning techniques to look at big data and synthesize information out of it.
But when conversational AI exploded last December, Padmanabhan said they realized that the data normalization would pay off.
With normalization done, VMware applied a conversational interface to Tanzu Intelligent Services, which VMware launched at Explore, its annual user conference held last month in Las Vegas. It uses conversational AI to ask questions about the application, what’s problematic, what node is causing problems, etc.
“What kind of problem is it?” Padmanabhan said, characterizing what they could deduce. “Is it cost, or performance or security? And if it’s a cost problem, what do I need to do? How do I need to right size it? It pulls data not only from the database that we have and the queries that we have, but also from documentation from other sources.”
Challenges
And herein lies the problem with generative AI adoption. Most companies need a data infrastructure in place to even start using generative AI. They hire machine learning technologists but often before prepping their enterprise environments.
Chris Albon, director of machine learning at the Wikimedia Foundation, wrote on X, the company formerly known as Twitter:
“A mistake I see WAY too many times is hiring some expensive ML expert before having the infrastructure there to support them. Then they spin their wheels being their own data engineer, data scientist, MLOps engineer, etc. etc. until they quit because they aren’t training models.”
“When people ask me: ‘How do I start with AI,?’ you first need to know what is the problem you’re solving,” Padmanabhan said. “What is it that you want to simplify? What is the data that you’re going to look at?”
And that simplification of a problem takes time, Padmanabhan said. “It’s that’s almost boring work that has to be done.”
Step one means identifying the problem, Padmanabhan said. Two, identify the correct data set. Third, find the model applicable to the data set. If a pre-trained model needs fine-tuning, that’s much better than building a new model. Most of the time, a user does not need to build the model. They may fine-tune a pre-trained model.
Padmanabhan said that fine-tuning a model becomes a way to consider data privacy. A model gets trained through the formulation of a query.
“If I can train the model on how to formulate a query, I don’t need to give any data,” Padmanabhan said. I just want to say, ‘formulate this query.'”
VMware developed accelerators that connect with pre-trained models, Padmanabhan said. She said to think of it as a catalog of applications, a curated catalog, that integrates with pre-trained models “so it’s easy for developers to create their AI-enabled applications through the same process through the same platform that they do their other applications.”
The problem statement becomes, “What am I actually asking the model to do? And how can I give you the minimum data set possible?”
Prompt engineering gets done up front, but the query formulation determines how much data needs to go into the fine-tuning of the model. VMware managed that process for its overall Kubernetes-based Tanzu application platform and its application services, which are part of its Cloud Foundry business.
“Okay, now that I got all these pieces, how do I do it this at scale?,” Padmanabhan said. “Because now I know, I have to have consistent APIs, I have to have a common workflow for ML workflow and ML flow engines.”
Cloud Foundry works with a concept called tiles, which provides a systematic approach. Applications get packaged as tiles that allow developers to integrate third-party software. Padmanabhan said VMware developed models based on the tiles of popular applications to make them available.
VMware uses a template process for its app accelerator, Padmanabhan said. For example, a developer can tell the accelerator what programming language to use. The idea: use an API to connect the accelerator to a service such as SpringAI. The accelerator may then have a way to connect to models through common APIs — the solution: a basic accelerator that allows for adjustments and fine-tuning.
Big Code
In its considerations for software engineers, VMware chose to work with open source alternatives, which touches on why VMware chose Hugging Face as a partner. Hugging Face provides a community for AI/ML projects with a deep focus on open source.
Together, Hugging Face and VMware announced SafeCoder, a coding assistant based upon StarCoder, which they developed with VMware and ServiceNow through an open source project called Big Code, which focuses on responsible training of large language models for coding applications.
SafeCoder, designed for the enterprise, is built with security and privacy as a first priority. It works on VMware infrastructure, whether on-premise in the cloud or hybrid. It also works with “open source projects like Ray and Kubeflow to deploy AI services adjacent to their private datasets.”
The VMware team talked more about responsibility than any tech conference I’ve attended. But it makes sense considering how proprietary models pose legal and compliance issues.
“For VMware, our source code is our business,” said Chris Wolf, vice president of VMware AI Labs. “And it’s very important for us to make sure that we’re maintaining privacy and control of that data because that’s our business.”
VMware tuned SafeCoder against its private source code. They looked at the code from their top-performing software engineers and their code commits, which they used as the dataset in the model. It gave them a quality code base for a new model, resulting in automation they could do in their style.
In a pilot, 80 VMware software engineers used SafeCoder, and more than 90 percent want to continue using it, Wolf said. Taking an open source route means more control over the direction of software development as they are not beholden to proprietary technology.