Google’s New TensorFlow Tools and Approach to Fine-Tuning ML
Today at Google I/O, the web giant’s annual developer conference, Google announced a bunch of new AI tools — including new tooling for the TensorFlow ecosystem, a new one-stop shop for developers called ML Hub, and upgrades to its cross-platform set of ML solutions called MediaPipe.
Ahead of the announcements, I conducted an email interview with Alex Spinelli, Vice President of Product Management for Machine Learning at Google.
The new tools for TensorFlow include KerasCV and KerasNLP (allowing developers access to new “pre-trained” models), DTensor (for scaling via parallelism techniques), JAX2TF (a lightweight API for the JAX numerical framework), and the TF Quantization API (which is “coming soon,” but will allow developers to build models that are “cost and resource efficient”).
State of Google’s LLMs
I asked Spinelli whether developers will be able to use any of the above tools on Google’s large language models (LLMs)?
“In March, we announced that developers who are experimenting with AI can build on top of our language models using the PaLM API,” he replied. “As part of that announcement, we made an efficient model of PaLM available, in terms of size and capabilities, and we’ll add other sizes soon. The API also comes with an intuitive tool called MakerSuite, which lets developers quickly prototype ideas and, over time, will have features for prompt engineering, synthetic data generation and custom-model tuning — all supported by robust safety tools.”
Spinelli added that at I/O, Google will be opening up a “private preview” of the PaLM API, “so more developers can prototype directly on the web with MakerSuite or with the tools they know and love, with integrations in Firebase and Colab.”
Why Use TensorFlow and Not LLMs
PaLM is Google’s biggest LLM, at 540 billion parameters, but it has a few other LLMs listed on the Stanford HELM index: Flan-T5 (11B), UL2 (20B), and T5 (11B). I asked Spinelli why a developer might want to use ML models via TensorFlow instead of Google’s LLMs. In other words, are there specific use cases that are best for TensorFlow?
He replied with three different use cases for TensorFlow ML:
- A developer wants to build their own model;
- A developer can solve a problem by using someone else’s model — either directly, or by fine-tuning it; and
- A developer can solve a problem by using a hosted large model — be it language, images, or a multi-modal combination of both.
On the first use case, Spinelli said a combo of TensorFlow and Keras (a software library with a Python interface that interfaces with the TensorFlow library) was the best choice to build your own model. “They make it easy for you to define model architecture and train on your own data,” he said.
TensorFlow and Keras are also the best choice when using someone else’s model, Spinelli told me.
“Many models (see Kaggle Models or tfhub.dev) have been created by other developers with extension via Transfer Learning in mind,” he continued. “TF [TensorFlow] makes it super simple for you to do this to — for example — take a model that’s great at recognizing generic images, and retrain it to be excellent at spotting specific, particular, images; like diseases on an X-Ray.”
As for using a hosted large model, Spinelli said that “We’re working to extend TF and Keras to make their high-level APIs useful for developers to access existing large-language or other generative models.”
Fine-Tuning in Google’s Models
There is mention of devs being able to train models with the new tools, but no mention of fine-tuning. TensorFlow’s own documentation defines fine-tuning as training “the weights of the top layers of the pre-trained model alongside the training of the classifier you added.”
Fine-tuning is something that Meta offers with its LLaMA model, but no other big LLM currently offers access to the weights. So I asked Spinelli if there is anything in the new tools that will help devs with this fine-tuning.
“In its strictest sense, fine-tuning involves creating an entirely new instance of a model, but with some parts retrained for one’s specific scenario,” he replied. “However, when dealing with LLMs, you don’t usually do that, with the exception that you noted [LLaMA], because of the storage and costs involved.”
Spinelli claims that developers can get the same overall effect of fine-tuning using what he called “prompt tuning” or “parameter efficient tuning” [PET]. He said that both can be done with MakerSuite. “You can also prompt tune and P.E.T. programmatically with the PaLM API,” he added.
With all that said, Spinelli noted there will be one exception to the “prompt tune” and PET approaches. With Cloud AI (part of the Google Cloud suite), he said, “you can fine-tune our code-generation model with your own codebase, and you’ll get a private VPC with that instance of our codegen model that you can use to generate code that is aware of your particular codebase as well as our general purpose one.”
An ML Hub
With all these new product announcements, Google clearly wants to become a hub for ML developers — similar to how it caters to web developers with regular browser, SEO and other web platform updates. The new front page for ML developers, ML Hub, is being positioned as a kind of portal to “enable developers to build bespoke ML solutions.” It will likely be similar to web.dev, Google’s front page for web developers.
Indeed, like Google’s web development tooling, there is something for everyone in Google’s newly expanded ML toolset — including ways to access those much larger, and trendier, generative models.