Google’s Generative AI Stack: An In-Depth Analysis
At the recently concluded Google I/O 2023 conference, the search giant unveiled its generative AI strategy. From Bard to Project Tailwind, generative AI dominated the conference. Google’s long-term investment in AI-related research led to the creation of powerful foundation models, which have become the core of the new product and services.
This article takes a closer look at Google’s generative AI strategy.
Foundation Models – The Secret Sauce
Based on self-supervised learning techniques, foundation models are trained on large, publicly available data. These foundation models can be adapted to various use cases and scenarios without the need to retrain.
Four foundation models power Google’s generative AI stack:
PaLM 2: This is a large language model (LLM) trained on 100+ languages that can perform text processing, sentiment analysis, classification, and more. According to Google, the model can understand, generate, and translate nuanced text across various languages, including idioms, poems, and riddles. It can demonstrate logic and reasoning and even solve complex mathematical equations.
Codey: A foundation model that can be embedded in a standard development kit (SDK) or application to enhance developer productivity. It improves the efficiency of developers through code generation and code completion. Codey has been optimized and fine-tuned based on high-quality and permissively licensed code from external sources to enhance its performance.
Imagen: This text-to-image foundation model lets organizations generate and customize studio-grade images. Developers can use this model to create or edit images.
Chirp: A foundation model trained to perform speech-to-text conversion. It can be used with various languages to generate captions and build voice assistance capability.
Bard – The ChatGPT Competitor from Google
Google Bard is a chatbot based on the PaLM 2 LLM. The current version of Bard is available in English, Japanese, and Korean, which can be accessed through the Google Bard website or through Google Assistant.
Bard, the new Chatbot from Google powered by PaLM 2
Soon, Bard’s response will include images along with text, making the interaction more useful and richer to the users. It will also become possible to use images as input prompts to make Bard write captions or perform image search.
Bard can also respond to code-related queries. It lets developers export the response to Google Colab or Replit code environments. Apart from code generation and explanations, Bard will also include citations with a link to the original source.
Duet AI – The AI-Powered Sidekick for Developers and Consumers
The foundation models are fine-tuned to assist developers and consumers in their day-to-day tasks. Duet AI is the brand that Google uses to identify the generative AI experiences infused into various products.
Duet AI for DevOps
When it comes to developers and operators, Google announced Duet AI-based services that are embedded into development environments and its cloud services.
Code Assistance: Google is going to ship plug-ins for popular IDEs such as VS Code and JetBrains. This helps developers generate code automatically based on comments and other forms of instructions. This capability competes with GitHub Copliot and other code completion products such as Amazon Code Whisperer and Tabnine.
Cloud Workstations: Cloud Workstations are pre-configured development environments in the cloud that come with runtimes, frameworks, and IDEs approved by enterprises. Duet AI enables Cloud Workstations with code/boilerplate generation, code completion, and code explanation. It can even scan source code for security vulnerabilities and suggest appropriate fixes.
Cloud Console: Google is going to embed a chat window within the Google Cloud Console user interface through which operators can interact with the chatbot. Like Bard and ChatGPT, this chatbot can assist operators with steps needed to perform a specific task or a function related to managing the cloud.
Cloud Shell: Like the chatbot for Cloud Console, the Cloud Shell, which is a terminal window embedded within the browser, will have a chatbot. It can generate commands and scripts to automate a variety of DevOps and CloudOps-related tasks.
Duet AI for End Users
It’s a no-brainer that Google would extend the power of AI to its consumer products, such as Google Workspace. Soon, Docs, Sheets, Slides, and Meet will get a chatbot to assist users in generating, transcribing, and summarizing content. This integration between the foundation models and Google Workspace makes the end-users creative and productive.
Google also demonstrated Project Tailwind, an AI-first notebook based on the personal content stored in Google Drive, Workspace, and other assets. Currently, this experimental service is available only in the USA.
Duet AI for Low Code and No Code
AppSheet, Google’s low-code/no-code platform, is going to be integrated with Duet AI. With this, users can create intelligent business applications, connect their data, and build workflows into Google Workspace via natural language powered by PaLM 2.
Google also announced MakerSuite, a tool that lets developers start prototyping quickly and easily. They will be able to iterate on prompts, augment datasets with synthetic data, and easily tune custom models. When they are ready to move to code, MakerSuite will let them export the prompt as code to languages and frameworks, such as Python and Node.js.
Search Generative Experience (SGE)
Google Search is going to be fundamentally changed through the infusion of generative AI. Search will become contextual and efficient by analyzing the semantic meaning of the query. Google is combining Shopping Graph with generative AI, which has over 35 billion product listings, to deliver an immersive experience to users.
The combination of traditional search and generative AI will transform how users experience the web.
Vertex AI and PaLM 2 API
Google is going to expose PaLM 2 LLM through a dedicated API endpoint. This is not available yet, but when it is ready, frontend and mobile developers can easily consume the API to build generative AI-based apps.
Vertex AI, Google’s ML PaaS in the Cloud, is ready for generative AI. It has an updated Model Garden, which is a repository of foundation models such as PaLM 2, Imagen, and Chirp. Google is also bringing third-party foundation models, such as stable diffusion, to its cloud platform.
Generative AI Studio within Vertex AI acts as a playground to explore the API by tweaking various parameters and prompts. Developers can start with the Generative AI Studio before invoking the API or using the SDK.
Finally, Gen App Builder is a new service that lets traditional developers unfamiliar with ML or AI build generative AI applications. Developers can use a combination of text and images to create applications that can search for information in documents, photos, and video content. This service enables them to build engaging customer interactions.
When we analyze Google’s generative AI strategy, it becomes clear that PaLM 2 is the foundation powering almost every service it announced.
PaLM 2 is helping Google compete with OpenAI and Microsoft. It’s doing what GPT did to OpenAI and Microsoft.