With Gemini Pro, Google Vies for Top Spot in GenAI Race
Last week, Google announced Gemini as its most sophisticated LLM to date; this week, the search giant opened up access to the Pro version. Gemini comes in three flavors: Ultra for very complex applications, Pro for individual developers and enterprises, and Nano for mobile (Android — Pixel 8 Pro) environments. Also, a specifically tuned version of Gemini Pro is in Bard, Google’s generative AI chatbot.
Gemini Pro is now available in preview supported by Google AI Studio, a new, web-based development environment, as well as Vertex AI, Google’s enterprise AI platform. Developers can start building applications with Gemini through these tools.
“Google certainly has the technical chops and AI infrastructure to build world-class generative AI models,” Mike Gualtieri, an analyst at Forrester Research, told The New Stack via email. “I would bet on Google Gemini to best OpenAI and most other models because of their vast repository of content that updates frequently and lower cost.”
Exposure to Developers
Google exposes Gemini Pro through a set of APIs that developers can use to access the model, and then makes it available to the two different tools — Google AI Studio for individual developers and small development teams, and Vertex AI for enterprises who have more complex and sophisticated needs, said Thomas Kurian, CEO of Google Cloud Platform, during a press briefing about the products.
“A developer can start in Google AI Studio, and if they have more needs, they can also move to Vertex after they’ve started with Google AI Studio,” he noted. Vertex AI features more than 130 models in its Model Garden.
Google AI Studio is a free tool that enables developers to quickly develop prompts and then get an API key to use in their app development. It’s free to use right now, within limits, and it will be competitively priced, Google said.
During the press briefing, Josh Woodward, vice president of Google Labs, demonstrated how to sign into Google AI Studio with a Google account and take advantage of the free quota that allows 60 requests per minute, which he said is 20 times more than other free offerings.
Woodward also demonstrated how a developer can simply click on “Get code” to transfer their work to their integrated development environment of choice or use one of the quickstart templates available in Android Studio, Colab or Project IDX.
“We’ll be further fine-tuning it in the weeks and months ahead as we listen and learn from your feedback, wrote Jeanine Banks, vice president and general manager, Developer X and DevRel at Google, and Burak Gokturk, Google’s vice president and general manager, Cloud AI and Industry Solutions, in a blog post about Gemini Pro.
“To help us improve product quality, when you use the free quota, your API and Google AI Studio input and output may be accessible to trained reviewers. This data is de-identified from your Google account and API key,” the post said.
Meeting Developers Where They Are
“One of the things that has been consistent about our strategy for developers is how we can meet them where they are,” Banks told The New Stack in an interview. “And doing that means wherever their experience lies with AI, whatever level of expertise they’ve had as developers.”
Moreover, Gemini Pro accepts text as input and generates text as output, and provides a dedicated Gemini Pro Vision multimodal endpoint that accepts text and imagery as input, with text output.
Today’s version comes with a 32K context window for text, and future versions will have a larger context window.
In addition, Kurian said, Gemini Pro comes with a range of features including function calling, embeddings, semantic retrieval and custom knowledge grounding, and chat functionality.
Gemini was trained using Google’s latest Tensor Processing Units (TPU) across multiple data centers and clusters to improve scale and resilience. However, details on the data centers and clusters were not disclosed.
Gemini is part of a broad AI hybrid computing infrastructure. Google introduced the fifth generation of its TPU which provides significantly faster model training for existing models, much better scale for next-generation models, and doubles the number of floating point operations per second (FLOPS) per chip, Kurian said.
Gemini and other models from Google, run on an ultrascale AI infrastructure, he said. It has very sophisticated hardware, compute, and different kinds of storage, interconnected with a very low latency, and a very wide optical switching network, which allows for “extremely good” performance and throughput, he added.
“On top of this, we expose a managed software stack to make it easy for people building models and using our models to use this infrastructure,” Kurian said during the briefing. “It starts with two flavors of managed compute, Google Compute Engine and Google Kubernetes Engine.
“We offer a variety of different styles of workloads. If somebody wants to do a long training run, they can do batch, but if they want to split the work across multiple clusters in multiple data centers, we call that multihost. If they want to dynamically assemble slices of infrastructure from different systems, we call that multislice.”
“It’s important if you have the tools and you have the SDKs, it’s critically important to have the best-performing infrastructure,” Banks told TNS.
Competitively, developers want powerful models, scalable infrastructure, good tooling and APIs connected to what they already use. Google aims to provide all three, which gives the company an edge, Woodward said in a video interview.
“Gemini Pro is just a natural extension of Gemini for developers to start building applications,” Gualtieri said. “The real competition though will be open source models downloaded from HuggingFace. There are certainly use cases for megamodels, but there will be tens of thousands of small expertise-specific models — just like human expertise. You don’t need a model to summarize Shakespeare if your use case is commercial real estate leasing contracts.”
Pricing for Gemini Pro is reduced per character to attract more developers, while indemnification coverage has been expanded, Woodward said.
Pricing for Gemini Pro is $0.00025 per 1,000 characters and $0.0025 per image for input, and $0.0005 per 1,000 characters for output.
Google also announced new integrations and capabilities, including Gemini powering Google’s real-time collaboration assistant Duet AI in Workspace, and new AI-powered programming assistants and security tools in Cloud.
Meanwhile, Accenture and Google Cloud launched a new initiative to help organizations adopt generative AI to improve operations, create new lines of business, and more.
The companies said they will create a global, joint generative AI Center of Excellence that will help businesses build and scale applications using Google Cloud’s generative AI portfolio, Accenture officials said.
“As a catalyst for business reinvention, generative AI will transform how people work and access information,” said Karthik Narain, group chief executive of technology at Accenture, in a statement. “Organizations want to move from experimentation with generative AI to scaled implementations faster. Accenture’s deep expertise in managing and scaling large language models tailored for business needs, paired with tools like Accenture’s model switchboard, can help accelerate adoption.”
Accenture itself has more than 300 scaled generative AI projects and AI solutions, and more than 1,450 AI patents and patent-pending applications.
Indeed, some of the major systems and service provider’s customers are looking for help with their generative AI efforts. Blue Cross is one such enterprise customer.
“At Independence Blue Cross, we’ve harnessed the power of AI to better manage massive amounts of data and show opportunities to improve member experiences and introduce new services,” said Michael R. Vennera, executive vice president and chief strategy, corporate development and information officer, Independence Health Group, said in a statement.
“We are looking forward to working with Accenture and Google Cloud through their new Center of Excellence to explore new ways generative AI can help proactively manage our members’ health.”