How SaaS Companies Can Monetize Generative AI
You’ve already been part of a conversation at your company, either as a contributor or observer, on how your customers can benefit with an increased value from your products infused with Generative AI, LLMs or custom AI/ML models.
Universally, product roadmaps are being upended to incorporate AI. As you hash out your approach and draw up the enhanced roadmap, I want to share some words of advice from the good ol’ California Gold Rush: Don’t show up to the gold rush without a shovel!
Similarly, don’t overlook the monetization aspect of your SaaS and AI. Factor it in at the outset and integrate the right plumbing at the start — not as an afterthought or post-launch.
What’s Changing? SaaS Is Shifting to Metered Pricing
Two years ago, I wrote about the inevitable shift to metered pricing for SaaS. The catalyst that would propel the shift at the time was unknown, but the foundational thesis outlined that it was inevitable. No one could have predicted in 2021 that a particular form of AI would serve to be that catalyst.
First thing to realize is that this is not merely a “pricing” change. It is a monetization model change. A pricing change would be a change in what you charge, for example, going from $79 per user/month to $99 per user/month. A monetization model change is a fundamental shift in how you charge, which inevitably will also change what you charge. It’s a business model change.
Traditionally, SaaS pricing has been a relatively lightweight exercise, often decoupled from product or product teams. With a per-user or per-seat model, as long as the price point was set sufficiently (and in some cases arbitrarily) high above a certain threshold that covered for underlying costs with the desired margin, that’s all that was needed. It was essentially a one-size-fits-all approach requiring almost no need for usage instrumentation or product usage tracking and reporting.
SaaS and AI Pivots This on Its Head
Your technology stack increasingly will have more third party value-add components of AI/ML, further infused with additional custom models layered on top. You are going to operate in a multi-vendor business tier (not just infrastructure) ecosystem. These new value-added business tier components in the form of AI/ML in turn will come with a usage-based pricing and charge model. See ChatGPT pricing.
Each user of your SaaS application will stretch and use these metered components in different ways, thereby propelling you to also charge on a metered basis to align with underlying costs and revenue.
Deploy a Proven and Scalable Approach
While on the surface it may seem daunting, believe me, this is a welcomed change. Lean into it.
Not only will it enable you to provide your customers with flexible and friendly consumption-based pricing, but it will also drive a level of operational efficiency and discipline that will further contribute to your bottom line.
Start with de-coupled metering, and then layer a usage-based pricing plan on top. For example, Stripe leverages GPT-4 from OpenAI to enrich the customer-facing experience in its documentation. Instacart has also integrated with ChatGPT to create an Ask Instacart service. The app will allow users to research food-related queries in a conversational language such as healthy meal formulations, recipe ideas based on given ingredients and generated shopping lists based on the ingredients of a particular recipe.
Beyond integrating with ChatGPT and other services, traditional software companies are developing their own GenAI technologies as well. For example, Adobe has rolled out Adobe Firefly to offer its own text- and image-generation capabilities to creatives.
As these capabilities become natively integrated and expected by customers, it will be imperative to track usage and develop a flexible, transparent pricing model that scales to all levels of consumption.
Usage-Based Pricing Is a Natural Fit for Generative AI Companies
Generative AI and Usage-Based Pricing: A Complimentary Pair
ChatGPT parses the text prompt to generate an output based on the “understanding” of that prompt. The prompts and outputs vary in length where the prompt/output size and resource consumption are directly related, with a larger prompt requiring greater resources to process and vice versa. Additionally, the usage profile can be expected to vary significantly from customer to customer. One customer may only use the tool sparingly, while another could be generating new text multiple times daily for weeks on end, and the pricing model must account for this variability.
On top of this, services like ChatGPT are themselves priced according to a usage-based model. This means that any tools leveraging ChatGPT or other models via API will be billed based on the usage; since the backend costs of providing service are inherently variable, the customer-facing billing should be usage-based as well.
To deliver the most fair and transparent pricing, and enable frictionless adoption and user growth, these companies should look to usage-based pricing with a product-led go-to-market motion. Having both elastic frontend usage and backend costs position generative AI products as ideal fits with a usage-based and product-led approach.
How to Get Started
Meter frontend usage and backend resource consumption
Rather than building these models from scratch, many companies elect to leverage OpenAI’s APIs to call GPT-4 (or other models), and serve the response back to customers. To obtain complete visibility into usage costs and margins, each API call to and from OpenAI tech should be metered to understand the size of the input and the corresponding backend costs, as well as the output, processing time and other relevant performance metrics.
By metering both the customer-facing output and the corresponding backend actions, companies can create a real-time view into business KPIs like margin and costs, as well as technical KPIs like service performance and overall traffic. After creating the meters, deploy them to the solution or application where events are originating to begin tracking real-time usage.
Track usage, margins and account health for all customers
Once the metering infrastructure is deployed, begin visualizing usage and costs in real time as usage occurs and customers leverage the generative services. Identify power users and lagging accounts and empower customer-facing teams with contextual data to provide value at every touchpoint.
Since generative AI services like ChatGPT use a token-based billing model, obtain granular token-level consumption information for each customer using your service. This helps to inform customer-level margins and usage for AI services in your products, and it is valuable intel going into sales and renewal conversations. Without a highly accurate and available real-time metering service, this level of fidelity into customer-level consumption, costs and margins would not be possible.
Launch and iterate with flexible usage-based pricing
After deploying meters to track the usage and performance of the generative AI solution, the next step is to monetize this usage with usage-based pricing. Identify the value metrics that customers should be charged for. For text generation this could be the word count or the total processing time to serve the response; for image generation it could be the size of the input prompt, the resolution of the image generated or the number of images generated. Commonly, the final pricing will be built from some combination of multiple factors like those described.
After creating the pricing plan and assigning to customers, real-time usage will be tracked and billed. The on-demand invoice will be kept up-to-date so at any time both the vendor or customers can view current usage charges.
Integrate with your existing tools for next-generation customer success
The final step once metering is deployed and the billing service is configured is to integrate with third-party tools inside your organization to make usage and billing data visible and actionable. Integrate with CRM tooling to augment customer records with live usage data or help streamline support ticket resolution.
With real-time usage data being collected, integrate this system with finance and accounting tools for usage-based revenue recognition, invoice tracking and other tasks.
Amberflo for Generative AI
Amberflo provides an end-to-end platform for customers to easily and accurately meter usage and operate a usage-based business. Track and bill for any scale of consumption, from new models in beta testing up to production-grade models with thousands of daily users. Amberflo is flexible and infrastructure-agnostic to track any resource with any aggregation logic.
Build and experiment with usage-based pricing models, prepaid credits, hybrid pricing or long-term commitments to find the best model and motion to suit any unique business and customer base. Leverage real-time analytics, reporting and dashboards to stay current on usage and revenue, and create actionable alerts to receive notifications when key thresholds or limits are met.