Building AI-Driven Applications with a Multimodal Approach
Generative AI is the latest era of AI/machine learning (ML) that is unlocking new opportunities and tackling previously unaddressed challenges. It can create new content and ideas, including conversations, stories, images, videos and music. It is powered by large models that are pretrained on vast amounts of data and commonly referred to as foundation models (FMs).
The use of AI to maximize an organization’s data strategy can help boost employee productivity, collaboration and enable smarter decision-making, resulting in real, quantifiable business value. It enables developers to reimagine their applications, create new customer experiences, while driving faster time to revenue to transform businesses.
The Evolution of Generative AI
In spite of the excitement around AI, we are still in the early stages of its transformation. The current wave of generative AI is focused on creating new content based on a snapshot of data that existed. This has tremendous value to improve and automate business processes.
However, the real value of generative AI is the next phase, where it can enable driving insights and decision-making. These early generative AI applications are all content-oriented, focused on processing vast amounts of data to produce fairly accurate results. However, the accuracy of these results is not enough to drive decisions.
The first generation of these applications work really well for the business-to-customer market where sifting through large quantities of data quickly to produce a condensed or summarized view has a huge benefit.
Often, the generated content serves as a starting point for more focused human involvement in completing the job to be done. It accelerates the low-value-add work where people spent a lot of time, whether it was building an outline or writing code, building data sets for testing or performing an analysis on data. It accelerates human productivity to work on the tasks that are not mission critical and that can be undone or revisited with relatively less effort. Copilot X is a great example of a feature where the timeliness of the output matters more than accuracy of the content.
Quantity Versus Quality
However, the real value of AI comes from the ability to draw meaningful insights and drive outcome-based decision-making. These are the cases where quality and accuracy matter more than anything.
For AI to trigger the transformation that it can deliver, it needs to move from low-impact, low-risk content generation for low-fidelity or accuracy use cases to high-impact, high-risk analysis that drives decision-making and needs high fidelity and accuracy.
For enterprises to evaluate the return on investment (ROI) of these new AI features and determine how they can uniquely differentiate themselves to customers, business-to-business applications need to drive outcomes and justify their ROI. They need to do this in the context of their existing data ecosystem and in a secure manner.
The rapid advancements and adoption of AI has democratized the access to the core technology at the heart of this resolution — foundation models. The tremendous pace of innovation around large language models (LLMs) in the open source community has triggered the creation of several open source LLMs that are democratizing the access to the core technology on which to build an AI business.
Smaller models also enable them to be run on lower-powered hardware, including an iPhone, thus reducing the barrier to tinkering and creativity. The availability of these smaller but sufficiently high-quality models has fueled innovation and unlocked possibilities for individuals and institutions around the world. In the end, the best model is the one that can learn constantly and quickly and hence can be iterated and fine-tuned in a short period of time.
This has reduced the barrier to entry for training and experimentation from a few very large enterprises or the output of a major research organization to one person, an evening and a beefy laptop or even a personal computing device like an iPhone. These advances will in turn lead enterprises to evaluate the impact of AI in the context of their own businesses.
Multimodal Platforms Hold the Key to Driving AI-Powered Apps
To be successful, enterprises need access to multiple foundation models that include multiple modalities of text, images and videos that can be fine-tuned on their proprietary data to deliver unique and business-relevant insights. They want to make it easy to take a base FM and build differentiated apps using their own data. Data is more important than ever and is the cornerstone of a successful AI initiative. This includes a reliable and performant data platform that supports unstructured operational and nonoperational data with deeper integration with the analytics and ML platforms to enable prescriptive analytics.
To achieve this in real time, there needs to be a deeper integration of the most recent up-to-the-minute data stored in an operational database. Databases need to be conversant with processing multiple modalities of data without introducing performance and latency overhead.
By building on top of a multimodal and low-latency data platform, enterprises can realize the vision of prescriptive analytics and drive AI-powered applications.