The COVID-19 pandemic has accelerated the adoption of artificial intelligence, or machine learning, in 2021. The need for automation among enterprises combined with the advancements in AI hardware and software is turning applied AI into a reality.
Here are five AI trends to expect in 2022:
Trend 1: Large Language Models (LLM) Define the Next Wave of Conversational AI
Language models are based on natural language processing techniques and algorithms to determine the probability of a given sequence of words occurring in a sentence. These models can predict the next word in a sentence, summarize textual information, and even create visual charts from plain text.
Large language models (LLMs) are trained on massive datasets that contain enormous amounts of data. Google’s BERT and OpenAI’s GPT-2 and GPT-3 are some examples of LLMs. GPT-3 is known to have 175 billion parameters trained on 570 gigabytes of text. These models can generate anything from simple essays to complex financial models.
The South Korean company Naver announced that it had built one of the most comprehensive AI-based language models, HyperCLOVA, a GPT-3-like Korean language model. Huawei’s PanGu-Alpha and Baidu’s Ernie 3.0 Titan are trained on terabytes of Chinese datasets comprising ebooks, encyclopedias, and social media.
In 2022, we will see large language models becoming the foundation for next-generation conversational AI tools.
Trend 2: The Rise of Multimodal AI
Deep learning algorithms have traditionally focused on training their models from one source of data. For example, a computer vision model is trained on a set of images, while an NLP model is trained on textual content. Speech processing deals with acoustic model creation, wake word detection, and noise cancellation. This type of machine learning is associated with single modal AI where the outcome is mapped to a singular source of data type — images, text, speech.
Multimodal AI is the ultimate convergence of computer vision and conversational AI models to deliver powerful scenarios closer to human perception. It takes AI inference to the next level, combining visual and speech modalities.
The most recent example of multimodal AI is DALL-E from OpenAI, which can generate images from text descriptions. The model is named using a portmanteau of the artist Salvador Dalí and Pixar’s WALL·E. For example, when the text prompt, “a clock in the shape of a doughnut,” is sent to the model, it generates the below images:
Google’s Multitask Unified Model (MUM) is another example of multimodal AI. It promises to enhance users’ search experience by prioritizing the results based on contextual information mined from 75 different languages. MUM uses the T5 text-to-text framework and is 1,000 times more powerful than BERT — the popular transformer-based model for natural language processing.
NVIDIA’s GauGAN2 model will generate photo-realistic images based on simple textual input. It combines segmentation mapping, inpainting, and text-to-image generation in a single model, making it a powerful tool to create photorealistic art with a mix of words and drawings.
Going forward, we will see the convergence of computer vision and language/speech models that make AI more natural and richer.
Trend 3: Simplified and Streamlined MLOps
Machine Learning operations (MLOps), or the practice of putting machine learning into industrial production, is complex! The numerous tools and frameworks available to implement MLOps make it overwhelming.
MLOps today, in many ways, is similar to DevOps in 2012. Organizations soon realized how valuable DevOps was, but they struggled to implement it due to a lack of guidance. The toolchain was complex, and the ecosystem was fragmented.
The MLOps package includes everything from installing and configuring the training and inference infrastructure, configuring the feature store, configuring the model registry, monitoring the models for decay, and detecting model drift.
MLOps is one of the concepts that have been incorporated into ML platforms based on clouds such as Amazon Web Services‘ Amazon SageMaker, Azure ML, and Google Vertex AI. These capabilities, however, cannot be used in hybrid and edge computing environments. As a result, monitoring models at the edge prove to be a significant challenge for enterprises. Monitoring models at the edge becomes even more challenging when dealing with computer vision systems and conversational AI systems.
Due to the maturity of open source projects such as Kubeflow and MLflow, MLOps have become quite accessible. A streamlined and simplified approach to MLOps will be seen over the coming years, spanning cloud and edge computing environments.
Trend 4: AI-Driven Developer Productivity
AI will influence almost every aspect of IT, including programming and development.
During the last couple of years, we have seen tools such as Amazon Code Guru that provide intelligent recommendations to improve code quality and identify an application’s most expensive lines of code. More recently, Github Copilot debuted as an “AI pair programmer” to assist developers in writing efficient code. Salesforce research teams have launched CodeT5, an open source project that will help Apex developers with AI-powered coding. Tabnine, formerly Codata, brought intelligent code completion to mainstream development environments. Ponicode is another AI-driven tool that can create, visualize and run unit tests for functions.
The rise of large language models (LLM) and the broader availability of open source code is enabling IDE vendors to build intelligent code generation and analysis. Going forward, expect to see tools that can generate high-quality and compact code from inline comments. They will even be able to translate code written in one language to another, enabling application modernization by converting legacy code to modern languages.
Trend 5: New Verticalized AI Solutions from Platform Companies
Leading AI vendors, including Amazon, Google, and Microsoft, are focusing on commercializing the research and development efforts. They offer managed services through their cloud platforms or build hardware appliances that come with AI accelerators and pre-trained models targeting specific scenarios.
Amazon Connect and Google Contact Center AI are classic examples of vertical integration. Both leverage machine learning capabilities to perform intelligent routing, conversations driven by bots, and automated assistance to contact center agents.
When it comes to appliances, AWS Panorama and Azure Percept are built to deliver turnkey AI capabilities. AWS Panorama connects to existing IP cameras to perform computer vision-based inference. Customers can train new models in the cloud and deploy them at the edge where Panorama devices are deployed. Azure Percept takes a similar approach to deliver computer vision and conversational AI capabilities at the edge. Microsoft has built Percept based on existing IoT, AI, and edge computing services already available on Azure.
Finally, services such as Amazon Lookout for Equipment and Google Cloud Visual Inspection AI leverage cloud-based AI platforms to perform predictive maintenance of devices and anomaly detection in products. These services are highly customized for retail and manufacturing verticals.
In 2022, we will see AI platform and cloud providers leverage cutting-edge research and existing managed services to deliver solutions targeting niche use cases and scenarios.