TNS
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
AI / Large Language Models

The Rise of Small Language Models

As language models evolve to become more versatile and powerful, it seems that going small may be the best way to go.
Feb 16th, 2024 3:00am by
Featued image for: The Rise of Small Language Models

The impressive power of large language models (LLMs) has evolved substantially during the last couple of years. These versatile AI-powered tools are in fact deep learning artificial neural networks that are trained with massively large datasets, capable of leveraging billions of parameters (or machine learning variables) in order to perform various natural language processing (NLP) tasks.

These can run the gamut from generating, analyzing and classifying text, all the way to generating rather convincing images from a text prompt, to translating content into different languages, or chatbots that can hold human-like conversations. Well-known LLMs include proprietary models like OpenAI’s GPT-4, as well as a growing roster of open source contenders like Meta’s LLaMA.

But despite their considerable capabilities, LLMs can nevertheless present some significant disadvantages. Their sheer size often means that they require hefty computational resources and energy to run, which can preclude them from being used by smaller organizations that might not have the deep pockets to bankroll such operations. With larger models there is also the risk of algorithmic bias being introduced via datasets that are not sufficiently diverse, leading to faulty or inaccurate outputs — or the dreaded “hallucination” as it’s called in the industry.

How Small Language Models Stack up Next to LLMs

These issues might be one of the many that are behind the recent rise of small language models or SLMs. These models are slimmed-down versions of their larger cousins, and for smaller enterprises with tighter budgets, SLMs are becoming a more attractive option, because they are generally easier to train, fine-tune and deploy, and also cheaper to run.

Small language models are essentially more streamlined versions of LLMs, in regards to the size of their neural networks, and simpler architectures. Compared to LLMs, SLMs have fewer parameters and don’t need as much data and time to be trained — think minutes or a few hours of training time, versus many hours to even days to train a LLM. Because of their smaller size, SLMs are therefore generally more efficient and more straightforward to implement on-site, or on smaller devices.

Moreover, because SLMs can be tailored to more narrow and specific applications, that makes them more practical for companies that require a language model that is trained on more limited datasets, and can be fine-tuned for a particular domain.

Additionally, SLMs can be customized to meet an organization’s specific requirements for security and privacy. Thanks to their smaller codebases, the relative simplicity of SLMs also reduces their vulnerability to malicious attacks by minimizing potential surfaces for security breaches.

On the flip side, the increased efficiency and agility of SLMs may translate to slightly reduced language processing abilities, depending on the benchmarks the model is being measured against.

Nevertheless, some SLMs like Microsoft’s recently introduced 2.7 billion-parameter Phi-2, demonstrate state-of-the-art performance in mathematical reasoning, common sense, language understanding, and logical reasoning that is remarkably comparable to — and in some cases, exceed — that of much heftier LLMs. According to Microsoft, the efficiency of the transformer-based Phi-2 makes it an ideal choice for researchers who want to improve safety, interpretability and ethical development of AI models.

Other SLMs of note include:

  • DistilBERT: a lighter and faster version of Google’s BERT (Bidirectional Encoder Representations Transformer), the pioneering deep learning NLP AI model introduced back in 2018. There are also Mini, Small, Medium and Tiny versions of BERT, which are scaled-down and optimized for varying constraints, and range in size from 4.4 million parameters in the Mini, 14.5 million in the Tiny, to 41 million parameters in the Medium version. There is also MobileBERT, a version designed for mobile devices.
  • Orca 2: Developed by Microsoft by fine-tuning Meta’s LLaMA 2 by using synthetic data that is generated from a statistical model, rather than from real life. This results in enhanced reasoning abilities, and higher performance in reasoning, reading comprehension, math problem solving and text summarization that can overtake that of larger models that are ten times larger.
  • GPT-Neo and GPT-J: With 125 million and 6 billion parameters respectively, these alternatives were designed by the open source AI research consortium EleutherAI to be smaller and open source versions of OpenAI’s GPT model. These SLMs can be run on cheaper cloud computing resources from CoreWeave and TensorFlow Research Cloud.

Ultimately, the emergence of small language models signals a potential shift from expensive and resource-heavy LLMs to more streamlined and efficient language models, arguably making it easier for more businesses and organizations to adopt and tailor generative AI technology to their specific needs. As language models evolve to become more versatile and powerful, it seems that going small may be the best way to go.

Group Created with Sketch.
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.