Lifelong Machine Learning: Machines Teaching Other Machines
For humans, learning happens over a lifetime as we gain, share and further develop skills that we’ve picked up along the way, and continuously adapt them to new situations. In contrast, we don’t think of machines as learning in quite such a collaborative way, and over the long term. However, new research into a subset of machine learning called lifelong learning (LL) suggests that machines are indeed capable of this human-like learning where it can learn and accumulate knowledge over time, and build upon it in order to adapt these skills to new scenarios.
Now, a team of researchers from the University of Southern California led by Laurent Itti and graduate student Yunhao Ge, have developed a tool that allows artificially intelligent agents to engage in this type of continuous and collective learning. In a recently published paper titled Lightweight Learner for Shared Knowledge Lifelong Learning, the researchers describe how their Shared Knowledge Lifelong Learning (SKILL) tool was able to help AI agents to each initially learn one of 102 different image recognition tasks, before sharing their know-how over a decentralized communication network with other agents. This collective transmission of knowledge then leads to all agents eventually mastering all 102 tasks — while still maintaining their previous knowledge of their initially assigned task.
“It’s like each robot is teaching a class on its specialty, and all the other robots are attentive students,” explained Ge in a statement. “They’re sharing knowledge through a digital network that connects them all, sort of like their own private internet. In essence, any profession requiring vast, diverse knowledge or dealing with complex systems could significantly benefit from [AI using] this SKILL technology.”
Avoiding ‘Catastrophic Forgetting’
Lifelong learning is a relatively new field in machine learning, where AI agents are learning continually as they come across new tasks. The goal of LL is for agents to acquire new knowledge of novel tasks, without forgetting how to perform previous tasks. This approach is different from the typical “train-then-deploy” machine learning, where agents cannot learn progressively without “catastrophic interference” (also called catastrophic forgetting) happening in future tasks, where the AI abruptly and drastically forgets previously learned information upon learning new information.
According to the team, their work represents a potentially new direction in the field of lifelong machine learning, as current work in LL involves getting a single AI agent to learn tasks one step at a time in a sequential way.
In contrast, SKILL involves a multitude of AI agents all learning at the same time in a parallel way, thus significantly accelerating the learning process. The team’s findings demonstrate when SKILL is used, the amount of time that is required to learn all 102 tasks is reduced by a factor of 101.5 — which could be a huge advantage when AI learning in a self-supervised manner is deployed in the real world.
“Most current LL research assumes a single agent that sequentially learns from its own actions and surroundings, which, by design, is not parallelizable over time and/or physical locations,” explained the team.
“In the real world, tasks may happen in different places. [..] SKILL promises the following benefits: speed-up of learning through parallelization; ability to simultaneously learn from distinct locations; resilience to failures as no central server is used; possible synergies among agents, whereby what is learned by one agent may facilitate future learning by other agents.”
‘Common Neural Backbone’
To create SKILL, the researchers took inspiration from neuroscience, in particular zeroing in on the theory of the “grandmother cell” or gnostic neuron — a hypothetical neuron that represents a complex but specific concept or object. This neuron is activated when the person senses or perceives that specific entity.
For the researchers, this theory of the grandmother cell was translated into their approach of designing lightweight lifelong learning (LLL) agents with a common, generic and pre-trained neural “backbone”, capable of tackling image-based tasks. As the team points out, this method enables “distributed, decentralized learning as agents can learn their own tasks independently”. Because it is also done in a parallel fashion, this technique also makes accelerated and scalable lifelong learning possible.
“Agents use a common frozen backbone and only a compact task-dependent ‘head’ module is trained per agent and task, and then shared among agents,” clarified the team. “This makes the cost of both training and sharing very low. Head modules simply consist of a classification layer that operates on top of the frozen backbone, and a set of beneficial biases that provide lightweight task-specific re-tuning of the backbone, to address potentially large domain gaps between the task-agnostic backbone and the data distribution of each new task.”
The researchers say that SKILL is similar to crowdsourcing, where a group of people share their skills and knowledge to find a common solution to a problem. They believe that machines could use a similar approach to become “comprehensive assistants” to aid human professionals in fields like medicine. In conjunction with other emerging fields of research like social intelligence for AI, other experts point out that lifelong machine learning could be crucial in developing artificial general intelligence (AGI).
Read more in the team’s paper.