We’ve seen a plethora of possibilities in recent years when it comes to applying artificial intelligence to our daily lives — whether that’s in the recommendation engines that underpin the digital services that many of us enjoy, or behind the systems that guide autonomous vehicles or help us develop cutting-edge meta-materials. In particular, so-called deep learning — a family of advanced machine learning techniques that attempt to emulate the inner workings of the human brain — are helping us pave the way toward new life-saving medicines, supercharge natural language processing, or even identify faraway galaxies.
While there is a lot of potential, there are still issues to iron out when it comes to using deep learning AI in situations where data is spread out over a continuous span of time — such as tracking a patient’s health information, or the ups and down of a particular stock in the financial markets (a.k.a. time-series data). That’s because deep learning relies on artificial neural networks (ANN), which consist of interconnected computational nodes that are stacked in distinct layers. These layers all work in concert to detect patterns in the data as it filters through, but because of the way these layers are structured, ANNs are better at handling discrete chunks of data that come in at fixed intervals of time, and are therefore not as well-suited to dealing with the time-series type of data that continuously transforms over time.
So it’s no wonder that late last year, when a team of researchers from the University of Toronto proposed rethinking the way ANNs are designed, other experts in the field took notice. The team’s paper puts forth what’s called neural ordinary differential equations, which tackle this problem by incorporating concepts from calculus into the deep learning process, rather than utilizing discrete layers. As you might recall, calculus is the branch of mathematics that studies continuous change, so this means that instead of breaking up the data into separate pieces, the team’s approach is better for data that may be inputted at irregular intervals, such as a patient’s health data during their lifetime.
“Some machine learning models are very interpretable but are too simple to be accurate in the real world,” paper co-author and University of Toronto graduate student Ricky Chen told The New Stack. “These are often used by statisticians, but their simplicity can lead to wrong conclusions. Others are high-performant but do not admit simple explanations for their predictions. These are often used in computer vision and natural language processing, with very good performance in real-world applications but are too unreliable for more sensitive tasks (like self-driving cars) due to their unexplainable nature. We’re hoping to build models that can be both interpretable and complex enough to reflect real-world phenomenon. Taking inspiration from the physicists’ favorite type of model — differential equations — seemed like a natural step in that direction.”
Better Modeling of Continuous Change
Not only would the team’s approach be more computationally more efficient, another major advantage is that it can be applied even in situations where the team’s algorithm doesn’t know what’s going on under the hood, making it more versatile than other methods. Such a machine learning tool would be useful for learning about complex, real-world phenomena that generate continually transforming data.
“This ranges from physical interactions, such as learning and predicting simple physics, like how objects collide, and more complex scenarios like weather or tornado forecasting, to applications that benefit from having a continuous prediction such as modeling a patient’s health records — which can, for instance, be used to determine when a patient should take certain tests, before bad symptoms show up,” explained Chen.
For now, there’s still some ways to go before neural ordinary differential equations gain wider implementation, though it’s clear that the future impact of this new family of deep neural network models will likely be far-ranging. Encouragingly, it’s already been applied other research projects focusing on point clouds in 3D graphics, and more accurately identifying colon glands in medical imaging. Currently, the team is working on introducing more mechanisms into their model, such as incorporating elements that would account for randomness and uncertainty in future predictions.
Images: University of Toronto