Facebook, Microsoft Bring Interoperable Models to Machine Learning Toolkits
A growing number of machine learning toolkits, especially for deep learning, have become available recently for developers; Google, Microsoft, Amazon, Facebook, Samsung, Intel and Baidu all have released their own, plus there are key frameworks like Torch, Caffe2 and Theano, and newer frameworks like Marvin that have come from academic groups.
Different machine learning teams use different frameworks, especially as they all have different advantages and disadvantages, but until now, there haven’t been many options for moving from one framework to another without completely recreating your data model.
Reproducibility is important for researchers who want to compare and replicate results published by other teams who use different tools or programming languages, as well as for developers who want to prototype a data model in one framework and then use a different toolkit in their production system. That usually means throwing away the work that’s done and recreating it in the new framework, which slows researchers down just at the point that they’re ready to roll out your machine learning system.
One way around this is to use Keras, a Python library that offers high-level building blocks for developing deep learning models using TensorFlow, Theano, Deeplearning4j and the Microsoft Cognitive Toolkit (CNTK); The same model can be used with different frameworks just by setting flags, without changing any code. Testing with an early version of the CNTK integration shows that the different frameworks give better results in different areas. Keras is designed to be more user-friendly than the frameworks themselves and it’s excellent for prototyping and experimentation.
It isn’t as good at scaling past a single-node machine though, which is increasingly the direction that deep learning is taking (to scale out larger, deeper networks); you have to add in other tools like Elephas to do that effectively. Having the Keras API acting as an abstraction doesn’t always provide access to the more powerful options in the underlying toolkits, without tweaking the Keras framework itself.
The Open Neural Network Exchange (ONNX) format that Facebook and Microsoft have collaborated on takes a different approach. It’s an open-source standard for representing deep learning models that will let you transfer them between CNTK, Caffe2 and PyTorch.
Unlike Keras, ONNX is just standardizing the way the data model is represented. Each framework can be used directly, but the model can be reused across different frameworks.
It’s common for machine learning projects to use multiple frameworks at different stages of development; one framework may be optimized for fast training — great for prototyping ideas — while another works with different network architectures so it is more suited for building a final product. Now that smartphones often have an “AI core” in their systems-on-a-chip, more learning can be done on devices, so a framework like Caffe2 would be a choice, given that it is optimized for inferencing on the phone. ONNX makes it easier to move from one to another.
“The main intent of Microsoft and Facebook’s work on ONNX is to address and (hopefully) repair the disconnects that can occur when AI projects move from dev/test into production,” Pund-IT Principal Analyst Charles King told The New Stack. “Support for data/model interchangeability is critical for developers hoping to create applications for multiple platforms since major vendors are also moving forward with their own proprietary AI frameworks (Apple’s Core ML is a good example),” King noted.
“It’s a nice way to bring different framework together so we can exchange models, because every framework has its own strengths and weaknesses and this will free people from being tied to one,” Microsoft Technical Fellow Xuedong Huang agreed.
“CNTK is fantastic for massive scale deep learning; if you have a massive amount of data, it’s the fastest. Caffe2 is very good for creating small models for computer vision workloads. PyTorch is very good at giving you the flexibility you need in research; it’s slow but it has the flexibility. When you move from early-stage research to massive industrial research, you might need to switch from PyTorch to CNTK, for example, and with ONNX, now you can easily do that.”
ONNX is also convenient for silicon vendors who are increasingly adding optimizations to speed up neural networks; Nvidia’s Deep Learning SDK covers a wide range of frameworks, but the more frameworks share the same representation for data models, the less work it is to optimize for them.
The latest versions of Caffe2 and PyTorch already have ONNX support and the next release of CNTK will add it. Microsoft and Facebook are also working on reference implementations, tools and a “model zoo” of model configurations that you can use to get started on a machine learning project quickly.
Many deep learning toolkits use computation graphs to represent neural networks, but they’ve always had their own graph formats. ONNX has a definition of an extensible computation graph model that all three frameworks can use, as well as definitions for standard data types and built-in operators. The ONNX computation graph is a list of nodes with inputs and outputs. Each node calls an operator; the operators aren’t stored in the graph, but each framework that supports ONNX will implement the operators for the data types the standard supports.
The first version concentrates on the operators and types for inferencing, but recurrent neural networks are a high priority for the PyTorch team working on ONNX support.
ONNX isn’t the only attempt to simplify the task is using multiple frameworks, King points out. The Predictive Model Markup Language (PMML) supports model interoperability in traditional neural networks, like Backpropagation. A number of companies, including Amazon, IBM and Google support and utilize PMML. The group that created PMML is currently developing a successor technology, Portable Format for Analytics (PFA) that is more flexible (using JSON instead of XML) and also handling data preparation along with the models themselves.
“ONNX should benefit a range of AI and associated machine learning (ML and deep learning (DL) processes, especially if it grows beyond the initial support,” King said. “Whether Microsoft and Facebook’s ONNX eventually emerges as an accepted standard remains to be seen but the two companies are heading in the right direction.”
Microsoft is a sponsor of The New Stack.
Feature image via Pixabay.