Modal Title
Cloud Services / Machine Learning

AWS Launches an IDE to Manage the Full Machine Learning Lifecycle

Dec 3rd, 2019 2:08pm by
Featued image for: AWS Launches an IDE to Manage the Full Machine Learning Lifecycle

As it did last year, Amazon Web Services made machine learning and artificial intelligence a big theme of its annual Re:Invent user conference, being held in Las Vegas this week. AWS CEO Andy Jassy devoting nearly an hour of his three-hour keynote to new AI/ML technologies and services that the company has launched for the show.

Perhaps most notably for developers, AWS has launched a fully integrated development environment (IDE), called SageMaker Studio, to manage the complete lifecycle of machine learning jobs that are being developed and run on the cloud service. It unifies all of the company’s tools under a single user interface on the SageMaker platform, which the company released two years ago. The launch also includes a number additional capabilities, including debugging. monitoring and even the automatic creation of ML models.

SageMaker Studio “pulls together for the first time dozens of machine learning tools within a single pane of glass,” said Matt Wood, AWS vice president of artificial intelligence, also during the keynote. “Our aim is to put machine learning in the hands of more developers and data scientists than ever before.”

SageMaker Studio provides the ability to build, train, tune, and deploy their ML models from a central console. Developers can easily switch between steps, the company promises, so they can iterate faster. Folders can be created to manage, and share, all the resources of a given project.

“The updates to SageMaker are huge. AWS now has an end-to-end and complete ML PaaS,” The New Stack contributing analyst Janakiramm MSV tweeted from the show floor.

Concept Drift No Longer

The atomic unit of work for SageMaker Studio will be a notebook, a template for code and comments based on the open source Jupyter Notebooks. Developers can work in SageMaker Notebooks (currently in preview), and when they are run, AWS can automatically allot the correct amount of hardware and cloud services, eliminating the need for the dev to manage infrastructure.

As training models requires a lot of iterative testing, AWS assembled SageMaker Experiments, which automatically captures the input parameters, configuration and results of each experiment, providing a way to inspect and compare multiple training runs, in real-time. The company has also launched Amazon SageMaker Processing, a Python SDK, that can run preprocessing, postprocessing and model evaluation workloads, all on fully-managed infrastructure. In particular, this can help with all the messy pre-processing work, such as converting the data set to the input format, rescaling or normalizing numerical features, and transforming data into richer formats, such as replacing mailing addresses with GPS coordinates, for instance.

Several tools have been added to improve the performance of the models themselves. A new debugger has been added to scrutinize why ML models aren’t getting optimal results. It inspects your models, automatically collects operational metrics, and provides alerts and advice on optimizing training times and improving model quality. This SDK can work with TensorFlow, Keras, Apache MXNet, PyTorch and XGBoost. The results can be viewed in notebooks from directly within SageMaker Studio.

Also on-hand to help is the SageMaker Model Monitor, which automatically monitors ML models during runtime, sending out alerts when data quality issues appear. Jassy noted that this monitor is especially good at detecting the issue of “concept drift,” where a previously-successful model ceases being useful because of some recent change in the basic training data (an abrupt shift in housing prices, for instance). It creates a baseline set of statistics during training, which is then used to compare the results of the model in production. Alerts can be fed back to SageMaker Studio and/or to CloudWatch. You can even tie it to SageMaker Processing, where it will infer a new model that they can be quickly spun up.

And if you are wondering if AWS could just automate the entire process of creating useful ML models, then take a look at SageMaker Autopilot, which was designed to inspect a set of raw data, stored as object data in S3 for instance,  automatically a set of classification and regression machine learning models. This eliminates the tedious chore of creating a predictive model by hand, through multiple iterations and slight adjustments to the foundational algorithms. From this work, it trains an Inference Pipeline, which can then be put to work at a real-time endpoint or prepped for batch processing.

Beyond SageMaker, the company has also introduced a new series of compute instances specially crafted for the inferencing portion of ML workloads. While everyone worries about the testing phase, Amazon itself has found that majority of hardware costs in ML workloads — such as supporting Alexa voice command service — come from the inferencing (customer-facing) part of the workload, Jassy noted. The Inf1 instances, with up to 16 ARM-based AWS Inferentia chips each, promise to offer a 40% cost per inference, and 3x the performance, over current G4 instances.

In the keynote, Wood sketched out how all AutoPilot, along with the other SageMaker features could be used in production. He wanted to predict housing prices in the U.S. All he needed was a CVS file with sales data of US housing, along with features of each house (number of bedrooms, etc.) AutoPilot can run up to 50 models at once. “The dirty secret of ML is that you don’t just train a single model, you train dozens and pick the best ones,” Wood quipped. AutoPilot iteratively hones in on the best set of algorithms data features and parameters to provide the best possible models. You can use the debugging tool to understand how each model was created, and, when the model you choose gets put into operation, the monitoring service will let you know when predictions start to differ from the baseline mode, which would signal the need for a remodeling.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.