This article is a part of the series where we explore cloud-based machine learning services. After covering Azure ML Services, Google Cloud ML Engine, and Amazon SageMaker, we will take a closer look at IBM Watson Studio Cloud.
IBM has consolidated a variety of data science products and services into Watson Studio, a unified environment for preparing, exploring, visualizing data, and also to create, train, and deploy machine learning models. Standalone offerings including Data Science Experience Cloud and SPSS Modeler are integrated with Watson Studio.
IBM Watson Studio Desktop Beta is available for Microsoft Windows and Mac users. It’s an IDE to address the data prep and data modeling needs on individual desktops. The output from the tool — a fully prepared dataset — can be uploaded to the cloud for training ML models.
Like its competitors, IBM is leveraging the building blocks of its public cloud platform for Watson Studio. Large datasets are stored in IBM Cloud Object Storage (COS) while the compute service is used for training and hosting machine learning and deep learning models backed by powerful CPU and GPU infrastructure. The service is also integrated with Apache Spark clusters for distributed processing.
Getting Started with IBM Watson Studio Cloud
Watson Studio is one of the many services available under the Watson portfolio of IBM Cloud. It can be accessed from AI section of the catalog. The Lite plan which comes with a limit of 50 capacity unit-hours per month is free for developers. The plan includes single small compute environment with 1 vCPU and 4 GB RAM, which is good enough to experiment with the service.
Like most of the IBM Cloud services, we need to create an instance of Watson Studio which is associated with a specific billing plan and geographic location. As mentioned above, the Lite plan is a great way to explore the service.
Once the service instance is created, the next step is to create a project that acts as the logical container for the datasets, models, deployments, and API credentials. Each project type is pre-configured for a specific task typically performed by data scientists. The Standard type is a generic project for working on any type of asset. Each project may have additional collaborators with varying access levels.
Once a project has been created, assets can be created as needed. Datasets may be uploaded to object storage which can be then accessed from other assets such as the Modeler and Jupyter Notebooks.
Fully prepared datasets are added to the project that are automatically uploaded to IBM Cloud Object Storage service. The data assets are centralized resources available to any other assets in the project.
The next step is to add a model that gets trained based on the data asset. Watson Studio can be used for creating NLP, vision, and generic models. The Watson Machine Learning Model is ideal for classical machine learning that solves regression and classification problems.
Before starting the modeling process, the project needs to be associated with compute infrastructure. This service is branded as Watson Machine Learning. The Lite plan includes a generous limit of 5 deployed models, 5,000 predictions per month, and 50 capacity unit-hours per month. The best thing is that the plan includes NVIDIA K80 GPUs which are great for experimenting with deep learning.
Once the dataset and training infrastructure are in place, we are all set of kick off the training job. Watson Studio supports creating a model from the scratch or uploading a Predictive Model Markup Language (PMML) XML file.
The model is associated with the centralized dataset, compute infrastructure, and the runtime based Spark Scala.
Choosing Automatic model type will automatically prepare the data and prompts us to choose the features and labels. Depending on the label, Watson Studio suggests an appropriate technique for training the model.
Once the training job is done, the model is saved and is ready to be deployed for predictions. A deployment results in a Web Service that exposes a REST endpoint. Developers can invoke the service by embedding the API key generated by Watson Studio.
The tool also generates a form to test the predictions that comes handy in evaluating the model.
The REST endpoint may be invoked from any languages or even cURL. The username and password needed to generate the token may be retrieved from Watson Machine Learning service credentials.
IBM Watson Studio abstracts the complexity involved in training machine learning models. From data preparation to model deployment, the platform simplifies the data science workflow.
When compared to other cloud-based offerings, Watson Studio doesn’t support exporting a fully trained model. Hosting an externally-trained ML model within Watson is not supported either. IBM doesn’t seem to leverage containers and Kubernetes for scaling the training and deployment of models. But, the wizard-style development environment makes up for the limitations. As an end-to-end platform, IBM Watson Studio delivers what is expected from a typical ML PaaS.
Feature image via Pixabay.