Cloud Services / Kubernetes / Machine Learning

VMware Extends vSphere to Capabilities for Nvidia’s AI and GPUs 

29 Sep 2021 10:26am, by

VMware’s VMware vSphere 7 Update 3 release serves to further unlock access to Nvidia vGPUs and AI with self-service API options, while also offering a number of new capabilities for vSphere for applications running in or on virtual machines (VMs), containers or Kubernetes.

The company is also today releasing VMware vSAN 7 Update 3 to improve infrastructure support for VMware’s vSAN environments.

Both releases reflect VMware’s ambition to simplify and further automate CI/CD, infrastructure-management and IT needs for developer and operations teams across a mix of different environments, including multiclouds.

The vSphere release follows the availability since April of direct access of vSphere users to GPU giant Nvidia’s AI Enterprise suite, to help scale AI applications and their development across multicloud virtual infrastructures. vSphere has since supported Nvidia’s AI frameworks, CUDA applications, models and SDKs, under the terms of the licensing agreement between the two companies.

With Update 3, developers and DevOps folks will be able to use Kubernetes commands to provision VMs on hosts with vGPUs, Sheldon D’Paiva, senior director of product marketing at VMware, told The New Stack. This capability will also help users build and run their AI apps on GPU-enabled hardware using a self-service model.

“When combined with all the GPU-related enhancements we introduced in Update 2, developers will have a lot of power at their fingertips,” D’Paiva said. “Prior to this capability, developers had to make requests to the IT team for GPU systems that they could run their apps on. This process is often slow and bureaucratic, taking days or even weeks before they receive what they need. This obviously slows down development velocity and, ultimately, time-to-market for new applications.”

AI and Developer-Ready Infrastructure include:

  • vSphere VM Service support for vGPUs. By using the VM Service through Kubernetes API, developers can provision VMs that leverage underlying GPU hardware resources for their AI/ML workloads (as described above).
  • Simplified setup of vSphere with Tanzu: “Faster and easier setup of networking with fewer steps and inputs needed provides improved time to value,” D’Paiva said.

Scaling improvements include improved resilience and monitoring for persistent memory systems in vSphere. This includes NVMe/TCP Support for “a fast, simple and cost-effective way to get the most out of their existing storage investments,” D’Paiva said. VMware has collaborated with Dell Technologies to deliver support for this capability.

Simplified operations include “easier setup” of NSX Security from within the vSphere Client and distributed resource scheduler (DRS) configuration enhancements to help “avoid moving (and thus disrupting) larger or more critical workloads whenever possible,” D’Paiva said.

vSAN 7 Update 3 is intended to increases the availability, security and resiliency of VMware’s infrastructure solution and helps operations teams to troubleshoot their vSAN environments “quickly and easily with the new tools,” D’Paiva said. The new capabilities D’Paiva communicated include:

  • Enhanced Availability for Cloud Native Applications for improved resiliency at the Edge with additional “levels of fault tolerance” to keep data available at the edge and in stretched clusters.”
  • Simplified Troubleshooting and Remediation, including new “health checks” to provide new metrics for improved visibility into the switch fabric that connects the vSAN hosts to help “ensures higher levels of consistency across a cluster.”

Feature image via Pixabay.