Project Tortuga: Cluster and Cloud Management for High-Performance Computing
Workload management vendor Univa has open sourced technology originally known as Unicloud, then later dubbed Navops Launch, to provide an improved on-ramp to the cloud for high-performance computing (HPC) workloads.
It will be known as Project Tortuga under an Apache 2.0 license. It’s a general-purpose cluster- and cloud-management framework with applications including high-performance computing (HPC), Big Data frameworks, Kubernetes and scale-out machine learning/deep learning environments.
Enterprise customers who have compute farms of several thousand servers doing tasks such as modeling and regression are looking to move HPC workloads to the cloud, according to Gary Tyreman, Univa president and CEO.
Tortuga is a general purpose cluster- and cloud-management framework that automates the deployment of clusters in local on-premise, cloud and hybrid-cloud configurations through repeatable templates.
Handling both virtual and bare-metal environments, it includes cloud-specific adapters for Amazon Web Services, Google Cloud, Microsoft Azure, OpenStack and Oracle Cloud Infrastructure with full support for bring-your-own image (BYOI). It has a built-in policy engine that allows users to dynamically create, scale and tear down cloud-based infrastructure according to changing workload demand. Management, monitoring and accounting of cloud resources are the same as for local servers.
The Wharton School at the University of Pennsylvania turned to Univa Grid Engine and Navops Launch to scale its HPC environment for researchers. It provided seamless integration of the on-premises hardware with AWS compute resources with no downtime. The combo also enables bursting capabilities so researchers can launch as many computationally intensive jobs as they need.
As an open source project, Tortuga could take a number of different avenues, according to Rob Lalonde, vice president and general manager of Navops, including adapters for additional cloud providers, support for new cluster technologies and infrastructure-as-a-service platforms such as OpenShift for bursting to the cloud, web UI and additional capabilities for the policy engine.
The company’s Navops product line is focused on the journey to cloud. The company unveiled Navops Command, an automated workload placement and policy management solution for Kubernetes back in 2016, but HPC workloads have largely been stuck on-premise, Tyreman said. In the past two years, however, migration to the cloud has become a double-digit percentage of the company’s accounts.
“We see those two combining where people are people are building and looking to us to build a cloud-native HPC environment. That means there’s container-based runtimes or applications in the environment that are suited for Kubernetes, container-based runtimes or applications that are HPC-oriented, meaning batch, as well as non-container-based workloads. Stretching Kubernetes on-premise and to the cloud is one of the key value props,” he said.
Launch consists of a core that uses Puppet to create resource adapters that can speak to can speak to bare-metal provisioning as well as cloud APIs.
“By pointing to a different resource adapter, I can say, ‘Give me a node that has this software and this resource adapter that has that hardware.’ That hardware can be in the cloud or behind your firewall,” Tyreman said.
The company has created some confusion because a part of Navops Command, which is focused on Kubernetes, also was called Navops Launch. That was part of Unicloud, which involved Puppet system configuration tools, Docker containers, Kubernetes and the minimalist Atomic Host Linux from Red Hat. Navops Command added scheduling and policy management capabilities to the mix.
In a previous post for The New Stack, Lalonde wrote about achieving cloud-native HPC capabilities in a mixed-workload environment.
Navops is a sponsor of The New Stack.