Culture / Edge / IoT / Kubernetes / Sponsored / Contributed

3 DevOps Insights from NIO, Maker of Self-Driving Cars, DC/OS and Kubernetes Power User

15 Feb 2018 10:05am, by

Gou Rao, Portworx
Gou Rao, Portworx co-founder and CTO, was previously CTO of Dell’s Data Protection division and of Citrix Systems’ ASG; co-founder and CTO of Ocarina Networks and of Net6; and a key architect at Intel and Lockheed Martin. He holds computer science bachelor’s (Bangalore University) and master’s (University of Pennsylvania) degrees.

One of the benefits of working at a startup is developing really deep relationships with customers. As chief technology officer, I love working with each and every one of our customers. They inspire me and our team to come to work every day and innovate. I’ve also learned a ton from them. Since we make a cloud-native storage solution for containerized applications, our customers are by definition on the bleeding edge of technology. Today’s post is about the industry and technology insights that I’ve learned from one of the coolest companies I’ve ever worked with: NIO.

It is hard to make a short list of the cool attributes of NIO, but here are a few of my favorites.

We recently sat down with Satya Komala, head of autonomous vehicle cloud at NIO, to talk about some of the higher-level issues they’ve had to overcome to run their platform on containers, Mesosphere DC/OS and Kubernetes. Here are three things I learned from my discussions with NIO.

IoT is Generating a Massive Amount of Data, We Need to Think Systems, Not Clouds.

When we think Internet-of-things (IoT), we’ve been trained to think internet-connected toasters, fridges and thermostats. There might be a lot of these devices, but they won’t produce much data. The much bigger part of IoT will be industrial devices, like self-driving cars. As I mentioned, each of NIO’s autonomous vehicle’s produces up to 24TB of data per day. Just ten cars is 240TB. For planning purposes, NIO is looking to have 120PB of data this year. This is a massive amount of data, and completely changes how they have to look at building out their infrastructure.

Having a fleet of cars producing this much data confirms in many ways the prediction of Peter Levine, partner at the famous venture capital firm Andreessen Horowitz. In the End of Cloud Computing, Levine’s argument is simple: Centralized data storage and processing doesn’t make sense in a world where real-time processing exceeds the capacity of the network to transfer large datasets.

NIOs experience bore this out, and as Komala, told me, “When I started [at NIO], I was coming from Tesla, which, as I mentioned before, has a Level 2 [semi-autonomous] vehicle. Since Tesla uses the cloud, I started by evaluating cloud providers, but I quickly realized that a Level 4 vehicle [fully autonomous vehicle] is a completely different ballgame. One car generates 12 to 24 terabytes of data a day in development mode. Ten cars is a whole 240 terabytes per day and there is no way I can get any of that data into cloud. So the problem statement completely changed.”

Instead of focusing on data and applications, NIO focused on infrastructure and systems. This includes running Docker directly on bare metal servers, instead of relying on VMs, to limit unnecessary performance overhead and complexity of environment in their edge locations.

A Broad Mix of Fast Data and Big Data Applications Drive Insights, Requiring Flexible Operations Platforms

Wrangling and gaining insights from such a massive amount of data requires specialized tooling. In case there was any doubt that we’ve moved on from an Oracle-based world, this is it. NIO uses a range of tools in order to stream, store and process data coming off their fleet. The chief insight here is that for each specialized task, they use a specialized service.

For instance, the data each car produces is almost exclusively high-quality, 1080p images and videos from the car’s 12 cameras. All this data must first be streamed off the car and stored long term for regulatory reasons, as well as analyzed.

Streaming leverages Kafka. Batch workloads take advantage of Spark running on top of Hadoop HDFS. Machine learning is trained on Spark ML and TensorFlow.

As a result of using so many different services, it is hard for NIO to become a deep expert in operating each individual service the way database administrators used to be expert in ops for a single “pet” database. This is where platforms likeMesosphere DC/OS and Kubernetes come in. They provide a consistent way to deploy, upgrade, backup and migrate any stateful service that runs on Linux. With a consistent platform like this in place, NIO is free to innovate and experiment with many different data services, without worrying about having to hire specialized operations teams to manage them.

Not All Problems Have Been Solved, Pick a Few Partners to Go Deep with, and Innovate

The world has never before had a car that produces 24TB of data per day, so by definition, out-of-the-box solutions don’t exist for operating a platform at this scale. These are completely new problems that require new technologies. That’s exciting but also daunting. A key insight from Komala was that you should focus on a small number of critical areas in your stack and find the right partners to solve those problems and innovate in those spaces.

NIO told me a great story about Big Switch Networks, for example. The challenge that NIO was having was getting overlay networking to work at scale on Kubernetes. IP space management was a huge challenge in particular due to the huge number of IPs associated with each container cluster. The company wanted a software-defined network that was integrated with the CNCF Container Networking Interface (CNI) standard so that it could have a flatter network architecture and automate IP space management and network space management. The problem was that this didn’t exist. Having identified the problem, NIO then worked with Big Switch to extend support for the CNI interface, resulting in a stable networking solution today in its platform.

What’s Next?

In such a new space, you can’t expect that all problems will be solved right away. Picking the right challenges to take on, with the right partners, is a key to pushing the envelope in an evolving space. In our own corner of the internet, we’re excited to work with customers like NIO to push the boundaries of DC/OS storage and Kubernetes storage forward. There is always something to build and always something to learn.

Portworx sponsored this story.

CNCF and Mesosphere are sponsors of The New Stack.

A newsletter digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.