How Docker Helped Yelp Leave the Monolith Behind

Yelp is an extraordinarily simple app. You type in what you’re looking for and where, and it returns a list of reviews for different businesses in that location. But this simplicity masks an enormous amount of complexity behind the scenes. The application is composed of millions of lines of code, written in at least two different programming languages, and the company employs over 300 developers tasked with delivering new features every single day.
For a company that prides itself on collaboration and rapid change, this monolithic codebase was a huge burden. That’s why Yelp has been moving away from that monolith to a service-oriented architecture hosted on Amazon Web Services. The problem is that when Yelp started out 11 years ago, Amazon Web Services didn’t even exist, and many of its critical services still run in-house. So the company turned to Docker two years as a means of smoothing out the differences between their own conventional data centers and their AWS instances.
Docker is famously easy to get started with. There are countless tutorials, including Docker’s own step-by-step beginner’s guide, as well as books and manuals, such as James Turnbull’s “The Docker Book” in which he writes:
“It’s very easy to run your first Docker container. If you just want to get your toes in the water to experiment, the barrier to entry is pretty low and you need to know how to edit a few configuration files.”
Much like Yelp itself, Docker’s simplicity conceals an incredible sophistication lurking just below the surface. “[Docker] has a considerable depth of complexity,” Turnbull says. “There’s a lot of ways you can configure memory [or] CPU. You can have a whole Rails stack and you can have a whole LAMP stack. You can have thousands of containers on separate machines and you can orchestrate them.”
If you really want to use Docker in production, Yelp Director of Operations Sam Eaton says you need a reasonable understanding of Linux, how Docker uses the Linux kernels, and the limitations of the approach. “In a large deployment when you are actually starting to be concerned about those deep technical details,” he says. “When you want to kind of deploy it at scale, you want to understand what it’s built on.”
That said, Eaton’s 60-person infrastructure team is highly technical and well-versed in a variety of open source technologies, so they didn’t need any particular training to implement Docker.
The Yelp internal application program interfaces (APIs) are the glue that holds their service-oriented architecture together. Crucially, Eaton’s team was able to keep the same APIs even after Dockerizing their services. “APIs are very important. You have to try to get them right, and you don’t want them to move very much if you want people to collaborate well and want people to talk to each other’s services,” he says.
Docker turned out to be a great fit for the company’s needs, Eaton says. “Without requiring collaboration, more could be done by the developers to give them more independence,” he explains. “The time it takes to deploy a single Docker container is a matter of seconds.”
The team also tried using Netflix’s Asgard to manage this architecture, which helped autoscale groups and isolate services. But it didn’t quite meet Yelp’s needs, because of the huge duplication of effort required to run a service on both the local infrastructure and the Amazon cloud. Eaton says Dockerizing their company frees up developers to work at their own pace, to isolate them from each other.Since everything is so much smaller, he says it makes it much faster to deploy, and opens up the possibility to combine existing services to create new functionality.
Yelp is also now using Docker heavily in its testing environment, “to isolate our text — little copies of parts of our infrastructure.”
In general, Yelp runs about seven million tests a day, with Docker housing about a million of them and counting. “We’re rapidly migrating a majority of our system toward Docker, at the moment, because it’s a massive pain to have to migrate one system to another.”
Docker and the Security Question
Turnbull reminds us that Docker isn’t a security control. “The Docker container is a fairly loose abstraction,” he says. “So it’s not a hard shell around your processes, so it’s not an additional security level. Docker is designed to make running workloads easier, it’s not designed to be like a sandbox environment.”
The big difference is that on a virtual machine (VM), you’ll find a full operating system installation with the associated overhead of virtualized device drivers, memory management and other resources. But containers use and share the operating system and all associated resources of their host. Containers are, therefore, smaller and faster than virtual machines, but don’t feature as much isolation.
Eaton sees the security model trade-off as a real concern for certain companies, but not any issue for Yelp, which is using Docker only internally with its own people and APIs. He says containers are designed for situations in which you need to isolate applications based on dependency and memory, but you aren’t trying to isolate the users from each other.
If people want the convenience and scalability of Docker and the security of traditional virtual machines in a multitenancy environment, Eaton offers a kind of Russian nesting solution where you run Docker containers on virtual machines. “They might run a virtual machine per tenant, but obviously the more layers of indirection like this, the more complex your environment becomes to manage,” he says.
Jared Rosoff, director of product management and architecture for VMware’s Cloud Native Apps, warns that when you are talking about security regulations and compliance, even the combined one-two punch of containers being built on top of virtual machines is simply not enough of a view into what’s going on.
Docker has made efforts to build a stronger security posture. It announced in the spring of 2015 that it had worked with the Center for Internet Security (CIS) to produce a benchmark document that detailed recommendations for the security of Docker deployments. Chris Swan notes in a blog post on InfoQthat “the study is detailed enough to serve as a resource to profile a specific Docker environment and determine practical steps that can be taken to improve its security. Where multiple choices exist due to differences in the underlying host OS numerous references are given to external guidance documents authored by members of the Docker core team and others.”
At DockerCon in June, Docker introduced developers to Notary, its open source system for certifying the validity of the sources of Docker images pushed to public repositories and encrypting the contents of those images. In August, the company released its own branded implementation of Notary, called Docker Content Trust.
Docker’s aim is to deploy a signing and encryption system that’s reliable on both the sending and receiving ends of the network, making the security of the network in-between is no longer an issue.
Rosoff says that though virtual machines can feel a little heavy for developers, they tend to be more convenient for ops teams, where they can maintain the user experience for developers, but keep the VM’s deeper insight into infrastructure and security manageability.
“As a developer, I start off really happy being able to run containers like that, but the picture starts to get really complicated,” Rosoff said. “If I run virtualized workloads, I get all my stuff from IT for free: monitoring dashboards, capacity planning and things like that.” Now the developers — who normally relied on the operations team to create, manage and monitor dashboards and capacity plans for all those Linux hosts — have to create their own management systems.
“The problem comes when your security audit team comes in and [asks] ‘Is this HIPAA compliant or PCI compliant?’ and operations can’t find any visibility inside those workflows,” he says.
To overcome these problems, Rosoff says we’ll need a new ecosystem of tools that close the visibility gap.
But Eaton says that the next generation of developers and operations teams will have to accept the DevOps culture of giving up control over their own environments and increasing collaboration. Containers will play a huge rule in that transformation.
”[Docker] provides the developers with more of the ability to do more of the management of the systems themselves,” he says. “It makes it easier for them to manage and be responsible for all their own services, without having to ask for operational help.”