Pinterest Uses Varnish VCL to Manage 50 Billion Pins a Month
“Oh for Vulcan’s sake, you mean I have to learn ANOTHER proprietary language?” I hear you ask. For Varnish engineers, this question was given deep consideration, and they concluded that yes, yes we need this language. And engineers from Pinterest and Tesla agreed.
Varnish is an open source HTTP accelerator for large web-based sites. “Put in front of application server, it is super simple, therefore it is also 200 to 1000 times faster. So every time you move data from caching layer to application server, Varnish will supply the data in 30-40 microseconds as opposed to typical cache which is 10 to 20 milliseconds,” said Per Buer, founder and chief technology officer of Varnish Software, speaking at the Varnish Summit in San Francisco earlier this year.
Later that day, Reza Naghibi, senior software engineer at Varnish Software, explained why they decided to create a Domain Specific Language (DSL), and why you should care. “We’re giving you all the power to define your own rules. VCL gives you total control,” he said.
Designed to be intuitive, The Varnish Configuration Language (VCL) is based on C and so it feels familiar to most developers, Naghibi explained. They kept the syntax very basic, straightforward. He called VCL a fail safe with a very strict rule set. It does not cache anything unless all of the rules for caching have been met.
In addition, VCL is not interpretive — no if statements, no loops. The functionality is grouped in subroutines that does not allow for arguments nor return values. Data is exchanged only through HTTP headers, and VCL has the power to manipulate HTTP headers, including the power to override TTLs, strip cookies and re-write URLs. A key differentiator, Naghibi said, is the ability to update VCL during runtime. You can change VCLs and state effect without a server restart.
If you “can’t” do it in VCL, there are workarounds to enable the functionality, he said, but the language itself is kept very simple. Multiple VCLs can be run concurrently and the switch between them is instantaneous and seamless.
VCL does more than cache, said Naghibi.
One company that has already made extensive use of VCL is Pinterest. Jenifer Zinner, Pinterest traffic and site reliability engineer spoke at the Summit about the specifics of how Pinterest created a Content Delivery Network (CDN) that leverages Varnish functionality, especially VCL, and had many useful tips on upgrading to the latest Varnish version. You can watch her full speech here.
Pinterest is a visual bookmarking tool that allows you to bookmark or “Pin” a link, said Zinner. When Pins are saved, they are collected into Boards.
Don’t be fooled by the homespun nature of the site; Pinterest is working at massive scale, with 100 million active users, 50+ billion pins across 1 billion boards every month. The service processes 180,000 requests per second through the Varnish CDN mostly non-cache hits, representing over 10 million unique user actions per minute.
Pinterest runs Varnish over Amazon Web Service (AWS) and are getting great results, Zinner said. That they started using AWS in 2010 and since then Zinner said AWS “has evolved under us” so they are now moving to use Amazon’s Virtual Private Cloud (VPC).
The company’s primary concern when designing in the cloud is resiliency. The company relies on the auto scaling and configuration management functionality provided by Varnish and controlled by VCL. Auto scaling automatically adds removes instances from a managed group based on the load.
The Pinterest cloud actively scales primary application tier based on user traffic, and there are over 60 auto-scaling events across the application tier per day, all done automatically through VCL programming. As a correlation, Zinner said, the ops team creates 60 unique VCL configurations every day.
Because auto-scaling generates about 12,000 server turnovers per month, she explained that the service’s traffic may soon exceed the kernel’s total TCP socket allocation with just probes for back end servers. Pinterest has found the load balancer, integration point, fixing unstable backend features of VCL to be invaluable.
With these numbers, reliability is critical. “Monthly uptime has been amazing,” said Zinner.
The service also uses VCL’s HTTP accelerator ability to work with HTTP headers, and have found the TCP kernel tuning to be “awesome” she said.
In an earlier talk at the summit, Tesla web platform architect Rajasekar Jegannathan explained how he created Tesla’s private CDN. he enthused about VCL’s capabilities and how the language gives Tesla the control necessary to manage its global network. “I love VCL,” he said.
Do you need to learn yet another DSL? You might want to if it can do all this.