Why Wells Fargo Wants to ‘Repave’ Its Platform Every Day
Despite the triumphant return of stonewashed denim and throwback Air Jordans, we really are not living in the 1990s. In no area is this more obvious than in technology, where we’ve gone from a burgeoning internet to a world where an internet connection is required to do almost anything. However, with ubiquity comes opportunity — for businesses, for consumers, and, unfortunately, for criminals.
That’s why another big shift from the era of the first internet boom is in how companies treat their infrastructure. For forward-thinking companies, the immutable nature of things like containers, functions and application instances also presents an opportunity to improve security. Killing and relaunching application components from a known-good state can be more secure (and more efficient) than constantly patching, configuring and generally doing everything possible to keep a server running.
That’s why as Wells Fargo started down the path of modernizing its infrastructure and development practices with Pivotal Cloud Foundry, it’s also stepping up its game around cybersecurity. As one of the world’s largest banks, it can’t take any chances around keeping sensitive data safe from intruders. Many security initiatives focus on application vulnerabilities and on detecting intrusions, but PCF helps ensure safety at the platform level via zero-downtime patching and “repaving” VMs to a known-clean state.
In this presentation from SpringOne Platform 2018, Wells Fargo’s Lance Rochelle, Cloud Foundry product owner and technology manager at Wells Fargo, explains how the company is combating advanced persistent threats, as well as an onslaught of CVEs, by repaving its entire platform multiple times per week — with a goal of doing so every day by the end of 2019.
You can watch the video for all the details or keep scrolling for some highlights from his talk below the video — which have been edited for the sake of brevity.
Here are the excerpts from Rochelle’s talk on Wells Fargo’s security makeover:
From ‘Why patch?’ to ‘Let’s do this monthly.’
Twenty years ago, we used uptime as a metric for success. Everybody and their brother had these legacy servers. They were afraid to restart them. They had thousands of days of uptime. You couldn’t really patch them because you were afraid that the disks wouldn’t spin back up or the memory would fail.
[In] 2002, somebody said, “You know, really we should patch, even if it’s just to do a sanity check.” Maybe, maybe not. That’s really the way it worked in lot of enterprise environments that I worked in. About 2004, we’re like, “This patching thing’s not really too bad. Let’s go with once, twice a year, maybe three times a year.” … [E]ventually we get down to once a quarter, and then around 2011, 2012, some of those wonderful security folks were like, “Let’s get on a regular patching cycle. Let’s do it once a month.”
CVEs Never Stop Coming
There are thousands upon thousands of CVEs that are reported every month. What we don’t want to do is leave those CVEs out on a platform, out in application code. We want to make sure that they’re patched and updated… How many of you guys deal with compliance and licensing, and then they come after you because some library somewhere is no longer under maintenance, or some library somewhere has a vulnerability that’s tied back to a CVE?
My estimation is that there’s going to be 19,500-plus CVEs reported this year. That’s a lot. There were only 14,714 last year. … The only way to keep up with those threats is to automatically update everything constantly.
Daily Repaving Requires Automation
What’s stopping you from rebuilding an environment once a day? I can tell you that if you manually patch your environments, you’re never going to get to once a day. You’re going to struggle with once a month and you’re going to really struggle with once a week…
…But I want to break it down for you so that you understand that when you deploy an application and that application has a number of instances, and those number of instances follow … a cloud native practice, that they have the ability to be patched … without causing customer impact. We’ve seen this since we do it once a week right now, and we’re going to be going to once a day by the end of next year. We’ve seen this probably 7 [hundred] or 800 times between all of our [PCF] foundations.
Convince Your Customers with Results … and Patience
When [internal customers] ask me for a [maintenance] schedule, I don’t give it to them. And then, I explain that we have a pre-approved change record that allows us to do it any night after our critical online window. And then they go, “Well, how are you going to notify us?” And I’m like, “I’m not.”
I will tell you: At first … you spend hours — hundreds of hours — in meetings. But then, after the first five or six times that it happens and other app teams are like, “Dude, we haven’t had an issue yet. How many times have you repaved?” And I’m like, “I don’t know, like 40.”
And then the other app teams start being advocates because they’re like, “We don’t have to be on a bridge line from 7 at night to 2 a.m. when you guys patch…” The first couple of customers, though, you have to spend an enormous amount of time explaining to them. Which is why [this] presentation is usually two and a half hours … I have to start with “What is Cloud Foundry?”
Partnering with Pivotal for Maximum Security
[The way PCF manages repaving] allows us to deploy a piece of code to the environment and know that every time it’s instantiated, it’s based on the same set of code that was originally deployed a week ago, a day ago, a month ago.
Pivotal is a great partner for us but, as Wells Fargo, we don’t trust anybody anyway [so we take some additional steps internally to satisfy our requirements.]
We then use a scanning agent, similar to what Pivotal uses, and then we compare notes. Our security vulnerability assessment team and their security vulnerability assessment team compare notes every time there’s an update to make sure that we’re not missing something that they’ve identified, and they’re not missing something that we’ve identified.