Among the large cloud vendors, Google is perhaps best known for its deep engineering bench. It is the company that brought the open source Kubernetes container engine to the world after all. But to bring to the enterprise all the benefits that Kubernetes offers — agility, economy, resilience — it would need an additional set of expertise.
Enter Will Grannis, a veteran of enterprise technology through his time in Boeing and other companies. For Google, he created and now manages the Office of the Chief Technology Officer, a business unit for a company that has long prided itself on not having CTOs. It is the role of the OCTO to work with the largest enterprise customers for the company, using Google Cloud services such as Anthos, a managed Kubernetes service.
Recently, we spoke with Grannis about how the OCTO works with enterprise customers, the challenges of multicloud and what Anthos offers over regular Kubernetes distributions.
Why did you start the Office of the CTO?
About five years ago, we saw an increasing amount of demand coming from our enterprise customers, saying the bridge between them and Google wasn’t as defined and structurally sound as they’d like it to be.
Let’s say that I’m a financial services company and I’ve been dabbling with Kubernetes and building my own platform internally, but I’m quickly discovering that I don’t want to necessarily operate and maintain that platform myself at some scale. It was great to experiment, but now I’m going to go into production. I’m going to do a little bit more. Who at Google do I engage with? So without an obvious entry point, a customer would have to assemble an answer from six or seven different product areas and synthesize that answer themselves.
Google doesn’t have chief technology officers. So in a strictly functional organization of engineering, product, sales, marketing, we also had to define for ourselves what we wanted this job to look like. I came from the enterprise, not pure tech — 20 years in aerospace, manufacturing, industrial on my side. I was always like, “If I had this team at Google, I could more quickly work through the requirements to the tech, coming back to a solution.” But one didn’t exist. So I decided to build it.
So how did you populate the office?
First, I had to set the foundation for what the team would look like. I base it on what I think is going to be the new pattern in enterprise integration, the cloud is driving this a lot. It’s multidomain: mobile, storage, network, compute. It’s AI, analytics. We’d have to be able to cover a very broad set of technical areas.
We would need people that had real experience in the enterprise: not people that had built products and sent them over the transom, but people who have been trying to figure out “How do I assemble 14 different vendors to solve this problem for me?” They had to be more of a facilitator. I don’t believe in the concept of the Genius CTO who answers all the questions — especially in this multidisciplinary, fast-moving world.
I believe it’s a peer group. So the first set of people that we brought in had a combination of enterprise experience. We hired a CTO of JP Morgan. We hired the head of software engineering for GE Healthcare. So we brought people that had a technical foundation but also had solved [problems] of the enterprise, using the cloud. We put them into this peer group. Then we went and tried to engage with customers and solve problems in a collaborative way. We were really learning.
In the enterprise, the CTO is a synthesizer a lot of times. My relationship was usually with the lines of business and they would say, “We want to build a new retail bank” or “We want to build a new experience for our customers. How do we do that?” I would have to take that need, think about it across this broad technology services and then go try something. Pilot it out.
At Google, we didn’t have a function that would take our enterprise requirements from within engineering to gather these requirements, synthesize them, share them with the product areas, give them these early warning signals for the emerging patterns.
We’ve been working in Kubernetes, six, seven years, but we’ve just recently, over the last couple of years, put Anthos into the world. As Kubernetes rises throughout the community right now developers and operators can speak a common language even if they’re using different cloud providers, to begin with. But it takes years for that to happen. Our team was involved at the very early going and taking the GKE [Google Kubernetes Engine] and providing the rationale for why you should do it long term. Which is the core of Anthos. That was three and a half years ago.
What did you find when you started interacting with customers?
One is that as much as we knew about technology and as much as we thought we knew about how to do our job, we had to learn a lot. The first thing that we have to learn is, what is a process we can take a customer through that would match Google’s design thinking methodology that we’ve used. Google Ventures has published a book called “Sprint,” which is how we do product development and design at Google. We had to take that and adapt it to the enterprise.
What do I mean by that? I mean that if a customer says, “Hey, we’d really like to talk about a really strategic thing that we want to go do, it’s complex, it’s multi-service. We don’t really know how to like put it in terms of individual product areas you understand,” we might get engaged. The first question we’ll ask is “what problem are you actually trying to solve?” Which sounds very obvious. But to take a step back and ask, “what is the problem they’re trying to solve?” really starts to reveal what the users want. What do their customers want? And that helps us think through the trade-offs between multiple types of technology.
Google’s engineering approach has always been to pull apart black boxes, characterize and instrument them and then hit high-performance. We need to, or else we’d go out of business.
What is one of the most frustrating current conditions if you’re a CIO or a technology leader in a company? It’s when you go through application rationalization and you realize you’ve got all these applications tied to mainframes or they’re tied to this infrastructure. Trying to move them is going to be really hard.
This is actually what drove containers at Google, because of our approach to commodity infrastructure, we want to be able to move an application and change the underlying servers underneath it with live migration and no downtime. We had done this for a decade-plus. Then we started to see customers ask, “Is there a way that we can write programs once and then have them sit on top of infrastructure, even if it was, you know, from different clouds or our own private cloud?” There’s a pattern between what we did with containers and what customers are asking for, for which containers are a good solve. That’s why Anthos, at its core, started with containers.
AWS and Microsoft saw that same pattern. What separates your approach from theirs?
We use a software-based approach. What that means is, we believe in the stack. If a customer has hardware and they want to put Anthos on that hardware, fine. There’s a way to get that onto that hardware. If they want to utilize Cisco or HP or whoever to provide the appliance and the hardware if they can do that as well. So they get the choice. They get to maintain some of these vendor relationships. More software versus hardware-based. Less being an appliance and more of being a software solution.
The second thing is that we started with containers. So for example, some of our competitors start with their, OnPrem appliance and it’s just to move virtual machines [VMs] from one post to another. We don’t think that approach buys very much. We think all you’ve done is move a VM from one location to another. That’s why starting with containers was our path. I would also point out that we are extending it, what we think is the future. Google’s engineering approach has always been to pull apart black boxes, characterize and instrument them and then hit high-performance. We need to, or else we’d go out of business.
The Istio [service mesh] is a good example of how we believe that the move to containers and the move to more modern programming paradigms is also a move towards more loosely coupled services and microservices. With Anthos you get Anthos and containers and core, but you also get Istio, which is service management. Let’s allow programmers to join in and develop pieces and understand how their parts connect to the whole.
What sort of jobs is the Google Cloud taking on for customers?
All forms. I would say four or five years ago there was a lot of this technical compute bursting. A media company wanted to render and they pump out their excess what they can handle on-prem onto the cloud. Similar, maybe a financial services organization where they were using cloud as an augmentation to core systems. That’s not what we see anymore.
What we see now is core workloads moving to the cloud. This is an all-industries — retail, financial services, media. What we see is roughly 80-85% of the companies we deal with, already have a strategy in place for multicloud.
Now those patterns can be somewhat different. Sometimes it’s abstracting over storage and compute. Sometimes it’s “I want to have special services or managed services from different cloud providers.” Sometimes they’re different applications, but at the firm level, it’s still multicloud. Most of them are starting in hybrid. We saw that most patterns of multicloud are going to start by learning how to implement Kubernetes, how to implement this kind of new stack and then gaining that confidence to push out the public cloud with a single control model.
I know all the clouds support Kubernetes, but each cloud has a “special sauce” which makes it difficult to move a workload from one cloud to another. What approach do you recommend for companies to deal with both being in the cloud, but enjoying each cloud’s special services.
Step one is don’t start from the technology and try to force-fit that into your model. The ones that are having the most success, they define an objective and it’s usually a multiyear objective. They had these business principles and then they work backward into the technology.
At what point should an organization give up on trying to manage Kubernetes in-house, and opt for a service such as Anthos?
If a large financial services company experimenting with Kubernetes to build out its own platforming services, and the developers love it and operators love it, and then they calculate the costs, [they will think] “So if we continue to do this, what will it take us to design, manage, operate ourselves? Who is on the hook for all of that? We are. So does that imply now that we’re in the business of maintaining infrastructure?” Not a great place to be, especially in a hyper-competitive environment where people are looking at your stock price and they’re looking at your costs.
So is managing infrastructure are really a core business for them? If it is, let’s have that conversation. If it isn’t, there are more questions coming from CFOs. There are more questions coming from CEOs and the board around, “we’re really glad and figured out these patterns, but is there is a viable solution to manage this for us? With strict policy governance, keys we control, all the different features, security, policy management at scale telemetry so we can characterize it and hold people accountable for performance and reliability? If all that’s available. Why would we still choose to run this?”
Fast forward two and a half years later. We have Anthos. Now that those choices are available, the justification for building it yourself is getting much more difficult.
I would say most of the use cases we see today for managing your own infrastructure fall into a couple of buckets. One is you’re in a super highly regulated market and so you’re operating in a place where regulatory and business strategy haven’t yet reconciled. So you’re not going to go take a big risk without some assurance that you can do so in the right way and in a regular environment.
The second that we’re seeing is, let’s say you’re a defense contractor, for example, had you have to segregate your program based on the customer segment that you’re in. That requires a high degree of a potentially high air gap or disconnected mode. And that’s one of those use cases will be taken care of over time. But right now it’s still hard [as a service].
Does the OCTO have any feedback mechanism back to the engineers?
So we live in engineering at Google. So in Google we roll up as a cost of R & D. We’re not in the sales team. So as a result, we spend about 60 to 70% of our time with customers, but then we spend 30% of our time with the engineering teams, synthesizing what we’re hearing and sharing back so that they can have this early warning system of shifts. You can imagine that a company like ours, we want horizontal insights, right? We want to know what’s happening in financial services, retail media, healthcare, public sector, so that we can build infrastructure and technology that’s applicable as possible to all of them.
Because we’re not any individual product area and because we aren’t limited to any particular GA or vertical, we provide that nice horizontal. Cloud has engineering, the sales, marketing, product and OCTO is the only function that cuts horizontally across all. And it’s the same way with our customers. So we’re constantly looking for patterns. We actually have programmatic ways of sharing those emerging themes internally.