Alex Williams: We’ve talked about what you’ve done in the past, and we got into Cloud Foundry a little bit. The one question I have been thinking about is: what’s your mandate in your new role with the Cloud Foundry Foundation?
Sam Ramji: The mandate is to generate a global ecosystem of apps and app developers that are writing to Cloud Foundry.
To give that a sense of scope: when I left Microsoft in 2009 it was a $65 billion revenue company with $24 billion in profits. The important thing is: that was all stacked underneath an ecosystem that was generating about $450 billion — about an eight-to-one ratio between revenue realized by the ecosystem and that recognized by Microsoft. That’s really good platform economics. Even more importantly, that’s not just all custom software. That’s not people just writing things on the fly. That was powered by 70,000 independent software vendors and 1.2 million partners. So, when I think about the shape of a platform ecosystem, it’s an inverted pyramid.
The mandate for Cloud Foundry is to be the platform at the bottom of an inverted pyramid where we can enable thousands of ISVs, along with millions of developers, to build standardized software that can simply deploy and install into a Cloud Foundry fabric.
Alex: That’s a lot different than in those days you’re talking about, where it was a pyramid …
Sam: Right, you had your component suppliers that would roll up to your subsystem suppliers that would roll up to your assembled parts suppliers, and then you turn that into a product, and maybe you deliver that finished product out to your users. This is the opposite of that — it’s a platform that sits underneath it all and enables, and ideally organizes, all that activity.
So, our real mandate is to make the market for enterprise software development and apps more efficient. We think that the cleanest way to do that is to look directly at the developers, the development experience, and deployment into the cloud. If we’re executing well in Cloud Foundry, then developers who are building custom solutions — whether they work for agencies or systems integrators, or directly for enterprises, or even startups — are finding it easier to write code and have it just run, reducing their operational burden.
You could build packaged software that is solely dependent on Cloud Foundry, so that you could say, “I want SugarCRM,” and, just like you can click a tile in Android or in iOS and then the app is installed, you could have that kind of experience for SugarCRM, or Amdocs, or Documentum, as a data center-scale application. Just click the tile, it goes and finds its dependencies through Cloud Foundry and Cloud Foundry infrastructure, and it runs. It’s almost unimaginable now — it sounds like science fiction, but that’s what a really good app platform would do for cloud computing.
Alex: We’re seeing examples of that with the capabilities in some of these new platform as a service environments — Cloud Foundry’s use inside ActiveState, for instance, and in container orchestration.
Sam: The key to a successful platform is almost a chicken and egg problem. Everybody talks about “the platform bootstrapping problem,” which is, it needs to be ubiquitous. To achieve ubiquity, there has to be some value to everybody in having it be everywhere.
The value that you need to provide as a platform is portability. You have to believe — belief is important, and it has to actually work — that you can take your application from one Cloud Foundry environment to another, to another, to another. Whether it’s your scripts, or your code, or whatever it is that you think of as making the application — and whether those environments are public and hosted, or are your own on-premises — that portability’s got to work. That’s why you would deploy this everywhere, so that it becomes ubiquitous.
The flipside of portability is interoperability. Below the pyramid, it’s got to interoperate with the leading, and even marginal, IaaS technologies. Just like PaaS is not a great cognitive frame, IaaS is a little loose. It’ll change — right now IaaS is more of a bumper sticker rather than the actual car. So, you need cluster management, orchestration, and integration with virtual machines and virtualization engines.
Sometimes you need something closer to bare metal — look at IBM SoftLayer, for example. There is a real world — with real metal, real electrons, real electricity and real networks plugging-in — that has to be made available to an application platform like Cloud Foundry. We must be able to be very widely adapted.
The success of Windows in the ’90s was that regardless of the set of PC hardware components — and there were millions of components and hundreds of millions of combinations — they all just worked. The drivers had been built, there were Device Driver Kits, there was a hardware qualification lab, and you knew that these things could be certified. Regardless of how many mainstream or marginal things you plugged-in, it would all talk to an operating system that would run the applications you cared about. That portability that you get at the top layer is the purpose of an application platform.
Certainly, we have to differentiate through elasticity — it’s called Cloud Foundry because it provides cloud scale, but that’s not valuable if you can’t move your apps around. So in order to enable that, you have to have interoperability below so that you can adapt this thing into many different data centers, whether it’s AWS, or Bluemix, or Equinix …
Alex: … Which really speaks to this major theme in 2015 of developing and managing applications at scale across multiple data centers and cloud services. Doesn’t Cloud Foundry have to be installed in different places in order to create that elasticity?
Sam: You can create elasticity in a single environment or at a single rack in a data center — that’s the atomic level of elasticity, and that’s one of the first places that was built and was proven. If you can’t do it there, you shouldn’t even talk about doing it at a bigger scale.
Alex: Cloud Foundry faces a similar challenge as do other applications that have been architected over the past several years. There are these databases and service technologies where you can replicate across multiple data centers simultaneously…
Sam: … Sure, you’ve got Cassandra, you’ve got Etcd …
Alex: … Cassandra, MapR, lots of open source projects around that. Does Cloud Foundry need to work on finding new ways to broaden and tighten the weave of that elasticity?
Sam: Fortunately, we already have a lot of things that we can depend on. We can take advantage of OpenStack wherever it’s deployed. Even if it’s not deployed, we can say, “Put in OpenStack and put Cloud Foundry on top of that.” That takes care of the virtualization of a lot of data center resources, and then reifies that virtualization through APIs into Cloud Foundry, and makes the data center available for the Elastic Runtime.
A lot of other people are trying to solve similar problems. With Kubernetes, Google has done an elegant job of building not only container management but also “container containers,” which they call “pods.” Sometimes containers need to have affinity: a particular combination of a web server that’s doing read-mostly access, and then maybe an uploader or a similar system that’s sharding information to a database — the reads, the writes, and storage, where the reads and the writes are two quite different applications. Those three things should have some affinity. That’s one of the things that Kubernetes handles.
Then you’ve got the CoreOS folks, and they’ve got Etcd as part of that environment. Etcd is working very well, so we have taken advantage of Etcd in the next generation of Cloud Foundry, called Diego. It’s a place that wherever you have a statement of truth — which could be configuration, or it could be a piece of monitoring data — we know that that’s replicated at scale as a consistent data fabric, on which orchestration, cluster management, routing, and monitoring, can then depend.
Alex: There are other orchestration environments, other PaaS environments, and other container environments for people to choose from — there’s OpenShift, there’s Mesos — what is the interoperability that the Cloud Foundry Foundation believes it is responsible for?
Sam: The Hard Rock Cafe put it best: “Love all, serve all.”
There are materially different capabilities of different schedulers. Mesos is very good for big data workloads — no surprise, because that set of pressures was the context in which that technology was created. A big data workload tends to be homogeneous — you dispatch a set of tasks to a cluster, you expect about the same number of tasks to be done on every machine, it takes a consistent unit of time, you pull it back, and you’re done.
Diego is more for app workloads, which are fairly spiky, with heterogeneous demands across a cluster. Each app is hopefully going to participate well with one another, so they’ll step in and hand back work-time that they don’t need, so that another app can pick it up.
Where you fit the different cluster managers — Diego, Mesos, Kubernetes — is unclear. It does seem that you need to have pluggable algorithms. Maybe we end up plugging-in different schedulers to do different kinds of workloads. For now, Cloud Foundry needs to not try to be all things to all people. It needs to be highly interoperable, but it needs to stay firmly focused on being the best app platform that anybody has ever seen for the cloud.
On the container side — and this is what many of us tried to teach Microsoft — you have to start by respecting the investments that people have made in their existing technology. So, if you’ve done a bunch of Docker — and frankly, who hasn’t done a bunch of Docker at this point, even if it’s in a lab mode? — you’ve got a bunch of Docker images. You’ve said, “Here’s what I want in my golden containers.”
Diego can already load Docker images into Cloud Foundry. There’s a technical nit, which is that it loads them into a warden container. We go back and forth a bit between calling it “warden” and “garden” — technically, under Diego, we should call it “garden.” But, we can standardize on an image format that respects all the effort that people have put into Docker. And anybody else who’s trying to build a cluster and container management system ought to figure out how to interoperate with the Docker format.
There will probably be other formats on the horizon as well. The CoreOS folks have put out a call for an app container spec related to Rocket, which is their container. We shouldn’t try to make any big bets that constrain people’s choice in the future. We have to do a core set of things better than anybody else: Diego and Lattice and the Cloud Foundry core have to be able manage application workloads that allow developers to just drop code in and have it run. You should be able to cf push code, and we should make it all work.
If enterprise IT wants to bless a particular image with a particular image format, or they’re going to use a particular style of virtualization, or they’re going to use something like SoftLayer, our technology’s got to be flexible enough that it just says “yes” to all of that while focusing on being the best possible app platform.
Alex: Who’s going to do all this work? Who’s signed up and how’s it all going to get done?
Sam: That’s the awesome thing. The core of Cloud Foundry is co-opetition. Already we have forty members — we’ve got some huge technology companies, we’ve got some large and medium-sized services companies, and we’ve got customers and users of the technology. That’s who’s going to do the work. Primarily, the load will be on all those companies that are making their business in Cloud Foundry implementations — IBM, HP, ActiveState, Pivotal, EMC, SAP — those are the big companies that push it in specific directions.
Historically, Cloud Foundry has been contributed to very heavily by Pivotal, the company that spun the project out. I think of everything in terms of what pressures of the environment produced a particular species — why does the beak of the finch look this way? The Pivotal business has been to build infrastructures to support business solutions for particular companies. A corporation that wants to build an amazing, next-generation set of user experiences very rapidly might ask Pivotal to build that.
It’s a very different set of contextual pressures from what IBM needs to offer in Bluemix. IBM’s customers are going to say, “How does this work with my other IBM investments — my WebSphere, my information management technologies?” It’s very different from what EMC is looking at: “How does this integrate with my storage and my security infrastructure?” It’s very different from what SAP’s customers are looking for, which is, “How can this do a great job of exposing my SAP services within HANA? Can I get access to my back-end systems, to my SAP fortress that I’ve got all this investment in?”
If we do things right — if we ring the bell with the Cloud Foundry Foundation — then we end up making the whole more than the sum of the parts, helping each of our member organizations develop toward common goals, and also keeping the architecture an “architecture for participation,” to use Tim O’Reilly’s expression. That allows the infrastructure to graduate, so that it can support all of these different cases without becoming overburdened or stalled, as you might think could happen with that many players in the game.
To that end, we’re blessed by a ton of goodwill. We seem to have a big tailwind of goodwill pushing us forward, from a range of developers in the community as well as the core corporate sponsors. We also have an amazing head of technology, Chip Childers, who is part of the Apache Software Foundation. He’s done a lot of work in CloudStack. He’s seen how these kinds of communities can get accelerated, and he’s seen how they can get disrupted and decelerated. His active hand on the wheel — guidance for the technology road map, as well as for the process of getting participation and getting new developers trained — I think that’ll make all the difference.
Alex: What are the trade-offs with this approach, with those companies participating in the development of the project?
Sam: The trade-off is always the same in these environments: it’s complexity in return for capacity. There’s a lot more people available, and a lot more time, code and contribution. Because there are so many more people involved and so many more interests involved, invariably the trade-off is the complexity of orchestrating it — organizing, making sure the meetings happen, sequencing the priorities, making sure the stories get logged, bringing more people into the community and getting them to understand the code base and how to contribute.
If you have a small project with a small number of contributors and you leave it closed, you’re always going to be able to go faster down one linear, directed path. But if you want to create an infrastructure that’s going to be resilient, that’s going to last for the future and adapt, then you have to give up that control. Part of “lasting for the future” is adapting to changes that you can’t predict. You have to lean back, open up your arms a little, and say, “Okay, we’re going to commit to dealing with the complexity, because that’s going to make this a better infrastructure for the long term.”
Alex: Sam, thank you very much for your time — I appreciate it. It’s good to talk with you.
Sam: It’s always a pleasure, Alex.
ActiveState, HP, Pivotal and SAP are a sponsors of The New Stack.
Feature image via Flickr Creative Commons.