The New Stack Makers: Adrian Cockcroft on Sun, Netflix, Clojure, Go, Docker and More
Part two of my interview with Adrian Cockcroft covers Adrian’s early interests in just learning how things work and where that lead him and his career. Plus he has some insights about cloud computing, hot topics such as Go, Docker and Clojure…and a lot more. Here’s part one for reference.
Alex: There’s a lot of marketing about the complexities of cloud computing and cloud technology.
Adrian: A couple of comments there. One is that it’s unfamiliar more than it’s hard. So don’t confuse unfamiliar with complicated or hard-to-do. And people find it hard to distinguish. Like NoSQL is unfamiliar for most people. It’s got some different concepts. It’s ridiculously easy once you’ve learned how to do it. It’s much simpler than MySQL or whatever. It’s almost nothing to learn, but it’s got some different concepts in it. Once you’ve got your head around it, “Is that all there is? Okay.” And it’s like, “Doesn’t it do anything more?” “No that’s all there was.” But it has these other attributes that it scales but it’s very, very simple. You’ve traded off the query complexity and the consistency models for the ability to scale, and then you have to deal with the lack of features in some other level, and do more in your own code, and understand some different ways of working. So that’s one of the key differences.
It’s easier than it was in 2009. Well this is really what the most fun part was. We were reading the Amazon whitepapers, reading the Google whitepapers and making stuff up. And we had some really experienced people that had done distributed systems before and we had big arguments and we tried stuff out and figured out something that worked. Most of the PaaS (offerings) have solved this problem one way or another. There was no Cassandra at that time, certainly in usable form. Distributed systems now are mature and stable. You can use ZooKeeper and whatever.
People were saying that it was needed but no one believed it.
Adrian: Well there were people building these systems but they were mostly proprietary. Oracle had Coherence. There were a bunch of software layers that you could buy which were distributed systems layers. But they were mostly commercial off-the-shelf software and you got into the architecture and you were kind of stuck in that architecture, and maybe it did what you wanted, maybe it didn’t. The open source ones were much more malleable. Netflix put people on the project and changed it until it did what we wanted, and other people did the same thing.
You essentially had to build the path.
And now people can follow that path and…
Adrian: The problem if you’re trying to sell software is you have to have the software reach a broad market. It’s very hard to have the purity to really understand what people need. Whereas if you actually have the real problem and you can change the product to meet your need and you’re big enough to do that, and you have the right talent to do it, you can actually get a real solution to your problems.
How did you get interested in technology when you were growing up? Was it before college, was it during college?
Adrian: No it was before. My dad was really the main thing. He is a university lecturer in statistics, but he was programming in the 1960s.
Where was that?
Adrian: It was in the UK. He went to university when I was small, and he received his Statistics degree and then got a job as a lecturer. But with statistics you need computers to add the numbers up. So he had these adding machines and he’d do statistics. He had hand-cranked adding machines and calculators and things like that. So they were always around. And the school I went to was close enough to the university that he worked at. But there was a terminal at school, and when I was 12 you could put your name on a list and at lunch break you could go and spend half an hour playing with typing BASIC into a DEC System-10 and playing Lunar Lander and stuff. So I studied programming when I was 12 which was fairly unusual in 1972. I was born in 1960.
What were you programming in 1972?
Adrian: The DEC system-10 was a big mainframe like box at Hatfield Polytechnic at the time (Now the University of Hertfordshire).
And you had a combined interest in Physics, or there was an intersection there?
Adrian: I was interested in how things work; I still am, and I kind of take things apart. Physics is the ultimate in “how things work.” From first principles, work out how anything works. If you’ve got some physics background, that is what they teach you to do. So that scientific method of applying the laws of physics to things and deconstructing it into rules of behavior is something I apply to computer things. I did a Physics degree and I was doing Applied Physics in Electronics and I built some stuff. So I got a job at a consulting company building real-time embedded systems and signal processing control systems.
I did signal processing control theory in physics, so I knew how the stuff you’re trying to control worked because that’s the physics. And I knew the signal processing control theory, and some programming, and how to program microprocessors. So that was my job and I did that for a few years. We were using Sun workstations as development machines.
Where was this? Who did you work for?
Adrian: Cambridge Consultants. It’s still around, the R&D company is in Cambridge (United Kingdom). We were one of the first users of Sun workstations. We were using PDP-11’s before that. This was the early 1980s. And then I joined Sun in 1988 because they opened a sales office nearby. I mostly wanted to know what machines were coming next.
Because you had been working with these Sun workstations?
Adrian: Yeah, I had been working on these latest, and I was like, “What are you doing next?” They wouldn’t always tell me what was coming next, so if I went to work there I could find out what’s happening next. I was always looking for this future generation stuff. At some point in the 1980s I met Bill Joy, and I realized that he lived five years into the future and I wanted to be like that.
…Well, I wanted to be Bill Joy. But I wanted to somehow get the viewpoint he had that far enough into the future that he could see where things were going to go. That’s one of those formative moments. That aspect of what Bill was doing in the 1980s at Sun was interesting. And then when I worked at Sun I didn’t really get to know him. I wasn’t working with him. Joining the VC firms means that I get to see stuff that isn’t going to happen for a few years because people are still dreaming up the idea.
Was it similarly so at Sun that you were dreaming up stuff that was not going to happen for a few years?
Adrian: Yeah. The high-end servers take five years to make, from designing the chip. You spend years getting the chip built and then you build the machine. From first running – one of those E10K type servers or a workstation – from the time it first ran internally to external product launch would be maybe a year. But you’d have access to people talking about how it was going to work before that.
So how did your career progress after leaving Sun?
Adrian: Okay, let’s summarize forty years in four minutes.
What I was doing at Cambridge Consultants was fairly high-end embedded systems: the later 68000 CPU boards processing in C – really high-end embedded system, signal processing, control theory stuff. And then we started working with transputers, and then parallel computing algorithms and things like that as another set of projects. The transputer was a 1980s British design for a microprocessor, a massively parallel microprocessor, which was a big thing. MPP machines were big at that time. The Intel Hypercube was one of them. There were a bunch of MPP systems, like thinking machines and connection machines. So I was in that space a little bit. So when I joined Sun I knew quite a lot about algorithms and development and performance, and I’d be tuning my code so I knew about performance tuning. And I was also a one-day-a-week sysadmin for the machines that CCL had. So I was DevOps, if you like. I was a developer four days a week and I was the ops guy one day a week. And I was the contact with Sun upgrading these machines and keeping them running and making sure we had backups, managing them from a UNIX point of view. So I was the UNIX sysadmin. Somebody else was the overall sysadmin person, like it was his real job. I was the UNIX guy because he was looking after the VMS machines and whatever else they had.
So, when I joined Sun I was a fairly experienced UNIX sysadmin on Suns, and a developer that could deal with this space. And that was interesting because I found that within the overall Sun community I knew more about performance tuning than most people. And then I would do some of the more technical projects where they’d need somebody to help tune some code or something to win a benchmark, so I would be involved in that kind of benchmarking. And then as we moved from workstations to servers, the machines just got bigger and the problems got bigger, and I never really became a full DBA database guy. But I would sometimes tune the machines the databases were running on and build some performance tools. At my talk yesterday morning I highlighted a thing I did twenty years ago called “Virtual Adrian,” which was a script you run on Solaris machines that tells you what’s wrong with it; it was written in the mid 90s. So that’s still running at a number of people that run Solaris machines now. If you can find a big Solaris machine it’s probably still running Virtual Adrian.
I became the capacity planning performance guy for Sun, and wrote a book, Sun Performance and Tuning. Most people that had ever worked in looking after a Sun machine at some point got a copy of that book. It’s not a very big community but I sold a decent number of books. And then I became a distinguished engineer at Sun for that work basically. I was the go-to person for performance and capacity planning and performance tools which is why this conference is my natural habitat. This is a “going back to basics” for me. And then a few different jobs at Sun: I was in the performance team that was doing performance tuning of applications and the operating system. Then I joined the high performance computing team. I was the chief architect for high performance computing, worrying about interconnects and InfiniBand and 10 gigabit Ethernet ten years ago.
Then in ‘04 Sun laid off that entire team as it shrunk – you know, one of its contractions – and we all left. I joined eBay at that point – back in 1999 they had a big outage, and in 1999-2000 I was part of the team that helped eBay get going again, and I talked to them a lot about how to do capacity planning and set up some processes. Four years later they were still running the processes and I’d kept in touch, so when I needed a job I called up eBay. So I went to work there for a few years. And then after two or three years there Netflix came up. Netflix was just at the point where it was starting to need to scale and I figured out the combination of the performance work I did at Sun and scalability stuff I learned at eBay – consumer based, web-scale stuff – was interesting to Netflix. So that’s why I joined Netflix.
That role started as helping them with capacity planning?
Adrian: No. At Netflix I was managing the personalization engineering team. So I was actually managing the development team, which was something I wanted to do. But the algorithms had to scale. So it was figuring out how to get algorithms to scale, and the systems around them. But Netflix was still pretty small at the time: small number of engineers, small in terms of deployment. It was a few tens of machines. Then we started growing rapidly as streaming kicked in around that time. So gradually after about two or three years I took the team I was running then and added some extra people and did the cloud migration. I then switched roles and became an individual contributor as the cloud architect in the cloud platform team.
Once we made the move, Netflix created a team to do the cloud platform. They hired a manager, Yury Izrailevsky, and he decided that he needed an architect who could figure out how all the stuff worked from a technical point of view, and I was explaining to him how it worked because I’d been architecting it as leading the team. Managers at Netflix are basically architects as well, and that was the role I was in. So I switched over to him, worked for him as the architect and stopped looking after the personalization stuff. Other people took that on. So that was a sideways shift and I did that for three or four years.
So now you’ve shifted into this new role, and you’re talking to a lot of people, you’re out and about…
Adrian: I took most of what I was doing in my spare time at Netflix and made it my day job, my main job, so that’s really the flip. I was actually spending more and more time outbound at Netflix promoting the open source projects and trying to understand what was going on out there and helping people externally, and also talking to VCs about interesting companies that were building technologies and might be interesting to Netflix. So those conversations are not that different really. For me it’s actually a much smaller company so that’s nice to be in a group of tens of people instead of over a thousand. But what I’m doing from day to day is a little different. I’m talking to a slightly different set of people and it’s a much broader set of things to be interested in. So that’s nice as well.
What are you seeing in this broad landscape, coming up on the middle part of the year? You have events like this (Monitorama); this is kind of an interesting event where there’s a community growing around this capability of understanding capacity and performance.
Adrian: I think it’s always been there and this event has just turned up and captured people’s imagination as being around open source tools in this space, which is a nice subset. It’s very much practitioners – people doing it. There’s a bit of vendor support but it’s not vendor-driven. Most of the conferences in this space are much more vendor-driven.
There’s a lot of problems to solve in this space – things that in my keynote I was talking about: the challenges of monitoring in a cloud-based, cloud-native environment where you’ve got enormously sophisticated automation to deploy everything and you start auto-scaling everything. Then you actually want to measure what’s going on and react to it automatically. Once you’ve closed that feedback loop you’re doing control systems stuff. Most monitoring tools are nowhere near that. They have a person staring at the thing who will make a change, right? So the sysadmin is part of the control loop.
What Netflix is doing is not having the sysadmin be part of the control. They’re automating the whole loop. When you do that your monitoring system has to be more dependable than the thing it’s monitoring. Otherwise, it will just screw you up. Every time it breaks it will trash your environment because it will randomly do the wrong thing because it’s seeing the wrong input. It’s like if you get a car engine and you unhook one of the sensors it will run rough. If you get a bad sensor or something, your knock sensor, or you unplug one of the wires and it will still run probably but it won’t be happy. So those are the kinds of problems that I think are interesting, and that was what I was highlighting to people here. So there’s a lot of interest in it. And the whole big data analytics means there’s a lot more analysis you can do. And there’s a lot more computational power. So you can actually do a lot more now than you were able to do a while ago. And a lot more things are being built as open source, as community built things. So I think it’s an interesting time to be trying to build that stuff.
And what’s become apparent in the last few years is that the open source products are the most scalable and dependable products.
You know, the ones that have maybe a company looking after them, but it doesn’t really totally control them. So you go to Datastax for Cassandra to get a dependable distributed system data store or Basho for Riak or whatever. Maybe IBM has a product in that space, but not really, and if they did they’d want a lot of money for it. So then the products are being beaten into shape by large-scale users. So there’s this interesting shift that the major sources of interesting software now are Netflix’s GitHub site, Twitter’s GitHub site, LinkedIn, Facebook, Google, mostly on GitHub. You go and get your open source from the web scale companies that are investing and have the best people working for them solving these distributed system problems, and you assemble your code out of that. You’ve got Hadoop and Kafka and Cassandra and whatever. You’ve got all these different pieces and you glue them all together. You’re a startup and you’ve got three people and you’ve just leveraged massive brain trust and you can do it for free. You can buy support for some of them but that is something that’s really only become apparent in the last few years. That is now the state of the art in getting stuff done. And the big vendors are designed out, and they’re trying to make themselves become relevant to that conversation by buying up some of these companies. I think that as the big IT vendors get more and more marginalized they are going to just start buying things.
Do you have any reflections on Docker, for instance? It seems to be an outcome of what you’re talking about.
Adrian: Docker is interesting. It’s a good example of getting the right thing at the right time with the right name and going viral.
It’s a metaphor.
Adrian: You get a metaphor. It’s a conceptually simple idea with a name that matches, a logo that’s good, and it’s sort of almost like the marketing clicked in and everyone goes, “I get that.” And there are actually other things that do this that have been around for a while that no one’s heard of and can’t remember the names, and have lost. It may not even be the best technology for doing it but they pushed all the right buttons and they’ve succeeded. Naming matters, it turns out.
Naming does matter.
Adrian: A lot of companies pick really stupid names. There’s a huge difference.
But the metaphor’s right.
Adrian: Conceptually, yes. So the idea that you can package something up and you can move around this container. Containerization is a ten-year-old thing. Paul Strong at Sun was working on Solaris containers ten years ago. He is in the CTO group at VMware now. He’s been there for a while, an old friend of mine. None of this is new conceptually, but the idea that you can package it is good. Everyone’s adopted Docker, so the question really is, “Is the company that made Docker going to make money out of it?” Or is everyone else just going to say, “Yeah, okay, we’ve done that.” I’m not sure. So that’s interesting.
There’s a lot of buzz about Go as a programming language.
Adrian: The really cool things I’m seeing being built, some of them are in Go, and the other one is Clojure. A lot of the best programmers and the most productive programmers I know are writing everything in Clojure and swearing by it, and then just producing ridiculously sophisticated things in a very short time. And that programmer productivity matters. So that’s interesting. Go is interesting as it’s replacing C and C++ as the systems programming language and lots of things are now being written in it. Personally, I’m happy because originally I was a C programmer and back in the 80’s I was also an occam programmer, and Go is basically like C and occam mashed together. It’s got the channel-based communication model. It’s not a very well-known language right now but if you looked at it you’d go, “Oh, yeah, I can see where Go came from.” It’s the same channel- and parallelism-based model – CSP is. Technically, there’s something called pi-calculus that it’s based off of and it came from CSP by Tony Hoare. Communicating Sequential Processes is the theory behind it and that was back in the 70’s.
I’ve heard a distinction between concurrency and parallelism. Is there a distinction there that is relevant as these infrastructures become more…
Adrian: You could argue about the labels. There’s different programming models. We’re now in the many-core thing. You get your laptop and it’s got four cores or eight cores; your phone has, right? So if you can’t figure out how to use that then your program is not going to run well. So the problem is how do you express things in a parallel way that programmers can understand. Go seems to have got enough traction to be interesting.
You were trying to think of other services. Maybe you could conclude this interview in that context of the metaphors that are resonating now, and what those metaphors mean as we look forward into the next six, twelve, and eighteen months. We have the “shipping” metaphors which are interesting, and it’s very much of a contrast…
Adrian: Going back to the speed of development: like I said, there isn’t any executive that wants his company to be slower at product development. So if you look at how you really speed things up you have to take the hand-offs out of the process. So every time you get a team giving something to another team – Development to QA to Ops to whatever – every one of those is a synchronization point that slows everything down. If you can avoid that then you’ve saved yourself a lot. So speed matters. Taking those steps out for big companies means you re-org really, because that’s what DevOps is about. If adopting DevOps doesn’t involve a re-org, then you’re not doing it right. So that’s why it’s one of the reasons it’s hard to adopt. But, once you get your head around it, you realize what you’re doing is streamlining things and you have to smash the groups together so that developers do their own operations, and Operations people and Development and QA… You get rid of the artificial barriers, and in operations you get rid of the stove-piped fiefdoms of the storage guys and network guys and the database guys and sysadmins. So you have to kind of mash this stuff back together again to make it efficient, and that’s to make the speed of delivery efficient. They got siloed for optimizing for cost rather than for speed. So this is kind of a cost-versus-speed thing. And the pendulum is swinging back away from cost to speed. Because the cost of infrastructure is so low that now the time it takes to develop something is the biggest problem, so you’ve got to speed things up. So that is causing people to think about things in different ways, and different products are appearing, and the scale that people are dealing with things, and the “software eating the world” kind of ideas where every company now has to be a software company. You can’t not be a software company because every product somewhere has software in it. And everything you do, if it’s marketing or sales, you’re doing real-time bidding for ads.
The hardware is getting abstracted.
Adrian: Yeah. But it’s almost impossible to have a company now that doesn’t have to deal with, in some way, all of this high-tech software and software development and technology. You can’t abstract it away and say, “I’m just going to put that in a corner and it takes care of itself.” Payroll or whatever is all it really used to be, and now it’s fundamental to the business. And that’s really what the Phoenix Project book is about: moving IT from being a peripheral activity to being central to the business. So that’s the big change and then people are processing and dealing with that. And I think that a lot of the old SaaS-based companies that are coming up are just trying to automate the pain points that they’re seeing in that space and trying to figure out “What are the big pain points?” and “How do you sell to these people?” That’s kind of what I’m doing in my day job now – mostly focusing on early stage enterprise IT startups that are finding a pain point and trying to figure out “Is that a really big scalable pain point, or is this just a feature of something else?” You don’t really want to invest in something that’s going to be an acqui-hire in six months. You’re always trying to find the one that’s going to be a billion-dollar IPO in three years. That’s the game.
And we have to think within that framework of “speed, speed, speed” increasingly to find those types of companies.
Adrian: Yeah. But it’s spread out. So lots of companies that didn’t used to need speed are now going out of business if they don’t figure it out. That’s what’s happening in the wider market of people that make widgets in the Midwest or whatever. Those people now have to figure out how to develop software at speed.
One of my analogies: thermostats usually are done with a tilt – the old-style thermostats. It’s a bi-metallic strip and it tilts a mercury tilt switch. And when it warms up it tilts and turns it on, and when it cools down it tilts the other way and turns it off. So there’s your thermostat. And for decades you’re making thermostats, and as long as you can buy mercury cheaply you’re okay. Maybe somewhere along the way you had the one with an LCD screen on it and a couple of relays. And then Nest comes out and you go like, “What?” This thing’s got an iPhone app and all its electronics and it’s basically a phone stuck on the wall. That’s bad enough and then Google buys them. Historically you’re a mercury tilt switch manufacturer and now you’re main competitor is Google, and that’s not a comfortable place to be. Companies are trying to deal with that. I mean, that’s an extreme example, but that kind of thing – the Internet of Things – is causing lots of companies to have to deal with this stuff. So that’s interesting in terms of the wide adoption. I think the Internet of Things is really starting to happen, and starting to drive a lot of activity and a lot of people are building around it. So that’s probably the next wave that’s going to happen. I’ve got some connections in that space. My first job was building embedded real-time things with computers in them. Back to the old days.
Well, Adrian, this has been a wonderful interview. Thank you very much for taking this time here, and visiting our city of Portland for Monitorama.
Adrian: Thanks and congratulations on launching The New Stack. I’m reading all of the articles as they come out to my RSS feed. Great to see a different viewpoint from the other sort of GigaOM or TechCrunch kind of view of things. So it’s an appreciated and good addition to what’s happening.
Well, thank you and we’ll talk soon.
Adrian: Yeah thanks. Cheers.
Luke Lefler, our podcast producer and copy editor, contributed to editing the interview with Adrian.