Life in a Post-Container World and Why Linux Will Play a Diminished Role
Right now containers are all the rage. The money piling into the container ecosystem is massive. Uncle Scrooge would drown inside the vault — and ducks are good swimmers.
Containers have actually been with us since the late 1990s, but they are not the end of the story. The real transformation will come with a “serverless” future that will completely overturn the ops ecosystem. Companies will go out of business, new ones will spring to life, and thousands of people will have fundamental changes to their jobs. The shift to a serverless future is much bigger than your normal hype cycle — I believe the current container hoopla is a foreshock preceding a 9.0 quake.
How did we get Here?
Schedule a trip to the Computer History Museum in Mountain View, California, the next time you visit the Bay Area. It is breathtaking to take in the 70-plus years of modern day computing within a few short hours.
Technology moves so fast that sometimes it is hard to recognize repeating patterns in software, but those patterns do exist.
Part two of “Operating System Concepts,” the classic dinosaur operating systems book, is all about process management. In particular, one of the chapters is about CPU scheduling. This is one of those repeating patterns, and also tells us the story of where we came from.
The Unix Era
Back before the Web/Internet entered popular consciousness, we used to have magnificent siloed servers and centralized computing where many users would submit requests for time to run their applications. These were shared resources that had to be divided amongst multiple users and multiple programs. These jobs had to be scheduled and the systems had to support many users, being careful to give each user their fair amount of CPU time. Back then, it actually cost quite a bit of money to run software on computers. Ken Thompson’s Space Travel cost $75 every time you played it.
Operating systems in prominent use to this day are multi-user, even though the vast majority of businesses have no good reason for them to be. It’s not common anymore for a system to have tens to hundreds of users. Servers nowadays serve one purpose, and the users are a mere relic of the past — used only for a set of programmatically controlled daemons. Security problems are probably the most prevalent headache of multi-user systems, but in practicality we don’t even use that feature anymore. We already segregate our servers based on type into various groups. The databases are separate from the app servers, which are separate from the load balancers, which are separate from the queues, etc. Sure, it’s a sin for applications to run as root, so ops people create a new user — and the trappings that comes with it — for each program, and we have a wealth of daemons that monitor and track things.
It’s because currently we don’t have anything better. The work and knowledge to build operating system kernels, the modules that support them, and the vast treasure troves of interfacing library support is non-trivial.
The Network Era
During the 1980s, the personal computer emerged and that really started to destroy the concept of the ivory tower big machines. However, I’d argue that it was the merging of greater networking capabilities that showed how applications could proliferate and be so widely used.
ArpaNet replaced network control packets (NCP) with TCP/IP the year I was born: 1983. Later that decade, networks were interconnecting at a rapid pace.
Programs were running everywhere, not just in universities and very large companies. Hell, even the lowly consumer had software. The early 1990s saw such an advance that it allowed the burgeoning VX scene (computer virus writers) in Europe to wreak so much havoc that companies such as McAfee made goldmines protecting end users.
In the late 1990s we had the first Internet boom … and the first bust. There might be small backoffs in isolated verticals happening, but don’t believe the hype — there isn’t any looming bust. If anything the current boom is still near the x-Axis on our hockey stick. If you thought the last couple of years were a big deal — buckle up.
The Container Era
The early years of this century were a decade of self-realization for tech — can we even call it an industry anymore? Every organization, be they a software provider or not, is now a tech company. However, how software runs is the more important evolution. I think it’ll go down as an important chapter in history and earn it’s rightful place in the computer history museum.
I previously wrote about this in the “Death of Linux.”
Hyperbolic for sure, but the essence is very real. We had the rise of virtualization in the early 2000s, then IaaS in the mid 2000s, Heroku and friends spawned Paas, of which Docker formerly known as dotCloud came to be. All of this dramatically changed how companies delivered their software, employed their workers, enabled whole new industries that could not exist beforehand, and forever changed the landscape of software in general.
We awoke and found ourselves staring into the Cambrian explosion of the container era.
Small companies can handle five to ten to 20 servers themselves with a handful of scripts and a good ops person or two, but after you start getting into the 50 servers and up terrain (or hundreds, thousands, or tens of thousands as many more established companies are finding themselves in) you really don’t want to be thinking in terms of managing servers anymore. Hell, you never really wanted to, but it just didn’t make sense until you hit that inflection point. There’s still a lot left to be desired in the orchestration space for companies with fewer than 20 servers or so. Ask any (dev)ops person from a company of that size and they’ll probably tell you about the countless deployment/orchestration software they’ve hacked together from a ball of twine and duct tape.
Notwithstanding, the demands of small companies have only risen sharply in the past few years. With small companies now needing more than a handful of servers or instances, you’d think we would be in shape for a return to this concept of a time-sharing, centralized-scheduling era — especially with the wealth of all the software in the resources scheduling space, à la Kubernetes, Mesos, etc.
We are, in fact, not going that route. We are leaving the era of the data center. We are at the tail end of the server paradigm itself that was started so many years ago, and we are fast approaching a resources era.
The Resources Era
The resources of what I’m speaking of are akin to electricity, sun or water.
People would look at you strangely and think you are crazy if you had a team of engineers managing your own power plant for your business. Groups of people in blue coveralls making sure it was well oiled. Capacity managers ensuring there was a backup generator. Constantly having to try to deal with peak usage, burnouts. “I’m giving it all she’s got, Captain!”
Yet, that is precisely how the majority of companies manage their server resources nowadays.
The answer is not better tooling or better ops, but rather the absence of these — it will all be abstracted away. We need to stop talking about servers and start talking about resources.
The so-called IoT is starting to heavily push us into what I consider a serverless, and more resources-based, environment.
Some people think IoT means your fridge will sing you a nice song when you get home. That line of thinking is so “Jetsons” — although full disclosure, my washing machine does.
A lot of people don’t know that virtualization, like Xen, is used in embedded systems and mobile as well. Yes, you can run a virtual computer on your phone.
What does this mean? This means there’s no need for Linux in the IoT space. Not only is there no need, but it’s actually a hindrance.
I think this trend will continue as well. When hackers can remotely kill Jeeps on the highway and there are entire conventions for autonomous truck driving we should be thankful that most of those systems aren’t running full on Linux servers. I don’t think we need to encourage fourteen-year-olds to take their GTA fantasies out on real life cars.
It’s not necessarily that IoT doesn’t need Linux, it’s more that we don’t want those devices to have it. From a security perspective alone it minimizes the attack surface.
I believe with the wave of new devices coming down the pipeline we are in for a rude awakening on what we thought IoT would turn out to be. With this proliferation of devices, the networking needs will change. Networking every single device to each other will be unncessary; however, we don’t need them checking into a central set of servers, either.
There is a movement underway from various engineers to decentralize data silos from big companies such as Facebook and Google. Companies like Sandstorm are proving the ground here. It is one of the software enablers of this realm.
In short, this is the coming political change that the so-called personal cloud brings. Don’t tell Nest, but with personalized cloud, this vision is realized. You won’t have to sign an end user licensing agreement to change your thermostat. Some of us might be tinkerers, but the rest of us just want a thermostat.
Right now, if you were to try and modify one of these devices, you’d definitely fall into the maker/hacker category. I believe we are very close to having much more standardized support for modifying these devices without having to muck about in a Linux shell.
Extremely poor tooling in the ops realm has led us down the path of obviating the need for doing a lot of the manual work. Our server environment is fast approaching this new paradigm, and it means all the old sysadmins/devops/etc. industries are going to die with it. With their removal from the future, so, too, shall all their tools go. If that means we go to services, then it’s only natural that those services consolidate under the provider-of-resources bundle.
This new paradigm is basically the fact that most of the tooling and human resources we have built up around server usage is not needed if we don’t have to manage servers in the future the way we do currently. The resources will be directly available to the end developer’s application, and all the elaborate ceremony that we currently do goes away.
What does this really mean?
Let’s use an example.
Logging is the most caveman of practices out there, but I can’t name one company that doesn’t log something. In fact, most companies have a default practice of logging “all the things.” The trouble is that no one looks at logs. No one takes action on logs.
On an evolutionary scale, logging is primordial soup. It’s what we did before we invented fire.
The fact that logging hasn’t been removed from our systems yet is simply because it’s the lingua franca of ops people. Any ops person can awk/sed/cut/head/tail their way to finding why problems are happening.
Logging led to widespread application level monitoring half a decade ago.
APM led to stronger and more useful metrics systems adoption both in and outside of apps. After all, if you aren’t directly managing the physical resources, your problems are directly related to your application problems, not the drive blowing up. Anyone using something like AWS today doesn’t deal with those problems anymore. It’s been abstracted away.
These more fine-grained metric tools have led us to feedback control and adaptive systems.
We can checkpoint processes (freeze a copy of them Han Solo-style), and we have DVR-style playback systems in place for production quality systems now. It’s only a matter of time before the data science people get involved. The future will no doubt lead us toward more proactive, self-adaptive control systems. If you are interested in what the future holds in this area you should most definitely check out William Louth’s work at AutoLetics.
All of this means that companies that specialize in things like logging won’t exist in the future. Monitoring companies as we know them today won’t exist in the future. Don’t get me wrong, I spend all day staring at a black screen with multi-colored text that looks right at home in the 1970s, but that’s not the future.
If you aren’t the provider of resources, your company that caters to servers — relics of a bygone era — will die. Investors: you might want to start the unloading now — in three to five years it’ll be too late.
Ops Financial Pressure
There is extreme financial pressure to alleviate ops spending from companies as well. Don’t believe me?
Even the lowest paid ops person makes $100,000 to $120,000 here in San Francisco. However, many of the common jobs that these people will work on are being replaced by services and more advanced tooling. Why spend that much each year on someone to stand up a server, integrate it, then ensure it works versus spending 1/20th of it on a service instead?
If you are a competent ops person here in San Francisco you can command a $180,000 per year salary. Compare that to what you could get for $150,000 per year — a rather damn good software engineer. Most ops people don’t even code! This is why the scripting languages such as Ruby/Python are so prominent in the discipline.
Having said that, the skill set needed to deal with those rascally servers is one fraught with late nights and empty liquor bottles. I firmly believe that the traditional ops job, while lucrative now, won’t be as nearly prominent in the future. The machines will eventually take over as we shift to unikernels and beyond.
Storage was ostensibly one of the first pieces of the monolith that got separated from the kernel, but not in the manner that most people thought. After all, you still needed file system drivers to mount external file systems.
Later on, a few engineers with a financial services firm in New Jersey started ripping out the networking stack from the kernel to speed up their high frequency trading (HFT). The kernel was too slow. They needed to process things just a bit faster. You see, what used to be a boon in traditional systems — context switching — turned out to be a bane for these traders. They didn’t have lots of processes — they wanted one process with the lowest latency and lowest overhead. Nowadays, kernel bypass is almost a given in many HFT firms, and many software-defined networking (SDN) ecosystems are adopting the approach as well.
Next up: the file system itself started cropping up in user-land — how else were you going to do all the neat tricks found in modern day systems, like dynamically resizing an instance while it’s running? There are quite a few container-using companies that also utilize FUSE and GlusterFS. Then you have companies like IPFS with radically different visions of the future when it comes to file systems.
Essentially, many of the older kernel responsibilities have been drifting away into the user realm for quite some time, and it’s only been accelerating lately.
The Developer/Kernel Divorce
So, if we are in the container era now, what comes next?
Unikernel adoption is next in line. The tooling is still too nascent for mass adoption, but make no doubt that you probably won’t have a large DevOps team in the future, if at all. In a serverless future, there simply won’t be a need for such a team. If those engineers exist, they’ll work at the application level, not the systems level.
There will come a distinct split between the application layer and the operating system, which will fade away into the providers of resources. Those providers are the natural progression of what we have been seeing in the past 15 years.
The split, when I was going to school, was clearly user-land versus kernel-land and based on the concept of rings.
Back then if you had ‘Ring 0,’ in hacker parlance, you owned it. Ring 3 was user/app land; however, we are pushing in a direction where these lines are getting rather blurred.
Docker and the rest of the container ecosystem have been instrumental in pushing developers in this direction. It might not have been what their respective companies intended, but it is what is happening.
What makes the unikernel the natural progression?
What the hell is a unikernel, to begin with? If you think of Amazon’s EC2 service, those computers you are using are not real computers. They don’t exist in real life. They are virtual instances of a computer using a modified version of the Xen software. However, they typically run Linux, which was meant to be ran on real hardware. Now, Linux needed plenty of extra code to ensure support on a wide range of hardware and was meant to be multi-user and multi-process. Unikernels don’t need all the extra bloat, and they are designed to do one thing and do one thing well: they are single-user and single-process.
Unikernels provide all the same and similar features as containers do (fast boot time, light-weight memory, packaging), but with additional security guarantees, amongst many other great features.
There is no shell to login to and it’s one application, not a host kernel switching between a bajillion processes that shouldn’t even exist. This eliminates most security vulnerabilities right out of the gate. In total, $4.6 billion of investor money went into security in the past two years. Computer security is not a small problem, and unikernels basically kill the majority of it, because the largest portion of security problems in the past have dealt with the fact that Linux is a multi-user system with multiple processes. ‘Getting root’ on a computer is the number one aim of any hacker, because once they have that you are screwed. Complicate that with the fact that many large organizations use the concept of trusted networks, and now your entire organization is screwed. There is no ‘getting root’ on a unikernel, because there is no root to be had.
The sheer size of unikernels versus the size of your average Docker container is something that a lot people are starting to notice. Deployment, then, is not preparing a machine for your containers, but simply pushing the change and scaling up or down the new changes directly to Xen. Who needs all this other orchestration/scheduling software? Yours truly has written no less than four distinct deployment systems in the past so many years. Now, I’m not saying orchestration and scheduling software is going to go away, just merely their current incarnations. Much of the hassle that orchestration deals with involves things like user permissions and external software dependencies — the type of stuff that unikernels simply don’t have to deal with. Future scheduling and orchestration software will only have to deal with raw resource usage, not be some 1000 line bash kludge that no one in your company wants to touch with a ten foot pole.
Unikernels are so powerful that we now have research papers with titles such as “Just-In-Time Summoning of Unikernels” that describe launching unikernels in response to network traffic.
Life in a Post-Container World
I believe the combination of containerization, cluster-level scheduling, poor tooling, IoT, kernel abandonment and ops pressure are pushing us towards an environment where we won’t be interacting with servers anymore. This is going to have a dramatic influence on the development world with lasting repercussions.
I don’t quite know what to expect past unikernels, but I do know we won’t be managing servers the way we do now, and that future is coming soon.
It’s going to be exciting.