Testing Developer Velocity: AWS EC2 vs Lambda vs Lambda on Stackery
Stackery sponsored this post.
Lambdas deliver better velocity than a traditional managed Virtual Machine (VM) like Amazon’s EC2. But how much more velocity exactly? I decided to find out.
Meet Your Racer
I’m a kind of smart developer: I’ve built web apps and did front-end work for a couple of years. My home machines are running Mint. But I’m by no means a Linux nerd — I know half a dozen commands and when people start talking about the many flavors of *nix my eyes glaze over.
To back me up, I have the great Stackery team, all of whom might be a bit better qualified since they work on a product that interfaces with AWS. But I’m their community developer, which means I hopefully take the viewpoint of the average web developer.
I don’t have fixed rules but, in general, I will get any level of help I need except for having someone really smart like Chase Douglas sit next to me throughout the process. As I’ll say a few times, this isn’t measuring the raw time it takes for each step to complete, but the time it takes for a web developer to figure out menus, configuration, and documentation.
EC2 Service: 5 Hours, 15 Minutes
I am certain that there are Linux nerds, or AWS nerds, who could get an EC2 Service running in just an hour. But again, the goal here is to measure the real time it takes for a web developer to get these services up and running. Let’s look at the time breakdown:
- Creating and starting the server: 1 hour 10 minutes.
- Picking an instance — I didn’t have a budget for this project so it took me a minute to identify an Amazon Machine Image (AMI) that would definitely be free. I also needed some time to make sure this Linux version wasn’t too far from the Arch I was used to.
- Adding storage — 8GB seems like plenty… right?
- Configuring the Security group. The image defaults to enabling SSH access, but this defaults to throwing a warning and we need to do some config if we actually want to serve HTTP requests. Since my application involves sensitive data I couldn’t just set 0.0.0.0/0 and let any IP address through.
- Launching — coffee break time! This last step took 10-12 minutes. In repeating it a few times, part of this might have been interface delay, it didn’t seem like the in-page “refresh” worked as described…
- Getting into the server: 1 hour 15 minutes.
- Setting up my local key. I’m pretty embarrassed by how long this took. I’ve never had trouble setting up access like this before! But I was on a new machine and forgot I had to chmod 400 my key, the error messages I got when trying to use it were… unclear.
- Actually logging in. I’ve always used the default ssh -i command without a username and logged in as root. This is apparently not permitted on AWS instances.
- Install Node: 15 minutes.
- I started just trying to grab Node directly. That wouldn’t work for me, so I needed to get Node Version Manager (NVM) and install from there. This didn’t seem to be a common problem, but I’d still recommend using NVM from the start!
- Install packages: 1 hour 45 minutes.
Why did this take so long? Swap space. It turns out that trying to run npm install to grab a few basic packages (including express) on a nano instance can actually fill the memory. To my surprise, you can’t just pull a slider to increase memory without wiping the instance’s storage. By only installing a few packages at a time I managed to get under the memory limit. I thought this strategy was faster than starting over on an instance with more memory but ?♀️.
Rather than fuss with a proper connection for my IDE I just copy/pasted into the SSH terminal. I didn’t use boilerplate for the short and simple server code. But I did realize later I needed to unblock port 3000 in the security group configuration,
- Connect my database: 30 minutes.
For this bit, I was able to use an AWS guide verbatim but gosh it still took time. I would have thought there was a simple UI for this since it’s such a common task, but sadly no!
This part didn’t take long! I had to add some new dependencies, configure the connection, and voila. I really wasn’t up for any special secrets config, but I don’t see how it’s likely anyone will ever get access to this instance. Notably, this is the single area where EC2 has a straightforward advantage: Lambda code isn’t as securely hidden so a naive implementation (keys/secrets right in the source code) is a bit safer on EC2!
Lambda Configured by AWS Console: 1 Hour, 16 Minutes
- Create the API endpoint and the lambda: 35 minutes.
- Create an API endpoint. Remember that a lambda doesn’t have any presence “online” — no IP address, no URL — without an API endpoint connected to it. The only bugaboo here was getting a domain issued.
- Create a lambda: 1 minute.
- Create a relational database service (RDS) instance. None of the options offered any real chance to do something unworkable, so I stuck with the defaults and got things set up quickly.
- Connect to and populate the DB: 40 minutes.
It took just a bit of doing to connected to my RDS instance, but after that, I wrote some SQL and I was ready to go. To go this route I did have to hard-code the RDS credentials into my lambda which isn’t ideal, but I’m not aware any too significant security risks involved with having auth data in my Lambda code provided that code isn’t in a public repo.
- Write the code: 10 minutes.
Even with this very simple example, there is a lot less code to write for a lambda than an express service on EC2.
Lambda Configured by Stackery: 14 Minutes
- Create all resources and connect them: 5 minutes.
- All the creation and configuration is handled from a single canvas view in the Stackery dashboard.
- Four of these five minutes were taken up waiting for the three-resource stack to deploy via CloudFormation.
- Write the code: 5 minutes.
Since I can work in my own IDE and push changes from GitHub to my stack, this part was a bit faster than the other paths.
- Configure/populate the DB: 4 minutes.
My big frustration here was that I wanted to develop locally with the lovely SAM CLI but RDS instances aren’t yet supported (and it seems like a big task to do so.) And I didn’t want to burn the time it would take to set up the local RDS tools and connect the SAM CLI local image.
6. Since Stackery Lambdas are automatically given all the parameters and config they need to access RDS, I could just do a one-time lambda to populate the DB.
This Wasn’t a Fair Fight
We’ve attempted to measure the difference between delivering the same service, but there’s one issue that we really haven’t addressed: Our EC2 service is currently something of a black box.
Our Lambda will produce highly detailed information about its execution time, how many errors it has encountered, and how long it’s been running. EC2, by contrast, will only report automatically from the OS layer: memory usage and uptime.
More predictable from the fundamental differences is management and dynamic loading. We need to either shut down our EC2 instance manually or invest more time in hypervisor tools.
None of these is a dealbreaker: EC2 is a blank canvas, and we can absolutely add observation and monitoring tools. In my time at New Relic, as we delivered more and more advanced Application Performance Monitoring that instrumented individual lines of application code, our top-selling tool was always server monitoring, which didn’t tell you much more than what Amazon CloudWatch will tell you about EC2 instances.
The point is that all of this is possible on EC2 but this is a contest of velocity so anything that will take extra dev time has been left out. Suffice it to say: These features are already built into our Lambda stack.
Lambdas really do offer a significant improvement in developer velocity, with Stackery making these tasks even faster. These effects would only be magnified by a more complex stack, and this comparison left out steps like implementing any security on our EC2 instance beyond closing ports.
Will you complete tasks on your team in 1/7th the time if you use Stackery? No. Most of the time spent on delivering a feature is spent coding, and the differences in the time it took to write the actual JaveScript didn’t show as much variation.
My next post will compare these three setups six months later in the life cycle: when our simple service has stopped working, and we need to explore what’s going wrong.