The Distance from Data to You in Edge Computing
In recent months I’ve been taking a closer look at how data transfer works between the browser, the server, and the database, and have built a prototype application using Next.js and deployed it to Vercel to help visualize and calculate the distances that data has to travel between them.
I’ve been asking folks on Twitter to “submit their edge” so I can build up a better picture of where in the world requests to and from my databases are coming from. If you like, please go ahead and anonymously submit your edge.
You can preview the app and see the code on GitHub using the links below.
- 🚀 Live Preview: https://cockroachdb-edge-locations.vercel.app/
- ⚙️ GitHub Repository: https://github.com/PaulieScanlon/cockroachdb-edge-locations
Here’s a short explanation of what the dots on the globe represent.
- green dots: The approximate locations of folks who have submitted their edge.
- red dot: The location of a single region Vercel Serverless Function.
- orange dots: The locations of x3 multiregion AWS Lambda Functions.
- blue dots: The locations of x3 CockroachDB multiregion Serverless Databases.
The green dots you’ll see around the globe are the approximate locations of users who have clicked the “submit” button. In the app I’m using request-ip and fastgeo-ip to translate the ip address on the req object of the function into real geographical locations. Then, using node-postgres I create a new row in the nearest CockroachDB multiregion database. CockroachDB handles replication across the other databases so that all three regions will stay in sync. (Full disclosure: I work for Cockroach Labs.)
You can see the
src code for the “create” functions on the links below:
- ⚙️ Vercel Serverless Function: cockroachdb-edge-locations/pages/api/create.js
- ⚙️ AWS Lambda Function: edge-locations-serverless-api/create.js
Using the Vercel docs alone, I wasn’t able to determine exactly what these locations (green dots) were. For instance, when I submit, the location displayed (Horsham, West Sussex) is ~28 miles from where I actually am.
To get a clearer understanding, I asked Vercel directly. They were kind enough to reply to my email and here’s what they said:
My understanding of this is that we use MaxMind as a database to cross-reference the IP information and get a latitude/longitude for the request and then we output that. Maybe the best way to describe this is just an estimate of a user’s location based off of IP information using geolocation services accessed via Vercel — Vercel Sales Engineer
Now that I have a better understanding of the starting point, I can begin to plot the data journey.
The Anatomy of a Request
As you may already know, you (your browser) can’t (in most cases) directly communicate with a database.
There are a number of reasons, many related to security (browsers aren’t secure). Instead, any requests you make from your browser have to go via a server; or, in this case, a Serverless Function or Lambda Function. The server can then perform the necessary security checks before requesting access to the database.
If all is “A-Okay!”, the database can respond in accordance with the request, e.g. with some data.
This leads me to the next part of the data journey. Where are the servers?
Where Are the Servers?
When deploying an app to Vercel, the static parts — HTML, CSS, Images — are deployed and globally distributed around a Content Delivery Network (CDN).
The idea being, the closer these assets are to the user, the shorter the distance traveled and the faster the website will “load”.
A Serverless Function or Lambda Function can’t be deployed globally around a CDN; and instead, will be deployed to a region. Vercel is built on AWS and Vercel Serverless Functions can only be deployed to a subset of the available AWS regions.
Depending on the plan you’re using (I’m using a Hobby plan, but the same applies to the Pro plan), you can only deploy a Serverless Function to a single region. It is possible to deploy to multiple regions, but you’d need to upgrade to a Vercel Enterprise plan — Yikes!
For my purposes, I didn’t want to upgrade to a Vercel Enterprise plan, so instead I built a Route 53 API Gateway using AWS and then deployed multiple Lambda Functions to match the regions of the CockroachDB multiregion serverless databases, which are free and come with a 5GB storage limit. CockroachDB is cloud native and can be deployed to AWS or GCP, for this application I’ve deployed to AWS. This setup is currently costing me $2.30 a month.
Because I work at Cockroach Labs, I’ve been granted access to a private beta version of multiregion serverless (it will be available publicly later this year), which allows me to deploy my database to multiple regions. Once again the idea being, the shorter the distance traveled, the faster the website will “load”.
Vercel Single Region Serverless Function
With the limitations of the single region Serverless Function on Vercel (Hobby and Pro plans), no matter where in the world you are (including being geographically close to a database region), your request would have to go via the region of the Serverless Function.
From my location in the UK, the request has to travel across the Atlantic Ocean to the Vercel Serverless Function located somewhere in us-east-1. From here my specific CockroachDB configuration optimizes the request and determines that the closest database is also located in us-east-1, accepts the request and sends the data back to the Serverless Function, which returns back across the Atlantic Ocean to my location in the UK.
The approximate distance traveled for this journey is as follows:
- One Way:
- ~3,683 miles
- ~5,928 kilometers
- Round Trip:
- ~7,367 miles
- ~11,857 kilometers
AWS Multiregion Lambda Functions
With AWS multiregion Lambda Functions and the geographically aware API Gateway, requests are routed via the nearest Lambda Function.
This time, from my location in the UK, the request only has to travel to eu-central-1 (Frankfurt) before, once again, my specific CockroachDB configuration optimizes the request and determines that the nearest database is also in eu-central-1. It accepts the request and sends the data back to the Lambda Function, which returns back to my location in the UK.
The approximate distance traveled for this journey is as follows:
- One Way:
- ~452 miles
- ~728 kilometers
- Round Trip:
- ~905 miles
- ~1,456 kilometers
This results in the AWS multiregion route being ~87% shorter than the Vercel single region route.
If you’d like to try this yourself, hit “submit” and then use the toggle switch to submit via either the Vercel Serverless Function, or the AWS Lambda Function, and have a look at the distance of your data journey.
I’ll be honest, setting up the API Gateway and Lambda Functions on AWS was no walk in the park. It was my first time using AWS and it did take a fair amount of reading to determine which of the many AWS services I needed to use.
Below are some details that might help steer you in the right direction if you’re thinking of doing something similar.
I used Serverless to create the Lambda Functions and deployed them to multiple regions using a GitHub Action. You can see the repo for my API on the link below:
- ⚙️ GitHub Repository: https://github.com/PaulieScanlon/edge-locations-serverless-api
The Lambda Functions are deployed behind an API Gateway with a custom domain. You can see the default route on the link below; depending on where you are in the world, you’ll see an appropriate region. For me, in the UK the region displayed is eu-central-1, but yours might be different.
- 🔗 GET /: https://api.crl-devrel.net/
I’m using a Route 53 hosted zone. I’ve added three A Records with Geolocation routing and defined a Differentiator to route traffic from different regions to the three Lambda functions.
The advantage of Vercel is there’s little to no infrastructure setup required. However, in my case it was more of a hindrance than a help. The application I’ve developed and the requirements I had didn’t quite fit the pricing model — curse my luck!
It’s not the easiest thing to do, but not the hardest either. I’d also like to add, don’t let the gatekeepers deter you from trying this yourself — managing infrastructure can provide a level of flexibility you might not achieve using a managed service like Vercel. If you’ve got the time and curiosity, it’s worth a look. It’s also worth noting, having AWS knowledge is a valuable skill to have.
I’ve been publicly discussing this project on Twitter and I’ve had a number of folks mention I should use Edge Functions. I do need to dig a little deeper into this but for now I’ll just quote the Vercel docs verbatim: Most Node.js APIs are not available.
Edge Functions also have additional limitations relating to the code size limit. These are:
- Hobby – 1MB
- Pro – 2MB
- Enterprise 4MB
My requirement to use node-postgres would exceed the 4MB limit on a Vercel Enterprise plan, so for now at least, Edge Functions aren’t quite the right fit for this project.
If you find yourself in a similar situation I think it’s worth considering alternatives. AWS has a fantastic array of services and whilst it can be a bit tricky to get started, I personally feel (with my limited knowledge) there are fewer restrictions.
I plan to keep learning about AWS and building in public and if you’re doing the same, let’s talk! You can find me on Twitter here: @PaulieScanlon.
In the meantime, go ahead and submit and lemme see your Edge.
- 5 reasons to build multiregion application architecture
- Getting Started With CockroachDB, pg-promise and Next.js
- How we built easy row-level data homing in CockroachDB with REGIONAL BY ROW