Tutorial: Host a Serverless ML Inference API with AWS Lambda and Amazon EFS

In this tutorial, I will walk you through the steps involved in hosting a PyTorch model on AWS Lambda backed by an Amazon EFS file system. The function is exposed through an API Gateway. Assuming you followed the steps mentioned in the previous tutorial, you should have the EFS file system ready with PyTorch modules and the ResNet model. We will attach that to a Lambda function to host the inference API.
This article is a part of the series on making AWS Lambda functions stateful with Amazon EFS (Part 1, Part 2, Part 3).
Prerequisite: IAM Role for AWS Lambda Function
Before we go ahead with the Lambda function, we need to have an IAM role in place. This role will give enough permissions to the Lambda function to access the EFS file system and Elastic Network Interface creation within the VPC.
Choose the Lambda use case to create a role.
Add AWSLambdaVPCAccessExecutionRole and AmazonElasticFileSystemClientFullAccess policies to the role and save it.
Since the Lambda function is running in the context of a VPC, you need to configure a NAT Gateway to provide outbound connectivity. Refer to the documentation for the details.
Create the Lambda Function
Create a Lambda function with Python 3.8 runtime and the execution role set to the IAM role created in the previous step.
From the VPC section of the Lambda function, select the same VPC that was used during the creation of the EFS file system. Add the default security group which allows inbound and outbound traffic within the VPC.
Under the file system section, choose the same filesystem that was used in the previous part of the tutorial. Type /mnt/ml
for the local mount path.
Edit the basic settings section to increase the RAM and timeout settings. Increase the memory to 1024 MB and timeout to 10 minutes.
Add the PYTHONPATH
and MODEL_DIR
environment variables to point the function to the EFS location. This will ensure that the Lambda function can access the PyTorch libraries, trained model, and the label file. Don’t miss the trailing backslash as it is required by the code to access the directories.
Paste the below code snippet into the function code section and hit the deploy button. The same is available on GitHub as well.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
import urllib import json import os import io import torch from PIL import Image from torchvision import models, transforms import torch.nn.functional as F MODEL_DIR=os.getenv("MODEL_DIR") model = models.resnet18() model.load_state_dict(torch.load(MODEL_DIR+"resnet18-5c106cde.pth")) model.eval() normalize = transforms.Normalize( mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225] ) resnet_transform = transforms.Compose([transforms.Resize(224), transforms.CenterCrop(224), transforms.ToTensor(), normalize]) json_file = open(MODEL_DIR+"imagenet_class_index.json") json_str = json_file.read() labels = json.loads(json_str) def transform_image(image): if image.mode != "RGB": image = image.convert("RGB") def lambda_handler(event, context): data = {} url = event['queryStringParameters']['url'] image = Image.open(urllib.request.urlopen(url)) image = resnet_transform(image) image = image.view(-1, 3, 224, 224) prediction = F.softmax(model(image)[0]) topk_vals, topk_idxs = torch.topk(prediction, 3) data["predictions"] = [] for i in range(len(topk_idxs)): r = {"label": labels[str(topk_idxs[i].item())][1], "probability": topk_vals[i].item()} data["predictions"].append(r) return json.dumps(data) |
Create a test event with the below configuration and trigger the function.
1 2 3 4 5 |
{ "queryStringParameters": { "url": "https://i.postimg.cc/v8pmjrwf/dog.jpg" } } |
Testing the function should result in the below output.
As you can see, the model is able to correctly classify the image of the dog.
Attach an API Gateway
It’s time to expose the function through an API Gateway. Add a trigger to the function with the following settings:
Send a cURL request to the inference API by sending the URL of the image as querystring parameter.
The first invocation will take longer due to cold start. But subsequent calls will be faster.
Try the service with the dog and flower images hosted at the below URLs:
https://i.postimg.cc/v8pmjrwf/dog.jpg
https://i.postimg.cc/1RN54Y1n/flower.jpg
Congratulations! You have successfully hosted a PyTorch model in AWS Lambda to deliver serverless machine learning API.
Janakiram MSV’s Webinar series, “Machine Intelligence and Modern Infrastructure (MI2)” offers informative and insightful sessions covering cutting-edge technologies. Sign up for the upcoming MI2 webinar at http://mi2.live.
Amazon Web Services is a sponsor of The New Stack.
Feature by Roman Kraft on Unsplash.