What news from AWS re:Invent last week will have the most impact on you?
Amazon Q, an AI chatbot for explaining how AWS works.
Super-fast S3 Express storage.
New Graviton 4 processor instances.
Emily Freeman leaving AWS.
I don't use AWS, so none of this will affect me.
Cloud Services / Serverless

Tutorial: Host a PyTorch Model for Inference on an Amazon EC2 Instance

In this tutorial, we will walk you through the steps involved in hosting a PyTorch model on Amazon Web Services' EC2 backed by an EFS file system.
Nov 4th, 2020 10:45am by
Featued image for: Tutorial: Host a PyTorch Model for Inference on an Amazon EC2 Instance

In this tutorial, I will walk you through the steps involved in hosting a PyTorch model on Amazon Web Services’ EC2 backed by an EFS file system. This article is a part of the series on making AWS Lambda functions stateful with Amazon EFS (Part 1, Part 2).

Assuming you followed the steps mentioned in the previous tutorial, you should have an EC2 instance with a mounted EFS file system. We will now configure it to run a Flask server that exposes a PyTorch inference API.

Our goal is to install all the dependencies of PyTorch and test the inference on EC2 before running it in AWS Lambda.

Preparing the EC2 Instance

We will start by creating two directories that will hold the PyTorch modules and the pre-trained ResNet model respectively.

Let’s start by taking ownership of the EFS file system. This will enable us to install everything that’s needed for the tutorial.

It’s time to install Python 3.8 and Git client on Amazon Linux 2 instance.

Let’s add a link to Python3.8. This will make it easy to deal with different versions of Python installed in the machine.

We will then install pip utility.

Installing PyTorch and Hosting the Inference API

Start by cloning the GitHub repository that has the model and the inference code.

Navigate to the ec2 directory and run the following command to install the Python modules including PyTorch and Flask.

The above command installs the Python modules in the lib directory of the EFS file system. This is the most important step where we populate the directory with all the dependencies that AWS Lambda will need.

According to the official documentation of pip, the -t or --target switch installs packages into a specific directory. We will leverage this to ensure that the modules are installed in one of the EFS directories instead of the default location. You can see the installed modules in the /mnt/efs/fs1/ml/lib directory.

Running the Inference API

Now that we have the basic environment configured, we are almost ready to host the PyTorch inference API.

First, let’s tell the Python runtime where to find the PyTorch and Flask modules. This can be achieved by setting the PYTHONPATH environment variable.

The trained model and label file are also passed through an environment variable. They are currently available in the ./model directory.

Let’s copy them to the /mnt/efs/fs1/ml/model/ directory created in the EFS file system.

Before running the inference service, we need to set the MODEL_DIR environment variable with the location of the model and label file.

We are now ready to launch the inference service.

The above command launches the Flask server listening on the default port, 5000.

Since the security group associated with the EC2 instance has port 5000 open, we should be able to hit the endpoint.

Classifying an Image with the Inference API

Since the inference API expects a URL of the image, upload an image to a hosting service and send the URL as a parameter to the service. You can find a sample dog image that I uploaded at

Let’s send the same URL to the inference API.

The response from the API confirms that the inference service is working.

Giving Back the Ownership of EFS to POSIX User ID

Now that we are done with the configuration and testing, it’s time to relinquish the ownership of the EFS root. We do that by making the POSIX user the owner of the file system.

Make sure you run the below command before terminating the EC2 instance.

This tutorial demonstrated how to use an EFS file system to host Python modules and a trained model to run the inference API. With the file system fully populated with everything that’s needed for the inference service, we are ready to run it in AWS Lambda.

In the next part of the tutorial, we will port the inference API to Lambda to turn that into a serverless API. Check back tomorrow.

Janakiram MSV’s Webinar series, “Machine Intelligence and Modern Infrastructure (MI2)” offers informative and insightful sessions covering cutting-edge technologies. Sign up for the upcoming MI2 webinar at

Group Created with Sketch.
TNS owner Insight Partners is an investor in: The New Stack.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.