Serverless: IOpipe Launches a Monitoring Tool for AWS Lambda
At AWS Summit this week in New York, IOpipe has gone into general availability with version 1.0 of its Lambda monitoring tool now launched and has stepped up to becoming an advanced tier partner with AWS.
IOpipe builds on the serverless premise that, while there are still servers, they become less of a worry for application developers. But instead of bringing about a NoOps future with serverless, the need for monitoring and managing performance now rests within a complex application architecture that may be calling on third party services through APIs, making use of internal microservices, and evoking small functions that carry out a specific task.
That creates a need for what IOpipe co-founder and Chief Technology Officer Erica Windisch calls “Application Ops.” Windisch — who will be talking on debugging and tracing Lambda applications at the upcoming Serverlessconf in New York on Oct. 8-11 — says one of the ideal ways to use IOpipe is to have a code editor and a window with IOpipe running side by side. “Dev Tools and application tools are converging now.
With IOpipe, you can be working in a code editor and add the IOpipe serverless framework plugin or use our decorator in your application. Then you can have IOpipe open in another window and have information fed into that,” she said.
IOpipe has been built to provide a suite of monitoring and debugging tools that allow deeper insight into what is happening when each Lambda function of a serverless workflow is called.
Co-founder and CEO Adam Johnson outlined the product’s starting point: “We started a year and a half ago, basically by talking to as many Lambda users we could find,” said Johnson. “The overwhelming response we got from them is a lack of visibility and instrumentation. For many serverless adopters, sure, there are challenges with development, but there are tools emerging for that. But when you go into production, there aren’t a great set of tools to monitor your serverless applications and figure out what is going on when something does go wrong.”
Windisch said with IOpipe, users have a deeper insight into errors that may occur, their frequency, memory leaks and their durations, and length of time of cold starts, a crucial indicator of the performance of a serverless application.
“One common thing that happens is there might be errors in your application when connecting to your storage. You can run alerts for those and with IOpipe, so you have easy access into your stack. You can dive in and see how many invocations are failing because of errors in connections with your database,” said Windisch.
Time Is of the Essence
It is also possible to use custom metrics in your IOpipe pipeline. Windisch gives the example of setting metrics to the number of records processed by a particular Lambda in Kinesis. In that way, it might become apparent, for example, that any Lambda processing under 500 records work fine but processing starts to fail when more than 500 records are regularly being processed. That gives developers insights the sort of information they need to debug problems faster as they arise.
Johnson points to the common use case of image processing, where serverless may be used to take an image from object storage, run additional processing like resizing or ML analysis and then saving the resulting data and resized images to another object storage bucket. “Reaching out to S3, relying on so many third party calls, like Google or Amazon’s image processing algorithms… it does reach a threshold where it becomes a concern because there are so many moving parts. We have just released a tracing plugin so you can trace the performance of those calls, visualize those, track how long those calls are taking in the function, so you can see where the time is spent,” said Johnson.
Insights into the time each step in a process takes is a key application ops monitoring task. Windisch described cold starts as being one of the key metrics to start with when monitoring application performance in a serverless environment:
“Serverless containers are recycled every 4.5 minutes to 4.5 hours, so that could be a difference between 5 milliseconds and 100 milliseconds for a base cold start. But the size of the Lambda function also determines how long the cold start takes, or if there are a lot of dependencies, and some languages are slower, so Java takes longer because it has to warm up the JVM container first. And if you are pulling in third party dependencies, they have to load before the cold start, so we try to give a lot of visibility into all of that. In IOpipe, you can filter by cold start and include or exclude them, so you can see if there is a performance issue only with cold starts, or if every invocation is a cold start. Depending on the use case, it may be something you need to address. Amazon has improved the cold starts time, and that’s continuing to get better, and affects less people, but for instance, if you are looking to an ElasticSearch service, during a cold start you will connect to that service and will have initialization penalties, so cold starts can significantly impact on performance which in the end impacts what you pay on Amazon.”
Johnson said without using a monitoring tool like IOpipe, he is seeing some companies diligently collecting everything about their Lambda processes, logging it to a central place, and then trying to view log files by time frame whenever there is an error. But in IOpipe, users can also order functions by projects and view errors and logs just for those projects or by time or by the name of the function.
“When we ask devs what their workflow is for development and debugging and pushing to production, it is very much taking several steps backward when using serverless because you are relying on a lot of third party services. So local testing is not an option. A lot of people are writing code and if there is an error they need to write log codes, and they pull up five or six terminals, tracing logs on 5 or 6 functions and then seeing what is happening,” Windisch said.
“In IOpipe you can get all of that in one view, and how they are affecting each other. It lets you have all of this at your fingertips. You can then integrate with your CI/CD pipeline to see how code changes are being affected over time. It eliminates the need to iterate back and forth over a Lambda, you don’t need to go in and add print statements, you can see exactly what happened when changes occurred, rather than wolf fence your code and repeat cycles of edit-upload-test.”
The Maturing Customer Segments of Serverless
Johnson says the majority of IOpipe customers fall into three categories, reflecting the most mature segments of the serverless market.
First are the no stack startups: “These are the 1-2 person start-ups who can build awesome tech using AWS technology and piece together something awesome pretty quickly,” Johnson described.
Second are the media companies who have an economic need to reduce costs of image and multimedia processing quickly. Johnson outlined: “There are a bunch of media companies, both old media companies that you wouldn’t expect and new style online magazines. The old media companies are making the leap from traditionally running on VMs in the data center and going to the public cloud and are consciously making a decision to skip containers and go straight to serverless. They are creating media processing pipelines where it is very bursty. When they bring in a new partner and ship content with SLAs, they need to process their entire media library in a short amount of time, so with serverless they were able to go into production over a year ago and reduce the cost of their bills significantly.”
The third group is larger companies. Johnson said these customers have already matured through one set of serverless functionality for business logics tasks and are now moving on to big data processing using serverless: “These enterprises started using Lambda for cron-style workloads, cleaning up VMs, clean up DNS records, the normal things DevOps teams would be doing. That was one of the early examples we were seeing but now they are moving on to doing big data transforms and ETLs. We are seeing a lot of enterprises make the shift to Aurora, for example, and using Lambda for that data processing. in any batch processing or messaging queue, you have a bunch of workers that are running in containers, processing the data and putting it somewhere. You need to be very good at knowing how many workers you need to have, or you pay more money to over-provision, or you take longer (which is what they did before). Since Lambda will spin up new containers, they only pay for exactly what they need and don’t have to worry about high availability. It is a no brainer, so we are seeing a lot of these use cases.”
Johnson says after this, he expects to see enterprises move onto using serverless for mobile and web applications. And as that space matures, IOpipe wants to mature alongside them. Already the team have plans to become a full monitoring application stack for serverless. Initially, they started as more a forensics reporting service, able to see what went wrong after the event, but in version 1.0 they are already able to debug serverless applications alongside the coding environment. With new tools like their tracer plugin, they want to be able to add the ability to set performance thresholds on third party calls. Integrating with CI/CD pipelines is also on their roadmap.
And perhaps not far behind that is an autonomous environment where not only can threshold alerts be set, but Lambdas can be created to trigger how to respond to those alerts. An autonomous ops future in serverless is on its way.