Serverless approaches are becoming increasingly popular with developers who want to run code without the complexity, cost or delay of setting up and maintaining the environment for that code to run in. As serverless platforms become powerful enough to be appealing for more complex development they’re also getting tools to add conveniences like managing external state and long-running operations.
Because serverless platforms are stateless, developers have had to add queues and databases to store the external state that almost any non-trivial code requires, usually using key-value stores like AWS DynamoDB and Azure CosmosDB. Frameworks like architect simplify that by building in Dynamo support, and Microsoft’s Azure Functions now has its own built-in framework to define the workflow in code and orchestrate calling other functions while storing state durably between function executions using Azure storage queues and table storage.
Orchestrator functions are stateful workflows, but they’re still written with code; you don’t have to create a JSON schema or use a graphical workflow designer. They can call other functions, either synchronously or asynchronously, and save the output from those functions into local variables, so you can use them for function chaining or more complex orchestrated patterns like fan-out/fan-in or MapReduce. Orchestrator functions are generator functions with an orchestrationTrigger binding; they call activity functions, which look like standard Azure Functions but have an activityTrigger binding.
They’re also ideal for long-running interactions because the function doesn’t need to stay active — and be running up charges — while it’s waiting for a response from another function or an external API (or even a person).
“The real world doesn’t work as simply as Functions execute,” Jeff Hollan, senior program manager for Azure Functions told The New Stack.
He uses the example of a website that wants to send out reminders for bills that are due, and more reminders when they’re overdue. “A lot of Node.js developers who are using Functions have websites and behind the scenes, they need to say ‘go and contact these customers in three days’. They need to know the task will execute and execute at least once, but you can’t get that unless the function hangs around.”
You can use Durable Functions for an orchestration, for instance, that sends a text message to the 500 people who have a bill due for payment, waits for three days to see if they pay it and then does something else if they haven’t. That’s the kind of long-running process that would be difficult to set up without Durable Functions. These are processes where you’re scheduling things with multiple phases like “wake up at 5 p.m. on Friday, do this, then wait for this, and then do this other thing,” Hollan noted.
Durable Functions include APIs for coordinating long-running operations with external clients, like a REST command to start new orchestrator function instances and exposing webhook HTTP APIs to query the status of those orchestrator functions. A long-running monitoring function can also poll external endpoints, running until either a condition is met or a timer runs out, and you can change the polling interval, for example by implementing a backoff algorithm. If what you’re waiting for is a person, the orchestrator function can set a durable timer when the request is sent to them, wait for notification through a webhook that the person has approved or denied the request, and escalate if no reply is received within a set time.
For performance reasons, functions don’t store their full runtime state. Instead, whenever the code of an orchestrator functions reaches a yield keyword, the Durable Task Framework dispatcher automatically checkpoints its progress into the append-only execution history (stored in an Azure Storage table), so the local state can be rebuilt if there’s a crash or a reboot (or while the function is suspended), then adds messages to the work-item queue to schedule the work. The orchestrator function is unloaded from memory at this point, so if you’re using a consumption plan, billing stops too.
When the orchestrator function calls an activity function, the activity function gets messages from the work-item queue via the activityTrigger and sends its response to the control queue; the orchestrator functions receives that response via the orchestrationTrigger. These queues are how Durable Functions can offer “at-least-once” message delivery guarantees.
When a response is received or a timer expires (or a function has to be restarted after a crash or reboot), the orchestrator re-runs the function using the execution history to rebuild local state, so there’s no need to rerun any tasks that have already completed. That means orchestrator functions have to be deterministic so that running the same code multiple times creates the same result every time; put any non-deterministic code like IO or random data in activity functions.
To write Durable Functions in Node.js, you also need the Durable Functions for Node.js library (for Node.js 8.4.0 and greater), which is again still in preview. When your orchestrator functions gets to the yield keyword where it’s expecting a Promise that resolves the Task or TaskSet object, the Node.js shim library accepts the function execution history as a state object. It then appends the actions of the Task or TaskSet object to a list that it returns to the Functions runtime to add to the execution history, along with any output from the function and whether the function has completed. If the function hasn’t completed and receives another response, the library will return that in the same way.
Durable beyond Azure
The Azure Functions v2 runtime runs on .NET Core, which allows cross-platform development and hosting, making it easier to write and test Durable workflows locally using the Azure Functions Core Tools, not just on Azure.
That includes Durable Functions. “Durable Functions can run anywhere, including on the edge,” says Hollan. In fact, state is even more important for development on the edge where your function might be running on resource-constrained devices that don’t have the battery power to wait for long-running processes to complete, and he expects that Durable Functions will be particularly useful in IoT management and device lifecycle orchestration.
Running Durable Functions on your own systems usually still means connecting to Azure for storage. “It’s more challenging if you want to run completely disconnected [from Azure] because you need Azure Storage,” Hollan points out. The Azure Storage Emulator from the Azure SDK only runs on Windows and is intended for development rather production use; Azurite, which runs on Node.js, is an open source clone of Azure Storage. The Azure Storage Explorer for Windows, Mac and Linux works with the emulator as well as cloud Azure Storage, so you can see the contents of the task hubs that group the Azure Storage queue and table resources used by Durable orchestration functions when you’re debugging. That’s also a way to see the history of orchestration executions, although the tracking data that Durable Functions sends to Azure Application Insights is probably a better way to monitor orchestrations.
With Microsoft’s focus on the intelligent edge, especially now that Azure IoT Edge is generally available, we expect to see offline versions of more of these basic Azure services that Durable Functions will be able to take advantage of, making this a pattern for orchestrating long-running tasks that will be useful in many places.
Microsoft is a sponsor of The New Stack.
Feature image via Pixabay.