Choreo: A Serverless Platform for Building Cloud Native Apps
The cloud has fundamentally changed the nature of programming at almost every level. We’ve moved from a situation where APIs were typically in language libraries to consuming APIs over the network via HTTP/JSON. In a precloud world, few enterprise developers needed to deal with concurrency, and now practically every programmer does. More and more concerns that were typically handled at the networking layer have shifted into software and become issues for programmers. As an industry, we’re still looking for our languages, tools and programming models to catch up.
We’re seeing languages such as Go and Rust, and more recently WSO2’s Ballerina, born into this new era, each with a specific sweet spot. In terms of programming models in the cloud native world, serverless is perhaps the only technology that has gotten anywhere near the level of hype and mindshare of Kubernetes. It is, however, rather a nebulous and ill-defined term, referring to a disparate group of technologies in which, from the developer’s point of view, the underlying computers don’t matter. The term seems to have been coined by Ken Fromm back in 2012, who wrote:
The phrase “serverless” doesn’t mean servers are no longer involved. It simply means that developers no longer have to think that much about them. Computing resources get used as services without having to manage around physical capacities or limits. Service providers increasingly take on the responsibility of managing servers, data stores and other infrastructure resources. Developers could set up their own open source solutions, but that means they have to manage the servers and the queues and the loads.
Function as a Service, or FaaS, has become so intertwined with the term serverless that the two are almost interchangeable, but it should be noted that there are other serverless products that cover specific domains such as queues, databases and some low-code systems such as Choreo or Zapier.
This excitement around FaaS started with Amazon Web Services‘ Lambda in 2014. Lambda operates rather like a typical message-queuing system from the late 1990s or early 2000s. You deploy some code that is dormant until triggered by an event, such as a message landing on a queue, an HTTP call, or a file arriving in a particular location. When triggered, the code runs, and when finished, it shuts down. The underlying platform handles this, including concurrent executions of functions, so you can have multiple copies running when appropriate.
This has a number of advantages: It is relatively easy for developers to write and reason about; it gives you an implicit level of robustness and high availability and is a particularly good fit when dealing with unpredictable load; and code that isn’t running doesn’t cost you money — you only pay for what you use.
Of course, everything in software architecture is a trade-off and FaaS is no exception. There are some important limitations to be aware of: You have limited control over the resources given to each runtime invocation, such as CPU or I/O; functions are not ideal for long-running processes and indeed are typically capped in terms of how long they can run for; and the dynamic scaling nature of FaaS can cause problems if other parts of your infrastructure don’t scale as well.
In addition, most function invocations are considered stateless, meaning that a function cannot access state from a previous function unless that state has been persisted somewhere external such as a database. If you need to do this, there are a couple of options — AWS offers Step Functions, which allow developers to tie together multiple functions using JSON, and Azure has Durable Functions, which support the ability to suspend the state of a function and allow it to restart where the function left off — but neither option is ideal.
These challenges around state management make using FaaS for integration programming problematic, and this is the area that WSO2’s Choreo targets.
Build API-Based Applications
Choreo, currently in beta, is an iPaaS built on top of Ballerina, WSO2’s language designed for writing small to medium-sized network-distributed applications. It can be considered opinionated software in the sense that it provides all the core functionality you need to support the complete software development life cycle for building API-based applications — from coding through to observability — straight out of the box. Nuwan Dias, vice president and deputy chief technology officer for API management and integration at WSO2, told me that he regards Choreo as a serverless platform for APIs.
“Choreo gives you a platform where you can use serverless for building actual APIs. So you get a serverless experience, you don’t think about servers, you implement your API logic, and then you run it.”
Like FaaS, Choreo can scale the number of running instances up or down, and as part of the future GA release, users will be able to limit the total number of instances of a given Choreo program the platform allows to run concurrently.
Choreo has built-in support for a number of publicly accessible APIs, such as the World Bank Open API, Open Weather Map, COVID-19 stats and the like, as well as API access for tools such as GitHub, Slack, SalesForce, Google’s Workplace (formerly G Suite), Spotify, SendGrid and Medium, with more integrations being added all the time.
Where the platform really shines, however, is for users wishing to either work with third-party APIs or develop and publish their own. In addition to working with prebuilt connectors, developers can use Choreo’s HTTP connector for any suitable third-party API. Choreo then provides a comprehensive set of API management features for the HTTP services exposed through the system, including built-in circuit breakers. The underlying Ballerina language includes built-in support for JSON/XML as data types and, as James Clark recently discussed with me on the InfoQ podcast, makes handling plain data straightforward.
To get a sense of this, have a look at this quick example from the Choreo docs which demonstrates how you can create, test and publish a simple REST API from scratch.
API management in Choreo is facilitated by five key components:
- Choreo Console: Allows API creators to develop, document, secure, test and version APIs. It also provides mechanisms for API publishers to deploy and publish APIs and apply rate-limiting policies.
- API Developer Portal: Allows API publishers to host and advertise their APIs and API consumers to discover, evaluate, subscribe to and consume APIs securely.
- API Key Manager: The default Identity Provider (IdP) for Choreo which acts as the Secure Token Service (STS). Users with administrator privileges can also configure an external authorization server, such as Okta, as an IdP via the Choreo Console.
- Traffic Manager: Helps regulate API traffic, makes APIs and applications available to consumers at different service levels, secures APIs against malicious attacks, and applies rate-limiting policies.
- Choreo Connect: An API gateway designed specifically for microservices.
Under the hood, the OpenAPI Specification (also known as Swagger) is used for the representation of an API. You can import an API using a Swagger definition in the Choreo Console, and a Choreo user can view, edit, import or download an API definition in Swagger the same way.
Once an API or other Choreo application is deployed, the platform has built-in support for observability allowing teams to visualize and monitor running applications, including viewing status, latencies, throughput data, diagnostic data and logs.
A key aspect of Choreo is that applications are developed for it using Ballerina, which has some similarities to scripting languages and is highly pragmatic. Since Ballerina is intended to be used in industry rather than academia, it emphasizes reliability and maintainability. Its relatively small size and simplicity also give it a gentle learning curve making it straightforward for individual programmers to learn.
Several of the design choices in Ballerina make it particularly suited to API programming in this cloud native era. One example is that the type system is designed to be network friendly with a fundamental set of data types that map one-to-one to JSON. Writing for InfoQ, Dakshitha Ratnayake observed that, “This allows a JSON payload from the wire to come immediately into the language and be operated on without transformation or serialization.”
Ballerina also has strong support for concurrency, which works in a similar way to Go and includes support for lightweight threads — known as strands in Ballerina. These are analogous to fibers in Go or virtual threads in the forthcoming Project Loom in Java. From a programming perspective, strands look like OS threads but are in fact a runtime construct that does not map directly to an OS thread. The runtime virtual machine can map potentially millions of these to a very small set of OS threads. Because they are lightweight, you can use many more of them, and because creating virtual threads is cheap, you no longer have to pool and share them, which greatly simplifies writing multithreaded code.
In integration coding, a visual representation that allows developers to observe flows and interactions between endpoints is extremely useful. Ballerina is able to provide this. To achieve it, the low-code editor is a direct representation of the Ballerina syntax tree, not an abstraction layer. What this means is that, unlike in previous generations of low-code tooling, developers can edit the Ballerina source code directly and have those changes reflected back in the low-code editor’s visual representation without the risk of one being unable to represent the other; the code generated in the low-code environment is just code — there is no other abstraction involved. As I noted in a previous article for The New Stack:
“The Choreo platform stores the generated source file in a private Git repo associated with the user account. Users can clone the repo, edit the Ballerina code using Microsoft VS Code, commit their changes, and merge to the same repo. Choreo will pick up the updated code, display it in the graphical and text editors, and use it with the build pipeline.”
In addition, the Ballerina VSCode plugin can generate a sequence diagram dynamically from the source code.
Choreo and Ballerina are ambitious projects. These four key themes — network interaction, data, concurrency and the visualization of the code — make Choreo a unique and interesting option for cloud native businesses. You can already start experimenting with Choreo for free, and a GA release is expected early next year.
The pricing model is based on the number of users and steps in a sequence and ranges from free for a single developer running a simple flow up to £995/month for a group running more complex workflows. Unlike many serverless environments, Choreo programs are not limited in terms of how long they can run for, although more complex workflows will be more expensive.