Data / Development

Stargate Data Gateway Aims to Ease ‘Read-the-Manual Fatigue’

26 Oct 2020 11:35am, by

DataStax’s Stargate open-framework data gateway, recently released as a technical preview, aims to enable developers to easily work with different data formats and query types, whether JSON, GraphQL, SQL or whatever.

Stargate is, of course, built on Apache Cassandra, Stargate’s flagship NoSQL database. It was designed to alleviate “read the manual” fatigue for each new project, according to a blog post introducing its implementation of Cassandra Query Language (CQL) and a REST API for CRUD access to data in tables.

However, the team envisions a day soon when Stargate users don’t necessarily have to know CQL and is working toward a more agnostic framework that enables community members to create extension modules.

“The direction that we want to move in is abstracting [away] those Cassandra-specific concepts entirely, such that the application developer’s only thinking is in terms of their application and their application objects. So what that means is, they tell us what their books, their authors, their posts, whatever, are, [and] we take that and transparently map that into the database, specifically, in this case, it’s Cassandra,” said Chris Splinter, senior product officer, open source for DataStax.

The project has since released a GraphQL API and a Documents API, which lets most Cassandra distros work with JSON through a REST API.

Plays Nicely with Microservices

“Developers actually don’t directly use a database. They use a database API. Typically the way that they used to do that was through drivers and things like JDBC [Java Database Connectivity] and a bunch of things like that,” said Ed Anuff, chief product officer for DataStax. “But increasingly … that’s not how they think about data. They think about data in terms of, ‘Can I use an API? Can I use a REST API? Can I have a microservice that surfaces data for my applications?’”

A data gateway, “takes your database and makes it play well within a microservice architecture,” he said.

Developers are “even looking at new APIs for doing that. They look at things like gRPC, if they want to do really high performance, high throughput, or they’re looking at GraphQL. If they want to go and build, for example, front-end app applications or they’re designing things like, perhaps building a Node.js  application. And we said, ‘How can we bring that into the Cassandra world?’ That led to saying, ‘You know, really, what you need is this microservice gateway directly embedded within Cassandra that allows it to talk to these different types of access mechanisms to handle different types of data, and to handle different types of APIs.’ And so that’s where Stargate came from.”

It’s part of the company’s focus on making Cassandra more cloud native. Earlier this year, DataStax released a Kubernetes operator for Cassandra and launched the Cassandra-as-a-service Astra. Stargate has evolved from its focus on addressing the API and microservices problem for Cassandra.

“We said, ‘Let’s make sure that from day one, this is out there on GitHub, that we’re working with developers, that we’re engaging with the community,’” Anuff said.

The core tenant of it was to have innovation as a first-class citizen, said Splinter. “We purposely built this architecture to be pluggable from the ground up. And we wanted to make sure that anybody could come in and add their extensions or functionalities to Stargate as a framework.”

“Another big thing that we took inspiration from is this concept of a Cassandra coordinator. And the Cassandra coordinator is responsible for all of the request routing to the storage nodes in a Cassandra ring,” Splinter said, adding that it’s a tried-and-true approach that builds on the Dynamo routing concept that’s been in Cassandra since the beginning.

“We didn’t want to reinvent the wheel here. And effectively what we did was modularize that code base so that we could add new access patterns and data types. And what this gives you at the end product is actually an architecture where you’re separating compute and storage that allows you to scale both of those units independently, which is a very common model for efficient use of cloud infrastructure.”

Cloud Native Design

“Today developers have a smorgasbord of [database] options from which to choose, whether document or key-value or columnar or relational or multimodel,” Matt Asay recently wrote, noting that perhaps there are too many options. For instance, DB-Engines lists 359 different databases.

“The goal of Stargate is to make your data available for you through whatever API you can dream up regardless of the backing datastore,” that introductory blog post states.

The Stargate data gateway is deployed between the client applications and the database. When Stargate is deployed, it joins the Cassandra cluster as a coordinator node but does not store any data.

Stargate is broken up into modules that fit into three broad categories:

  • API extensions: Responsible for defining the API, handling and converting requests to database queries, dispatching requests to persistence services, and returning and serving response to clients. These extensions use both the authentication extensions and the persistence extensions.
  • Persistence extensions: Implement the coordination layer to execute requests passed by API services to underlying data storage instances.
  • Authentication extensions: Responsible for access control to Stargate’s APIs.

When a request is sent to Stargate, the API Service handles it, translating it into a database query, and dispatches it to the Persistence Service. The Persistence Service then sends the request to the storage replicas of that row using Cassandra’s internal QueryHandler. The Persistence Service processes the request and responds to the client once it receives acknowledgements from the number of storage replicas specified by the request consistency level, the blog post states.

“If you’re using something like Istio — that is what Google and others within the Kubernetes ecosystem are using for their microservice gateways — the stuff we’re doing Stargate will be perfectly at home in that type of architecture,” said Anuff, adding that in that case, it will expose microservices from Cassandra that can be consumed from within your service mesh.

“Stargate’s ultimate goal is to have pluggable APIs on the frontend and pluggable storage engines on the backend with all the Dynamo magic happening in the middle,” it says.

Going forward, the project aims to focus on APIs for more kinds of data, including gRPC and streaming.

It’s not the only API data gateway out there — Microsoft, Google and others have their own versions.

“We’re really targeting the cloud native design from the get-go,” Splinter said. “Whereas some of the previous abstraction layers on top of the likes of Postgres, MySQL … they came from a previous generation, so that wasn’t necessarily where they got started. But we’re really taking a born-in-cloud approach here.

“As far as the other data gateways that I’ve seen emerging, I’ve seen individual REST layers or individual GraphQL layers, but one of the really unique things about Stargate is that it gives the developers options,” he said. They can go with REST or GraphQL or in the future gRPC. “And so it’s this single framework that can host many of the different protocols, which I personally haven’t seen, on top of on top of any other offerings.”

A newsletter digest of the week’s most important stories & analyses.