Analysis / Technology /

FaunaDB Harnesses the Serverless Cloud

3 Apr 2017 10:09am, by

What if you could adopt one data management system as you start up your business and it would grow along with you, eliminating the need to ever add another one?

That’s the goal of FaunaDB, which describes itself as an adaptive operational database that’s global, consistent and fast. The San Francisco-based startup just launched the FaunaDB Serverless Cloud, which enables developers to implement and elastically scale cloud applications with no capacity planning or provisioning. An on-premises version, now in beta, is due to be commercially available in a few months and also offers a private cloud version.

In a blog post, CEO Evan Weaver claims several “firsts”:

  • The first database built for serverless applications
  • The first globally replicated database-as-a-service
  • The first transactional, strongly consistent, multi-region database available to the public

It also touts being the first cloud database to seamlessly replicate data in real time across both Amazon Web Services and Google Cloud Platform. It plans to add support for Microsoft Azure later this year.

Born of Frustration

Weaver and his team built Fauna out of their frustration in building out databases for Twitter. Remember the prevalence of the “Fail Whale”?

“Twitter at the time was stuck between a rock and a hard place because Twitter was at the vanguard of mobile-first adoption,” Weaver explained.

“For almost three years, whenever we racked new hardware, it would almost immediately be redlined by latent demand because people were pounding on us really hard for the API as well as the relentless user growth over that period of time.

“The systems we built were incredibly efficient but had scalability and performance problems. They were very inflexible because they were point solutions. …We were really frustrated being in a product company that had product deadlines for shipping stuff and fighting fires, we could never reach escape velocity and build a reusable platform … and that was a big drag on the product teams,” he said.

After leaving Twitter to form the new company, they found basically nothing had changed for companies seeking to solve the problem of flexible data at scale, especially flexible operational data.

They saw a pattern: Companies would install myriad different databases.

“They’d start with the SQL thing, then some key-value thing for some data they needed to scale or be more performant. Then they’d [use] a graph system or a geospatial thing. Then they’d add a message bus like Kafka to tie it all together. Then there would be two or three or four analytics things. The data set would be replicated 15 or 20 times over, and their engineering team would spend the majority of their time fussing with this low-level infrastructure and trying to integrate it, and the product velocity would grind to a halt. It’s the same thing we experienced at Twitter, but it plays out at all levels of scale,” he said.

“You can get big machines in the cloud now, so you can scale individual workloads much higher than when we were at Twitter, but you can’t safely modify it, basically,” Weaver said. “The more teams interacting with that same dataset, the more likely some will cause a problem. People get afraid of making a schema change or adding a new dataset. Performance becomes unpredictable. So there are all these things adjacent to scale that are totally unsolved.”

“That’s what we started Fauna to fix,” he said.

Weaver and his team set out to build a totally reusable general-purpose safe operational data platform that will let a small company get started at very low cost in the serverless cloud, then have confidence as it grows that its data system will grow as well, without having to add other systems or refactor an application to keep the business moving.

Throwback to ‘80s?

On the serverless cloud, users don’t have to operate anything and are charged only for their active use time. It’s relational, but not SQL. It’s adaptive in that it lets you change your infrastructure footprint on the fly.

Weaver describes it as sort of a throwback to what SQL systems did in the ‘80s in which there was a big, monolithic system like a mainframe and everything integrated with that.

The cloud changed that. Now people are trying to deploy to commodity hardware, distributed clusters, multi-data center setups, hybrid clouds, private clouds and also scale to much higher levels of throughput, availability and data size, he said, so the productivity benefits of having one big mainframe got lost with NoSQL and point solutions. (He says people are fleeing NoSQL.)

So Fauna is that monolithic model for the cloud, he said. You can create as many sub-databases as you want.

The interaction model uses modern application development patterns familiar to the developer, he said.

“You insert documents and build relations on top of them. So you don’t have to denormalize anything. You also don’t have to flatten anything like an SQL system. You get all the features you want from SQL, like transactions, unique indexes, compound indexes and views, but you also get all this stuff that’s great for modern application development like graph queries, temporalities, you can go back in time to see how your database changed, build a change feed for any query.

FaunaDB is implemented in Scala and Java and runs on any modern operating system, including Linux, Windows, BSD, and OS X. It’s implemented on the JVM for portability and queried via type-safe embedded DSLs, like LINQ.

MongoDB users will be familiar with the services’ document patterns, as well as the ability to compose all these indexes together. Users can build a regular index over the document, or make a different index that represents a graph, join, intersect to get a temporal view of the document, so it can be fed into a social network theme, he said, adding that the company plans to expand into more query domains like search and geospatial.

An example JavaScript query on its website returns a change feed of city crime watch reports. It illustrates graph queries, multi-level joins, indexes, and temporality — all at the same time. Weaver boasted that expressions in other languages are equally simple.

NVIDIA’s Support

The company was formed in 2012 and has spent the past four working on FaunaDB. It has raised $7.61 million, with Scott McNealy, founder of Sun Microsystems, and Olivier Pomel, founder of DataDog, among its investors.

It recently brought on board Chris Anderson, a co-founder of Couchbase, as director of developer experience. Fauna also has partnered with Serverless, Inc. to improve the developer experience with serverless architecture.

NVIDIA is among the enterprise customers using Fauna in a private cloud for several of its services. Last year it launched GeForce Experience 3.0 [software used in GeForce graphics card-equipped PCs] backed by Fauna in multiple data centers. Weaver said it’s been “pushing pretty hard on the enterprise story”: operations, performance and feature set.

Other smaller customers include service business software vendor Breezeworks; Longform, a website of curated articles; and ThinAir, an endpoint security vendor that offers a service that protects enterprise trade secrets and IP from exfiltration.

“To meet our customers’ security requirements, we often need to migrate from the cloud to on-premises data centers quickly and easily. FaunaDB Serverless Cloud uniquely supports these requirements,” said Cliff Moon, ThinAir director of engineering.

Feature Image: “Zoo” by Ibinic, licensed under CC BY-SA 2.0.


A digest of the week’s most important stories & analyses.

View / Add Comments