API Management / Data / Technology / Sponsored / Contributed

How Radical API Design Changed the Way We Access Databases

17 Mar 2022 9:00am, by

John Page
John Page is a document database veteran who, after 18 years building full-stack document database technologies for the intelligence community, joined MongoDB. He now builds robots for fun and tests and writes about databases to pay for the robot parts.

API design starts with knowing who your consumers are, how they will use your API and how they will expect it to work. But all too often, API design is more akin to dutifully following established patterns and long-held conventions.

When we first started working on MongoDB in 2007, we wanted to reimagine the way developers interacted with databases. The service API we created, the MongoDB Query API, was a radical approach because it gave developers access to the database through the use of objects rather than strings of pseudo-English text. MongoDB’s pure object API was revolutionary in 2007. Since, the MongoDB Query API has inspired legions of imitators, though we believe they map to a less functionally rich underlying database.

A Look Back at SQL

For decades, developers have had to request operations in a database using SQL, the Structured Query Language built for relational databases. When client-server applications arrived, SQL was wrapped in thin layers like data access objects (DAO) and Open Database Connectivity (ODBC) designed to send SQL statements to the database and parse the response.

SQL emerged at a time of dramatic changes in user interface design. Teletype printers were giving way to green-on-black CRT displays, and a radical new editor was created to let developers see and edit multiple lines of code at the same time. This editor was called Visual, later shortened to vi, which evolved into Vim.

At the same time, a new UI for accessing data was created to allow users without formal training in mathematics or programming to construct complex queries. That UI was called Structured English Query Language (SEQUEL), and it allowed users to talk to a computer in formalized English to view and modify data. (It was later shortened to SQL for trademark reasons.)

The original designers of SQL, Donald Chamberlin and Ray Boyce, thought it would be used by planners and other professionals — not database management experts — to perform ad-hoc queries. Chamberlin has written that he has been surprised to see SQL so frequently used by trained database specialists for repetitive transactions. (Boyce died at 26, just after a paper introducing SEQUEL.)

Whether writing analytic requests or having your application code interact with the database, if you’re a developer who understands what imperative coding is, SQL is not the best way to access a database.

At the same time, SQL isn’t merely declarative either, not when you consider that the difference in performance between good and bad SQL queries, each performing the same task, is directly related to how the query is written.

A Change in Mindset

One of the early design decisions we made at MongoDB was to focus on interaction with the database using a pure object-based API. There would be no query language. Instead, every request to the database would be described as a set of objects that were intended to be constructed by a computer as much as by a human (in many cases, more often by a computer).

This approach allowed programmers to treat a complex query the same as creating a piece of imperative code. Want to retrieve all the animals in your database that have exactly two legs? Then create an object, set a member, “legs,” to two and query the database for matching objects. What you get back is an array of objects. This model extends to even the most complex operations.

This approach enabled developers to build database queries as codeit was a leap from a query language mindset to a programmer’s mindset. This would significantly speed up development time and improve query performance. This API approach to database operations helped kickstart MongoDB’s rapid adoption and growth in our early years.

The challenge of using object-oriented programming with tabular databases has been a source of friction for years. Object relational mappers (ORMs) have proliferated as Band-Aids, obscuring the tabular model from the end developer for basic use cases. But ORMs flourish at the expense of developer understanding and influence over performance, while optimization is always subject to the inherent friction of the tabular database.

The Practice of Object-Oriented Programming

Thinking in objects isn’t always obvious. Many first-time users of MongoDB use a JavaScript-based shell in interactive, client-side JavaScript REPL (read, evaluate, print, loop). Although you are, indeed, writing programs, the experience is so immediate that you could mistake it for a SQL prompt rather than the code development playground it really is. This can lead you to think of MongoDB queries as separate from the API you call — you might picture them as JavaScript even when the rest of your code is in some other language. To master MongoDB, it’s important to think of the queries not as a language per-se but as an object model.

I’d go as far as to say the most powerful queries are not written by hand but generated by higher-level functions, and that powerful aggregation queries should be thought of as imperative coding, more like Apache Spark than SQL.

For example, here are two ways to write the same request, one with a query language mindset and one with a programmer’s mindset. I’m using the classic programmer’s interview question FizzBuzz as an analytic query in both styles.

Although the total code size in this sample is roughly the same, decomposing it like this makes it easier to read, write, debug and maintain. For a less trivial case, it also makes for far smaller query code.

Conclusion

In the same way that MongoDB reimagined the concept of interacting with data and aligned it with developers, I would urge you to look at your own service APIs and make sure you’re not just blindly following an established pattern. Focus on who your consumers are, how they will use it and how they will want to work with it.

And, while your APIs should initially be simple and easy to work with, think about how they can be both extensible and backwardly compatible in later versions. Above all, don’t fall into situations where you are afraid to break long-held conventions to best enable your users. Remember, the first cars were steered with levers rather than steering wheels. Rethinking one of the most common classes of service APIs was a key part of MongoDB’s strategy.

Feature image via Pixabay.