JSON and Relational Tables: How to Get the Best of Both
The Relational Model
I’ll start with the relational model, which is a general-purpose model that uses data normalization to ensure data integrity and avoids data duplication. In addition, SQL makes data access, manipulation and modeling flexible and easy. The strength of the relational model, when coupled with SQL, lies in its general-purpose design: You can use normalization techniques to take data from an application and split it into stand-alone logical parts by storing these in separate tables.
Imagine a student class schedule. In its simplest form, such a data item would have four parts that in combination make up the schedule: a) the information about the student for whom the schedule is for, b) the courses the student wants to attend, c) the time when the courses are scheduled, d) the classrooms where the courses are held. Each piece of information is important for students to know where and when they are supposed to be, but the relational model would normalize these four pieces into separate tables.
Why is this normalization important? Because it increases the reusability of the data for other applications or purposes. In this example, teachers also have schedules, which look different than student schedules. Teachers care about which courses they are giving, and when and in what classrooms they are. Yet the teacher’s schedule is likely to look very different from a student’s because the teacher only teaches a few courses while students probably also attend many courses taught by other teachers.
Next, add the facilities team to the mix. They probably do not care about the teachers or the students, but they care very much about which classrooms need special setups. They also need to know when classes are not being used in order to schedule maintenance.
Thanks to the relational model and the normalization of the data, all these questions can be answered without having to restructure the data physically or logically.
Yet, while the relational model gives us this nice, general-purpose design, it is not always the easiest for developers to use. That’s because developers usually build apps in terms of object-oriented programming, where objects are structured hierarchy into classes and therefore aren’t naturally aligned with the notion of normalized data in rows and columns within separate tables.
JSON Document Model
The JSON document model overcomes many of the rows-and-columns limitations of the relational model by allowing apps to directly map objects into a corresponding hierarchical JSON format. This can drastically reduce the complexities of object-relational mapping.
Furthermore, JSON is a self-describing data format that makes inter-application communication easy, because a JSON document not only contains the actual data, but also information about the data — the metadata or schema. However, although the JSON document model provides great benefits to developers by more closely mapping to the object-oriented nature of the application’s internal operations, it isn’t ideal either.
Here’s one weakness: Due to the self-describing and hierarchical nature of JSON documents, information is stored redundantly, leading to inefficiencies and potential inconsistency down the road.
In the scheduling example, what happens if the facilities crew needs to take a classroom out of commission due to a water leakage? If the data for the student and teacher schedules were all modeled in JSON format, chances are that each student and teacher would have one JSON document with redundant data about the affected classroom.
Those documents may not have the same structures: One document may have the student information as the root element while the other has the teacher information as the root element. The student schedule document may have information about each class’s teachers, but the teacher schedule document may not have any information about individual students or about co-teachers.
In this example, the facilities-management application would have to somehow propagate the classroom change, searching for the classroom in differently structured JSON documents and replacing it with another. While doable, JSON adds complexity to the facilities app, which arguably should not even have to worry about student and teacher schedules and their associated data structures altogether, but which should merely map everything in classroom 225 to classroom 316B for the next two weeks.
In contrast, if the data was modeled relationally, there would only be one table that contains all schedules — and therefore only one table would require an update. All three applications, whether used by students, teachers or facilities staff, would automatically use the correct data consistently. (Of course, this example assumes all data to be in one database or document store for simplicity.)
The Worst of Both Worlds?
To get around the problem of inconsistency, some JSON document databases recommend normalizing documents using references: Instead of including the classroom information in the student or teacher schedule documents, the documents may simply include an ID for that classroom residing in another document. The problem with that is that normalizing documents this way makes a JSON document work like a relational database, which completely defeats the simplicity benefits of the document model, and now developers have a model that is the worst of both worlds.
By the way, it’s difficult to model many-to-many relationships using the JSON document model. Attempts to model the relationships lead to even greater data duplication and the potential for additional inconsistencies.
There are different ways developers get around these issues, and historically developers have often resorted to using object-relational mapping (ORM) frameworks to consume and manipulate the data in relational databases in hierarchical object-oriented form, while leaving the data itself in relational form.
Problem solved? No. ORMs are not perfect either. They add a layer of abstraction that developers have limited control over, often provide their own conflict or collision resolution that may or may not be adequate, and may not take full advantage of all the features a given database engine provides. ORMs can quickly become the lowest common denominator moving the application even further away from the data and efficient manipulation of it than originally desired.
To sum up:
The relational model is a general-purpose model that makes it easy to query and manipulate parts of data but sometimes poses challenges for developers when consumed by their apps.
The JSON document format makes it easy for developers to consume data in hierarchical form but adds data duplication and the associated inconsistency and reusability challenges on its own.
ORMs alleviate the task of decomposition and reconstruction of data from object-oriented hierarchical structure to relational tables but may bring their own set of challenges to the overall architecture.
The Best of Both Worlds
These issues with relational databases and JSON document models are well known, and many database providers have been working to offer a solution. A new approach is a feature in the Oracle Database called JSON Relational Duality. This feature offers a way forward that combines the benefits of the JSON document and relational worlds in a single database, hence the name JSON Relational Duality, while avoiding the tradeoffs I’ve discussed so far.
The new feature is available in Oracle Database 23c Free—Developer Release, which anyone can download and use; no lawyers will ever come knocking. If you are curious about that release, check out my conversation about it with James Governor, co-founder of RedMonk. (It’s an unprecedented move by Oracle to offer a new database version free to developers before the paid version arrives. But I digress.)
Let’s get into what JSON relational duality can do:
By allowing a capability called a JSON Relational Duality View, data is still stored in relational tables in a highly efficient normalized format but is accessed by apps in a way that looks like JSON documents. This allows developers to continue to think in terms of JSON documents for data access while, behind the scenes, the system uses a highly efficient and multipurpose relational model for data storage.
In this and other ways, Duality Views hide all the complexities of database-level concurrency control from the user, providing document-level serializability.
Duality Views can be declared over any number of tables using intuitive GraphQL or SQL/JSON syntax. For example, the earlier described example could be defined as the following Duality View which renders the relational data available in the students, schedules, courses and teachers’ tables as a JSON document corresponding to an app-tier StudentSchedule object.
Developers can easily define different Duality Views on the same or overlapping set of relational tables, making it easy to support many use cases on the same data (such as TeacherSchedule and StudentSchedule Duality Views that share common tables for courses and schedules).
Using Duality Views, developers have JSON document access to all data, including access to data stored in relational tables. At the same time, they can access the relational data directly using SQL, if they choose. This way, applications using Duality Views can now simply read a document from the database, make any changes they need and write the modified document back to the database without having to worry about the underlying relational structure.
This is where the magic trick comes in: The database will consume the JSON document and do the correct creates, reads, updates and deletes on the corresponding rows based on JSON Relational Duality View definitions, with full ACID concurrency and consistency controls, runtime optimizations and all in just one round trip to the database. And the best part of it is developers can manipulate documents realized by Duality Views in the ways they’re used to, using their usual drivers, frameworks, tools and development methods.
Lock-Free Concurrency Control
Duality Views also benefit from a novel lock-free or optimistic concurrency control architecture, so developers can manage their data consistently across stateless operations. You can find more details about that here, but the basic idea is that stateless document-level serializability using built-in optimistic concurrency control avoids pessimistic locking and the associated performance issues so that developers can continue to focus on building their app instead of debugging concurrency control mechanisms and race conditions.
Why use JSON Relational Duality:
- Duality Views eliminate the need for object-relational mapping (ORM) frameworks and JSON-relational de-/serializers.
- Document-centric applications can access Duality Views via document APIs, such as Oracle Database API for MongoDB and Oracle REST Data Services (ORDS), or they can use standard-based SQL/JSON functions.
- Application operations against Duality Views are optimally executed inside the database since they enable fetching and storing of all rows needed for an app-tier object use case in a single database round trip and database-optimized operation.
- Duality Views eliminate data duplication by allowing many Duality Views to be defined over the same or overlapping relational tables giving every application its business objects without the need of duplicating data.
- Duality Views allow relational data modification concurrently and consistently via JSON document-based PUT and POST operations.
By giving developers the flexibility and data access benefits of the JSON document model as well as the storage efficiency and general-purpose power of the relational model, JSON Relational Duality is a big step forward in simplifying app development.
The cool thing is that anyone can try these capabilities for free by downloading Oracle Database 23c Free—Developer Release.
Furthermore, Oracle is also making it easy for you to experience Duality View in building apps with well-documented and easy-to-learn tutorials on GitHub. You can browse and download the tutorials, and use them in this release. In the tutorials, you can use SQL, REST and Oracle Database API for MongoDB to try features, capabilities and examples related to Duality Views. Last but not least, Oracle also provides LiveLabs to help you play with Duality Views in a ready-to-run environment.