Generative AI, Streaming Data, and More Come to MongoDB Atlas
At its MongoDB .local NYC developer conference today, document database leader MongoDB is announcing a slew of new features for its cloud-based Atlas database service. The new capabilities are focused on new workloads (including real-time streaming data and generative AI), developer productivity and migration of relational databases to MongoDB’s document store platform.
MongoDB views its market strategy as one of competing for a broadening array of “workloads.” It sees itself as having penetrated many large accounts but initially for niche applications. Its own version of “land and expand,” therefore, is to grow the number of workloads it’s used for by its many customers and, along the way, grow its appeal to new customers too. As a cloud-first company, MongoDB is initially landing these workloads in its Atlas service.
On the broadened workload front, then, MongoDB is announcing the private preview of Atlas Stream Processing, making a big push towards real-time and streaming data — which can come from logs, financial market data sources, telemetry and observability platforms, or IoT devices. This new capability is built around MongoDB’s core document data model, handling continuous processing of data, validation and stateful windowing.
Stateful abstractions over streaming data make working with it data more like working with conventional data at rest, which is the paradigm most familiar to developers. The trick to enabling developers to build real-time applications is to let them apply their existing skillsets, rather than making them code streaming data apps one way and conventional data apps a different way. As such, MongoDB says Stream Processing is a key part of Atlas, albeit in private preview to start.
In Search of AI
Next on the workload front is the public preview Atlas Vector Search. This feature aims to accommodate requirements for hot new tech, like generative AI/large language models (LLMs) and semantic search. In these use cases, chunks of data — whether they contain text, images, or audio–that can be used to train LLMs are encoded in a single values called vectors.
Atlas will allow vectors to be stored, of course, but that’s the easy part. The big value is that Atlas Vector Search provides for the indexing and querying of vectors, to allow finding similar ones in the database and then use LLMs to, in the company’s words, “probabilistically construct sentences from prompts, generate images from captions, or return search results that are more accurate and contain greater context than traditional search engines.” Atlas Vector Search also allows customers to augment pre-trained LLMs with their own data for more relevant results.
Atlas Vector Search is integrated with the open source LangChain and LlamaIndex frameworks, which will be appreciated by developers who have already come up the generative AI learning curve. The frameworks let such developers access LLMs from cloud providers and model providers like Anthropic, Hugging Face, and OpenAI.
Complementing all this is the new ability to dedicate specific nodes in an Atlas cluster to the Atlas Search service. MongoDB says Atlas Search Nodes, now in public preview, will enable workload isolation, resource optimization, and better performance at scale. Atlas Search Nodes service both Atlas Vector Search and the broader Atlas Search facility.
The streaming data and generative AI features are a big deal, but there’s more in the new capability manifest. Further Atlas enhancements include these nuggets:
- Atlas Time Series collections will now offer improved scalability, and support for deletes — which further enables modification of previously ingested time series data.
- Atlas Device Sync, which allows offline devices to sync with the mothership clutser, will now offer tiered device sync, as a private preview.
- Atlas SQL, which provides a SQL query interface to MongoDB for BI tools, is being brought to general availability (GA), along with MongoDB-developed connectors for Microsoft’s Power BI and Salesforce’s Tableau.
- Atlas Data Federation and Online Archive will enter private preview on Azure (the features had already been available on AWS). Online Archive provides for automatic tiering of Atlas databases to different cloud object storage options, while retaining the ability to query. Atlas Data Federation lets customers mix and match data from Atlas databases and cloud object stores, essentially giving Atlas access to cloud data lakes.
- And last, but certainly not least, Mongo and Atlas will add improvements in query performance and resource efficiency. The company said the platforms now offer up to a 50% improvement for grouping operations on subfields, up to 90% improvement for filtering on complex expressions, and a 4x-30x speedup for lookups in replica sets.
All about Developers
MongoDB has always been a company that has targeted developers as its VIPs, so improvements to the “developer experience” are part of every major product cycle. This time around, the company is announcing support for new programming languages.
For example, the MongoDB Driver for Kotlin is now GA, enabling server-side Kotlin development, in addition to previously-supported client-side development using the Realm Kotlin SDK. MongoDB’s PyMongoArrow library for Python development is now GA as well. The library, which is maintained by MongoDB, let developers convert MongoDB data to Pythonic data structures including Pandas DataFrames, NumPy arrays and Apache Arrow tables. This allows data scientists to work with data in MongoDB directly, while still using the tools, language and libraries to which they are accustomed.
Mobile and Containers
And in the world of containerized development, the Atlas Kubernetes Operator will offer simplified installation and use via the Atlas Command Line Interface (CLI). Specifically, developers can install the Operator, generate security credentials, then import existing MongoDB Atlas projects and deployments with a single CLI command.
- Deploy MongoDB in a Container, Access It Outside the Cluster
- MongoDB’s New Tool to Migrate Data from Relational Systems
Moving on to application modernization, and the ability to take over more workloads from relational databases, MongoDB is moving its Relational Migrator tool to GA. Relational Migrator works on a three-step paradigm: (1) analyze the source relational database and design the document schema to which it will be migrated; (2) migrate the data and (3) generate application code to query and maintain the data.
The Relational Migrator supports Oracle Database, Microsoft SQL Server, MySQL, and PostgreSQL source databases and will work with both Atlas and self-managed MongoDB. It can perform one-time “snapshot” migrations, but will also support a continuous migration mode, to facilitate parallel operation of the source and destination systems, using a change data capture (CDC) mechanism. This will keep the source and destination databases in sync, until customers are ready to cut over to the new document-based system completely.
MongoDB’s announcements cover a huge surface area. The company working to accommodate generative AI, analytics and BI, streaming data, mobile, cloud, and container scenarios. And it’s doing tons of groundwork to woo mobile developers, relational database developers, and data scientists. All up, MongoDB is clearly working to expand its utility, its appeal, and its franchise. MongoDB’s appeal to developers is certainly not in dispute, and this batch of announcements shores it up handily. But the company is clearly serious about winning over new constituencies as well and, in so doing, getting corporate tech leadership to see Atlas as a versatile platform that can service a broad range of needs, not just capably, but productively as well.
MongoDB is a client of Brust’s analyst and advisory firm, Blue Badge Insights.