Wasm-Based SQL Extensions — Toward Portability and Compatibility

WebAssembly (Wasm) is becoming well known for letting users run code written in different languages in the browser. But that’s not all it lets you do. Wasm’s portability, speed and security make it a great way for you to create platforms and extensible frameworks that let users compile their code to Wasm and run it in your system quickly.
Databases and other data-intensive systems are great candidates for becoming Wasm-powered platforms. When you have a lot of data, it’s cheaper to move the compute to the data than the other way around. Wasm gives us the tools to do this well, but it’s missing a few features that we can either all build on our own in a thousand incompatible ways or build together in the open.
Many SQL databases already have extensibility features that let you create new functions, aggregates, types and more. For example, in databases like PostgreSQL, each extension has an installation script written in SQL and may also include C code that is compiled to a shared library. The C code may use database APIs and implement logic that would be hard to write in procedural SQL languages.
These shared libraries don’t create a secure sandbox, so you can’t easily prevent an extension from using too many resources, corrupting memory or messing with the system. They’re also not very portable, since you have to compile them for each platform on which you run the database.
This is a natural fit for Wasm since its modules are portable, sandboxed and “capability-safe,” which means they can only access what you give them permission to. SingleStore released Wasm-powered extensibility last summer, including user-defined functions (UDFs) created from Wasm. We’re not alone either — several other products and open source projects are also working on Wasm-based extensibility.
Like other Wasm use cases, people working on SQL extensions quickly realized they need some way to pass data like strings, lists and records in and out of Wasm. The core Wasm spec doesn’t provide a way to do this and only defines things like numbers and memory as a flat array of bytes, not higher-level types.
This can lead different Wasm platforms to come up with their own Application Binary Interface (ABI), procedure call mechanism, mapping to gRPC or other solutions. These different solutions to describing high-level interfaces and types lead to a huge amount of fragmentation. This means that Wasm created for one platform can’t be used in another, and users need a different set of tools for each language for each platform, which is both inconvenient and a waste of resources to develop.
However, there is a way out of this fragmentation nightmare: the WebAssembly System Interface (WASI) and the component model. WASI is a subgroup of the WebAssembly Community Group (CG), and it’s working to define standardized interfaces for common system resources and a component model. Wasm Components are wrappers around core Wasm modules, giving us a way to statically link them together and include high-level interfaces and types in the binary.
The component model provides a general solution with a path to standardization for these high-level types and interfaces that are currently being achieved in a huge variety of bespoke ways. If we want to prevent fragmentation, reduce the amount of duplicate work done in the Wasm + SQL ecosystem, and make extensions work in a wide variety of projects and products, the component model and WASI are the answer.
That’s why SingleStoreDB is championing the WASI SQL Embedding proposal, which describes how Wasm can be embedded in SQL environments as extensions. The standard will leverage the component model and its interfaces to provide a way for users to create SQL extensions using only open source component model tools like Cargo Component and Componentize-JS.
The WASI SQL Embedding proposal is fully open source and part of the WASI subgroup. If you’re interested in being part of a more cohesive and less fragmented SQL-extension ecosystem based on Wasm, come join us.