What news from AWS re:Invent last week will have the most impact on you?
Amazon Q, an AI chatbot for explaining how AWS works.
Super-fast S3 Express storage.
New Graviton 4 processor instances.
Emily Freeman leaving AWS.
I don't use AWS, so none of this will affect me.
Operations / Software Development

TLA+ App Modeling Language Comes Correct with a Proper Foundation

Cloud providers use TLA+ to ensure the consistency of their systems. Now the Linux Foundation has set up an organization to get the design language in front of more developers.
Apr 24th, 2023 8:00am by
Featued image for: TLA+ App Modeling Language Comes Correct with a Proper Foundation
Image via Pixabay.

“Coding is to programming what typing is to writing,” mathematician and accidental distributed systems pioneer Leslie Lamport has said, by way of explaining the challenge developers face in building complex systems. In other words, programming requires more thought and planning than just slinging code into the terminal.

For Lamport, developers can lose track of the programs they write by getting too far down into the weeds, worrying about the syntax of the language itself. And this is how the bugs creep in, by devs losing focus on the program’s core algorithm (every program is basically an algorithm, for Lamport).

So, in the late 1990s, Lamport created the TLA+ formal specification language, which, for him, provides a high-level pseudocode to more easily understand and reason about complex programs, especially concurrent and distributed systems.

TLA+ basically provides a blueprint of how a program works, one that can be mathematically tested for correctness (Sample Code).

Now, the Linux Foundation wants to bring more formal discipline to the practice of software development. And so the organization has set up a foundation for TLA+.

Initial members will include the Amazon Web Services (AWS), Oracle and Microsoft. Lamport, now Microsoft Research distinguished scientist, will continue to work on the project.

No Longer a Secret

Despite its amazing powers, TLA+ isn’t particularly well-known in the dev community, admitted Markus Kuppe, Microsoft Principal Research Software Development Engineer, by e-mail.

TLA+ is not widely taught in academia, so engineers will face a steep learning curve if they want to learn it on the job. And given the corporate pressure to produce code rapidly certainly doesn’t lend itself to taking extra time to create a blueprint of what they’ve created. (Lamport himself has speculated that the fear of math, even among software engineers, has held the specification back from wider adoption).

Nonetheless, it may be time to take another look at TLA+, especially with the increase of distributed systems being build for cloud native computing.

“By using TLA+ to model systems, companies can avoid costly bugs, improve reliability, and better understand their systems’ behavior,”  Kuppe wrote. “Existing TLA+ users frequently report that the initial investment in writing blueprints yields significant dividends later.”

For the developer, Kuppe promised that “by using TLA+ to create correct systems,” developers will “avoid patching issues at 3:00 in the morning.”

Many top IT providers already use the technology, including Intel, AWS, Microsoft and Oracle (others keep their use confidential, Kuppe mentioned). Microsoft used it to ensure strong consistency in its Cosmos DB database service, as did AWS for its DynamoDB database service. Oracle has used it to reinforce its Oracle Cloud Infrastructure.

Accidental Expert

Lamport came to be an early pioneer in distributed systems pretty much by pure serendipity.

A mathematician by training, he had read a 1975 Network Working Group RFC on synchronizing duplicate databases over the ARPAnet. He wrote a response paper, which grew to be hella influential, borrowing from Einstein’s Special Relativity, to explain that causality can be used to describe any distributed system. He defined a distributed system as a state machine, where multiple computers cooperate to maintain this single machine. This idea went on to inform how most system builders think about distributed systems.

“I never even thought about a distributed system before I wrote that paper,” Lamport has said. Nonetheless, he won the 2013 Turing Award for this work (You might also know Lamport by way of his almost universally-used LaTeX typesetting system for formatting scientific documents).

Building off his work in distributed systems, he created TLA+ in 1993 as a way to reason about programs, using time-based Temporal Logic (hence the acronym TLA+: Temporal Logic of Actions). The sequential logic for a simple program, written for a single processor, can be readily understood by a developer. But when the execution gets spread across multiple systems, the number of possible executions, and their side effects, gets too complex for even the 10x programmer to grok. TLA+ provides a mathematical model to simplify things.

A Proper Home

The Foundation will provide training, fund research, develop tools and work to build a community around TLA+. It will also work to evolve the language to meet user needs.

A board for the TLA+ Foundation, comprising representatives from the foundation member companies, will lead the organization and determine its strategic direction.

The language is largely complete, so the foundation won’t have much to do in the way of further development (with more focus on removals than additions, Kuppe wrote), though it will offer a RFC process for suggesting changes.

The organization will also work on major upgrades for existing TLA+ support tools, including the model checker TLC model checker and the TLAPS proof system, as well as introduce new, developer-friendly tools.

“The foundation will help spread that work among more hands,” Lamport told the Linux Foundation, in a blog post to be released on Tuesday.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.