Culture / Development

Souped-Up Gopher: Project Gemini’s Plan to Revolutionize Internet Browsing

14 Mar 2021 6:00am, by

Screenshot of a Gemini site as viewed in Lagrange

A team of highly motivated and principled developers is quietly building an entirely new web of content, served by different servers and accessed with an entirely new kind of software.

As the community grows, it’s been consciously designed to invite the involvement of others. The plan also involves keeping out, hopefully forever, some of the worst features that crept into our modern web. It offers a fresh and thought-provoking perspective on some of the choices we’ve already made in the online world of today. And raises interesting questions. If you were designing a new protocol for sharing documents and files over our vast global networks, what would you leave in?

And more importantly, what would you leave out?

‘Souped-Up Gopher’

Project Gemini” began in June 2019, the work of a self-described “loose and informal community” that aims to address what they call the “misfeatures” of today’s web browsing — like pop-up ads, tracking, and autoplaying videos.

“You may think of Gemini as ‘the web, stripped right back to its essence’,” explains the official FAQ, “or as ‘Gopher, souped-up and modernized just a little’, depending upon your perspective (the latter view is probably more accurate).”

For those who weren’t around in the internet’s early days, Gopher was an early internet protocol for searching and browsing online content. It was rapidly replaced by the web’s easier navigability and support of images. Gemini builds from the base, with additional features.

Its improvements on Gopher include allowing the use of non-ASCII character sets, and identifying binary content with MIME types (allowing the transfer of a variety of formats including plain text, rich text, HTML, and Markdown). Crucially, it also allows links within documents.

But Project Gemini put just as much consideration into what to leave out. Anyone who’s read a web server’s log knows that HTTP requests include a wealth of information, including “User-Agent” and “Referer” headers. But there’s only one field in a Gemini request: what URL was requested.

“This goes a very long way to preventing user tracking,” notes the FAQ.

And since there’s exactly one kind of transaction — getting something — the response headers can indicate the type of content being served… and nothing else. “To minimize the risk of Gemini slowly mutating into something more web-like, it was decided to include one and exactly one piece of information in the response header for successful requests.”

One missing header is the protocol number — because there will only ever be one protocol. “[T]he plan is to ‘get it right the first time’ as much as possible, then freeze the protocol specification forever after, without upgrades, enhancements or extensions… There are plenty of things that Gemini is useful for and good at right now, and there is no reason to think it won’t be useful for and good at those same things decades from now.”

Or, as they put it later, a desire to eliminate the tracking common in web browsers “manifests as a deliberate non-extensibility in many parts of the Gemini protocol.”

In short, the FAQ describes Project Gemini as “a ‘less is more’ reaction against web browsers and servers becoming too complicated and too powerful…”

“Suggestions for new features will not be considered, as the protocol is considered feature complete,” explained the FAQ. “[T]he main focus of the project now is on growing the community around the protocol, as well as working on translating the existing specification into a more precise and formal version which might be considered for submission to internet standards bodies such as IETF and IANA.”

Taking Control

The project also offers a refreshing new perspective on the evolution of our HTML-based web — for example, with the conspicuous absence of a styling language like CSS. “Gemini instead takes the position that visual styling of Gemini content should be under the sole and direct control of the reader, not the writer,” argues the FAQ. “Not everybody has the same taste in colors and fonts, and no single way of styling a page will be optimal for all readers, all devices and all lighting conditions… It’s much simpler, and in fact much more liberating for content authors, to let content just be content, and leave styling to the client.”

To maintain compatibility with a do-it-yourself ethos, Project Gemini incorporates familiar/standardized/mature technologies like MIME and TLS, with a URL-like syntax. This is also an intentional decision, specifically aimed and simplifying the creation of new Gemini-compatible browsing software (which “should be a feasible weekend programming project for a single developer…”).

“Modern web browsers are so complicated that they can only be developed by very large and expensive projects,” notes the FAQ. “This naturally leads to a very small number of near-monopoly browsers, which stifles innovation and diversity and allows the developers of these browsers to dictate the direction in which the web evolves.”

The content for Geminispace will be written in what they describe as a lightweight hypertext. There’s three levels of headings (and one kind of list) — as well as a greater-than symbol that indicates quoted text. (At one point the FAQ notes it’s the equivalent of HTML where the only tags are <p>, <pre>, <a>, <h1> through <h3>, <ul> and <li>, and <blockquote>.)

The “Project Gemini” home page calls the resulting protocol “heavier than gopher… lighter than the web. Will not replace either.” But they also highlight another principle: “Takes user privacy very seriously.” Their ultimate goal is to build “a clearly demarcated space where people can go to consume *only* that kind of content in *only* that kind of way,” arguing that it’s inevitable that stripped-down protocols like Gemini (or Gopher) will create “alternative, simple-by-design spaces with obvious boundaries and hard restrictions… You can relax and get on with your browsing, and follow links to sites you’ve never heard of before, which just popped up yesterday, and be confident that they won’t try to track you or serve you garbage because they *can’t*.

“You can do all this with a client you wrote yourself, so you *know* you can trust it.”

New Day Dawning

The official FAQ describes the protocol as “largely finished” (with an exception for “small changes to remove ambiguity and address edge cases.”)

And the community is growing. The Gemini software page already lists a wide range of choices, with 24 different servers written in a variety of programming languages (including Go, Rust, Python, C, Erlang, Clojure, Lisp, and even PHP). There are at least 40 different clients for browsing Gemini space — including software that runs on Android and iOS devices. And for content authors, there’s syntax highlighting for writing “Gemtext” in popular text editors like Emacs, Nano, and Vim.

There’s also a #gemini channel on the tilde.chat IRC server. (Overheard this week: “You don’t need Docker for this.”)

The Gemini FAQ reports that as of early 2021 there were 500 domains and 600 distinct IP addresses — though as of Thursday there were 236,604 Gemini-formatted URLs, according to stats compiled by gemini.bortzmeyer.org.

It feels a bit like the web must’ve felt in 1993 — populated with early adopters sharing information on bare-bones sites, but with a palpable sense of excitement and promise. There are already a few early, primitive search engines, the first pages of a Gemini-compatible Wiki, and even a portal with Gemini-to-HTTP proxy services.

As one site puts it, “Gemini used to have no culture, but now it does because people.”

A newsletter digest of the week’s most important stories & analyses.