Beyond Gemini?

I’ve been thinking about Gemini’s place in the world again after a critical but fair article was posted to Lobsters, Why Gemini is not my favorite internet protocol. When I arrived at the comments page it was a fairly one-sided dig at how Gemini doesn’t fix anything.

I made a few points to balance that out: that it’s essentially a social art project in response to the web, a way to escape the web world and read and publish using simple software. Replacing the web was always a non-goal, which makes Gemini a perpetual disappointment to those who do want to replace the web. At first it seems so promising and then after using it for a while you realise, hang on, this has some serious practical limits.

This wasn’t by accident, or because someone forgot that tabular data exists. Anything Gemini could do but doesn’t was almost certainly discussed on the mailing list at some point at some length. Benevolent dictator Solderpunk entertained these discussions for a while but ultimately put their foot down and brought back the scope of gemtext to what was originally intended: headings, bullets, blockquotes and preformatted blocks. When they delegated responsibility for finalising the spec in February 2021, this situation was stated unambiguously.

There is no change in my long-standing overall stance that Gemini requires no new features and that remaining work on the specification should be directed only toward addressing edge cases, removing ambiguities and increasing overall consistency.

Two things are clear at this point. Firstly, Gemini is a success. It’s basically done, it’s filling a niche, and a certain number of people are enjoying themselves using it*. Secondly, there is plenty of scope for someone else to have a go. Why stop at Gemini? Any new proposals can use Gemini as a reference point without being bound by anything Solderpunk did.

From my own playing with the system, and from reading the mailing list on-and-off, I can think of a number of serious limitations in Gemini’s document format off-hand:

  • Having links only on their own lines rather than part of text is really a drag
  • No way to indicate basic formatting in-line in text, such as bold, emphasis, underline or code
  • No way to specify syntax highlighting
  • No tabular data without resorting to hacks like preformatted blocks
  • Readable preformatted text cannot be differentiated from artistic content like ASCII art, affecting accessibility
  • No inline images, diagrams, mathematics

Then there are some obvious limitations in the Gemini protocol itself.

  • No way to submit long-form content, like a multi-paragraph comment on a blog
  • Trust On First Use (TOFU) server certificate model is simple but maybe not strong enough for many uses
  • No way for a server to indicate the size of a resource until it’s all downloaded

To repeat, the existence of all these weaknesses is not a failure of Gemini. It was never intended to do those things. The challenge for someone who is suitably bored or innovative is to look at lists like these, expand them a bit, and work out how we could get the results we want without it inevitably snowballing into the madness that is arbitrary code execution in our browsers, people painting pictures with CSS, WASM, Web Bluetooth, and all the rest of it. There is a gaping chasm between Gemini and the modern web where something like this could be created.

It won’t be me. I have enough hobbies without trying to fix the web. But keep your eyes peeled for the person who does.


*The author included. For my part, I think I have the dubious honour of writing the first proprietary client but that’s another story.