Being a human on the internet

If you were born earlier than… let’s say 2005, I want to congratulate you on your new privilege. Your online accounts and their associated activity show that you existed before we had AI that could generate convincing text. So long as you continue to use the same email address or publish on the same blog, the services you use and the people you’re communicating with can reasonably assume that you’re human. If you happen to be uncritically regurgitating the thoughts of others, well, we can at least be reassured that it’s happening the traditional way.

Suppose that in 2030 a 16-year-old creates an online journal and starts writing. What would they have to do to convince you that they’re a real person and not a bot programmed to influence your views through the medium of creative writing? It seems difficult to prove. For business or government purposes we have lots of ways to identify ourselves. Piggy-backing off anything involving money is popular, such as using a credit card to prove that you’re a real person. These strategies aren’t so good for communicating with strangers online. You’re not going to publish your driver’s licence on your website. Even if you did, most people don’t have a way to check if it’s valid, and it doesn’t stop someone copying your licence onto their website.

But what do I care whether the journal is written by a real person? If a story moves and influences me, isn’t that a merit of the text regardless of who wrote it? Shouldn’t changes to my worldview be based objectively on what’s written rather than who’s speaking? Besides, everything an AI spits out is trained on human text so am I not being ultimately influenced by human thoughts?

If you’ve spent much time in a car with air conditioning you’re probably familiar with the “recirculate” button, which chooses whether to draw air from outside or use the air already inside the cabin so the heating or cooling effect compounds more quickly. It’s irrational and I know it, but switching on the recirculated air gives me a feeling of existing in a dirty and stale environment, with an outside chance of asphyxiation.

That’s how I instinctively feel about large language models. Dirty and stale, with an outside chance of Roko’s basilisk. And really, feelings are sufficient. If AI-generated text gives me the heebie-jeebies, as I’m certain it will for many others, then I’m going to take an active interest in the provenance of what I read online. I don’t need to be “objective” in my quest for authentic human content lest I hurt the feelings of a computer program. I can care about the distinction if I want to.

In any case, I don’t think this constitutes a failure to be rational about what I’m reading. In most contexts the author of a text matters a great deal. Of course there are your basic concerns of interpretation like minding the author’s motivations and prejudices. More interestingly, there’s a chance that a human is trying to tell the truth, or at least what seems true from their perspective. If you have a program that produces combinations of words that seem good in terms of a mathematical optimisation of training data, truth is an emergent property obtained by chance, and only verifiable by checking outside the system.

Faced with a human author, you can extend good faith and assume they are telling the truth. Whether or not this is wise in a given situation, the concept makes sense as something you can choose to do. With a language model there is no moral obligation of honesty. To give it the benefit of the doubt is to put faith in a personal attribute that doesn’t exist. If a human lies or distorts the truth, there is the possibility of using context to read between the lines. What is there to read between the lines of a language model’s text? Everything and nothing is suspect, all at once.

I think there is going to be a deep interest in proving humanness online, and I’m not talking about captchas. The question is how we do it.

The most basic approach is to rely on the big central services. If Facebook were overrun by AI profiles they could require or offer the possibility of verification via government ID. This would mark your profile as verified and through the magic of single-sign-on your verified Facebook account would let you into other services, everybody knowing your real name and of course your humanness.

Verifying people on the internet by inspecting government ID sounds familiar… oh yes that’s right, PGP keys. The web of trust offers a decentralised solution: humans hold private keys, and we sign them for each other to prove identity and humanness. We usedn’t to have to say that last part out loud but it was implied.

Could it make a comeback? I think we’ve already learnt this lesson, first from PGP advocates, second from cryptocurrency advocates, that most people are uninterested in learning how to look after secret keys properly, and they’re unwilling to accept the consequences if that key leaks. For this reason I’m also sceptical about doing web of trust where the first step is having an Ethereum wallet. Ordinary folks might care about being duped by AI bots but not that much.

So are we stuck with Facebook being the arbiter of who is human? That sounds pretty dismal to me. The government? Also pretty dismal. Maybe we’re just going to rely on good old word-of-mouth to find fellow humans in the AI swamp, with a sprinkling of our own judgement? Maybe we’re going to get a little bit more Offline and use the physical world to build our human connections? Maybe this will be a generational difference and kids won’t care in the slightest whether they’re talking to a human or a machine?

I still don’t know. But in case it matters… *waves* hello from an established human.