Vasu Menon

The Human Test

The Human Condition by René Magritte, 1933
The Human Condition by René Magritte, 1933, National Gallery of Art, Washington DC

Why

After deploying this website, I came to realize a fundamental fact of the internet: it’s full of bots.

Though most bot traffic is detected and filtered by the Umami analytics system, I’m still skeptical that the numbers that are shown to me are a legitimate reflection of actual human traffic on the website.

I wanted to create some sort of system where I can, almost surely, verify the “human-ness” of this site’s visitors. The Human Test page is just that.

There’s also a slightly interesting but technical reason, which I will discuss later.

The Idea

The idea is very simple. It’s simply a form (not a CAPTCHA1) that asks some fairly rudimentary questions about the purpose of your visit. As an optional reward, you get to put your name (and even a message) on “the wall”. It’s hidden until submission, so submitting feels like opening something (a true reward) not appending to a list you’ve already seen. I hope this is enough incentive for people to try and prove their human-ness.

The Stack

The site already runs on Cloudflare Pages, so the form submission goes to a Pages Function. Before it writes anything to the database, it does a few checks. There’s a honeypot field bots fill in and humans never see, a time check that rejects anything submitted in under two seconds, and IP-hash rate limiting (one submission per 24 hours, hashed with a salt, never stored raw2).

Submissions land in a Cloudflare D1 database. Each successful write also triggers a Resend email to my inbox with the full submission.

The More Interesting Reason

Currently, the site is already running Umami, which gives me information about page views, referrers, session durations, etc., all without cookies, but I cannot be truly confident that these visitors are human. But every submission on the human test page gives me a ground-truth label. If you cross-reference the submission timestamps against the analytics sessions, you get feature vectors with known labels.

With enough confirmed human sessions I can start building a small classifier: given an analytics fingerprint, how likely is this visitor to be human? It’s a toy problem, but it’s my own data, and I collected it myself. There’s a particular charm to that.

I don’t have a concrete plan for the model yet, but once there’s enough data, I’ll revisit the classifier idea. Until then, please go fill it out the form

  1. If I find that there are actual bots spamming my form, I will 100% immediately implement a CAPTCHA, but for now, it’s ok. 

  2. Shoutout Software Security.Â