Why
After deploying this website, I came to realize a fundamental fact of the internet: it’s full of bots.
Though most bot traffic is detected and filtered by the Umami analytics system, I’m still skeptical that the numbers that are shown to me are a legitimate reflection of actual human traffic on the website.
I wanted to create some sort of system where I can, almost surely, verify the “human-ness” of this site’s visitors. The Human Test page is just that.
There’s also a slightly interesting but technical reason, which I will discuss later.
The Idea
The idea is very simple. It’s simply a form (not a CAPTCHA1) that asks some fairly rudimentary questions about the purpose of your visit. As an optional reward, you get to put your name (and even a message) on “the wall”. It’s hidden until submission, so submitting feels like opening something (a true reward) not appending to a list you’ve already seen. I hope this is enough incentive for people to try and prove their human-ness.
The Stack
The site already runs on Cloudflare Pages, so the form submission goes to a Pages Function. Before it writes anything to the database, it does a few checks. There’s a honeypot field bots fill in and humans never see, a time check that rejects anything submitted in under two seconds, and IP-hash rate limiting (one submission per 24 hours, hashed with a salt, never stored raw2).
Submissions land in a Cloudflare D1 database. Each successful write also triggers a Resend email to my inbox with the full submission.
The More Interesting Reason
Currently, the site is already running Umami, which gives me information about page views, referrers, session durations, etc., all without cookies, but I cannot be truly confident that these visitors are human. But every submission on the human test page gives me a ground-truth label. If you cross-reference the submission timestamps against the analytics sessions, you get feature vectors with known labels.
With enough confirmed human sessions I can start building a small classifier: given an analytics fingerprint, how likely is this visitor to be human? It’s a toy problem, but it’s my own data, and I collected it myself. There’s a particular charm to that.
I don’t have a concrete plan for the model yet, but once there’s enough data, I’ll revisit the classifier idea. Until then, please go fill it out the form