Preventing automated sign-ups
The Session goes through periods of getting spammed with automated sign-ups. I’m not sure why. It’s not like they do anything with the accounts. They’re just created and then they sit there (until I delete them).
In the past I’ve dealt with them in an ad-hoc way. If the sign-ups were all coming from the same IP addresses, I could block them. If the sign-ups showed some pattern in the usernames or emails, I could use that to block them.
Recently though, there was a spate of sign-ups that didn’t have any patterns, all coming from different IP addresses.
I decided it was time to knuckle down and figure out a way to prevent automated sign-ups.
I knew what I didn’t want to do. I didn’t want to put any obstacles in the way of genuine sign-ups. There’d be no CAPTCHAs or other “prove you’re a human” shite. That’s the airport security model: inconvenience everyone to stop a tiny number of bad actors.
The first step I took was the bare minimum. I added two form fields—called “wheat” and “chaff”—that are randomly generated every time the sign-up form is loaded. There’s a connection between those two fields that I can check on the server.
Here’s how I’m generating the fields in PHP:
$saltstring = 'A string known only to me.';
$wheat = base64_encode(openssl_random_pseudo_bytes(16));
$chaff = password_hash($saltstring.$wheat, PASSWORD_BCRYPT);
See how the fields are generated from a combination of random bytes and a string of characters never revealed on the client? To keep it from goint stale, this string—the salt—includes something related to the current date.
Now when the form is submitted, I can check to see if the relationship holds true:
if (!password_verify($saltstring.$_POST['wheat'], $_POST['chaff'])) {
// Spammer!
}
That’s just the first line of defence. After thinking about it for a while, I came to conclusion that it wasn’t enough to just generate some random form field values; I needed to generate random form field names.
Previously, the names for the form fields were easily-guessable: “username”, “password”, “email”. What I needed to do was generate unique form field names every time the sign-up page was loaded.
First of all, I create a one-time password:
$otp = base64_encode(openssl_random_pseudo_bytes(16));
Now I generate form field names by hashing that random value with known strings (“username”, “password”, “email”) together with a salt string known only to me.
$otp_hashed_for_username = md5($saltstring.'username'.$otp);
$otp_hashed_for_password = md5($saltstring.'password'.$otp);
$otp_hashed_for_email = md5($saltstring.'email'.$otp);
Those are all used for form field names on the client, like this:
<input type="text" name="<?php echo $otp_hashed_for_username; ?>">
<input type="password" name="<?php echo $otp_hashed_for_password; ?>">
<input type="email" name="<?php echo $otp_hashed_for_email; ?>">
(Remember, the name—or the ID—of the form field makes no difference to semantics or accessibility; the accessible name is derived from the associated label
element.)
The one-time password also becomes a form field on the client:
<input type="hidden" name="otp" value="<?php echo $otp; ?>">
When the form is submitted, I use the value of that form field along with the salt string to recreate the field names:
$otp_hashed_for_username = md5($saltstring.'username'.$_POST['otp']);
$otp_hashed_for_password = md5($saltstring.'password'.$_POST['otp']);
$otp_hashed_for_email = md5($saltstring.'email'.$_POST['otp']);
If those form fields don’t exist, the sign-up is rejected.
As an added extra, I leave honeypot hidden forms named “username”, “password”, and “email”. If any of those fields are filled out, the sign-up is rejected.
I put that code live and the automated sign-ups stopped straight away.
It’s not entirely foolproof. It would be possible to create an automated sign-up system that grabs the names of the form fields from the sign-up form each time. But this puts enough friction in the way to make automated sign-ups a pain.
You can view source on the sign-up page to see what the form fields are like.
I used the same technique on the contact page to prevent automated spam there too.