Fraud Prevention: Being Strict While Being Fair

At Hunter, one of our daily tasks is to fight against fraud. Preventing the creation of duplicate or bot accounts, as well as blocking payments from stolen credit cards, for example, are missions we undertake at any moment. Using both automated and manual means, we want to make sure every Hunter user is a legit, well-intentioned human.

This post describes our approach and the choices we’ve made to be strict while being fair to our visitors, users and customers.

Limiting the Trial usage

Visitors can use parts of our service without having to create an account. They get a few requests to help them determine if they wish to sign up. But the Web being what it is, some people try to take advantage of it! For this reason, our trial endpoints are limited by different means.

First, we use Google reCAPTCHA to ensure only real humans are performing requests against those endpoints. When we introduced this protection a few weeks ago, a lot of bot traffic was immediately blocked:

Fortunately, within a few days, all the bad traffic has been filtered and we’re now ensured that we only receive “human” traffic on these endpoints. (Note: reCAPTCHA scores every interaction between 0.0 - very likely a bot - and 1.0 - very likely a good interaction. We kept the default 0.5 score threshold to differentiate a bot from a human)

reCAPTCHA reports mostly good interactions

We also rate-limit on our end those endpoints to block requesters going above a certain threshold. As shown in the graph below, this limitation permits to block around 50k trial requests every hour:

Finally, we use some internal rules to decide if an IP address is allowed to take advantage of our trial API endpoints. But for this one, we’ll have to remain a bit more secret 🤫

Preventing bad players from going in

Once a person has used all of its daily free trial quota or he needs to access more features reserved to registered users, he has to create an account. To make the sign up as simple as possible, we only require some basic information on the regular Email registration: email, first name, last name, and password. But these 4 pieces of information (plus the source IP address) can actually provide a lot of really interesting insights!

Let’s take a look at the most interesting one: the email address. A lot of relevant information can be inferred from it, for example:

Is it from webmail or a disposable address provider?
Is the local part (the part before the @) gibberish, generic (president@), or does it look personal?
Is the domain name linked to a valid website, a “parked domain” website, or no website at all? Is the website secured with SSL?
How many users do we already have on that same domain?
When was the domain registered?

Many more insights can be extracted from the email address. It just depends on your creativity.

Then, let’s focus a bit on the source IP address. This one, too, can answer a lot of questions:

How many users do we already have on that same IP address?
Is it a VPN or Proxy IP address?
Where does the request originate from?

And finally, the first and last name. It’s hard to be extremely strict regarding those, but some inputs will definitely catch our attention: Test Test or John Doe, for example, aren’t really common people. Elon Musk1234 neither. 😅

Note: If you’re trying to build something similar, we also recommend you to take a look at Clearbit Risk, which provides you with a score from 0 to 100 for each visitor.

By combining all those indicators, we can quickly and easily detect legitimate users and those who should be sent to an extra layer of verification.

Preventing duplicate accounts through extended validations

The previous verifications, triggered right after a new user joins Hunter, are there to ensure only real humans sign up to our service. Though, with our 50 free monthly requests model, some can be tempted to create duplicate accounts to double (or even more!) their monthly quota.

To prevent this, we reroute some of our newcomers to a phone number validation process (using the data gathered at the sign up step):

the user must input a valid (and obviously) unique phone number
we send a validation code to that number
the user must provide the validation code he received.

To prevent abuses, we don’t allow VOIP numbers (detected thanks to the Twilio Lookup API). We also prevent users from requesting too many codes, through Redis “rate-limit” counters.

Note: we don’t use the phone number for anything else than user account validation.

Most of the users that will succeed in this verification will get their free 50 requests and will be able to enjoy Hunter. But, for a minor portion of them, there’s a last validation coming: their LinkedIn profile.

We keep this verification for cases we are really unsure about. Here, we do use a regular OAuth Sign-in with LinkedIn, with the r_emailaddress and r_liteprofile scopes. It gives access to a few interesting data points, in particular:

the user name and email address
the user picture URL.

We won’t dive into the details here, but with a bit of imagination, you can guess how those two pieces of information can be used to finally allow or block a newcomer.

At this step, we mostly ensured that only real persons own a Hunter account and that each person has at most one Hunter account. Though, there’s still room for fraudulent users: payments 💰

Preventing fraudulent payments

Stripe processes all the payments to Hunter. They help a lot fighting against fraudulent payments with their service named Radar. This service defines a nice set of default rules to block payment, put it into a review, or request 3D Secure. It also allows businesses to set up their own rules. At Hunter, we have decided to implement a strict policy regarding payments and therefore included some custom rules, such as:

Review the payment if the CVC check failed
Block the payment if it has an elevated risk score
Send the payment through 3D Secure when it’s supported.

Those procedures help us keep a low dispute rate and block illegitimate users out of the service. The downside is more work for some users and our Support team. We mitigate this impact by fixing rules with high false-positive scores.

Here’s how those rules perform against time:

Unfortunately, from time to time, some bad players are still able to pass, and for this reason, we added another security layer based on the payments. For this one, the checks are, for example:

What’s the Stripe score for this payment?
Does the payment information make sense regarding what we already know about the customer?
How coherent is it for that customer to upgrade to a Premium plan?

If a customer seems legit after all those checks, he’ll be able to enjoy his Premium Hunter subscription. If he doesn’t, he’ll have to go through our last verification step, which is completely manual. We do ask him to provide some information regarding his company, his social networks handle, etc.

Note: Once again, these are only used to validate the customer payment.

Conclusion

By mixing automated checks (as much as we can) and manual ones, we ensure requests that reach our servers are only made by valid, well-intentioned persons. We’re aware it might create some friction in some cases, and we really apologize for this. On the other hand, keeping a clean user and customer base is better for everyone in the long run.

Using a few data points, a couple of third-parties services, and a lot of imagination, we’ve built a simple but efficient system to fight fraud. That’s another example that proves you don’t always need tons of resources to build things that work for your business.

If you’d like to share your own experience or suggest some additions that would make sense, we’re all ears!