As a product owner it is very important to protect your website from bots and scrapers. To achieve this you need a process to identify the actual users apart from the bots and scripts. Ideally this can be done by giving some challenge to the users that only a human can solve. This involves identifying character that are fairly unreadable but still somewhat identifiable. With evolving image processing techniques getting correct answers is difficult for such challenges but not impossible. Then comes image identification challenges where you select a few images from a bunch which satisfy a given condition. This is still fairly more difficult than the last approach but still not unbreakable. But even these techniques although not 100% effective can be extremely difficult to implement yourself.
Well, you don’t have to built it yourself, Google has already built the service for this very purpose called reCaptcha. There are 2 versions available at the time of writing of this article V2 and V3. The former one identified as version 2 used image categorization challenge mentioned above. The more advanced and newer version 3 employs more advanced algorithm. It only shows a challenge if it detect bot like behaviour from the interacting user otherwise the page looks pretty normal.
How to integrate Google reCaptcha to your site?
The process is quite straight forward and absolutely free. You just need a Gmail account before you proceed. To protect your website from bots and scrapers, follow the steps below.
- After login to your Gmail account, proceed to the reCaptcha admin panel on this link.
- Add a label to easily identify the service. You can keep it same as the product/site name.
- Next choose the version of the reCaptcha that you want to use. If you need help in deciding between the two available, you can refer to the official documentation.
- Add the domain of your website on which you want to integrate the reCaptcha service.
- Next accept the terms of service, after reading (Pst. I know no one reads it anyway).
- On the next screen you will get your Site Key and Secret Key. Here, the Site Key is a public key and you will be embedding it in the front-end site. The Secret Key needs to be kept super secret from public and should only be used at the back-end code to verify the response from the user.
- You can use the reCaptcha API for validating response at the back-end. You can refer to this link for the API details and documentation.
- On front-end you can refer to this link to integrate Google reCaptcha to your site.
Your website is now safe from bots and scrapers 🙂 If you are using WordPress and want to protect it from getting hacked, give this short article a read: Is your WordPress website redirecting users to a malicious web page?