How can you tell whether bots visiting your website actually are who they say they are? Are they bad bots with malicious intent or are they good bots crawling your site to index your content properly? It’s a given that if you want your content to appear in search results, you have to let search engines do the automated work of indexing it. Which means, of course, opening the door (including premium content behind a paywall) to different search engines’ web crawlers for indexing.
Are these bots legitimate, and how can you tell the difference?
Attack of the bots?
Search engines generally provide information on how to detect and verify that its bots are legitimate. Some use statically generated IP lists, while others, like Google or Bing, dynamically assign IPs to their web crawlers, requiring a few more steps to detect the legitimacy of the request to crawl through your content.
Because we’re talking about Varnish here, of course there’s VCL in the form of a handy VMOD to handle all your bot verification needs.
Resolver VMOD and veribot.vcl
In Varnish Enterprise, the resolver VMOD lets you resolve the domain name of an IP. This involves a reverse DNS lookup on the IP, and verification of the resulting domain name with a forward DNS lookup.
If you want to get an easy way to grant access based on a number of domains and User-Agents, veribot.vcl, which is bundled with the resolver VMOD, combines the domain resolution capabilities of the resolver VMOD with the list manipulation capabilities of the rewrite VMOD and caching capabilities of the kvstore VMOD. Once integrated with your VCL, veribot.vcl actively verifies whether bot requests are legitimate or not.
Learn more about bot detection in Varnish, below.