le warns against the use of 404 or 403 status codes to limit Googlebot’s crawl rate
Google warns against the use of 404 and other 4xx server status errors (such as 403s) to try to limit Googlebot’s crawl rate. Gary Illyes, from Google Search Relations team , wrote: “Please don’t do that.”
The reason for the notice. These techniques have been used to reduce Googlebot crawling by increasing numbers of websites and CDNs. Gary Illyes stated that there has been an increase in web owners and content delivery networks (CDNs), trying to use 404 and other 4xx client errors (but NOT 429), to try to decrease Googlebot’s crawl rates.
Here’s what to do instead. Google provides a detailed help guide on how to reduce Googlebot crawling of your site. To adjust your crawl rate, you can use the Google Search Engine crawl rate settings.
Google explained that you can reduce the Googlebot crawlrate in Search Console. This setting is usually reflected within days of changes. First verify ownership before you can use this setting. You should not set the crawl rate at a level that is too low for your site. Find out more about and what a crawl budget is for Googlebot. To reduce your crawl rate, make a request if the Crawl Speed Settings are not available for your site. An increase in crawl rate cannot be requested.
Google will then tell you to reduce the crawl rate for a short time (for example, for a few hours or 1-2 days), and then return an informational error webpage with a 500, 503, 429, or 429 HTTP response status code.
Why we care. You may have noticed crawling problems because your CDN or hosting provider recently used these techniques. To ensure that they aren’t using 404s and 403s to lower crawl rates, you might submit a support ticket.
The post Google warns against 403 and 404 status codes to limit Googlebot crawl rate limiting appeared originally on Search Engine land.