le explains its different crawler types and their use cases //
Google has added details to its Google crawlers, including Googlebot, user-triggered fetchers, and special-case crawlers.
Google has also added a JSON-formatted file that contains the IP addresses of each crawler type.
Types of Google crawlers. Google has listed three types of crawlers at the top of its Googlebot page:
- Googlebot is the main crawler of Google’s search engine products. Google claims that this crawler respects the robots.txt guidelines.
- Special case crawlers –Crawlers performing specific functions (such AdsBot) that may or may no respect robots.txt.
- User triggered fetchers –Tools or product functions that allow the user to trigger a fetch. Google Site Verifier, for example, acts at the user’s request or certain Google Search Console tools send Google to fetch a page based upon an action taken by the user.
IP addresses. Google has also listed IP addresses and reverse DNS masks for each type.
- Googlebot – googlebot.json (crawl—-.googlebot.com or geo-crawl—-.geo.googlebot.com)
- Special-case crawlers – special-crawlers.json (rate-limited-proxy—-.google.com)
- User-triggered fetchers – user-triggered-fetchers.json (—.gae.googleusercontent.com)
What’s new? The section that has been updated is shown below; the rest is mostly unchanged.
Why we care. Google changed this after seeing the reaction to the GoogleOther Robot they announced the other week. This explains the behavior of Google crawlers, how they behave when they obey robots.txt, and how you can identify them.
If you don’t want to block Googlebot but decide to block others, then you can identify them more accurately.
The article Google explains its different crawler types first appeared on Search Engine land.