Google New Updates: Google Documents Its Three Types Of Web Crawlers
Google New Updates: Google Documents Its Three Types Of Web Crawlers
Google has updated its Verifying Googlebot and other Google crawlers help document to add a new section describing the three categories or types of crawlers they have. They have their Googlebot crawler, special-case crawlers and user-triggered crawlers.
I believe this was done after we, including me, were obsessed a bit over the new GoogleOther crawler. Then Gary Illyes from Google added, "Please don't overthink it, it's really that boring." But I do what I do and I overthinked it. So Gary did what he does and had a help document to explain this in more detail.
The help document says, "Google's crawlers fall into three categories."
(1) Googlebot: The main crawler for Google's search products, it alays respects robots.txt rules. Its revenue DNS mask is "crawl-***-***-***-***.googlebot.com or geo-crawl-***-***-***-***.geo.googlebot.com" and the list of IP ranges are in this googlebot.json file.
(2) Special-case crawlers: Crawlers that perform specific functions (such as AdsBot), which may or may not respect robots.txt rules. Its revenue DNS mask is "rate-limited-proxy-***-***-***-***.google.com" and the list of IP ranges are in this special-crawlers.json file.
(3) User-triggered fetchers: Tools and product functions where the end user triggers a fetch. For example, Google Site Verifier acts on the request of a user. Because the fetch was requested by a user, these fetchers ignore robots.txt rules. Its revenue DNS mask is "***-***-***-***.gae.googleusercontent.com" and the list of IP ranges are in this user-triggered-fetchers.json file.
Here is a screenshot of the new section in this help document:
I believe this was done after we, including me, were obsessed a bit over the new GoogleOther crawler. Then Gary Illyes from Google added, "Please don't overthink it, it's really that boring." But I do what I do and I overthinked it. So Gary did what he does and had a help document to explain this in more detail.
The help document says, "Google's crawlers fall into three categories."
(1) Googlebot: The main crawler for Google's search products, it alays respects robots.txt rules. Its revenue DNS mask is "crawl-***-***-***-***.googlebot.com or geo-crawl-***-***-***-***.geo.googlebot.com" and the list of IP ranges are in this googlebot.json file.
(2) Special-case crawlers: Crawlers that perform specific functions (such as AdsBot), which may or may not respect robots.txt rules. Its revenue DNS mask is "rate-limited-proxy-***-***-***-***.google.com" and the list of IP ranges are in this special-crawlers.json file.
(3) User-triggered fetchers: Tools and product functions where the end user triggers a fetch. For example, Google Site Verifier acts on the request of a user. Because the fetch was requested by a user, these fetchers ignore robots.txt rules. Its revenue DNS mask is "***-***-***-***.gae.googleusercontent.com" and the list of IP ranges are in this user-triggered-fetchers.json file.
Here is a screenshot of the new section in this help document:
Post a Comment