Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Memverifikasi Googlebot dan crawler Google lainnya
Anda dapat memverifikasi apakah web crawler yang mengakses server Anda benar-benar
crawler Google, seperti
Googlebot. Tindakan ini berguna jika Anda khawatir terhadap spammer atau pembuat masalah lainnya
yang mengakses situs Anda dengan mengaku sebagai Googlebot.
Crawler Google dibagi menjadi tiga kategori:
Jenis |
Deskripsi |
Mask DNS terbalik |
Rentang IP |
Crawler umum |
Crawler umum yang digunakan untuk produk Google (seperti Googlebot). Crawler ini selalu mematuhi
aturan robots.txt untuk crawl otomatis.
|
crawl-***-***-***-***.googlebot.com atau
geo-crawl-***-***-***-***.geo.googlebot.com
|
googlebot.json |
Crawler kasus khusus |
Crawler yang melakukan fungsi tertentu untuk produk Google (seperti AdsBot) saat ada
perjanjian antara situs yang di-crawl dan produk tersebut tentang proses crawl-nya. Crawler ini mungkin
mematuhi atau tidak mematuhi aturan robots.txt.
|
rate-limited-proxy-***-***-***-***.google.com |
special-crawlers.json |
Pengambil yang dipicu pengguna |
Alat dan fungsi produk tempat pengambilan dipicu pengguna akhir. Misalnya,
Pemverifikasi Situs Google
bertindak atas permintaan pengguna. Karena pengambilan diminta oleh pengguna, pengambil ini
mengabaikan aturan robots.txt.
Pengambil yang dikontrol oleh Google berasal dari IP di objek
user-triggered-fetchers-google.json dan di-resolve ke
nama host google.com . IP dalam objek user-triggered-fetchers.json
akan di-resolve ke nama host gae.googleusercontent.com . IP ini digunakan, misalnya,
jika situs yang berjalan di Google Cloud (GCP) memiliki fitur yang memerlukan pengambilan feed RSS
eksternal atas permintaan pengguna situs tersebut.
|
***-***-***-***.gae.googleusercontent.com atau
google-proxy-***-***-***-***.google.com
|
user-triggered-fetchers.json
dan
user-triggered-fetchers-google.json
|
Ada dua metode untuk memverifikasi crawler Google:
-
Secara manual: Untuk pencarian satu kali, gunakan alat command line. Metode ini
sudah cukup bagi sebagian besar kasus penggunaan.
-
Secara otomatis: Untuk pencarian dalam skala besar, gunakan solusi otomatis untuk
mencocokkan alamat IP crawler dengan daftar alamat IP Googlebot yang dipublikasikan.
Menggunakan alat command line
-
Jalankan pencarian balik DNS di alamat IP pengaksesan dari log Anda, menggunakan
perintah
host
.
-
Verifikasi bahwa nama domainnya adalah
googlebot.com
, google.com
, atau
googleusercontent.com
.
-
Jalankan pencarian DNS maju pada nama domain yang diambil di langkah pertama menggunakan
perintah
host
pada nama domain yang diambil.
- Verifikasi bahwa alamat tersebut sama dengan alamat IP pengaksesan asli dari log Anda.
Contoh 1:
host 66.249.66.1
1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com.
host crawl-66-249-66-1.googlebot.com
crawl-66-249-66-1.googlebot.com has address 66.249.66.1
Contoh 2:
host 35.247.243.240
240.243.247.35.in-addr.arpa domain name pointer geo-crawl-35-247-243-240.geo.googlebot.com.
host geo-crawl-35-247-243-240.geo.googlebot.com
geo-crawl-35-247-243-240.geo.googlebot.com has address 35.247.243.240
Contoh 3:
host 66.249.90.77
77.90.249.66.in-addr.arpa domain name pointer rate-limited-proxy-66-249-90-77.google.com.
host rate-limited-proxy-66-249-90-77.google.com
rate-limited-proxy-66-249-90-77.google.com has address 66.249.90.77
Menggunakan solusi otomatis
Atau, Anda dapat mengidentifikasi Googlebot berdasarkan alamat IP dengan mencocokkan alamat IP crawler
dengan daftar rentang IP crawler dan pengambil Google:
Untuk alamat IP Google lain tempat situs Anda dapat diakses (misalnya
Apps Script), cocokkan alamat IP pengaksesan
dengan
daftar umum alamat IP Google.
Perhatikan bahwa alamat IP dalam file JSON ditampilkan dalam
format CIDR.
Kecuali dinyatakan lain, konten di halaman ini dilisensikan berdasarkan Lisensi Creative Commons Attribution 4.0, sedangkan contoh kode dilisensikan berdasarkan Lisensi Apache 2.0. Untuk mengetahui informasi selengkapnya, lihat Kebijakan Situs Google Developers. Java adalah merek dagang terdaftar dari Oracle dan/atau afiliasinya.
Terakhir diperbarui pada 2025-08-04 UTC.
[null,null,["Terakhir diperbarui pada 2025-08-04 UTC."],[[["\u003cp\u003eVerify if a web crawler is actually a Google crawler to prevent unauthorized access.\u003c/p\u003e\n"],["\u003cp\u003eGoogle has three crawler types: common crawlers, special-case crawlers, and user-triggered fetchers, each with varying adherence to robots.txt rules.\u003c/p\u003e\n"],["\u003cp\u003eTwo verification methods are available: manual verification using command-line tools for individual checks, and automatic verification by comparing IP addresses against published Googlebot IP lists for large-scale checks.\u003c/p\u003e\n"],["\u003cp\u003eGoogle provides JSON files listing IP ranges for different Googlebot types, allowing for automated verification and filtering.\u003c/p\u003e\n"]]],["To verify if a crawler is genuinely from Google, use reverse DNS lookups. Check if the domain is `googlebot.com`, `google.com`, or `googleusercontent.com`. Then, perform a forward DNS lookup on this domain and compare it to the original IP. Alternatively, automatically match the crawler's IP to Google's published IP ranges for common, special, or user-triggered fetchers. Use command-line tools for manual verification or IP-matching against provided JSON files for automation.\n"],null,["# Googlebot and Other Google Crawler Verification | Google Search Central\n\nVerifying Googlebot and other Google crawlers\n=============================================\n\n\nYou can verify if a web crawler accessing your server really is a\n[Google crawler](/search/docs/crawling-indexing/overview-google-crawlers), such as\nGooglebot. This is useful if you're concerned that spammers or other troublemakers are\naccessing your site while claiming to be Googlebot.\n\nGoogle's crawlers fall into three categories:\n\n| Type | Description | Reverse DNS mask | IP ranges |\n|------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [Common crawlers](/search/docs/crawling-indexing/google-common-crawlers) | The common crawlers used for Google's products (such as Googlebot). They always respect robots.txt rules for automatic crawls. | `crawl-***-***-***-***.googlebot.com` or `geo-crawl-***-***-***-***.geo.googlebot.com` | [googlebot.json](/static/search/apis/ipranges/googlebot.json) |\n| [Special-case crawlers](/search/docs/crawling-indexing/google-special-case-crawlers) | Crawlers that perform specific functions for Google products (such as AdsBot) where there's an agreement between the crawled site and the product about the crawl process. These crawlers may or may not respect robots.txt rules. | `rate-limited-proxy-***-***-***-***.google.com` | [special-crawlers.json](/static/search/apis/ipranges/special-crawlers.json) |\n| [User-triggered fetchers](/search/docs/crawling-indexing/google-user-triggered-fetchers) | Tools and product functions where the end user triggers a fetch. For example, [Google Site Verifier](https://support.google.com/webmasters/answer/9008080) acts on the request of a user. Because the fetch was requested by a user, these fetchers ignore robots.txt rules. Fetchers controlled by Google originate from IPs in the `user-triggered-fetchers-google.json` object and resolve to a `google.com` hostname. IPs in the `user-triggered-fetchers.json` object resolve to `gae.googleusercontent.com` hostnames. These IPs are used, for example, if a site running on Google Cloud (GCP) has a feature that requires fetching external RSS feeds on the request of the user of that site. | `***-***-***-***.gae.googleusercontent.com` or `google-proxy-***-***-***-***.google.com` | [user-triggered-fetchers.json](/static/search/apis/ipranges/user-triggered-fetchers.json) and [user-triggered-fetchers-google.json](/static/search/apis/ipranges/user-triggered-fetchers-google.json) |\n\nThere are two methods for verifying Google's crawlers:\n\n- [Manually](#manual): For one-off lookups, use command line tools. This method is sufficient for most use cases.\n- [Automatically](#automatic): For large scale lookups, use an automatic solution to match a crawler's IP address against the list of published Googlebot IP addresses.\n\nUse command line tools\n----------------------\n\n1. Run a reverse DNS lookup on the accessing IP address from your logs, using the `host` command.\n2. Verify that the domain name is either `googlebot.com`, `google.com`, or `googleusercontent.com`.\n3. Run a forward DNS lookup on the domain name retrieved in step 1 using the `host` command on the retrieved domain name.\n4. Verify that it's the same as the original accessing IP address from your logs.\n\n**Example 1:** \n\n host 66.249.66.1\n 1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com.\n\n host crawl-66-249-66-1.googlebot.com\n crawl-66-249-66-1.googlebot.com has address 66.249.66.1\n\n**Example 2:** \n\n host 35.247.243.240\n 240.243.247.35.in-addr.arpa domain name pointer geo-crawl-35-247-243-240.geo.googlebot.com.\n\n host geo-crawl-35-247-243-240.geo.googlebot.com\n geo-crawl-35-247-243-240.geo.googlebot.com has address 35.247.243.240\n\n**Example 3:** \n\n host 66.249.90.77\n 77.90.249.66.in-addr.arpa domain name pointer rate-limited-proxy-66-249-90-77.google.com.\n\n host rate-limited-proxy-66-249-90-77.google.com\n rate-limited-proxy-66-249-90-77.google.com has address 66.249.90.77\n\nUse automatic solutions\n-----------------------\n\n\nAlternatively, you can identify Googlebot by IP address by matching the crawler's IP address\nto the lists of Google crawlers' and fetchers' IP ranges:\n\n- [Common crawlers like Googlebot](/static/search/apis/ipranges/googlebot.json)\n- [Special crawlers like AdsBot](/static/search/apis/ipranges/special-crawlers.json)\n- [User triggered fetches (users)](/static/search/apis/ipranges/user-triggered-fetchers.json)\n- [User triggered fetches (Google)](/static/search/apis/ipranges/user-triggered-fetchers-google.json)\n\n\nFor other Google IP addresses from where your site may be accessed (for example,\n[Apps Scripts](/apps-script)), match the accessing IP address\nagainst the general\n[list of Google IP addresses](https://www.gstatic.com/ipranges/goog.json).\nNote that the IP addresses in the JSON files are represented in\n[CIDR format](https://wikipedia.org/wiki/Classless_Inter-Domain_Routing)."]]