Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Mengelola crawling URL navigasi berfaset
Navigasi berfaset adalah fitur umum situs yang memungkinkan pengunjungnya mengubah cara item
(misalnya, produk, artikel, atau acara) ditampilkan di halaman. Fitur ini populer dan berguna,
tetapi penerapannya yang paling umum, yang didasarkan pada parameter URL, dapat menghasilkan
ruang URL yang tidak terbatas sehingga merusak kualitas situs karena:
Crawling berlebih: Karena URL yang dibuat untuk navigasi berfaset tampaknya
baru dan crawler tidak dapat menentukan apakah URL tersebut berguna sebelum melakukan crawling
terlebih dahulu, crawler biasanya akan mengakses URL navigasi berfaset dalam jumlah sangat besar sebelum
akhirnya proses crawler menentukan bahwa URL tersebut sebenarnya tidak berguna.
Crawl penemuan melambat: Meneruskan dari poin sebelumnya, jika upaya crawling terlalu terfokus
pada URL yang tidak berguna, crawler tidak akan bisa secara optimal meng-crawl URL baru yang berguna.
URL navigasi berfaset standar dapat berisi berbagai parameter dalam string kueri yang terkait dengan
properti item yang difilter. Contoh:
Mengubah parameter URL products, color, dan
size akan menampilkan kumpulan item yang berbeda di halaman dasar. Sering kali, hal ini berarti
kemungkinan kombinasi filternya sangat banyak, yang berarti kemungkinan URL juga
sangat banyak. Untuk menghemat resource Anda, sebaiknya tangani URL ini dengan salah satu cara
berikut:
Jika Anda tidak memerlukan URL navigasi berfaset yang berpotensi diindeks, cegah crawling URL
ini.
Jika Anda memerlukan URL navigasi berfaset yang berpotensi diindeks, pastikan URL tersebut mengikuti
praktik terbaik yang diuraikan di bagian berikut. Perlu diingat bahwa meng-crawl URL berfaset cenderung
menghabiskan banyak resource komputasi situs karena banyaknya URL dan
operasi yang diperlukan untuk merender halaman tersebut.
Mencegah crawling URL navigasi berfaset
Jika ingin menghemat resource server dan tidak perlu menampilkan URL navigasi berfaset di
Google Penelusuran, Anda dapat mencegah crawling URL ini dengan salah satu cara berikut.
Gunakan robots.txt untuk mencegah crawling URL navigasi berfaset. Sering kali,
Anda tidak perlu mengizinkan crawling item yang difilter, karena hal ini menghabiskan resource server secara
percuma. Sebagai gantinya, hanya izinkan crawling halaman item individual beserta
halaman listingan khusus yang menampilkan semua produk tanpa filter yang diterapkan.
Cara lain untuk memberikan sinyal preferensi URL navigasi berfaset yang (tidak) perlu di-crawl adalah menggunakan
elemen linkrel="canonical" dan atribut anchor
rel="nofollow". Namun, metode ini umumnya kurang efektif dalam jangka panjang dibandingkan
metode yang disebutkan sebelumnya.
Penggunaan rel="canonical"
untuk menentukan URL mana yang merupakan versi kanonis dari URL navigasi berfaset
dapat seiring waktu mengurangi volume crawl versi non-kanonis dari URL tersebut. Misalnya,
jika Anda memiliki 3 jenis halaman yang difilter, pertimbangkan untuk mengarahkan rel="canonical" ke
versi yang tidak difilter:
https://example.com/items.shtm?products=fish&color=radioactive_green&size=tiny
menentukan <link rel="canonical" href="https://example.com/items.shtm?products=fish" >.
Menggunakan
atribut rel="nofollow"
pada anchor yang mengarah ke halaman hasil yang difilter
mungkin bermanfaat, tetapi perlu diingat bahwa setiap anchor yang mengarah ke URL tertentu harus memiliki
atribut rel="nofollow" agar dapat berfungsi efektif.
Memastikan URL navigasi berfaset optimal untuk web
Jika Anda ingin URL navigasi berfaset berpotensi di-crawl dan diindeks, pastikan Anda
mengikuti praktik terbaik berikut untuk meminimalkan efek negatif dari crawling
URL potensial dalam jumlah besar di situs Anda:
Gunakan pemisah parameter URL standar industri '&'. Karakter
seperti koma (,), titik koma (;), dan tanda kurung ([ dan
]) sulit dideteksi crawler sebagai pemisah parameter (karena biasanya karakter tersebut
bukan pemisah).
Jika Anda mengenkode filter di jalur URL, seperti
/products/fish/green/tiny,
pastikan urutan logis filter selalu sama dan tidak boleh ada
filter duplikat.
Tampilkan kode status HTTP 404 saat kombinasi filter tidak menampilkan
hasil.
Jika tidak ada halaman di inventaris situs, pengguna dan crawler akan menerima
error "not found" dengan kode status HTTP yang sesuai (404). Hal ini juga berlaku
jika URL berisi filter duplikat atau kombinasi filter yang tidak masuk akal, dan
URL penomoran halaman yang tidak ada. Demikian pula, jika kombinasi filter tidak memiliki hasil, jangan alihkan
ke halaman error "not found" umum. Namun, tayangkan error "not found" dengan kode status HTTP
404 di URL tempat error tersebut ditemukan.
[null,null,["Terakhir diperbarui pada 2025-08-04 UTC."],[],["Faceted navigation URLs, often using parameters, can lead to overcrawling and slower discovery of new content. To manage this, prevent crawling of these URLs using `robots.txt` to disallow specific parameters or employ URL fragments instead of parameters. Alternatively, use `rel=\"canonical\"` or `rel=\"nofollow\"` to reduce crawling of these pages. If faceted URLs are needed, utilize `&` for parameters, maintain consistent filter order, and return `404` for no-result combinations.\n"],null,["# Managing crawling of faceted navigation URLs | Google Search Central\n\nManaging crawling of faceted navigation URLs\n============================================\n\n\nFaceted navigation is a common feature of websites that allows its visitors to change how items\n(for example, products, articles, or events) are displayed on a page. It's a popular and useful\nfeature, however its most common implementation, which is based on URL parameters, can generate\ninfinite URL spaces which harms the website in a couple ways:\n\n- **Overcrawling**: Because the URLs created for the faceted navigation seem to be novel and crawlers can't determine whether the URLs are going to be useful without crawling first, the crawlers will typically access a very large number of faceted navigation URLs before the crawlers' processes determine the URLs are in fact useless.\n- **Slower discovery crawls**: Stemming from the previous point, if crawling is spent on useless URLs, the crawlers have less time to spend on new, useful URLs.\n\n\nA typical faceted navigation URL may contain various parameters in the query string related to the\nproperties of items they filter for. For example: \n\n```\nhttps://example.com/items.shtm?products=fish&color=radioactive_green&size=tiny\n```\n\n\nChanging any of the URL parameters `products`, `color`, and\n`size` would show a different set of items on the underlying page. This often means a\nvery large number of possible combinations of filters, which translates to a very large number of\npossible URLs. To save your resources, we recommend dealing with these URLs one of the following\nways:\n\n- If you don't need the faceted navigation URLs potentially indexed, prevent crawling of these URLs.\n- If you need the faceted navigation URLs potentially indexed, ensure that the URLs follow our best practices outlined in the following section. Keep in mind that crawling faceted URLs tends to cost sites large amounts of computing resources due to the sheer amount of URLs and operations needed to render those pages.\n\nPrevent crawling of faceted navigation URLs\n-------------------------------------------\n\n\nIf you want to save server resources and you don't need your faceted navigation URLs to show up in\nGoogle Search, you can prevent crawling of these URLs with one of the following ways.\n\n- **Use [robots.txt](/search/docs/crawling-indexing/robots/intro) to disallow crawling of faceted navigation URLs.** Oftentimes there's no good reason to allow crawling of filtered items, as it consumes server resources for no or negligible benefit; instead, allow crawling of just the individual items' pages along with a dedicated listing page that shows all products without filters applied. \n\n ```\n user-agent: Googlebot\n disallow: /*?*products=\n disallow: /*?*color=\n disallow: /*?*size=\n allow: /*?products=all$\n ```\n- **Use URL fragments to specify filters.** [Google Search generally doesn't support URL fragments in crawling and indexing](/search/docs/crawling-indexing/url-structure#fragments). If your filtering mechanism is based on URL fragments, it will have no impact on crawling (positive or negative). For example, instead of URL parameters, use URL fragments: \n\n ```\n https://example.com/items.shtm#products=fish&color=radioactive_green&size=tiny\n ```\n\n\nOther ways to signal a preference of which faceted navigation URLs (not) to crawl is using\n`rel=\"canonical\"` `link` element and the `rel=\"nofollow\"` anchor\nattribute. However, these methods are generally less effective in the long term than the\npreviously mentioned methods.\n\n- **Using [`rel=\"canonical\"`](/search/docs/crawling-indexing/consolidate-duplicate-urls#rel-canonical-link-method)\n to specify which URL is the canonical version of a faceted navigation URL** may, over time, decrease the crawl volume of non-canonical versions of those URLs. For example, if you have 3 filtered page types, consider pointing the `rel=\"canonical\"` to the unfiltered version: `https://example.com/items.shtm?products=fish&color=radioactive_green&size=tiny` specifies `\u003clink rel=\"canonical\" href=\"https://example.com/items.shtm?products=fish\" \u003e`.\n- **Using\n [`rel=\"nofollow\"`](/search/docs/crawling-indexing/qualify-outbound-links#nofollow)\n attributes on anchors pointing to filtered results pages** may be beneficial, however keep in mind that every anchor pointing to a specific URL must have the `rel=\"nofollow\"` attribute in order for it to be effective.\n\nEnsure the faceted navigation URLs are optimal for the web\n----------------------------------------------------------\n\n\nIf you need your faceted navigation URLs to be potentially crawled and indexed, ensure you're\nfollowing these best practices to minimize the negative effects of crawling the large number of\npotential URLs on your site:\n| Keep in mind that having these URLs crawled means an increased resource usage on your server and, potentially, slower discovery of new URLs on your site.\n\n1. **Use the industry standard URL parameter separator '`&`'.** Characters like comma (`,`), semicolon (`;`), and brackets (`[` and `]`) are hard for crawlers to detect as parameter separators (because most often they're not separators).\n2. If you're encoding filters in the URL path, such as `/products/`**fish** `/`**green** `/`**tiny**, ensure that the logical order of the filters always stays the same and that no duplicate filters can exist.\n3. **Return an HTTP `404` status code when a filter combination doesn't return\n results.** If there are no green fish in the site's inventory, users as well as crawlers should receive a \"not found\" error with the proper HTTP status code (`404`). This should also be the case if the URL contains duplicate filters or otherwise nonsensical filter combinations, and nonexistent pagination URLs. Similarly, if a filter combination has no results, don't redirect to a common \"not found\" error page. Instead, serve a \"not found\" error with the `404` HTTP status code under the URL where it was encountered. If [you have a single-page app](/search/docs/crawling-indexing/javascript/javascript-seo-basics#avoid-soft-404s) this might not be possible. Follow the best practices for single page apps."]]