404 错误会影响我的网站吗?
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
2011 年 5 月 2 日,星期一
您正在想着自己的事情,用网站站长工具看看自己的网站有多棒…但是,等等!抓取错误页面中显示了大量 404 (Not found)
错误!是即将出现严重后果吗?
别担心,我的新手网站站长。我们来了解一下 404
错误,看看这些错误会对您的网站产生什么影响(或没有影响):
问:网站站长工具中报告的 404
错误是否会影响我的网站排名?
答:404
错误是网页的完全正常部分;互联网不断变化,新内容诞生,旧内容终止,当它终止时(理想情况下),就会返回 404
HTTP 响应代码。搜索引擎知道这一点;我们在自己的网站上也有 404
错误(如上文所示),我们在整个网络都会遇到这些错误。事实上,我们有时希望遇到这样的错误;当您删除网站上的某个网页后,请确保它会返回 404
或 410
响应代码(而不是 soft
404
)。请注意,为了让我们的抓取工具能够看到网址的 HTTP 响应代码,抓取工具必须能够抓取该网址。如果该网址被 robots.txt 文件屏蔽,我们就无法抓取该网址并看到其响应代码。您网站上的部分网址不再存在或返回 404
错误不会影响您网站的其他网址(返回 200 (Success)
状态代码的网址)在搜索结果中的表现。
问:404
错误对我的网站没有任何影响吗?
答:如果您网站上的部分网址返回 404
错误,那么仅这一点就不会影响您的网站或对网站在 Google 搜索结果中的表现产生负面影响。不过,可能由于其他原因,您需要解决某些类型的 404
错误。例如,如果返回 404
错误的某些网页是您真正关心的网页,则应检查我们抓取网页时为什么会看到 404
错误!如果您看到合法网址拼写有误(www.example.com/awsome 而不是 www.example.com/awesome),很可能是因为有人原本打算链接到您的网站,但却输错了网址。您可以通过 301
代码将拼写有误的网址重定向到正确的网址,并通过该链接捕获预期流量,而不是返回 404
。您还可以确保,当用户到达您网站上的 404
网页时,您可以帮助他们找到自己要找的内容,而不仅仅是显示“404
未找到”。
问:请详细说明“soft 404
错误”。
答:soft 404
表示 Web 服务器针对不存在的网址返回 404
或 410
以外的响应代码。一个常见的示例是,网站所有者想要返回包含有用用户信息的实用 404
网页,并认为要向用户提供内容,必须返回 200
响应代码。事实并非如此!您可以在提供所需内容的同时返回 404
响应代码。再比如,网站将所有未知网址重定向到其首页,而不是返回 404
错误。这两种情况都会对我们对您网站的理解和索引编制产生负面影响,因此我们建议您确保服务器针对不存在的内容返回正确的响应代码。请注意,网页显示“404
未找到”,也并不表示它就一定会返回 404
HTTP 响应代码,请使用网站站长工具中的 Googlebot 模拟抓取功能仔细检查。如果您不知道如何配置服务器以返回正确的响应代码,请参阅网站托管服务商的帮助文档。
问:如何知道某个网址应该返回 404
、301
还是 410
?
当您从网站上移除网页时,请考虑一下,您是要将相关内容移到别处,还是要从网站上永久移除此类内容。如果您要将相关内容移到新网址,请使用 301
代码将旧网址重定向到新网址。这样一来,如果用户访问旧网址查找该内容,系统就会自动将他们重定向到与所需内容相关的网址。如果您要彻底删除相关内容且您的网站上没有可满足用户相同需求的内容,那么旧版网址应该返回 404
或 410
。目前,Google 会将 410 (Gone)
视为与 404 (Not found)
相同,因此无论返回哪一个,对我们来说都一样。
问:我的大部分 404
错误都是针对我的网站中从未存在过的奇怪网址。
这是怎么回事?它们来自哪里?
答:如果 Google 在网络上某个位置找到指向您网域中网址的链接,可能会尝试抓取该链接,无论其中是否实际存在任何内容;如果 Google 尝试抓取了,并且找不到任何内容,您的服务器应该返回 404
。导致出现这些链接的原因可能是:有人在链接到您的网站时拼错了网址;某些类型的错误配置(如果链接是自动生成的,例如由 CMS 自动生成);或者由 Google 努力识别和抓取嵌入 JavaScript 中的链接或其他嵌入式内容;或者,也可能是我们在进行快速检查,以了解您的服务器如何处理未知网址,等等。如果网站站长工具针对您网站上不存在的网址报告了 404
错误,您可以放心地忽略它们。我们不知道哪些网址对您很重要,哪些网址应该返回 404
错误,因此我们将向您展示在您的网站上发现的所有 404
错误,并由您决定需要注意哪些错误(如果有)。
问:有人抄袭了我的网站,并在此过程中造成了很多 404
错误。这些都是包含其他代码的“真实”网址(例如 https://www.example.com/images/kittens.jpg" width="100" height="300" alt="kittens"/>
),这会影响我的网站吗?
答:一般来说,您无需担心,这样的“损坏链接”不会影响您的网站。我们知道,网站所有者几乎无法控制抄袭其网站的用户,或以异常方式链接到其网站的用户。如果您擅长使用正则表达式,可以考虑重定向这些网址,但一般来说,无需担心。请注意,如果您认为有人从您的网站窃取原创内容,也可以提交移除要求。
问:上周我修正了网站站长工具报告的所有 404
错误,但它们仍然列在我的帐号中。这是否意味着我没有正确解决这些问题?它们需要多长时间才会消失?
答:请查看“抓取错误”页面上的“上次检测到此错误的时间”列,这是最近一次检测到每个错误的日期。如果该列中的日期是您修正错误之前的日期,则意味着我们自该日期起未遇到这些错误。如果日期较近,则表示我们在抓取时仍会看到这些 404
错误。
修正后,您可以使用 Googlebot 模拟抓取功能来检查抓取工具能否看到新的响应代码。
测试几个网址,如果它们看上去正常,这些错误应该很快就会从“抓取错误”列表中消失。
问:我能否使用 Google 的网址移除工具,让 404
错误更快地从我的帐号中消失?
答:不能。网址移除工具会从 Google 搜索结果中移除网址,而不是从您的网站站长工具帐号中移除网址。它仅用于处理紧急移除请求,如果网址已返回 404
错误,则不必使用该工具,因为随着时间的推移,此类网址自然会从我们的搜索结果中消失。请参阅这篇博文的下半部分,详细了解网址移除工具可以执行哪些操作以及不能执行哪些操作。
还想要详细了解 404
错误?请在我们的博客中查看 404
特集,或访问我们的网站站长帮助论坛。
发布者:网站站长趋势分析师 Susan Moskwa
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
[null,null,[],[[["\u003cp\u003e404 errors on your site generally do not negatively impact your site's search ranking in Google.\u003c/p\u003e\n"],["\u003cp\u003eSoft 404 errors, where a page displays a 404 message but returns a success code like 200, can negatively impact your site's indexing and should be addressed.\u003c/p\u003e\n"],["\u003cp\u003eIf you move content to a new URL, implement a 301 redirect to guide users and search engines to the correct location.\u003c/p\u003e\n"],["\u003cp\u003eYou can safely ignore 404 errors for URLs that never existed on your site, as Google might crawl linked URLs regardless of their existence.\u003c/p\u003e\n"],["\u003cp\u003eGoogle's URL removal tool is not designed for removing 404 errors from Webmaster Tools but rather for removing specific URLs from search results.\u003c/p\u003e\n"]]],["`404` errors are normal and don't directly harm a site's search ranking. Ensure old content returns a `404` or `410` code, and redirect misspelled URLs with `301` to capture traffic. Soft `404` errors, where a non-404 code is used for non-existent pages, should be avoided. If content is moved, use `301` redirects; otherwise, use `404` or `410`. Ignore `404` errors for URLs that never existed. Fixing errors can be confirmed using \"Fetch as Googlebot.\" The URL removal tool doesn't affect Webmaster Tools' error reports.\n"],null,["# Do 404 errors hurt my site?\n\nMonday, May 02, 2011\n\n\nSo there you are, minding your own business, using Webmaster Tools to check out how awesome your\nsite is... but, wait! The\n[Crawl errors](https://support.google.com/webmasters/answer/7440203)\npage is full of\n[`404 (Not found)` errors](/search/blog/2008/08/its-404-week-at-webmaster-central)!\n*Is disaster imminent??*\n\n\nFear not, my young padawan. Let's take a look at `404` errors and how they do (or do\nnot) affect your site:\n\n\n**Q: Do the `404` errors reported in Webmaster Tools affect my site's ranking?** \n\nA: `404` errors are a perfectly normal part of the web; the Internet is always\nchanging, new content is born, old content dies, and when it dies it (ideally) returns a\n`404` HTTP response code. Search engines are aware of this; we have `404`\nerrors on our own sites, as you can see above, and we find them all over the web. In fact, we\nactually *prefer* that, when you get rid of a page on your site, you make sure that it\nreturns a proper `404` or `410` response code (rather than a `soft\n404`). Keep in mind that in order for our crawler to see the HTTP response code of a\nURL, it has to be able to crawl that URL---if the URL is blocked by your robots.txt file we won't be\nable to crawl it and see its response code. The fact that some URLs on your site no longer exist or\nreturn `404` errors does not affect how your site's other URLs (the ones that return\n[`200 (Success)`\nstatus codes](https://en.wikipedia.org/wiki/List_of_HTTP_status_codes)) perform in our search results.\n\n\n**Q: So `404` errors don't hurt my website at all?** \n\nA: If some URLs on your site `404`, this fact alone does not hurt you or count against\nyou in Google's search results. However, there may be other reasons that you'd want to address\ncertain types of `404` errors. For example, if some of the pages that\n`404` are pages you actually care about, you should look into why we're seeing\n`404` errors when we crawl them! If you see a misspelling of a legitimate URL\n(www.example.com/awsome instead of www.example.com/awesome), it's likely that someone intended to\nlink to you and simply made a typo. Instead of returning a `404`, you could\n`301` redirect the misspelled URL to the correct URL and capture the intended traffic\nfrom that link. You can also make sure that, when users do land on a `404` page on\nyour site, you\n[help them find what they were looking for](/search/docs/crawling-indexing/http-network-errors#pagegone)\nrather than just saying \"`404` Not found.\"\n\n\n**Q: Tell me more about \"`soft 404` errors.\"** \n\nA: A\n[`soft 404`](/search/docs/crawling-indexing/http-network-errors#soft-404-errors)\nis when a web server returns a response code other than `404` (or `410`) for\na URL that doesn't exist. A common example is when a site owner wants to return\n[a pretty `404` page with helpful information for their users](/search/blog/2008/08/make-your-404-pages-more-useful),\nand thinks that in order to serve content to users, they have to return a `200`\nresponse code. Not so! You can return a `404` response code *while* serving\nwhatever content you want. Another example is when a site redirects any unknown URLs to their\nhome page instead of returning `404` errors. Both of these cases can have negative\neffects on our understanding and indexing of your site, so we recommend making sure your server\nreturns the proper response codes for nonexistent content. Keep in mind that *just because a\npage **says** \"`404` Not Found,\" doesn't mean it's actually returning a\n`404` HTTP response code* ---use the\n[Fetch as Googlebot](https://www.google.com/support/webmasters/bin/answer.py?answer=158587)\nfeature in Webmaster Tools to double-check. If you don't know how to configure your server to\nreturn the right response codes, check out your web host's help documentation.\n\n\n**Q: How do I know whether a URL should `404`, or `301`, or\n`410`?** \n\nA: When you remove a page from your site, think about whether that content is moving somewhere\nelse, or whether you no longer plan to have that type of content on your site. If you're moving\nthat content to a new URL, you should `301` redirect the old URL to the new URL---that\nway when users come to the old URL looking for that content, they'll be automatically redirected\nto something relevant to what they were looking for. If you're getting rid of that content\nentirely and don't have anything on your site that would fill the same user need, then the old URL\nshould return a `404` or `410`. Currently Google treats\n`410 (Gone)` the same as `404 (Not found)`, so it's immaterial to us whether\nyou return one or the other.\n\n\n**Q: Most of my `404` errors are for bizarro URLs that never existed on my site.\nWhat's up with that? Where did they come from?** \n\nA: If Google finds a link somewhere on the web that points to a URL on your domain, it may try to\ncrawl that link, whether any content actually exists there or not; and when it does, your server\n*should* return a `404` if there's nothing there to find. These links could be\ncaused by someone making a typo when linking to you, some type of misconfiguration (if the links\nare automatically generated, for example, by a CMS), or by Google's increased efforts to recognize\nand crawl links embedded in JavaScript or other embedded content; or they may be part of a quick\ncheck from our side to see how your server handles unknown URLs, to name just a few. If you see\n`404` errors reported in Webmaster Tools for URLs that don't exist on your site, you\ncan safely ignore them. We don't know which URLs are important to you vs. which are supposed to\n`404`, so we show you *all* the `404` errors we found on your site and\nlet you decide which, if any, require your attention.\n\n\n**Q: Someone has scraped my site and caused a bunch of `404` errors in the\nprocess. They're all \"real\" URLs with other code tacked on, like\n`https://www.example.com/images/kittens.jpg\" width=\"100\" height=\"300\" alt=\"kittens\"/\u003e`\nWill this hurt my site?** \n\nA: Generally you don't need to worry about \"broken links\" like this hurting your site. We\nunderstand that site owners have little to no control over people who scrape their site, or who\nlink to them in strange ways. If you're a whiz with the\n[regex](https://www.google.com/search?q=define:regex),\nyou could consider\n[redirecting these URLs](/search/docs/crawling-indexing/301-redirects), but generally\nit's not worth worrying about. Remember that you can also file a\n[takedown request](https://www.google.com/dmca)\nwhen you believe someone is stealing original content from your website.\n\n\n**Q: Last week I fixed all the `404` errors that Webmaster Tools reported, but\nthey're still listed in my account. Does this mean I didn't fix them correctly? How long will it\ntake for them to disappear?** \n\nA: Take a look at the 'Detected' column on the Crawl errors page---this is the most recent date on\nwhich we detected each error. If the date(s) in that column are from before the time you fixed the\nerrors, that means we haven't encountered these errors since that date. If the dates are more\nrecent, it means we're continuing to see these `404` errors when we crawl.\n\n\nAfter implementing a fix, you can check whether our crawler is seeing the new response code by\nusing\n[Fetch as Googlebot](https://www.google.com/support/webmasters/bin/answer.py?answer=158587).\nTest a few URLs and, if they look good, these errors should soon start to disappear from your\nlist of Crawl errors.\n\n\n**Q: Can I use Google's URL removal tool to make `404` errors disappear from my\naccount faster?**\n\n\nA: No; the URL removal tool removes URLs from Google's search results, not from your Webmaster\nTools account. It's designed for urgent removal requests only, and using it isn't necessary when a\nURL already returns a `404`, as such a URL will drop out of our search results\nnaturally over time. See the bottom half of\n[this blog post](/search/blog/2010/05/url-removal-explained-part-iv-tracking)\nfor more details on what the URL removal tool can and can't do for you.\n\n\nStill want to know more about `404` errors? Check out\n[`404` week](/s/results/search/blog?q=%22404+week%22) from our blog, or drop\nby our\n[Webmaster Help Forum](https://support.google.com/webmasters/community).\n\n\nPosted by\n[Susan Moskwa](/search/blog/authors/susan-moskwa),\nWebmaster Trends Analyst"]]