我们如何抵御 Google 搜索中的网络垃圾 - 2019 年网络垃圾报告
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
2020 年 6 月 9 日,星期二
我们重视用户的每次搜索。因此,每当您通过 Google 搜索查找相关的有用信息时,我们都希望确保您能尽可能找到最优质的结果,而这也是我们一直履行的承诺。
遗憾的是,网络上有一些干扰性的行为和内容,我们称之为“网络垃圾”。这些网络垃圾会使用户体验大打折扣,妨碍用户搜索到有用的信息。我们设立了多个团队,致力于防止网络垃圾出现在您的搜索结果中,而要比垃圾内容发布者棋高一着,我们将不断面临挑战。与此同时,我们将继续与广大网站站长合作,确保他们遵循最佳做法并能在 Google 搜索上获得成功,进而在开放网络中提供优质内容。
我们将在下文中回首过去一年的表现,概述我们在 2019 年抵御 Google 搜索中垃圾内容的情况,以及我们为网站站长社区提供的支持。
大规模防范网络垃圾
我们的索引系统包含数千亿个网页,每天提供数十亿次查询服务,因此不断有不良之徒企图操控搜索排名并不让人意外。事实上,我们观察到每天发现的垃圾内容网页超过 250 亿。这是一个庞大的数量,说明垃圾内容发布者的规模庞大,并将坚持不懈地在网上发布垃圾内容,而我们将与之进行旷日持久的斗争。我们非常重视这一问题,会尽量降低您在 Google 搜索中搜到垃圾网页的几率。通过我们的努力,用户在超过 99% 的查询体验中,从我们提供的搜索结果访问到了不含垃圾内容的网页。
自去年以来的更新
2018 年,我们的报告中显示用户生成的垃圾内容减少了 80%。我们很高兴在此确认,这种滥用行为在 2019 年并未增加。垃圾内容链接依旧是很常见的一种垃圾内容形式,但我们的团队在 2019 年成功控制住了这种内容的影响。我们的系统发现了超过 90% 的垃圾内容链接,并削弱了付费链接或链接交换等技术的影响。
黑客攻击带来的垃圾内容仍然是我们常常发现的一大难题,但其态势比往年更加稳定。我们不断努力找出解决办法,以更好地检测到这类内容,通知受影响的网站站长和平台,并帮助他们恢复被黑客攻击的网站。
网络垃圾趋势
2019 年,我们的首要任务之一就是通过机器学习系统提高处理垃圾网站的能力。我们的机器学习解决方案结合了成熟且久经考验的人工强制措施,能有效帮助发现垃圾内容并防止将垃圾内容结果呈现给用户。
过去几年,我们发现包含自动生成和抄袭内容的垃圾内容网站呈增长趋势,而这类网站存在令搜索者不悦或伤害搜索者的行为,例如虚假按钮、铺天盖地的广告、可疑的重定向和恶意软件。这类网站通常都是欺骗性网站,并且提供的内容对用户没有实际价值。与 2018 年相比,我们在 2019 年将这类垃圾内容对搜索用户的影响降低了超过 60%。
在提高发现垃圾内容的能力和效率的同时,我们继续投入,以减少更广义的有害内容类型,例如欺骗和欺诈。这类网站会欺骗用户,让用户以为自己访问的是官方或权威网站。在许多情况下,用户最终会泄露敏感个人信息、损失钱财或设备受到恶意软件感染。我们一直密切关注容易招引欺骗和诈骗的查询,也一直努力抢先一步,在不法分子耍弄各种垃圾内容手段之前,保护用户利益不受伤害。
与网站站长和开发者携手打造更好的网络环境
我们抵御垃圾内容的工作大多是通过自动系统检测滥用行为,但这些系统还不够完善,无法发现所有垃圾内容。作为我们的 Google 搜索用户,您也可以举报搜索过程中发现的垃圾内容、网上诱骗行为或恶意软件,帮助我们抵御垃圾内容和其他问题。2019 年,我们收到了将近 23 万次 Google 搜索垃圾内容举报,并且对所处理举报中的 82% 采取了措施。我们衷心感谢您提出的所有举报,感谢您为维护安全的搜索结果环境所提供的帮助!
当我们接到举报或发现可疑内容时,会怎么做?当我们检测到网站存在问题时,很重要的一步就是通知网站站长。在 2019 年,我们向网站所有者发送了超过 9000 万条消息,通知他们有问题可能会影响到他们的网站在 Google 搜索结果中的表现,并提供了一些切实可行的改进建议。在所有消息中,大约有 430 万条是关于网站因违反我们的网站站长指南而被采取了相应的人工处置措施。
此外,我们一直在寻找更好的办法来帮助网站所有者。2019 年,我们推出了许多旨在改善通知方式的举措,例如全新 Search Console 消息功能、适用于 WordPress 网站的 Site Kit 或全新 Search Console 中的自动 DNS 验证。我们希望这些举措能为网站站长提供更便捷的网站验证方式,并持续发挥作用。我们还希望这能让网站站长快速了解各种新闻,并使他们能够更加高效地解决网络垃圾问题或黑客攻击问题。
在致力于清除垃圾内容的同时,我们也并未忘记跟上网络的发展步伐,并重新思考我们要如何对待 "nofollow"
链接。"nofollow"
属性一开始是我们抵御垃圾评论和为赞助商链接添加注释的一种手段,现在已取得长足发展。但我们不会止步于此。我们相信,就像我们提高处理垃圾网站的能力一样,现在是时候进一步改进这个功能了。我们推出了两种新的链接属性,分别是 rel="sponsored"
和 rel="ugc"
。它们提供了额外的方式,让网站站长能够向 Google 搜索标识特定链接的性质。我们已开始将这两种属性和 rel="nofollow"
一起作为网站排名的依据。看到世界各地的网站站长都在接受并采用这些新的 rel 属性,我们非常高兴!
去年,我们能有机会与世界各地的网站站长交流,帮助他们提高在搜索结果中的表现并得到他们的反馈,对此我们一如既往地充满感激。我们在全球多座城市举办了超过 150 次在线“咨询交流时间”活动以及其他线上/线下活动,活动面向各个受众群体,包括搜索引擎优化人员、开发者、在线营销者和企业主。在这些活动中,我们非常高兴看到网站站长会议的的火热程度,这些会议在全球 15 个国家/地区的 35 个地方用 12 种语言举办,包括在山景城举办的首次产品峰会。虽然我们目前还不能举办面对面活动,但希望未来能多多举行这类活动和线上互动。
在 2019 年,网站站长帮助社区中进行了超过 3 万个交流会话,所用的语言有十多种,持续为广大网站站长提供解决方案和实用妙招。在 YouTube 上,我们推出了 #AskGoogleWebmasters 以及 打破 SEO 流言一类的视频系列,确保能为您解惑释疑。
我们知道改善网络环境还有很长的路要走,也希望来年能继续与您携手!因此,请一定要在 Twitter、YouTube、博客、帮助社区中与我们保持联系,或亲身参与一场离您较近的会议!
发布者:Google 搜索关系团队的 Cherry Prommawin 及搜索质量分析师 Duy Nguyen
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
[null,null,[],[[["\u003cp\u003eGoogle is actively combating webspam, blocking over 25 billion spammy pages daily to ensure high-quality search results.\u003c/p\u003e\n"],["\u003cp\u003eGoogle reduced the impact of spam with auto-generated and scraped content by over 60% in 2019 using machine learning and manual actions.\u003c/p\u003e\n"],["\u003cp\u003eWebmasters are encouraged to report spam, phishing, and malware; Google provides tools and resources like Search Console to help them address website issues.\u003c/p\u003e\n"],["\u003cp\u003eGoogle fosters collaboration with webmasters through online and offline events, online help communities, and dedicated YouTube series.\u003c/p\u003e\n"],["\u003cp\u003eGoogle introduced new link attributes (\u003ccode\u003erel="sponsored"\u003c/code\u003e and \u003ccode\u003erel="ugc"\u003c/code\u003e) as hints for ranking purposes, complementing the existing \u003ccode\u003erel="nofollow"\u003c/code\u003e attribute.\u003c/p\u003e\n"]]],["Google Search actively combats webspam, identifying over 25 billion spammy pages daily. In 2019, they reduced the impact of auto-generated and scraped content spam by 60% and caught over 90% of link spam. They sent 90 million messages to webmasters about website issues, with 4.3 million related to manual actions. New link attributes were introduced (rel=\"sponsored,\" rel=\"ugc\"). The company held numerous online and offline webmaster events and engaged with the community through various online platforms.\n"],null,["# How we fought Search spam on Google - Webspam Report 2019\n\nTuesday, June 09, 2020\n\n\nEvery search matters. That is why whenever you come to Google Search to find relevant and useful\ninformation, it is our ongoing commitment to make sure users receive the highest quality results\npossible.\n\n\nUnfortunately, on the web there are some disruptive behaviors and content that we call \"webspam\"\nthat can degrade the experience for people coming to find helpful information. We have a number\nof teams who work to prevent webspam from appearing in your search results, and it's a constant\nchallenge to stay ahead of the spammers. At the same time, we continue to engage with webmasters\nto ensure they're following best practices and can find success on Search, making great content\navailable on the open web.\n\n\nLooking back at last year, here's a snapshot of how we fought spam on Search in 2019, and how we\nsupported the webmaster community.\n\nFighting Spam at Scale\n----------------------\n\n\nWith hundreds of billions of webpages in our index serving billions of queries every day,\nperhaps it's not too surprising that there continue to be bad actors who try to manipulate\nsearch ranking. In fact, we observed that **more than 25 Billion pages we discover each\nday are spammy**. That's a lot of spam and it goes to show the scale, persistence, and\nthe lengths that spammers are willing to go. We're very serious about making sure that your\nchance of encountering spammy pages in Search is as small as possible. Our efforts have helped\nensure that more than 99% of visits from our results lead to experiences without spam.\n\nUpdates from last year\n----------------------\n\n\nIn 2018, we reported that we had reduced\n[user-generated spam](/search/docs/advanced/guidelines/user-gen-spam) by 80%,\nand we're happy to confirm that this type of abuse did not grow in 2019. Link spam continued to\nbe a popular form of spam, but our team was successful in containing its impact in 2019. More\nthan 90% of link spam was caught by our systems, and techniques such as paid links or link\nexchange have been made less effective.\n\n\nHacked spam, while still a commonly observed challenge, has been more stable compared to\nprevious years. We continued to work on solutions to better detect and notify affected\nwebmasters and platforms and\n[help them recover from hacked websites](/web/fundamentals/security/hacked).\n\nSpam Trends\n-----------\n\n\nOne of our top priorities in 2019 was improving our spam fighting capabilities through machine\nlearning systems. Our machine learning solutions, combined with our proven and time-tested\nmanual enforcement capability, have been instrumental in identifying and preventing spammy\nresults from being served to users.\n\n\nIn the last few years, we've observed an increase in spammy sites with\n[auto-generated](/search/docs/advanced/guidelines/auto-gen-content) and\n[scraped content](/search/docs/advanced/guidelines/scraped-content)\nwith behaviors that annoy or harm searchers, such as fake buttons, overwhelming ads, suspicious\nredirects and malware. These websites are often deceptive and offer no real value to people. In\n2019, we were able to reduce the impact on Search users from this type of spam by more than 60%\ncompared to 2018.\n\n\nAs we improve our capability and efficiency in catching spam, we're continuously investing in\nreducing broader types of harm, like scams and fraud. These sites trick people into thinking\nthey're visiting an official or authoritative site and in many cases, people can end up\ndisclosing sensitive personal information, losing money, or infecting their devices with\nmalware. We have been paying close attention to queries that are prone to scam and fraud and\nwe've worked to stay ahead of spam tactics in those spaces to protect users.\n\nWorking with webmasters and developers for a better web\n-------------------------------------------------------\n\n\nMuch of the work we do to fight against spam is using automated systems to detect spammy\nbehavior, but those systems aren't perfect and can't catch everything. As someone who uses\nSearch, you can also help us fight spam and other issues by\n[reporting spam on search](/search/docs/advanced/guidelines/report-spam),\n[phishing](https://safebrowsing.google.com/safebrowsing/report_phish/) or\n[malware](https://www.google.com/safebrowsing/report_badware/). We received nearly\n230,000 reports of search spam in 2019, and we were able to take action on 82% of those reports\nwe processed. We appreciate all the reports you sent to us and your help in keeping search\nresults clean!\n\n\nSo what do we do when we get those reports or identify that something isn't quite right? An\nimportant part of what we do is notifying webmasters when we detect something wrong with their\nwebsite. In 2019, we generated more than 90 million messages to website owners to let them know\nabout issues, problems that may affect their site's appearance on Search results and potential\nimprovements that can be implemented. Of all messages, about 4.3 million were related to\n[manual actions](https://support.google.com/webmasters/answer/9044175), resulting\nfrom violations of our Webmaster Guidelines.\n\n\nAnd we're always looking for ways to better help site owners. There were many initiatives in\n2019 aimed at improving communications, such as\n[the new Search Console messages](/search/blog/2019/12/search-console-messages),\n[Site Kit for WordPress sites](/search/blog/2019/10/site-kit-is-now-available-for-all)\nor\n[the Auto-DNS verification in the new Search Console](/search/blog/2019/09/auto-dns-verification).\nWe hope that these initiatives have equipped webmasters with more convenient ways to get their\nsites verified and will continue to be helpful. We also hope this provides quicker access to\nnews and that webmasters will be able to fix webspam issues or hack issues more effectively and\nefficiently.\n\n\nWhile we deeply focused on cleaning up spam, we also didn't forget to keep up with the evolution\nof the web and\n[rethought how we wanted to\ntreat `\"nofollow\"` links](/search/blog/2019/09/evolving-nofollow-new-ways-to-identify). Originally introduced as a means to\nhelp fight comment spam and annotate sponsored links, the `\"nofollow\"`\nattribute has come a long way. But we're not stopping there. We believe it's time for it to\nevolve even more, just as how our spam fighting capability has evolved. We introduced two new\nlink attributes, `rel=\"sponsored\"` and `rel=\"ugc\"`,\nthat provide webmasters with additional ways to identify to Google Search the nature of\nparticular links. Along with `rel=\"nofollow\"`, we began treating these\nas hints for us to incorporate for ranking purposes. We are very excited to see that these new\nrel attributes were well received and adopted by webmasters around the world!\n\nEngaging with the community\n---------------------------\n\n\nAs always, we're grateful for all the opportunities we had last year to connect with webmasters\naround the world, helping them improve their presence in Search and hearing feedback. We\ndelivered more than 150 online office hours, online events and offline events in many cities\nacross the globe to a wide range of audience including SEOs, developers, online marketers and\nbusiness owners. Among those events, we have been delighted by\n[the momentum behind our Webmaster Conferences](/search/blog/2019/09/join-us-at-webmaster-conference-in)\nin 35 locations across 15 countries and 12 languages around the world, including the first\nProduct Summit version in Mountain View. While we're not currently able to host in-person\nevents, we look forward to more of these\n[events](/search/events) and virtual\ntouchpoints in the future.\n\n\nWebmasters continued to find solutions and tips on our\n[Webmasters Help Community](https://support.google.com/webmasters/community)\nwith more than 30,000 threads in 2019 in more than a dozen languages. On YouTube, we\n[launched #AskGoogleWebmasters](/search/blog/2019/08/you-askgooglewebmasters-we-answer)\nas well as series such as\n[SEO mythbusting](/search/blog/2019/06/a-new-series-on-seo-for-web-developers) to\nensure that your questions get answered and your uncertainties get clarified.\n\n\nWe know that our journey to better web with you is ongoing and we would love to continue this\nwith you in the year to come! Therefore, do keep in touch on\n[Twitter](https://twitter.com/googlesearchc),\n[YouTube](https://www.youtube.com/channel/UCWf2ZlNsCGDS89VBF_awNvA),\n[blog](/search/blog),\n[Help Community](https://support.google.com/webmasters/community) or see you in\nperson at one of\n[our conferences](/search/events) near you!\n\n\nPosted by [Cherry Prommawin](https://www.linkedin.com/in/cherry-prom/), Search Relations, and [Duy Nguyen](/search/blog/authors/duy-nguyen), Search Quality Analyst"]]