使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
让隐去的信息不显示在 Google 搜索中
将文档和图片发布到网络中时,您可能会无意中将一些不应直接对公众可见的信息发布出来。尤其是一些您可能看不到或应被隐去的信息,这类信息可能会包含在某些格式的文档中,并可在搜索引擎中显示。
由于搜索引擎会将网络中的公开内容(包括图片)编入索引,因此用户或许能在搜索引擎中找到一些未完全隐去的内容。屏幕阅读器等辅助技术可能会使这类看似“已隐藏”的内容更易于访问,而光学字符识别 (OCR) 等常见的图像识别技术也同样会使这类内容有可能被搜索到。
虽然将文字设成超小字号、将文字颜色设成与所在背景的颜色相同或使用图片遮盖文字可能会使内容对肉眼来说不可见,但这些方法并没有将内容实际隐去,因此内容仍可被搜索引擎编入索引并可被用户找到。
同样,某些类型的文档会通过各种方式使所含信息不直接可见。这些类型可能包括文档的更改历史记录,使用户能了解哪些文字曾被隐去或更改;它们可能保留了部分信息被剪裁或隐去的图片的完整版本;文件中可能还包含某些元数据,这些元数据不会直接可见,但可能会列出曾访问或修改过该文件的人员名称。
即使文档被导出或从一个格式转换为另一格式,这类信息仍可能会全部得以保留。如果您需要从文件中移除信息,请务必在公开发布文件前完全移除这些信息。
如需了解如何恰当地从文档中隐去信息,使其既不被编入索引又无法通过 Google 搜索找到,请参考下面列出的一些最佳实践。
先修改和导出图片,然后再嵌入图片
Google 搜索会列出其在网络中找到的图片,包括网页上的图片或各种格式的文档中嵌入的图片。嵌入的图片有时只使用图片所在文档的编辑工具进行修改。如果图片与文档被分开编入索引,则可能导致隐去设置失效。因此,最好在将图片嵌入文档之前(而不是之后)修改图片。尤其要注意:
- 在将图片嵌入文档之前,先剪裁掉图片中不需要的信息。某些文档编辑工具(如文字处理程序或幻灯片制作工具)会保留您在文档的公开版本中使用的所有未剪裁版图片,因此请务必仔细阅读该工具的说明文档。
-
完全移除或模糊处理图片中的所有文字或其他非公开的部分,因为 OCR 系统可能会将其发现的所有图片文字转换成可搜索的文字。
-
移除所有不需要的元数据。
按照本文档中的建议操作后,将更新后的图片导出或保存为非矢量或经过拼合处理的图片文件格式,例如 PNG 或 WEBP。这样可以防止图片中的这些部分无意中包含在公开文档中。
先修改或移除不需要的文字,然后将文档转换为公开文件格式
生成公开文档之前,请移除您不希望在文件的最终版本中显示的任何文本。接着,将文档转换成不会保留之前的更改历史记录的公开格式。更具体的提示如下:
如果文档中的相应内容没有隐去或隐去方式不当,且文档已被 Google 搜索编入索引,该怎么办
-
将已上线的文档从其发布网站或所在位置移除。
-
使用“移除”工具处理经过验证的网站,将相关文档从 Google 搜索中移除。如果您需要移除多个文档,请使用网址前缀。对于经过验证的网站,网址移除程序通常可在一天内完成。这样可以防止文档出现在被隐去内容的搜索结果中。
-
将正确隐去内容的文档托管在其他网址下。这样可确保所有新编入索引的版本均为新版文档而非旧版文档,因为重新抓取网址并在搜索索引中更新网址可能需要一些时间。更新指向这些文档的所有链接。
-
如果有任何其他网站可能也托管了未正确隐去内容的文档,请与其网站管理员联系,要求对方移除这些文档。请对方通过其 Search Console 账号中的“移除”工具进行操作,您也可以使用“移除过期内容”工具请求 Google 系统更新搜索结果。
- 请等待网址移除请求过期;如果 Google 搜索索引中的相应网址得以更新,或距网址移除请求提交之日已过去大约 6 个月,请求便会过期。
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-08-04。
[null,null,["最后更新时间 (UTC):2025-08-04。"],[[["\u003cp\u003eAvoid unintentionally publishing sensitive information hidden within documents or images, as search engines can index this data.\u003c/p\u003e\n"],["\u003cp\u003eEmploy proper redaction techniques like cropping, obscuring, or removing sensitive content before publishing files, rather than relying on visual concealment methods.\u003c/p\u003e\n"],["\u003cp\u003eEdit images and remove unwanted text before embedding them into documents to prevent unintended exposure of redacted content.\u003c/p\u003e\n"],["\u003cp\u003eUse the Google Search Console's removal tools if improperly redacted documents are indexed to quickly remove them from search results.\u003c/p\u003e\n"]]],["To keep information out of Google Search, properly redact documents before publishing. Edit images before embedding them, cropping unwanted parts, removing text, and deleting metadata. Remove unwanted text from documents, using redaction tools instead of merely covering text. Export images in non-vector formats. For indexed, improperly redacted content, remove the live document, use Google Search Console's Removals tool, host a properly redacted version under a new URL, and contact other hosts to remove the documents.\n"],null,["# Keep Redacted Information out of Google | Google Search Central\n\nKeep redacted information out of Google Search\n==============================================\n\n\nWhen publishing documents and images on the web, you may unintentionally publish information\nbeyond what is immediately visible to the human eye. In particular, information that you might\nnot see, or that was intended to be redacted, might be included in some document formats and\nvisible to search engines.\n\nBecause search engines index public material on the web, including images, content that is\nnot completely redacted can potentially be findable in search engines. Assistive technologies\nlike screen readers can make this seemingly \"hidden\" content more easily accessible, and\ncommon image understanding techniques like optical character recognition (OCR) similarly make\nit possible to search for this content.\n\nEven though putting text in a tiny font, using a font color that's the same as the background\nthe text is on, or covering text with an image may make something invisible to the human eye,\nthese methods don't actually redact material in a way that prevents search engines from\nindexing it and making it findable.\n\n\nSimilarly, some document types include information in various ways that aren't immediately\nvisible. They might include the document's change history, allowing users to see text that has\nbeen redacted or altered. They might retain the full versions of images that contain cropped\nor redacted information. There might also be metadata that's included in a file, which is not\nimmediately visible, that may list the names of people who accessed or edited the file.\n\n\nAll of this information can remain even when a document is exported or converted from one\nformat to another. If you need to remove information from a file, it's critical that the\ninformation is removed completely from the file before that file is made public.\n\n\nHere are some best practices for how to appropriately redact information from documents that\nyou don't want to be indexed and made discoverable via Google Search.\n\nEdit and export images before embedding them\n--------------------------------------------\n\n\nGoogle Search lists images that it finds across the web, both those that are on web pages or\nthose that are embedded into various document formats. Embedded images are sometimes edited\nusing only the containing document's editing tools. This can cause this redaction to fail when\nan image is indexed apart from the document. That is why it's best to edit images before\nembedding them into a document, not after. In particular:\n\n- Crop out unwanted information from images before embedding them into documents. Some document editing tools (such as word processors or slide creation tools) will maintain any uncropped images that you use in the public version of the document, so be sure to review the tool's documentation thoroughly.\n- Completely remove or obscure any text or other non-public parts of the image, as OCR systems may turn any image text seen into searchable text.\n- Remove any undesired metadata.\n\n\nAfter following the suggestions in this document, export or save the updated images as non-vector or\nflattened image file formats such as PNG or WEBP. This prevents those parts of the images from\nbeing inadvertently included in a public document.\n\nEdit or remove unwanted text before moving to a public file format\n------------------------------------------------------------------\n\n\nBefore you generate the public document, remove any text that you don't want displayed in the\nfinal version of the file. Move to a public format that does not keep your previous change\nhistory. Here are more specific tips:\n\n- Use proper document redacting tools if a file needs to have information redacted. For example, avoid placing black rectangles over text as a redaction method, as this can result in the text still being included in the public document.\n- Double-check the document metadata in the public file.\n- Follow the [document redaction best practices](https://www.google.com/search?q=document+redaction+best+practices) for the format that you are using (PDF, image, etc).\n- Consider information in the URL or file name itself. Even if a part of a website is [blocked by robots.txt](/search/docs/crawling-indexing/robots/intro), the URLs may be indexed in search (without their content). Use hashes in URL parameters instead of email addresses or names.\n- Consider using authentication to limit access to the redacted content. Serve the resulting login page with a [`noindex` robots `meta` tag](/search/docs/crawling-indexing/block-indexing) to block indexing.\n- When publishing, make sure that the website is [verified in Google Search Console](https://support.google.com/webmasters/answer/9008080). This allows quick removal action, if needed.\n\nWhat to do if unredacted or improperly redacted documents are indexed in Search\n-------------------------------------------------------------------------------\n\n1. Remove the live document from the website or location where you published it.\n2. Use the [Removals tool](https://support.google.com/webmasters/answer/9689846) for the verified site to remove the documents in question from Search. Use a URL prefix if you need to remove many documents. For verified sites, a URL removal generally takes less than a day. This prevents the document in question from appearing for any searches for redacted content.\n3. Host the properly redacted document under a different URL. This makes sure that any newly indexed version is of the new document, and not an older version of the document (since recrawling of URLs and updating them in a search index can take a bit of time). Update any links to those documents.\n4. Contact any other site that may also be hosting the improperly redacted documents and ask them to take them down as well. Ask them to use the Removals tool in their Search Console account, or you can use the [Outdated Content tool](https://support.google.com/webmasters/answer/7041154) to ask Google's systems to update the search results.\n5. Allow the URL removal requests to expire (this happens after the URLs were either updated in the Google Search index, or after about 6 months)."]]