Page Summary
-
Google Safe Browsing v5 prioritizes real-time protection and user privacy with features like a Global Cache and an optional Oblivious HTTP relay.
-
It offers flexible operation modes: Real-Time, Local List, and No-Storage Real-Time, catering to various client needs and resource constraints.
-
URLs undergo canonicalization and are checked against both local and remote databases for efficient threat detection using hash prefix matching.
-
The system is free for non-commercial use and requires clear user warnings with specific language and attribution to Google.
-
Google Safe Browsing v5 protocol employs URL canonicalization, hash prefix matching, and various caching mechanisms to efficiently identify and protect against online threats.
The Safe Browsing APIs let your client applications perform realtime or list-based URL checks against Google's constantly updated lists of unsafe web resources. Examples of unsafe web resources are social engineering sites (phishing and deceptive sites) and sites that host malware or unwanted software. Any URL found on a Safe Browsing list is considered unsafe.
To determine if a URL is on any of the Safe Browsing lists, clients can use either urls.search or hashes.search.
What's New?
Data Freshness
Traditionally, Safe Browsing clients periodically download threat lists used to match against potential threats. As both threat volumes and velocity have increased over time, these local threat lists have become less effective against modern threats.
To close this gap, we introduce the ability to shift protocols to be check-by-default instead of the allow-by-default protocol previously available in V4. This is done by introducing the ability to download a list of likely-benign sites referred to as the "Global Cache". If a URL is not found in the Global Cache, the client should perform a check with the API to determine if the URL is a threat.
This ability to perform checks by default in addition with data freshness improvement in the service will provides faster, near-real-time protection against new threats.
IP Privacy
Safe Browsing API only uses the IP addresses for essential networking needs and for anti-DoS purposes.
We introduced a companion API known as the Safe Browsing Oblivious HTTP Gateway API to enable additional privacy guarantees. This uses Oblivious HTTP to hide end users' IP addresses from Google. It works by having a non-colluding third-party to handle an encrypted version of the user request and then forward that to Google. So the third party only has access to the IP addresses, and Google only has access to the content of the request. The third party operates an Oblivious HTTP Relay (such as this service by Fastly), and Google operates the Oblivious HTTP Gateway. This is an optional companion API. When using it in conjunction with Google Safe Browsing, end users' IP addresses are no longer sent to Google.
See the Safe Browsing Oblivious HTTP Gateway API documentation for additional details.
Search Methods
Let's go over the different methods available to perform real-time checks of URLs.
urls.search
Lets client applications send URLs to the Safe Browsing service to check if there's any threats associated with the URLs.
Advantages
- Simple URL checks: You send a request with the actual URLs, and the server responds with the URLs with their associated threats (if any).
Drawbacks
- No URL Confidentiality: The request contains the raw URLs being checked.
If the advantages/drawbacks work for your requirement consider using urls.search since it's simple to use.
Review the Terms of Service of the method usage as it differs from hashes.search.
hashes.search
Lets client applications to check there are known threats in a set of URLs without revealing the actual URLs to the service. This is done by only providing the hash prefix of the URL. The response will contain full hashes of known threats with the shard hash prefix.
Advantages
- URL Confidentiality: Only a 4-byte hash prefix of the URL hashed is in the request.
- Compatibility: Because the client handles URL canonicalization and hashing, this method integrates seamlessly with modes that use a local database, such as Real-Time Mode and Local List Mode.
Drawbacks
- Complex URL checks: You need to know how to canonicalize URLs, create suffix/prefix expressions, and compute SHA256 hashes to make requests to this method, comparisons with the local copies of the unsafe lists or Global Cache.
If you need to prioritize URL confidentiality and are interested in using the Real-Time Mode or Local List Mode, then hashes.search is the recommended approach.
The HashList Methods
These methods allow clients to download and store hashed versions of the unsafe lists and the Global Cache. Clients can use this information to determine whether or not they should perform a real-time search of the URLs in question.
These methods are essential to the operation of the Real-Time Mode and the Local List Mode.
For additional information on how to utilize these methods, see the Local Database page.
Example Requests
This section documents some examples of directly using the HTTP API to access Safe Browsing. It is generally recommended to use a generated language binding because it will automatically handle encoding and decoding in a convenient way. Please refer to the documentation for that binding.
Example HTTP request using urls.search:
GET https://safebrowsing.googleapis.com/v5alpha1/urls:search?key=INSERT_YOUR_API_KEY_HERE&urls=testsafebrowsing.appspot.com/
Example HTTP request using hashes.search:
GET https://safebrowsing.googleapis.com/v5/hashes:search?key=INSERT_YOUR_API_KEY_HERE&hashPrefixes=WwuJdQ
Example HTTP request using hashLists.batchGet:
GET https://safebrowsing.googleapis.com/v5/hashLists:batchGet?key=INSERT_YOUR_API_KEY_HERE&names=se&names=mw-4b
All of the response bodies are protocol-buffer formatted payload that you can decode.