AES-GCM-HKDF 流式传输 AEAD
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
本文档正式定义了由 AES-GCM-HKDF 流式传输密钥表示的数学函数,以 proto 格式编码为 type.googleapis.com/google.crypto.tink.AesGcmHkdfStreamingKey
。
这种加密大致基于 HRRV151。对于安全分析,请参阅 HS202。
键和参数
以下部分对键进行了介绍(本文档中的所有大小均以字节为单位):
- \(\mathrm{KeyValue}\),一个字节字符串。
- \(\mathrm{CiphertextSegmentSize} \in \{1, 2, \ldots, 2^{31}-1\}\)。
- \(\mathrm{DerivedKeySize} \in \{16, 32\}\)。
- \(\mathrm{HkdfHashType} \in \{\mathrm{SHA1}, \mathrm{SHA256},
\mathrm{SHA512}\}\)。
有效键还满足以下属性:
- \(\mathrm{len}(\mathrm{KeyValue}) \geq \mathrm{DerivedKeySize}\)。
- \(\mathrm{CiphertextSegmentSize} > \mathrm{DerivedKeySize} + 24\) (这等于稍后介绍的 \(\mathrm{len}(\mathrm{Header}) + 16\) )。
在解析密钥或创建相应的基元时,Tink 会拒绝违反其中任何属性的键。
加密功能
为了使用相关数据\(\mathrm{AssociatedData}\)加密消息 \(\mathrm{Msg}\) ,我们需要创建一个标头,将消息拆分为多个分段,对每个分段进行加密,然后将加密的分段串联起来。
我们选择一个长度为\(\mathrm{DerivedKeySize}\) 的统一随机字符串 \(\mathrm{Salt}\) 和一个长度为 7 的统一随机字符串 \(\mathrm{NoncePrefix}\)。
然后设置 \(\mathrm{Header} := \mathrm{len}(\mathrm{Header}) \| \mathrm{Salt}
\| \mathrm{NoncePrefix}\),其中标头的长度被编码为单个字节。请注意, \(\mathrm{len}(\mathrm{Header}) \in \{24, 40\}\).
接下来,我们使用 HKDF3 来处理由 \(\mathrm{HkdfHashType}\)以及输入 \(\mathrm{ikm} := \mathrm{KeyValue}\)、 \(\mathrm{salt} :=
\mathrm{Salt}\)和 \(\mathrm{info} := \mathrm{AssociatedData}\)提供的哈希函数,输出长度为 \(\mathrm{DerivedKeySize}\)。我们将结果称为 \(\mathrm{DerivedKey}\)。
拆分消息
接下来,消息 \(\mathrm{Msg}\) 会拆分为以下部分: \(\mathrm{Msg} = M_0 \|
M_1 \| \cdots \| M_{n-1}\)。
它们的长度应符合以下要求:
- \(\mathrm{len}(M_0) \in \{0,\ldots, \mathrm{CiphertextSegmentSize} -
\mathrm{len}(\mathrm{Header}) - \mathrm{16}\}\)。
- 如果为 \(n>1\),则 \(\mathrm{len}(M_1), \ldots, \mathrm{len}(M_{n-1}) \in
\{1,\ldots, \mathrm{CiphertextSegmentSize} - \mathrm{16}\}\)。
- 如果为 \(n>1\),则 \(\mathrm{len}(M_{0}), \ldots, \mathrm{len}(M_{n-2})\) 必须具有最大长度(根据上述约束条件)。
\(n\) 不得超过 \(2^{32}\)。否则加密失败。
加密块
为了加密区段 \(M_i\),我们计算 \(\mathrm{IV}_i := \mathrm{NoncePrefix}
\| \mathrm{i} \| b\),其中 \(\mathrm{i}\) 为 4 个字节(采用大端序编码),如果 $i < n-1$,则字节 $b$ 为 0x00
,否则为 0x01
。
然后我们加密 \(M_i\) 使用 AES-GCM4 ,其中的关键是\(\mathrm{DerivedKey}\),初始化向量是 \(\mathrm{IV}_i\),且关联数据为空字符串。 \(C_i\) 是这种加密的结果(即串联 \(C\) 和 \(T\) 在链接的 AES-GCM 参考的第 5.2.1.2 节中)。
串联加密片段
最后,所有分段都以 \(\mathrm{Header} \| C_0 \| \cdots \|
C_{n-1}\)的形式串联起来,即最终的密文。
解密
解密将反加密。我们使用该标头获取\(\mathrm{NoncePrefix}\),并单独解密密文的每个分段。
API 可以(且通常允许)允许随机访问,即在不检查文件结尾的情况下访问文件的开头。这是有意为之,因为可以从 \(C_i\)解密 \(M_i\) ,而无需解密之前和其余的所有密文块。
不过,API 应格外小心,不要让用户混淆文件末尾错误和解密错误:在这两种情况下,API 都可能需要返回错误,而忽略差异可以使攻击者能够有效截断文件。
键的序列化和解析
要序列化“Tink Proto”格式的键,我们首先以显而易见的方式将参数映射到 aes_gcm_hkdf_streaming.proto 中给出的 proto 中。字段 version
需要设置为 0。然后,我们使用常规 proto 序列化对其进行序列化,并将生成的字符串嵌入到 KeyData proto 的值中。我们将 type_url
字段设置为 type.googleapis.com/google.crypto.tink.AesGcmHkdfStreamingKey
。然后,我们将 key_material_type
设置为 SYMMETRIC
,并将其嵌入密钥集。我们通常将 output_prefix_type
设置为 RAW
。例外情况是,如果解析键时使用了为 output_prefix_type
设置的其他值,Tink 可能会写入 RAW
或先前的值。
如需解析键,我们会反转上述过程(以解析 proto 时的常规方式)。系统会忽略 key_material_type
字段。可以忽略 output_prefix_type
的值,也可以拒绝 output_prefix_type
与 RAW
不同的键。version
不为 0 的密钥必须被拒绝。
已知问题
上述加密函数的实现预计不应是分支安全的。请参阅叉子安全。
参考
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-07-25。
[null,null,["最后更新时间 (UTC):2025-07-25。"],[[["\u003cp\u003eThis document specifies the \u003ccode\u003eAesGcmHkdfStreamingKey\u003c/code\u003e encryption function, based on HRRV15 and analyzed in HS20, for secure streaming data encryption.\u003c/p\u003e\n"],["\u003cp\u003eKeys consist of \u003ccode\u003eKeyValue\u003c/code\u003e, \u003ccode\u003eCiphertextSegmentSize\u003c/code\u003e, \u003ccode\u003eDerivedKeySize\u003c/code\u003e, and \u003ccode\u003eHkdfHashType\u003c/code\u003e, with specific size and validity constraints.\u003c/p\u003e\n"],["\u003cp\u003eEncryption involves creating a header, splitting the message into segments, encrypting each segment using AES-GCM with a derived key, and concatenating the results.\u003c/p\u003e\n"],["\u003cp\u003eDecryption reverses this process, allowing random access to segments for efficient retrieval of specific message parts.\u003c/p\u003e\n"],["\u003cp\u003eSerialization and parsing of keys utilize Tink Proto format with specific requirements for \u003ccode\u003eversion\u003c/code\u003e, \u003ccode\u003etype_url\u003c/code\u003e, and \u003ccode\u003ekey_material_type\u003c/code\u003e fields.\u003c/p\u003e\n"]]],["The document defines AES-GCM-HKDF Streaming key encryption. Encryption involves creating a header with a random salt and nonce prefix, deriving a key using HKDF with associated data, and splitting the message into segments. Each segment is encrypted using AES-GCM with a unique initialization vector. The encrypted segments are concatenated with the header to form the final ciphertext. Decryption reverses this process. Key serialization follows the Tink Proto format, with specific field settings. Keys can be parsed in the same way.\n"],null,["# AES-GCM-HKDF Streaming AEAD\n\nThis document formally defines the mathematical function represented by\nAES-GCM-HKDF Streaming keys, encoded in proto format as\n`type.googleapis.com/google.crypto.tink.AesGcmHkdfStreamingKey`.\n\nThis encryption is loosely based on HRRV15^[1](#fn1)^. For security analysis, we refer\nto HS20^[2](#fn2)^.\n\nKey and parameters\n------------------\n\nKeys are described by the following parts (all sizes in this document are in\nbytes):\n\n- \\\\(\\\\mathrm{KeyValue}\\\\), a byte string.\n- \\\\(\\\\mathrm{CiphertextSegmentSize} \\\\in \\\\{1, 2, \\\\ldots, 2\\^{31}-1\\\\}\\\\).\n- \\\\(\\\\mathrm{DerivedKeySize} \\\\in \\\\{16, 32\\\\}\\\\).\n- \\\\(\\\\mathrm{HkdfHashType} \\\\in \\\\{\\\\mathrm{SHA1}, \\\\mathrm{SHA256}, \\\\mathrm{SHA512}\\\\}\\\\).\n\nValid keys additionally satisfy the following properties:\n\n- \\\\(\\\\mathrm{len}(\\\\mathrm{KeyValue}) \\\\geq \\\\mathrm{DerivedKeySize}\\\\).\n- \\\\(\\\\mathrm{CiphertextSegmentSize} \\\u003e \\\\mathrm{DerivedKeySize} + 24\\\\) (This equals \\\\(\\\\mathrm{len}(\\\\mathrm{Header}) + 16\\\\) as explained later).\n\nKeys that violate any of these properties are rejected by Tink, either\nwhen the key is parsed or when the corresponding primitive is created.\n\nEncryption function\n-------------------\n\nTo encrypt a message \\\\(\\\\mathrm{Msg}\\\\) with associated data\n\\\\(\\\\mathrm{AssociatedData}\\\\), we create a header, split the message into\nsegments, encrypt each segment, and concatenate the encrypted segments.\n\n### Create the header\n\nWe pick a uniform random string \\\\(\\\\mathrm{Salt}\\\\) of length\n\\\\(\\\\mathrm{DerivedKeySize}\\\\) and a uniform random string \\\\(\\\\mathrm{NoncePrefix}\\\\)\nof length 7.\n\nWe then set \\\\(\\\\mathrm{Header} := \\\\mathrm{len}(\\\\mathrm{Header}) \\\\\\| \\\\mathrm{Salt}\n\\\\\\| \\\\mathrm{NoncePrefix}\\\\), where the length of the header is encoded as a single\nbyte. Note that \\\\(\\\\mathrm{len}(\\\\mathrm{Header}) \\\\in \\\\{24, 40\\\\}\\\\).\n\nNext, we use HKDF^[3](#fn3)^ with the hash function given by \\\\(\\\\mathrm{HkdfHashType}\\\\)\nand inputs \\\\(\\\\mathrm{ikm} := \\\\mathrm{KeyValue}\\\\), \\\\(\\\\mathrm{salt} :=\n\\\\mathrm{Salt}\\\\), and \\\\(\\\\mathrm{info} := \\\\mathrm{AssociatedData}\\\\), with output\nlength \\\\(\\\\mathrm{DerivedKeySize}\\\\). We call the result \\\\(\\\\mathrm{DerivedKey}\\\\).\n\n### Split the message\n\nThe message \\\\(\\\\mathrm{Msg}\\\\) is next split into parts: \\\\(\\\\mathrm{Msg} = M_0 \\\\\\|\nM_1 \\\\\\| \\\\cdots \\\\\\| M_{n-1}\\\\).\n\nTheir lengths are chosen to satisfy:\n\n- \\\\(\\\\mathrm{len}(M_0) \\\\in \\\\{0,\\\\ldots, \\\\mathrm{CiphertextSegmentSize} - \\\\mathrm{len}(\\\\mathrm{Header}) - \\\\mathrm{16}\\\\}\\\\).\n- If \\\\(n\\\u003e1\\\\), then \\\\(\\\\mathrm{len}(M_1), \\\\ldots, \\\\mathrm{len}(M_{n-1}) \\\\in \\\\{1,\\\\ldots, \\\\mathrm{CiphertextSegmentSize} - \\\\mathrm{16}\\\\}\\\\).\n- If \\\\(n\\\u003e1\\\\), then \\\\(\\\\mathrm{len}(M_{0}), \\\\ldots, \\\\mathrm{len}(M_{n-2})\\\\) must have maximal length according to the above to constraints.\n\n\\\\(n\\\\) may be at most \\\\(2\\^{32}\\\\). Otherwise, encryption fails.\n\n### Encrypt the blocks\n\nTo encrypt segment \\\\(M_i\\\\), we compute \\\\(\\\\mathrm{IV}_i := \\\\mathrm{NoncePrefix}\n\\\\\\| \\\\mathrm{i} \\\\\\| b\\\\), where \\\\(\\\\mathrm{i}\\\\) is 4 bytes in big-endian encoding and\nbyte $b$ is `0x00` if $i \\\u003c n-1$ and `0x01` otherwise.\n\nWe then encrypt \\\\(M_i\\\\) using AES-GCM^[4](#fn4)^, where the key is\n\\\\(\\\\mathrm{DerivedKey}\\\\), the initialization vector is \\\\(\\\\mathrm{IV}_i\\\\), and the\nassociated data is the empty string. \\\\(C_i\\\\) is the result of this encryption\n(i.e. the concatenation of \\\\(C\\\\) and \\\\(T\\\\) in section 5.2.1.2 of the linked\nAES-GCM reference).\n\n### Concatenate the encrypted segments\n\nFinally, all segments are concatenated as \\\\(\\\\mathrm{Header} \\\\\\| C_0 \\\\\\| \\\\cdots \\\\\\|\nC_{n-1}\\\\), which is the final ciphertext.\n\nDecryption\n----------\n\nDecryption inverts the encryption. We use the header to obtain\n\\\\(\\\\mathrm{NoncePrefix}\\\\), and decrypt each segment of ciphertext individually.\n\nAPIs may (and typically do) allow random access, or access to the beginning of a\nfile without inspecting the end of the file. This is intentional, since it is\npossible to decrypt \\\\(M_i\\\\) from \\\\(C_i\\\\), without decrypting all previous and\nremaining ciphertext blocks.\n\nHowever, APIs should be careful to not allow users to confuse end-of-file and\ndecryption errors: in both cases the API probably has to return an error, and\nignoring the difference can lead to an adversary being able to effectively\ntruncate files.\n\nSerialization and parsing of keys\n---------------------------------\n\nTo serialize a key in the \"Tink Proto\" format, we first map the parameters in\nthe obvious way into the proto given at\n[aes_gcm_hkdf_streaming.proto](https://github.com/tink-crypto/tink-java/blob/main/proto/aes_gcm_hkdf_streaming.proto). The field `version` needs to\nbe set to 0. We then serialize this using normal proto serialization, and embed\nthe resulting string in the value of field of a [KeyData](https://github.com/tink-crypto/tink-java/blob/main/proto/tink.proto) proto. We\nset the `type_url` field to\n`type.googleapis.com/google.crypto.tink.AesGcmHkdfStreamingKey`. We then set\n`key_material_type` to `SYMMETRIC`, and embed this into a keyset. We usually set\nthe `output_prefix_type` to `RAW`. The exception is that if the key was parsed\nwith a different value set for `output_prefix_type`, Tink may either write `RAW`\nor the previous value.\n\nTo parse a key, we reverse the above process (in the usual way when parsing\nprotos). The field `key_material_type` is ignored. The value of\n`output_prefix_type` can either be ignored, or keys which have\n`output_prefix_type` different from `RAW` can be rejected. Keys which have a\n`version` different from 0 must be rejected.\n\nKnown issues\n------------\n\nImplementations of the above encryption function are not expected to be fork\nsafe. See [Fork Safety](/tink/issues/fork-safety).\n\nReferences\n----------\n\n*** ** * ** ***\n\n1. Hoang, Reyhanitabar, Rogaway, Vizar, 2015. Online authenticated-encryption\n and its nonce-reuse misuse-resistance. CRYPTO 2015.\n \u003chttps://eprint.iacr.org/2015/189\u003e [↩](#fnref1)\n\n2. Hoang, Shen, 2020. Security of Streaming Encryption in Google's\n Tink Library. \u003chttps://eprint.iacr.org/2020/1019\u003e [↩](#fnref2)\n\n3. RFC 5869. HMAC-based Extract-and-Expand Key Derivation Function (HKDF).\n \u003chttps://www.rfc-editor.org/rfc/rfc5869\u003e [↩](#fnref3)\n\n4. NIST SP 800-38D. Recommendation for Block Cipher Modes of Operation:\n Galois/Counter Mode (GCM) and GMAC. \u003chttps://csrc.nist.gov/pubs/sp/800/38/d/final\u003e [↩](#fnref4)"]]