idnits 2.17.1 draft-ietf-avt-srtp-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: As the rollover counter is 32 bits long, the maximum number of packets in any given SRTP session is 2^48 = 281,474,976,710,656. After that number of SRTP packets have been sent, the sender MUST not send any more packets with that cryptographic context. This limitation enforces a security benefit by providing an upper bound on the amount of traffic that can pass before cryptographic keys are changed. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: where TS (Timestamp, 32 bits), SEQ (Sequence Number, 16 bits), M (Marker Bit, 1 bit), PT (Payload Type, 7 bits), and SSRC (Synchronization Source, 32 bits) are taken from the current RTP header. ROC is the 32-bit rollover counter from the identified context. FLAG is a 8-bit value which is used to signal additional information. Currently, the only value defined (for RTP) is FLAG = 00..0. The value 00..01 is reserved for RTCP and MUST not be used with RTP. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: FLAG is a 8-bit value which is used to signal additional information. Currently, the only value defined (for RTCP) is FLAG = 00..01. The value 0..0 is reserved for RTP and MUST not be used for RTCP. This allows to use the same key for related RTP and RTCP flows (being the IV unique). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 2001) is 8471 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'BR98' is mentioned on line 533, but not defined == Missing Reference: 'B96' is mentioned on line 934, but not defined == Missing Reference: 'Bi96' is mentioned on line 955, but not defined == Unused Reference: 'ES3E' is defined on line 1140, but no explicit reference was found in the text == Unused Reference: 'LRW00' is defined on line 1158, but no explicit reference was found in the text == Unused Reference: 'R92' is defined on line 1177, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'AES' -- Possible downref: Non-RFC (?) normative reference: ref. 'BCNN00' -- Possible downref: Non-RFC (?) normative reference: ref. 'BF00' -- Possible downref: Non-RFC (?) normative reference: ref. 'C99' -- Possible downref: Non-RFC (?) normative reference: ref. 'ES3D' -- Possible downref: Non-RFC (?) normative reference: ref. 'ES3E' -- Possible downref: Non-RFC (?) normative reference: ref. 'HAC' -- Possible downref: Non-RFC (?) normative reference: ref. 'H80' ** Obsolete normative reference: RFC 2401 (ref. 'KA98a') (Obsoleted by RFC 4301) -- Possible downref: Non-RFC (?) normative reference: ref. 'KBHHKR00' -- Possible downref: Non-RFC (?) normative reference: ref. 'LRW00' -- Possible downref: Non-RFC (?) normative reference: ref. 'M00' -- Possible downref: Non-RFC (?) normative reference: ref. 'MF00' -- Possible downref: Non-RFC (?) normative reference: ref. 'MF00b' -- Possible downref: Non-RFC (?) normative reference: ref. 'R92' -- Possible downref: Non-RFC (?) normative reference: ref. 'RC94' -- Possible downref: Non-RFC (?) normative reference: ref. 'RC98' == Outdated reference: A later version (-05) exists of draft-rescorla-sec-cons-00 -- Possible downref: Normative reference to a draft: ref. 'RK99' -- Possible downref: Non-RFC (?) normative reference: ref. 'S96' ** Obsolete normative reference: RFC 1889 (ref. 'SCFJ96') (Obsoleted by RFC 3550) Summary: 7 errors (**), 0 flaws (~~), 11 warnings (==), 20 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Rolf Blom, Ericsson 3 AVT Working Group Elisabetta Carrara, Ericsson 4 INTERNET-DRAFT David A. McGrew, Cisco 5 Expires: July 2001 Mats Naslund, Ericsson 6 Karl Norrman, Ericsson 7 David Oran, Cisco 9 February 2001 11 The Secure Real Time Transport Protocol 12 14 Status of this memo 16 This document is an Internet-Draft and is in full conformance with 17 all provisions of Section 10 of RFC2026. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that other 21 groups may also distribute working documents as Internet-Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or cite them other than as "work in progress". 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/lid-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html 34 Abstract 36 This document describes the Secure Real Time Transport Protocol 37 (SRTP), a profile of the Real Time Transport Protocol (RTP) which can 38 provide privacy, message authentication, replay protection, and 39 implicit header authentication. 41 SRTP can achieve high throughput and low packet expansion by using an 42 additive stream cipher for encryption, a universal hashing based 43 function for message authentication, and an 'implicit' index for 44 sequencing based on the RTP sequence number. 46 In addition, SRTP proves to be a suitable protection for heterogenous 47 environments, i.e. environments including both wired and wireless 48 links. 50 TABLE OF CONTENTS 52 1. Notational Conventions.........................................2 53 2. Goals..........................................................3 54 3. SRTP Overview..................................................4 55 3.1 SRTP Cryptographic Contexts...................................5 56 3.2 Mapping SRTP Packets to Cryptographic Contexts................5 57 3.3 SRTP Packet Processing........................................6 58 3.4 Cryptographic Algorithms......................................7 59 4. Synchronization................................................8 60 4.1. IV Formation for Implicit Header Authentication .............9 61 5. Replay Protection.............................................10 62 6. Encryption....................................................10 63 6.1 Defined Ciphers..............................................11 64 6.1.1. Counter Mode AES..........................................11 65 6.1.2. AES in f8-Mode............................................12 66 6.1.3. NULL Cipher...............................................13 67 7. Message Authentication........................................13 68 7.1 Default MAC: UMAC............................................14 69 8. SRTP Parameters...............................................14 70 9. Secure RTCP...................................................15 71 10. Rationale....................................................17 72 10.1 Synchronization.............................................18 73 10.2 Replay Protection...........................................18 74 10.3 Source Origin Authentication................................18 75 10.4. Choice of Encryption Transform.............................19 76 11. Security Considerations......................................20 77 11.1. SSRC collision.............................................21 78 11.2. Confidentiality of the RTP Payload.........................21 79 11.3. Confidentiality of the RTP Header..........................22 80 11.4. Integrity of RTP headers...................................22 81 12. Multicast and Multi-unicast..................................22 82 13. Acknowledgements.............................................23 83 14. Author's Addresses...........................................23 84 15. References...................................................23 85 APPENDIX A: Test Vectors.........................................25 87 1. Notational Conventions 89 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 90 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 91 document are to be interpreted as described in RFC-2119 [B97]. 93 By convention, the most left bit (byte) is the most significant one. 94 By XOR we mean bitwise addition modulo 2 of binary strings, and || 95 denotes concatenation. E.g. if C = A || B, then the most significant 96 bits of C are the same as those of A, and the least significant bits 97 of C equals those of B. 99 2. Goals 101 The security goals for SRTP are to ensure: 103 * the privacy of the RTP payload, 105 * the authentication of the entire RTP packet, including protection 106 against replayed RTP packets, and 108 * implicit authentication of the header. 110 Each of the security services described above is optional. Any 111 combination of options can be provided, except the single option of 112 implicit header authentication. 114 Source origin authentication (e.g., digitally signed packets) may be 115 desirable in some situations, but this goal is deferred from 116 consideration in this document. See Section 10.3 for a discussion on 117 this point. 119 Other goals for the protocol are: 121 * a low computational cost, 123 * a low footprint (i.e., small code size and data memory for key 124 schedules and replay lists), 126 * limited packet expansion, 128 * no error propagation (e.g., changing a single bit of an SRTP packet 129 should change no more than one bit of the corresponding RTP packet), 131 * the preservation of RTP header compression efficiency, 133 * to allow cryptographic keys to be used by multiple RTP sessions 134 simultaneously, 136 * independence from the underlying transport used by RTP. 138 These properties ensures that SRTP is a suitable protection scheme 139 for both wired and wireless scenarios. 141 3. SRTP Overview 143 RTP is the Real Time Protocol [SCFJ96]. We define SRTP as a profile 144 of RTP, in an analogous way to RFC1890 which defines the audio/video 145 profile for RTP. Conceptually, we consider a 'bump in the stack' 146 implementation which resides between the RTP application and the 147 transport layer, which intercepts RTP packets and then forwards an 148 equivalent SRTP packet on the sending side, and which intercepts SRTP 149 packets and passes an equivalent RTP packet up the stack on the 150 receiving side. 152 0 1 2 3 153 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 154 +-->+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 155 | |V=2|P|X| CC |M| PT | sequence number | 156 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 157 | | timestamp | 158 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 159 | | synchronization source (SSRC) identifier | 160 | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 161 | | contributing source (CSRC) identifiers | 162 | | .... | 163 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 164 | | RTP extension (optional) | 165 | +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 166 | | | | 167 | | | payload | 168 | | | .... | 169 +-+>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 170 | | | authentication tag (optional) | 171 | | | 172 | | | .... | 173 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 174 | | 175 | +- Encrypted Portion 176 +---- Authenticated Portion 178 Figure 1. The format of an SRTP packet. 180 The format of an SRTP packet is illustrated in Figure 1. The optional 181 authentication tag is the only field defined by SRTP that is not in 182 RTP. It provides data origin authentication of the header and 183 payload, and it indirectly provides replay protection by 184 authenticating the sequence number. The Encrypted Portion of an 185 SRTP packet consists of the RTP payload of the equivalent RTP packet. 186 The Authenticated Portion of an SRTP packet consists of the entire 187 equivalent RTP packet. 189 3.1 SRTP Cryptographic Contexts 191 Each SRTP session requires the sender and receiver to maintain 192 cryptographic state information. This information is called the 193 cryptographic context, and it consists of: 195 * an encryption key k_e, and a optionally "salting key" k_s. These 196 keys must be randomly and independently chosen. 198 * a 32-bit rollover counter r (which records how many times the 199 16-bit RTP sequence number has been reset to zero after passing 200 through 65,535), 202 * an 8-bit FLAG used to signal additional information, 204 * the mode of operation for the encryption scheme, and 206 * the cipher. 208 In addition, when authentication and replay protection are provided: 210 * a message authentication key k_a, 212 * a sequence number s_l (which is the last received and authenticated 213 sequence number for the receiver, and is the last sequence number 214 sent for the sender), and 216 * a replay list L (maintained by the receiver only). 218 3.2 Mapping SRTP Packets to Cryptographic Contexts 220 In this section we define the mapping of RTP and SRTP packets to the 221 cryptographic contexts used to protect them. 223 The RTP synchronization source (SSRC) identifier is used, along with 224 the RTP transport address (e.g., the Destination IP Address and Port 225 Number) by a receiver to identify the proper cryptographic context 226 for each packet. 228 Recall that an RTP session is defined [SCFJ96] by a pair of 229 destination Transport Addresses (one network address plus a port pair 230 for RTP and RTCP), and that a multimedia session is defined as a 231 collection of RTP sessions. For example, a particular multimedia 232 session could include an audio RTP session, a video RTP session, and 233 a text RTP session. 235 An SSRC identifier is unique inside an RTP session, and all packets 236 with the same SSRC form part of the same timing and sequence number 237 space. Thus, the SSRC field and transport address information can be 238 used by an SRTP receiver (or by a bump in the stack implementation on 239 the sender's side) to identify the proper cryptographic context 240 within that session. Note though that, for instance in a multicast 241 scenario, the RTP anti-collision mechanism for SSRCs may force these 242 identifiers to change over time, see discussion in Section 12. 244 SRTP may allow the different RTP sessions to use identical 245 cryptographic keys. This is possible if the design of the 246 synchronization mechanism (i.e., the IV in the case of the F8 and 247 Counter Modes) avoids keystream re-use (the two-time pad, Section 11) 248 and with uniqueness requirements on SSRC beyond that dictated by the 249 RTP standard, see Section 12. However, different multimedia sessions 250 SHOULD use different keys. 252 The authentication and encryption keys of each context MUST remain 253 fixed for the duration of that context. This ensures that incorrect 254 keys will not be used by the receiver due to a synchronization error. 256 3.3 SRTP Packet Processing 258 When Generic Forward Error Correction is performed as specified in 259 RFC 2733, then the security processing takes place before FEC on the 260 sender's side, and after FEC on the receiver's side. 262 To construct a proper SRTP packet, given an RTP packet, the sender 263 does the following: 265 1. Determine which cryptographic context to use by checking the 266 SSRC field of the RTP packet, and the Transport Address information 267 of that packet (e.g., the Destination IP Address and Port Number). 269 2. Determine the index of the SRTP packet as described in Section 4, 270 using the rollover counter in the cryptographic context and the 271 sequence number in the RTP packet. Form the current initialization 272 vector (IV). If Implicit Header Authentication is provided, this can 273 be done as described in Section 4.1. 275 3. Encrypt the Encrypted Portion of the packet, as described in 276 Section 6, using the IV determined in Step 2 and the encryption key 277 and salting key in the context found in Step 1. 279 4. If authentication is provided, compute the authentication tag for 280 the Authenticated Portion of the packet, as described in Section 7, 281 using the index determined in Step 2 and the authentication key in 282 the context found in Step 1. Note that the Encrypted Portion is 283 encrypted before the authentication tag is computed. 285 To authenticate and decrypt a SRTP packet, the receiver does the 286 following: 288 1. Determine which cryptographic context to use by checking the 289 SSRC field of the RTP packet and the transport address information of 290 the underlying transport header (e.g., the Destination IP Address and 291 Port Number). 293 2. Determine the index of the SRTP packet from the rollover counter 294 in the cryptographic context and the sequence number in the RTP 295 packet, as described in Section 4. Form the current IV in the same 296 way as done in Step 2 in the encryption process. 298 3. If authentication is provided, check the Replay List to ensure 299 that no packet with that index has been received and authenticated 300 before, as described in Section 5. If that index is in the list, then 301 the packet has been replayed and is invalid. It MUST be discarded, 302 and the event SHOULD be logged. 304 Compute the authentication tag for the Authenticated Portion of the 305 packet, as described in Section 7, using the index determined in Step 306 2 and the authentication key in the context found in Step 1. Note 307 that the Encrypted Portion is not decrypted before the authentication 308 tag is computed. 310 If the authentication tag that is computed matches that in the SRTP 311 packet, then the packet is accepted and the index is added to the 312 Replay List. Otherwise, the packet is invalid: it MUST be discarded, 313 and the event SHOULD be logged. 315 4. Decrypt the Encrypted Portion of the packet, as described in 316 Section 6, using the IV determined in Step 2 and the encryption key 317 and salting key in the context found in Step 1. 319 The processing occurring when replay protection is activated has been 320 chosen to maximize resistance to denial of service attacks (i.e., to 321 minimize the receiver's effort in processing spurious packets). 323 3.4 Cryptographic Algorithms 325 Default encryption and authentication algorithms are specified in 326 Sections 6.1 and 7.1. While there are numerous encryption and message 327 authentication algorithms that can be used in SRTP, we define default 328 algorithms in order to avoid the complexity of specifying the 329 encodings for the signaling of algorithm and parameter identifiers. 331 4. Synchronization 333 SRTP implementations use an 'implicit' packet index for sequencing. 334 Receiver-side implementations use the RTP sequence number to 335 reconstruct the correct index (that is, location in the sequence of 336 all RTP packets). The index is defined as s + r * 65,536, where the 337 sequence number is s and the rollover counter is r. 339 A robust approach for the proper use of a rollover counter requires 340 that its handling and use be well defined. In particular, out-of- 341 order RTP packets with sequence numbers close to 65,536 or zero must 342 be properly dealt with. 344 A receiver reconstructs the index i of a packet with sequence number 345 s using the estimate 347 i = 65,536 * t + s, 349 where t is chosen from the set { r-1, r, r+1 } such that i is closest 350 to the value 65,536 * r + s_l. If the value r+1 is used, then the 351 rollover counter r in the cryptographic context is incremented by 352 one. 354 The pseudocode for the algorithm to process a packet with sequence 355 number s follows: 357 if (s_l < 32,768) 358 if (s - s_l > 32,768) 359 set i to s + 65,536 * (r-1) 360 else 361 set i to s + 65,536 * r 362 endif 363 else 364 if (s_l - 32,768 > s) 365 set r to r + 1 366 endif 367 set i to s + r * 65,536 368 endif 369 set s_l to s 371 The index i is used in replay protection (Section 5) when 372 authentication is provided, in encryption (Section 6), and in message 373 authentication (Section 7). 375 This algorithm should be extended by using the information in the 376 authenticated RTCP reports. 378 When RTP authentication is not present, robust synchronization is not 379 possible. In this case, transmission errors or an active attacker may 380 force the receiver to erroneously update his rollover counter and 381 thus to become completely out of synch. It is not possible to protect 382 against active attackers in such case, but it is possible to have an 383 update policy for the rollover counter which, except in rare cases, 384 is robust with respect to random bit errors. 386 As the rollover counter is 32 bits long, the maximum number of 387 packets in any given SRTP session is 2^48 = 281,474,976,710,656. 388 After that number of SRTP packets have been sent, the sender MUST 389 not send any more packets with that cryptographic context. This 390 limitation enforces a security benefit by providing an upper bound on 391 the amount of traffic that can pass before cryptographic keys are 392 changed. 394 Other approaches to sequencing were considered and rejected; please 395 see Section 10.1 for our rationale. 397 4.1. IV Formation for Implicit Header Authentication 399 There may be several alternatives for the Initialization Vector (IV) 400 formation. To guarantee synchronization and avoid keystream re-use, 401 we only require the SSRC, rollover counter and sequence number, or 402 some function thereof (possibly combined with re-keying mechanisms), 403 to be part of the IV. Below, we give a concrete proposal which also 404 provides 'implicit' header authentication, and works with every 405 cipher having at least 128-bit block size. This particular solution 406 also gives a high degree of agreement between bit ordering in the RTP 407 packet header and the IV, simplifying data copying. 409 When implicit header authentication is provided, data from each RTP 410 packet to be encrypted and transmitted, must be included in the(IV). 411 This IV shall be computed and supplied as input to the ciphering 412 algorithm. This shall be done by taking information of said RTP 413 packet, the FLAG, and the rollover counter value, and computing the 414 128-bit IV: 416 IV = ROC || FLAG || M || PT || SEQ || TS || SSRC 418 where TS (Timestamp, 32 bits), SEQ (Sequence Number, 16 bits), M 419 (Marker Bit, 1 bit), PT (Payload Type, 7 bits), and SSRC 420 (Synchronization Source, 32 bits) are taken from the current RTP 421 header. ROC is the 32-bit rollover counter from the identified 422 context. FLAG is a 8-bit value which is used to signal additional 423 information. Currently, the only value defined (for RTP) is FLAG = 424 00..0. The value 00..01 is reserved for RTCP and MUST not be used 425 with RTP. 427 With this IV formation, the number of SRTP packets encrypted with any 428 fixed encryption key MUST therefore be no more than 2^48. Otherwise, 429 the size of the ROC ..||..SEQ .. field will not be large enough to 430 avoid keystream reuse. 432 5. Replay Protection 434 A packet is 'replayed' when it is stored by an adversary, and then 435 re-injected onto the network. SRTP provides protection against such 436 attacks whenever authentication is provided, through the storage of 437 the indices of the most recently received and authenticated packets. 439 Each SRTP receiver maintains a Replay List, which conceptually 440 contains the indices of all of the packets which have been received 441 and authenticated. In practice, the list can use a 'sliding window' 442 approach, so that a fixed amount of storage suffices for replay 443 protection. SRTP packet indices which are less than s_l * 65,536 - 444 SRTP-WINDOW-SIZE MAY be assumed to have been received, where SRTP- 445 WINDOW_SIZE is a parameter that MUST be at least 64, and which MAY 446 be set to a higher value. 448 The Replay List can be efficiently implemented by using a bitmap to 449 represent which packets have been received, as described in the 450 Security Architecture for IP [KA98a]. 452 6. Encryption 454 Encryption uses a 'seekable' additive stream cipher, following the 455 Stream Cipher ESP [sc-esp]. The stream ciphers that can be used must 456 be able to efficiently seek to arbitrary locations in their 457 keystream. Ciphers that can do this include SEAL [RC94, RC98], 458 LEVIATHAN [MF00b], and any block cipher run in suitable mode. In 459 particular, AES in counter mode will provide good security, 460 reasonable performance, and conform to emerging U.S. Federal 461 standards. Another mode which fulfils the requirements is f8 mode 462 [ES3D], used together with AES. 464 SRTP encryption consists of generating a keystream segment 465 corresponding to the index of the packet, and then bitwise exclusive- 466 oring that keystream segment into the RTP packet, starting at the 467 first bit of the RTP payload. Decryption is then done the same way, 468 but swapping the roles of the plaintext and ciphertext. The 469 definition of how the keystream is generated, given the index, 470 depends on the cipher and its mode of operation. 472 Such a cipher shows features which are desired in a general scenario, 473 e.g. low computational cost, and speed. It also shows properties 474 which fulfil additional requirements posed by the cellular 475 environment [BCNN00], i.e. preservation of RTP header compression 476 efficiency, and absence of error propagation and message expansion. 478 Hence, we conclude that the proposed profile can be applied to the 479 most general heterogenous environment. 481 6.1 Defined Ciphers 483 The default cipher is the Advanced Encryption Standard (AES), and we 484 define two modes of running AES, Counter Mode AES and AES in f8-Mode. 485 Both of these modes provide implicit header authentication through 486 the use of the IV formation described in Section 4.1. The NULL cipher 487 is also defined, to be used when encryption is not required. 489 6.1.1. Counter Mode AES 491 The default cipher SHALL be AES used in the Segmented Integer Counter 492 Mode (SICM) [M00], with a 128-bit key size and a 128-bit block size. 494 Conceptually, counter mode consists of encrypting successive 495 integers. The actual definition is somewhat more complicated, in 496 order to avoid 128 bit integer arithmetic and to randomize the 497 starting point of the integer sequence. Each packet is encrypted with 498 a distinct keystream segment, which is computed as follows. 500 The 128-bit block is divided into three parts: a 64-bit segment 501 prefix, a 32-bit block index, which is incremented to generate a 502 keystream segment, and a 32-bit segment suffix. The segment 503 prefix/suffix pair is unique for each keystream segment. 505 A keystream segment is the concatenation of the output blocks of the 506 cipher in encrypt mode, in which the block indicies are in increasing 507 order. Symbolically, each keystream segment looks like 509 E(A || B || C) || E(A || B + 1 mod 2^32 || C) || E(A || B + 2 mod 510 2^32 || C) .. 512 where A, B, and C are segment prefix, block index, and segment 513 suffix, respectively, determined as given below. 515 The offsets are computed from the salting key k_s and the IV (from 516 Section 4.1) by exclusive-oring k_s and the IV, and setting A to the 517 first 64 bits of the result, B as the following 32 and C to the 518 remaining 32 bits of the result. Symbolically, 520 A || B || C = IV XOR k_s. 522 If k_s is less than 128 bits long, then k_s is concatenated with 523 itself as many times as needed in order to form the salt which is 524 added to the IV. If no salting key is used, this is interpreted as 525 k_s = 0. 527 Note that the segment prefix/suffix pair is distinct for each packet 528 which is encrypted, thus ensuring that keystream segments are 529 distinct and non-overlapping. 531 The restriction on the maximunm number of RTP packets above ensures 532 the security of the encryption method by limiting the effectiveness 533 of probabilistic attacks [BR98]. 535 The AES has a block size of 128 bits, so 2^32 output blocks are 536 sufficient to generate the 2^7 * 2^32 = 549755813888 bits of 537 keystream needed to encrypt the largest possible RTP packet. 539 6.1.2. AES in f8-Mode 541 To encrypt UMTS (Universal Mobile Telecommunications System, as 3G 542 networks) data, a solution (see [ES3D]) known as the f8-algorithm has 543 been developed. On a high level, the proposed scheme is a variant of 544 Output Feedback Mode (OFB) [HAC], with a more elaborate 545 initialization and feedback function. As in normal OFB, the core 546 consists of a block cipher. We define the use of AES as default block 547 cipher to be used in f8-Mode for RTP encryption, with 128-bit key and 548 block size. 550 Figure 2 shows the structure of an arbitrary b-bit block size cipher, 551 E, running in what we shall call "f8-mode of operation". 553 | 554 | 555 \|/ 556 +------+ 557 | | 558 --->| E | 559 | | | 560 | +------+ 561 | | 562 m --> * |--------------------------- ... -------| 563 _____ | IV' | | | | 564 | | ct=1 --> * ct=2 --> * ... ct=L-1 --> * 565 | | | | | 566 | | --> * --> * ... --> * 567 | \|/ | \|/ | \|/ | \|/ 568 | +------+ | +------+ | +------+ | +------+ 569 | | | | | | | | | | | | 570 k -------->| E | | | E | | | E | | | E | 571 | | | | | | | | | | | 572 +------+ | +------+ | +------+ | +------+ 573 | | | | | | | 574 |------ |-------- | ... ---- | 575 | | | | 576 \|/ \|/ \|/ \|/ 578 S(0) S(1) S(2) . . . S(L-1) 580 Figure 2. f8-mode of operation (asterisk, *, denotes bitwise XOR). 582 Let E(k,B) be the 128-bit output of E in encrypt mode when applied to 583 the 128-bit key k and 128-bit plaintext block B. Let ct, IV, IV', 584 S(j), and m denote 128-bit blocks, determined below. 586 The S() keystream for an n-bit message is defined by setting IV' = 587 E(k XOR m, IV), and ct = S(-1) = 00..0. For j = 0,1,.., L-1 where L = 588 n/128 (rounded up to nearest integer) compute 590 S(j) = E(k,IV' XOR ct XOR S(j-1)), (Eq. 1) 591 ct = ct + 1 mod 2^128 (Eq. 2) 593 Notice that the IV (as defined in Section 4.1) is not used directly. 594 Instead it is fed through E under another key to produce an internal, 595 "salted" value (denoted IV') to prevent an attacker from gaining 596 known input/ouput pairs, and the roll of the internal counter is to 597 prevent short keystream cycles. The value of the key mask m is 598 defined to be 600 m = k_s || 0x555..5, 602 i.e. the salting key, padded with the the binary pattern 0101.. to 603 fill the 128-bit key size. (If no salting key is used, m = 0x55..5.) 605 The maxium allowable packet size can be determined as follows. 606 The AES has a block size of 128 bits. Assuming that AES behaves like 607 a random function, it is (heuristically) secure to generate about 608 2^64 output blocks, which is sufficient to generate the 2^71 bits of 609 keystream. In practise though, the counter ct above will often be 610 sufficient if implemented as a 16- or 32-bit counter. In fact, for 611 some security margin, other methods SHOULD be used if packets of size 612 exceeding 2^32 * 128 = 549755813888 bits are to be encrypted. 614 6.1.3. NULL Cipher 616 The NULL cipher is used when no confidentiality is requested. It 617 simply copies the plaintext input into the ciphertext output. 619 7. Message Authentication 621 Message integrity and authentication (hereafter referred to as just 622 "authentication") are optional functions provided by SRTP. 623 Authentication can be provided by any message authentication code, 624 though the default value is UMAC [KBHHKR00]. 626 The authentication tag is computed by applying the UMAC function to 627 the Authenticated Portion of the SRTP packet. 629 The authentication tag is appended to the RTP packet. This expansion 630 of the RTP packet may cause the packet size to exceed the Maximum 631 transmission Unit (MTU) of a network interface on its path, 632 especially in circumstances when the application is attempting to 633 'optimize' the size of packets. MTU path discovery SHOULD be used to 634 avoid this problem. 636 Authentication SHOULD be provided by SRTP. The fact that 637 authentication is optional is motivated by the fact that, while the 638 function is typically highly desired, there are certain cases 639 (notably in the cellular environment) where it has an impact in terms 640 of cost, as motivated in [BCNN00]. In those cases, it is up to the 641 user security profile to request authentication. 643 7.1 Default MAC: UMAC 645 The default message authentication code is UMAC [KBHHKR00], which 646 has proven security properties and is quite fast. Furthermore, it 647 can be used with short (e.g., two or four byte) authentication tags, 648 as well as larger tags. 650 UMAC is a parameterized algorithm (see Section 2.1 of [KBHHKR00]). 651 The default selection of UMAC parameters for SRTP are: 653 WORD-LEN 2 654 UMAC-OUTPUT-LEN 4 655 L1-KEY-LEN 128 656 UMAC-KEY-LEN 16 657 ENDIAN-FAVORITE BIG 658 L1-OPERATIONS-SIGN SIGNED 660 This choice of parameters is intended to work well on low-power 661 processors, to minimize packet expansion, and to minimize the size of 662 the cryptographic context. The WORD-LEN of two will work well on 16 663 bit and higher processors. The packet expansion is determined by the 664 UMAC-OUTPUT-LEN to be only four bytes. The storage requirement, per 665 cryptographic context, is 144 bytes. These parameters ensure a 666 forgery probability of no greater than 1/2^30 for each individual 667 packet. Please see the security considerations section in [KBHHKR00] 668 and the references therein for a more detailed discussion. 670 8. SRTP Parameters 672 The SRTP-WINDOW-SIZE is defined to be at least 64 (Section 5). 674 The current defined modes are Counter Mode (default), f8 Mode 675 (Section 6), and the NULL Cipher. The default cipher is AES (Section 676 6), used with a block- and encryption key size of 128 bits. 678 9. Secure RTCP 680 Secure RTCP follows the definition of Secure RTP, but defines the 681 index and IV differently. In order to differentiate these quantities, 682 we refer to it as the SRTCP index and IV. 684 SRTCP is defined as a profile of RTCP, and it adds two new fields 685 to the RTCP packet definition, the SRTCP index and the authentication 686 tag. Those fields are appended to an RTCP packet in order to form an 687 equivalent SRTCP packet, so that they follow any other profile- 688 specific extensions. An SRTCP packet is illustrated in Figure 3. 690 0 1 2 3 691 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 692 +-->+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 693 | |V=2|P| RC | PT=SR=200 | length | 694 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 695 | | SSRC of sender | 696 | +>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 697 | | | ... | 698 | | | sender info | 699 | | | ... | 700 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 701 | | | ... | 702 | | | report block 1 | 703 | | | ... | 704 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 705 | | | ... | 706 | | | report block 2 | 707 | | | ... | 708 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 709 | | | | 710 | | | ... | 711 | | | | 712 | | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 713 | | | ... | 714 | | | profile-specific extensions | 715 | | | ... | 716 | +>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 717 | | | SRTCP index | 718 +-|>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 719 | | | ... | 720 | | | authentication tag | 721 | | | ... | 722 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 723 | | 724 | +-- Encrypted Portion 725 +---- Authenticated Portion 727 Figure 3. The format of a Secure RTCP packet, after Section 6.3.1 of 728 [SCFJ96]. In this case, the underlying RTCP packet is a sender report 729 packet; the SRTP format is identical for other RTCP packet types. 731 The SRTCP index is a 32-bit value. As we allow both encrypted and 732 non-encrypted packets belonging to the same flow (see discussion 733 below), indices with their most significant bit set to '1' are 734 reserved for encrypted packets, and indices with most significant bit 735 set to '0' are used for non-encrypted packets. With this restriction, 736 the rest of the bits are set to zero before the first SRTCP packet is 737 sent, and is incremented by one after each SRTCP is sent. Except for 738 differences in the most significant bit, SRTCP indices form a 739 strictly increasing sequence. The index is explicitly included in 740 each packet, in contrast to the 'implicit' index approach used for 741 SRTP. 743 SRTCP packet processing is identical to that of SRTP packet 744 processing, with the following changes: 746 * SRTCP replay protection is as defined in Section 5, but using the 747 the SRTCP index as the index i. 749 * SRTCP encryption is as defined in Section 6, but using the 750 definition of the SRTCP Encrypted Portion as defined in this 751 section, using the SRTCP index as the index i, and the IV as defined 752 in this section. 754 * The SRTCP authentication tag is defined as in Section 7, but 755 applying the UMAC function to the Authenticated Portion of the SRTCP 756 packet as defined in this section, and using the SRTCP index as the 757 index i. 759 * SRCTP decryption is performed as in Section 6, but only if the 760 SRTCP index has its most significant bit equal to 1. If so, the 761 encrypted portion is decrypted, using the SRTCP index as the index i, 762 and the IV as defined in this section. In case the most significant 763 bit of the index is 0, the payload is simply copied. 765 The IV for ciphers using 128-bit block size is formed in the 766 following way: 768 IV = SRTCP index || FLAG || PT || 0..0 || SSRC 770 where PT (Payload Type, 8 bit), and SSRC (Synchronization Source, 32 771 bits) are taken from the first header in the RTCP compound packet. 772 SRTCP index is the added 32-bit index to the packet. A pad of 48 773 zeros is inserted between the PT and the SSRC. 775 FLAG is a 8-bit value which is used to signal additional information. 776 Currently, the only value defined (for RTCP) is FLAG = 00..01. The 777 value 0..0 is reserved for RTP and MUST not be used for RTCP. This 778 allows to use the same key for related RTP and RTCP flows (being the 779 IV unique). 781 Then this IV is treated in the same way as defined in Section 6, 782 according to the chosen encryption mode. 784 The encryption prefix (Section 6.1 of [SCFJ96]), which is a random 785 32-bit quantity intended to improve privacy, SHOULD NOT be used. This 786 is because SRTP encryption uses an additive stream cipher, and thus 787 the prefix offers no benefit. 789 The maximum number of SRTCP packets is limited to 2^31 = 790 2,147,483,648. The last RTCP packet MUST contain an RTCP BYE. SRTCP 791 senders MUST send an RTCP BYE in the final packet, if the maximum 792 number of SRTCP packets is reached. Similarly, SRTCP receivers MUST 793 act as though the last RTCP packet included a BYE, even if no BYE was 794 included in the packet, if the maximum number of SRTCP packets is 795 reached. 797 Authentication MUST be required for RTCP, being it the control 798 protocol (e.g., it has a BYE packet). Moreover, the cost for RTCP 799 authentication is not of the same order of RTP authentication, being 800 the session bandwidth allocated to RTCP recommended at 5%. However, 801 when adding authentication to RTCP, the overhead in bandwidth SHOULD 802 be considered (it will be more than 5%). 804 It is allowed to split a compound RTCP packet into two lower-layer 805 packets, one to be encrypted and one to be sent in the clear, as 806 described in Section 9.1 of [SCFJ96]. 808 Encryption/non-encryption is signaled by the most significant bit of 809 the SRTCP index as described above. 811 10. Rationale 813 SRTP achieves high throughput and low packet expansion by using fast 814 stream ciphers for encryption, an implicit index for synchronization, 815 and universal hash functions for message authentication. SRTP shows 816 to be a suitable choice for the most general scenario, and to fit 817 also the most demanding one, conversation multimedia over wireless, 818 having it the necessary robustness properties. 820 Only a single header extension may be appended to the RTP data 821 header, so the use of a header extension for SRTP was avoided. SRTP 822 and SRTCP are defined as profiles of RTP and RTCP, respectively. 824 10.1 Synchronization 826 RTP runs over unreliable transport. Thus, maintaining synchronization 827 of the cryptographic context between the sender and receiver is a 828 conspicuous challenge. Because of the requirement to minimize packet 829 expansion, no explicit sequencing information should be added. RTP 830 packets contain two fields for synchronization purposes, the 831 timestamp and the sequence number. The timestamp field could be used 832 for cryptographic synchronization in some circumstances. However, 833 this field is not appropriate for such use. From [SCFJ96]: 835 Several consecutive RTP packets may have equal timestamps if they are 836 (logically) generated at once, e.g., belong to the same video frame. 837 Consecutive RTP packets may contain timestamps that are not monotonic 838 if the data is not transmitted in the order it was sampled, as in the 839 case of MPEG interpolated video frames. 841 The RTP sequence number might be directly used as a unique identifier 842 for SRTP packets. However, it has only sixteen bits, which would 843 limit the duration of an SRTP security association to only 64,536 844 packets, asking therefore for periodically rekeying. 846 The 'implicit index' approach works as long as the reorder and loss 847 of the packets is not too great. In particular, 32,768 packets would 848 need to be lost, or a packet would need to be 32,768 packets out of 849 sequence in order for synchronization to be lost. Such drastic loss 850 or reorder is likely to disrupt the RTP application itself. 852 When a participant joins an SRTP session while that session is in 853 progress, the entire cryptographic context except for the replay 854 list is sent to that participant. This step is essential for 855 security. See also Section 12. 857 10.2 Replay Protection 859 Replay protection is undoubtedly important for multimedia data, and 860 SHOULD be provided. Otherwise, it would be possible for an adversary 861 to perform simple manipulations on data that subverted security. For 862 example, in a voice application, the phrase "yes" could be 863 substituted for "no" if replay protection were not present. However, 864 there are certain scenarios, e.g. conversation multimedia, where it 865 may be difficult to perform such a kind of attacks. Moreover, to be 866 useful, replay protection needs to be based on an authentication 867 mechanism (i.e., authentication of the sequence number of the RTP 868 header), and this has a cost when cellular links are involved on the 869 path. 871 10.3 Source Origin Authentication 872 'Source origin authentication' was listed as an option in the 873 security goals, not because it is not an appropriate goal, but 874 because it may not be achievable. This goal may be desirable in some 875 circumstances, such as multicast environments in which the sender 876 is more trusted than the receivers, or when translators or mixers 877 (Section 2.3 of [SCFJ96]) are used. However, it is not clear that 878 this capability can always be provided, as mixers and translators can 879 change the payload. Furthermore, this security service essentially 880 requires digital signatures (at least if collusion resistance is 881 required [BF00]). 883 Two examples of the multicast scenario mentioned above are a 884 military commander addressing his troops over RTP, and financial 885 market data sent over RTP. In these situations, a 'stream signing' 886 method can provide digital signatures on the entire RTP packets. An 887 extensive literature on such methods is developing, and it is 888 reasonable to expect that one of these methods can be reduced to 889 practice and specified for RTP. This suggests that it should be left 890 as an option in the current specification. A future effort can define 891 a stream signing method as an authentication type for RTP, which 892 could be used as a replacement for a message integrity transform. 894 Examples of the mixer and translator scenarios include a translator 895 re-encoding data at a lower rate or in a different encoding, and a 896 mixer combining the audio streams of multiple speakers in a 897 teleconference. In these cases, it is not clear that meaningful 898 source origin authentication is possible, as the data that is 899 received is not the same as the data that is signed. If the 900 translator is trusted by the receivers, then it could sign or re-sign 901 the data streams, but this scenario may not be prevalent. It may be 902 possible to devise a signing scheme that authenticates the source but 903 not the content (enabling the receivers to know that "John is one of 904 the people talking", but not providing authentication on who said 905 what) by signing the concatenation of the Contributing source (CSRC) 906 field and some sequencing information (e.g., a timestamp or sequence 907 number), but such schemes require synchronization between the 908 senders. This synchronization is not required by the RTP protocol 909 itself, and may be difficult or impossible to arrange. 911 10.4 Choice of Encryption Transform 913 When adopting a block cipher mode to produce keystreams, the central 914 ingredient is the block cipher which is its core. As far as modern 915 cryptology knows, the security basically stands (and falls) with the 916 security of the block cipher. This means that if a weakness is found, 917 replacing the block cipher with a new one will most likely remedy the 918 security problems. We define AES (Rijndael) [AES] as default block 919 cipher, as it is widely believed to be secure. 921 11. Security Considerations 923 The security of UMAC is well understood, and is described in 924 [KBHHKR00]. 926 Additive ciphers do not provide any security service other than 927 privacy. In particular, they do not provide message authentication 928 (see [RK99] or [S96] for a discussion of this security service). 929 However, SRTP uses a message authentication code to provide that 930 security service. 932 By using 'seekable' stream ciphers, SRTP avoids the denial of service 933 attacks that are possible on stream ciphers that lack this property 934 (these attacks are described in Section 3.4 of [B96]). 936 No bit of keystream in an additive stream cipher should ever be used 937 to encrypt multiple distinct plaintext bits. Such keystream reuse 938 (jokingly called a 'two-time pad' system by cryptographers), can 939 seriously compromise security. The NSA's VENONA project [C99] 940 provides a historical example of such a compromise. In SRTP, a 'two- 941 time pad' is avoided by requiring the key or the IV to be unique. 943 An SSRC is mapped to a unique crypto context. Multiple crypto 944 contexts may contain identical keys; in this case, each context 945 together with data from the RTP header MUST produce a unique IV 946 (which is typically assured by plugging the unique SSRC in the IV). 948 If manual keying is used, two different cryptographic contexts might 949 accidentally use the same encryption key with non-negligible 950 probability, through manual error or procedural inadequacies. Thus, 951 manual keying SHOULD NOT be used for SRTP (or SRTCP). 953 An additive stream cipher is vulnerable to attacks that use 954 statistical knowledge about the plaintext source to enable key 955 collision and time-memory tradeoff attacks [MF00,H80,Bi96]. These 956 attacks take advantage of commonalities among plaintexts, and provide 957 a way for a cryptanalyst to amortize the computational effort of 958 decryption over many keys, thus reducing the effective key size of 959 the cipher. A detailed analysis of these attacks and their 960 applicability to the encryption of Internet traffic is provided in 961 [MF00]. In summary, the effective key size of SRTP when used in a 962 security system in which m distinct keys are used, is equal to the 963 key size of the cipher less the logarithm (base two) of m. Protection 964 against such attacks can be provided simply by increasing the size of 965 the keys used, which here can be accomplished by the use of the 966 "salting key". 968 In order to provide an effective key size of n bits in a deployment 969 in which 2^m SRTP/SRTCP cryptographic contexts will be created, the 970 true key size will need to be n+m bits. The value of m SHOULD be 32 971 bits for networks with 50,000 connections (fully meshed networks 972 with up to 200 devices), and SHOULD be 64 bits for networks with 973 49e+12 connections (fully meshed networks with up to 7,000,000 974 devices). These choices of m ensures that key collision attacks 975 amortized over a ten year period offer no advantage over exhaustive 976 search, when new SRTP keys are established for every connection 977 every hour (note that such an attack requires the storage of all 978 network traffic over the ten year period). These choices will suffice 979 for many networks, though SRTP deployments with more stringent 980 security requirements will need to make a detailed assessment of 981 those requirements with respect to the attacks described in [MF00]. 983 Implementations SHOULD use keys that are as large as possible. Please 984 note that in many cases increasing the key size of a cipher does not 985 affect the throughput of that cipher. 987 It is an important point that the m bits of 'extra' key provided to 988 thwart these attacks need not be private. In jurisdictions with 989 mandated limits on the length of a secret key, the additional key 990 bits could be made public. This is because those bits are 991 functionally equivalent to the 'salt' that is used to protect 992 passwords from dictionary attacks. The fact that the 'extra' key bits 993 are distinct for many different keys defeats the key collision and 994 time-memory tradeoff attacks by reducing the number of keys over 995 which cryptanalytic computation can be amortized. 997 Note that other security protocols which use additive ciphers for the 998 encryption of Internet traffic (e.g., SSL, TLS, SSH, IPSEC) are also 999 vulnerable to the attacks described in [MF00]. Those attacks are 1000 generic to additive encryption of redundant plaintext, and are not 1001 particular to SRTP. 1003 11.1 SSRC collision 1005 Assume that two or more communication parties use the same key. 1006 Though RTP implements an SSRC collision detection mechanism, it is 1007 impossible to guarantee that two parties do not accidently choose the 1008 same SSRC and send a few packets before the collision is detected. In 1009 a very unfortunate case, the IV formation in Section 4.1 could in 1010 fact make the keystreams collide and we have a 'two-time pad'. This 1011 is probably a bigger problem in the case of group communication when 1012 a single group key is desired. See also some administrative issues 1013 with SSRC collisions in Section 12. 1015 11.2. Confidentiality of the RTP Payload 1017 It is important to be aware that, as with any stream cipher, the 1018 exact length of the payload is revealed by the encryption. This means 1019 that it may be possible to deduce certain "formatting bits" of the 1020 payload, as the length of the CODEC output might vary due to certain 1021 parameter settings etc. This, in turn, implies that the corresponding 1022 bit of the keystream can be deduced. However, if the stream cipher is 1023 secure, knowledge of a few bits of the keystream will not aid an 1024 attacker in predicting the following keystream bits. Thus, the 1025 payload length (and information deducible from this) will leak, but 1026 nothing else. 1028 11.3. Confidentiality of the RTP Header 1030 With our proposal, RTP headers are sent in the clear to allow for 1031 header compression. This means that data such as payload type, 1032 synchronization source identifier, and timestamp are available to an 1033 eavesdropper. Moreover, since RTP allows for future extensions of 1034 headers, we cannot foresee what kind of possibly sensitive 1035 information might also be "leaked". 1037 Our proposal is a low-cost method, which allows header compression to 1038 reduce bandwidth. It is up to the endpoints policies to decide about 1039 the security scheme to employ. If the header compression is omitted, 1040 other solutions might be applicable, e.g. [sc-esp]. In other words, 1041 we provide a solution that works in the most general scenario, even 1042 in the most demanding one (like conversational multimedia over low- 1043 bandwidth, unreliable media. Of course the solution will then also 1044 work in less restricted environments, but we suggest that if one 1045 really needs to protect headers, and is allowed to do so by the 1046 surrounding environment, then he should also look at alternatives. In 1047 addition, we strongly recommend the use of profiles to select the 1048 right trade-off for the required level of security. 1050 11.4 Integrity of RTP headers 1052 The IV formation in Section 4.1, which depends on the RTP header, 1053 provides an 'implicit' authentication of that header, which is useful 1054 when the authentication option is not present. This is because any 1055 attacks which modify the header of such a packet will cause the SRTP 1056 receiver to use an incorrect IV in the decryption step, with the 1057 result that the decrypted RTP payload will be essentially random. 1059 12. Multicast and Multi-unicast 1061 The scheme described here can be used in case a single, unique key (a 1062 single pair, encryption group key and authentication group key) is to 1063 be used inside a multimedia session, for a low complexity key 1064 management. However, it then becomes necessary to have a way to 1065 assure that each SSRC is unique inside that multimedia session. This 1066 is a light and feasible solution in several scenarios, e.g. one 1067 sender only, streaming, and unicast. 1069 In multicast and multi-unicast, to use the same group key for the 1070 multimedia session, there should be a way to guarantee uniqueness of 1071 the SSRC before starting sending. Otherwise, the triggering of the 1072 anti-collision mechanism will ask for a change in the SSRCs of the 1073 parties that happened to have the same SSRC, hence giving trouble in 1074 pointing to the right context. 1076 The problem remains how to address the context database after the 1077 anti-collision algorithm has changed the SSRCs. Section 3.3 defines 1078 the use of SSRC and Transport Address of that packet as selectors to 1079 the database. In case of UDP, the unchanged transport address can be 1080 a good indicator that a collision, followed by anti-collision 1081 triggering, has happened. So, simply try decryptions until a RTCP 1082 message confirms the change in the SSRC on that transport address and 1083 then update the database selector triplet. 1085 If the requirement of unique SSRC inside that multimedia session 1086 cannot be guaranteed (e.g., for large groups), then a unique key per 1087 sender should be used. The additional requirement is to have SSRC 1088 unique per sender, which appears to be feasible enough. However, the 1089 same consideration on the anti-collision algorithm triggerring 1090 applies. 1092 13. Acknowledgements 1094 The authors would like to thank Brian Weis and Magnus Westerlund for 1095 their reviews and comments. 1097 14. Author's Addresses 1099 Questions and comments about this memo can be directed to: 1101 David A. McGrew 1102 David Oran 1103 Cisco Systems, Inc. 1104 San Jose, CA 95134-1706 USA 1105 mcgrew@cisco.com, oran@cisco.com 1107 Rolf Blom 1108 Elisabetta Carrara 1109 Mats Naslund 1110 Karl Norrman 1111 Ericsson Research 1112 {rolf.blom, elisabetta.carrara, mats.naslund, 1113 karl.norrman}@era.ericsson.se 1115 15. References 1117 [AES] NIST, "Advanced Encryption Standard (AES)", 1118 http://csrc.nist.gov/encryption/aes/ 1120 [B97] Bradner, S., "Key words for use in RFCs to Indicate 1121 Requirement Levels", RFC 2119, March 1997. 1123 [BCNN00] Blom, R., Carrara, E., Naslund, M., and Norrman, K., 1124 "Conversational Multimedia Security in 3G Networks", Internet Draft, 1125 November 2000, . 1127 [BF00] Boneh, D., and Franklin, M., "Message Authentication in a 1128 Multicast Environment", the Proceedings of the Seventh Annual 1129 Workshop on Selected Areas in Cryptography (SAC 2000), Springer- 1130 Verlag. 1132 [C99] Crowell, W. P., "Introduction to the VENONA Project", 1133 http://www.nsa.gov:8080/docs/venona/index.html. 1135 [ES3D] ETSI SAGE 3GPP Standard Algorithms Task Force, "Security 1136 Algorithms Group of Experts (SAGE); General Report on the Design, 1137 Specification and Evaluation of 3GPP Standard Confidentiality and 1138 Integrity Algorithms", Public report, Draft Version 1.0, Dec 1999. 1140 [ES3E] ETSI SAGE 3GPP Standard Algorithms Task Force, "Security 1141 Algorithms Group of Experts (SAGE) Report on the Evaluation of 3GPP 1142 Standard Confidentiality and Integrity Algorithms", Public report, 1143 Draft Version 1.0, Dec 1999. 1145 [HAC] Menezes, A., Van Oorschot, P., and Vanstone, S., "Handbook of 1146 Applied Cryptography", CRC Press, 1997, ISBN 0-8493-8523-7. 1148 [H80] Hellman, M. E., "A cryptanalytic time-memory trade-off", IEEE 1149 Transactions on Information Theory, July 1980, pp. 401-406. 1151 [KA98a] Kent, S., and R. Atkinson, "Security Architecture for IP", 1152 RFC 2401, November 1998. 1154 [KBHHKR00] Krovetz, T., Black, J., Halevi, S., Hevia, A., Krawczyk, 1155 H., Rogaway, P., "UMAC: Message Authentication Code using Universal 1156 Hashing", Internet Draft, October 2000, . 1158 [LRW00] Lipmaa, H., Rogaway, P., and Wagner, D., "Comments to NIST 1159 Concerning AES Modes of Operation: CTR-Mode Encryption", NIST 1160 Workshop on AES Modes of Operation, 1161 http://csrc.nist.gov/encryption/aes/modes/lipmaa-ctr.pdf 1163 [M00] McGrew, D., "Segmented Integer Counter Mode: Specification 1164 and Rationale", NIST Workshop on AES Modes of Operation, 1165 http://www.mindspring.com/~dmcgrew/sic-mode.pdf 1167 [MF00] McGrew, D., and Fluhrer, S., "Attacks on Encryption of 1168 Redundant Plaintext and Implications on Internet Security", the 1169 Proceedings of the Seventh Annual Workshop on Selected Areas in 1170 Cryptography (SAC 2000), Springer-Verlag. 1172 [MF00b] McGrew, D., and Fluhrer, S., "The Stream Cipher LEVIATHAN: 1173 Specification and Supporting Documentation", Submission to the New 1174 European Schemes for Signatures, Integrity, and Encryption (NESSIE) 1175 Process, October, 2000http://www.cryptonessie.org/. 1177 [R92] Rueppel, R., "Stream Ciphers", Chapter 2 of Simmons, G., 1178 "Contemporary Cryptology: the Science of Information Integrity," 1179 1992, IEEE Press. 1181 [RC94] Rogaway, P. and Coppersmith, D., "A Software-Optimized 1182 Encryption Algorithm", Proceedings of the 1994 Fast Software 1183 Encryption Workshop, Lecture Notes In Computer Science, Volume 809, 1184 Springer-Verlag, 1994, pp. 56-63. 1186 [RC98] Rogaway, P. and Coppersmith, D., "A Software-Optimized 1187 Encryption Algorithm", Journal of Cryptology, Volume 11, Number 4, 1188 Springer-Verlag, 1998, Pages 273-287. Also available on the Internet 1189 at http://www.cs.ucdavis.edu/~rogaway/papers/seal-abstract.html. 1191 [RK99] Rescorla, E., and Korver, B., "Guidelines for Writing RFC 1192 Text on Security Considerations," draft-rescorla-sec-cons-00.txt 1194 [S96] Schneier, B. "Applied Cryptography: Protocols, Algorithms, 1195 and Source Code in C", Wiley, 1996. 1197 [sc-esp] McGrew, D., Fluhrer, S., Peyravian, M., "The Stream Cipher 1198 Encapsulating Security Payload", Internet Draft, July 2000 1200 [SCFJ96] Schulzrinne, H., Casner, S., Frederick, R., Jacobson, V., 1201 "RTP: A Transport Protocol for Real-Time Applications", IETF Request 1202 For Comments RFC 1889. 1204 Appendix 1206 A. Test vectors 1208 We include in the following some test vectors for f8-AES. 1210 key: 1211 234829008467be186c3de14aae72d62c 1213 salting key || 0x555... : 1214 32f2870d555555555555555555555555 1216 AES-internal expanded key: 1217 23482900 8467be18 6c3de14a ae72d62c 1218 62be58e4 e6d9e6fc 8ae407b6 2496d19a 1219 f080e0d2 1659062e 9cbd0198 b82bd002 1220 05f097be 13a99190 8f149008 373f400a 1221 78f9f024 6b5061b4 e444f1bc d37bb1b6 1222 4931be42 2261dff6 c6252e4a 155e9ffc 1223 31ea0e1b 138bd1ed d5aeffa7 c0f0605b 1224 fd3a37a1 eeb1e64c 3b1f19eb fbef79b0 1225 a28cd0ae 4c3d36e2 77222f09 8ccd56b9 1226 043d86ca 4800b028 3f229f21 b3efc998 1227 ede0c0a7 a5e0708f 9ac2efae 292d2636 1229 AES-internal expanded salting key || 555...: 1230 32f2870d 55555555 55555555 55555555 1231 cf0e7bf1 9a5b2ea4 cf0e7bf1 9a5b2ea4 1232 f43f3249 6e641ced a16a671c 3b3149b8 1233 37045eab 59604246 f80a255a c33b6ce2 1234 dd54c685 843484c3 7c3ea199 bf05cd7b 1235 a6e9e78d 22dd634e 5ee3c2d7 e1e60fac 1236 089f7675 2a42153b 74a1d7ec 9547d840 1237 e8fe7f5f c2bc6a64 b61dbd88 235a65c8 1238 d6b39779 140ffd1d a2124095 8148255d 1239 9f8cdb75 8b832668 299166fd a8d943a0 1240 9c963bb7 17151ddf 3e847b22 965d3882 1242 RTP-packet header fields: 1243 version = 2 1244 padding = 0 1245 extension = 0 1246 CSRC count = 0 1247 marker bit = 0 1248 payload type = 6e 1249 sequence no. = 5cba 1250 timestamp = 50681de5 1251 SSRC = 5c621599 1253 Data from Cryptographic context: 1254 FLAG = 0 1255 Rollover counter = d462564a 1257 IV: 1258 d462564a006e5cba50681de55c621599 1260 IV': 1261 4fee844eedb458a3e2b0c7ed43888cc1 1263 Encryption of bits 0 to 127: 1265 ct: 0 1266 S(-1) : 00000000000000000000000000000000 1267 S(-1) XOR IV' : 4fee844eedb458a3e2b0c7ed43888cc1 1268 S(-1) XOR IV' XOR ct : 4fee844eedb458a3e2b0c7ed43888cc1 1269 plain text P[0..127] : 6e915f07cd6f1c0d44afaab4961c7d31 1270 final keystream S(0) : b2d3b3d7e16092de379e33b350582e63 1271 cipher text C[0..127] : dc42ecd02c0f8ed373319907c6445352 1273 Encryption of bits 128 to 255: 1275 ct: 1 1276 S(0) : b2d3b3d7e16092de379e33b350582e63 1277 S(0) XOR IV' : fd3d37990cd4ca7dd52ef45e13d0a2a2 1278 S(0) XOR IV' XOR ct : fd3d37990cd4ca7dd52ef45e13d0a2a3 1279 plain text P[128..255] : 7b9daad84352a6d4bcdf501a560832a0 1280 final keystream S(1) : b1ce287dc53c1975de3d7d0500f780ba 1281 cipher text C[128..255] : ca5382a5866ebfa162e22d1f56ffb21a 1283 ------------------------------------------------------------ 1285 This Internet-Draft expires in July 2001.