idnits 2.17.1 

draft-ietf-avt-srtp-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     As the rollover counter is 32 bits long, the maximum number of
     packets in any given SRTP session is 2^48 = 281,474,976,710,656. After
     that number of SRTP packets have been sent, the sender MUST not send any
     more packets with that cryptographic context. This limitation enforces a
     security benefit by providing an upper bound on the amount of traffic
     that can pass before cryptographic keys are changed.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     where TS (Timestamp, 32 bits), SEQ (Sequence Number, 16 bits), M
     (Marker Bit, 1 bit), PT (Payload Type, 7 bits), and SSRC (Synchronization
     Source, 32 bits) are taken from the current RTP header. ROC is the 32-bit
     rollover counter from the identified context. FLAG is a 8-bit value which
     is used to signal additional information. Currently, the only value
     defined (for RTP) is FLAG = 00..0. The value 00..01 is reserved for RTCP
     and MUST not be used with RTP.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     FLAG is a 8-bit value which is used to signal additional
     information. Currently, the only value defined (for RTCP) is FLAG =
     00..01. The value 0..0 is reserved for RTP and MUST not be used for RTCP.
     This allows to use the same key for related RTP and RTCP flows (being the
     IV unique).

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (February 2001) is 8471 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'BR98' is mentioned on line 533, but not defined

  == Missing Reference: 'B96' is mentioned on line 934, but not defined

  == Missing Reference: 'Bi96' is mentioned on line 955, but not defined

  == Unused Reference: 'ES3E' is defined on line 1140, but no explicit
     reference was found in the text

  == Unused Reference: 'LRW00' is defined on line 1158, but no explicit
     reference was found in the text

  == Unused Reference: 'R92' is defined on line 1177, but no explicit
     reference was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. 'AES'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'BCNN00'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'BF00'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'C99'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'ES3D'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'ES3E'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'HAC'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'H80'

  ** Obsolete normative reference: RFC 2401 (ref. 'KA98a') (Obsoleted by RFC
     4301)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'KBHHKR00'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'LRW00'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'M00'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'MF00'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'MF00b'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'R92'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'RC94'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'RC98'

  == Outdated reference: A later version (-05) exists of
     draft-rescorla-sec-cons-00

  -- Possible downref: Normative reference to a draft: ref. 'RK99' 

  -- Possible downref: Non-RFC (?) normative reference: ref. 'S96'

  ** Obsolete normative reference: RFC 1889 (ref. 'SCFJ96') (Obsoleted by RFC
     3550)


     Summary: 7 errors (**), 0 flaws (~~), 11 warnings (==), 20 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                      Rolf Blom, Ericsson
3	AVT Working Group                           Elisabetta Carrara, Ericsson
4	INTERNET-DRAFT                                    David A. McGrew, Cisco
5	Expires: July 2001                                Mats Naslund, Ericsson
6	                                                  Karl Norrman, Ericsson
7	                                                       David Oran, Cisco

9	                                                           February 2001

11	                The Secure Real Time Transport Protocol
12	                      <draft-ietf-avt-srtp-00.txt>

14	Status of this memo

16	   This document is an Internet-Draft and is in full conformance with
17	   all provisions of Section 10 of RFC2026.

19	   Internet-Drafts are working documents of the Internet Engineering
20	   Task Force (IETF), its areas, and its working groups. Note that other
21	   groups may also distribute working documents as Internet-Drafts.

23	   Internet-Drafts are draft documents valid for a maximum of six months
24	   and may be updated, replaced, or obsoleted by other documents at any
25	   time. It is inappropriate to use Internet-Drafts as reference
26	   material or cite them other than as "work in progress".

28	   The list of current Internet-Drafts can be accessed at
29	   http://www.ietf.org/ietf/lid-abstracts.txt

31	   The list of Internet-Draft Shadow Directories can be accessed at
32	   http://www.ietf.org/shadow.html

34	Abstract

36	   This document describes the Secure Real Time Transport Protocol
37	   (SRTP), a profile of the Real Time Transport Protocol (RTP) which can
38	   provide privacy, message authentication, replay protection, and
39	   implicit header authentication.

41	   SRTP can achieve high throughput and low packet expansion by using an
42	   additive stream cipher for encryption, a universal hashing based
43	   function for message authentication, and an 'implicit' index for
44	   sequencing based on the RTP sequence number.

46	   In addition, SRTP proves to be a suitable protection for heterogenous
47	   environments, i.e. environments including both wired and wireless
48	   links.

50	TABLE OF CONTENTS

52	   1. Notational Conventions.........................................2
53	   2. Goals..........................................................3
54	   3. SRTP Overview..................................................4
55	   3.1 SRTP Cryptographic Contexts...................................5
56	   3.2 Mapping SRTP Packets to Cryptographic Contexts................5
57	   3.3 SRTP Packet Processing........................................6
58	   3.4 Cryptographic Algorithms......................................7
59	   4. Synchronization................................................8
60	   4.1. IV Formation for Implicit Header Authentication .............9
61	   5. Replay Protection.............................................10
62	   6. Encryption....................................................10
63	   6.1 Defined Ciphers..............................................11
64	   6.1.1. Counter Mode AES..........................................11
65	   6.1.2. AES in f8-Mode............................................12
66	   6.1.3. NULL Cipher...............................................13
67	   7. Message Authentication........................................13
68	   7.1 Default MAC: UMAC............................................14
69	   8. SRTP Parameters...............................................14
70	   9. Secure RTCP...................................................15
71	   10. Rationale....................................................17
72	   10.1 Synchronization.............................................18
73	   10.2 Replay Protection...........................................18
74	   10.3 Source Origin Authentication................................18
75	   10.4. Choice of Encryption Transform.............................19
76	   11. Security Considerations......................................20
77	   11.1. SSRC collision.............................................21
78	   11.2. Confidentiality of the RTP Payload.........................21
79	   11.3. Confidentiality of the RTP Header..........................22
80	   11.4. Integrity of RTP headers...................................22
81	   12. Multicast and Multi-unicast..................................22
82	   13. Acknowledgements.............................................23
83	   14. Author's Addresses...........................................23
84	   15. References...................................................23
85	   APPENDIX A: Test Vectors.........................................25

87	1. Notational Conventions

89	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
90	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
91	   document are to be interpreted as described in RFC-2119 [B97].

93	   By convention, the most left bit (byte) is the most significant one.
94	   By XOR we mean bitwise addition modulo 2 of binary strings, and ||
95	   denotes concatenation. E.g. if C = A || B, then the most significant
96	   bits of C are the same as those of A, and the least significant bits
97	   of C equals those of B.

99	2. Goals

101	   The security goals for SRTP are to ensure:

103	   * the privacy of the RTP payload,

105	   * the authentication of the entire RTP packet, including protection
106	   against replayed RTP packets, and

108	   * implicit authentication of the header.

110	   Each of the security services described above is optional. Any
111	   combination of options can be provided, except the single option of
112	   implicit header authentication.

114	   Source origin authentication (e.g., digitally signed packets) may be
115	   desirable in some situations, but this goal is deferred from
116	   consideration in this document. See Section 10.3 for a discussion on
117	   this point.

119	   Other goals for the protocol are:

121	   * a low computational cost,

123	   * a low footprint (i.e., small code size and data memory for key
124	     schedules and replay lists),

126	   * limited packet expansion,

128	   * no error propagation (e.g., changing a single bit of an SRTP packet
129	   should change no more than one bit of the corresponding RTP packet),

131	   * the preservation of RTP header compression efficiency,

133	   * to allow cryptographic keys to be used by multiple RTP sessions
134	   simultaneously,

136	   * independence from the underlying transport used by RTP.

138	   These properties ensures that SRTP is a suitable protection scheme
139	   for both wired and wireless scenarios.

141	3. SRTP Overview

143	   RTP is the Real Time Protocol [SCFJ96].  We define SRTP as a profile
144	   of RTP, in an analogous way to RFC1890 which defines the audio/video
145	   profile for RTP. Conceptually, we consider a 'bump in the stack'
146	   implementation which resides between the RTP application and the
147	   transport layer, which intercepts RTP packets and then forwards an
148	   equivalent SRTP packet on the sending side, and which intercepts SRTP
149	   packets and passes an equivalent RTP packet up the stack on the
150	   receiving side.

152	          0                   1                   2                   3
153	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
154	   +-->+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
155	   |   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
156	   |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
157	   |   |                           timestamp                           |
158	   |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
159	   |   |           synchronization source (SSRC) identifier            |
160	   |   +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
161	   |   |            contributing source (CSRC) identifiers             |
162	   |   |                               ....                            |
163	   |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
164	   |   |                   RTP extension (optional)                    |
165	   | +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
166	   | | |                                                               |
167	   | | |                           payload                             |
168	   | | |                             ....                              |
169	   +-+>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
170	   | | |                     authentication tag (optional)             |
171	   | | |
172	   | | |                             ....                              |
173	   | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
174	   | |
175	   | +- Encrypted Portion
176	   +---- Authenticated Portion

178	                    Figure 1.  The format of an SRTP packet.

180	   The format of an SRTP packet is illustrated in Figure 1. The optional
181	   authentication tag is the only field defined by SRTP that is not in
182	   RTP. It provides data origin authentication of the header and
183	   payload, and it indirectly provides replay protection by
184	   authenticating the sequence number. The Encrypted Portion of an
185	   SRTP packet consists of the RTP payload of the equivalent RTP packet.
186	   The Authenticated Portion of an SRTP packet consists of the entire
187	   equivalent RTP packet.

189	3.1 SRTP Cryptographic Contexts

191	   Each SRTP session requires the sender and receiver to maintain
192	   cryptographic state information. This information is called the
193	   cryptographic context, and it consists of:

195	   * an encryption key k_e, and a optionally "salting key" k_s. These
196	   keys must be randomly and independently chosen.

198	   * a 32-bit rollover counter r (which records how many times the
199	   16-bit RTP sequence number has been reset to zero after passing
200	   through 65,535),

202	   * an 8-bit FLAG used to signal additional information,

204	   * the mode of operation for the encryption scheme, and

206	   * the cipher.

208	   In addition, when authentication and replay protection are provided:

210	   * a message authentication key k_a,

212	   * a sequence number s_l (which is the last received and authenticated
213	     sequence number for the receiver, and is the last sequence number
214	     sent for the sender), and

216	   * a replay list L (maintained by the receiver only).

218	3.2 Mapping SRTP Packets to Cryptographic Contexts

220	   In this section we define the mapping of RTP and SRTP packets to the
221	   cryptographic contexts used to protect them.

223	   The RTP synchronization source (SSRC) identifier is used, along with
224	   the RTP transport address (e.g., the Destination IP Address and Port
225	   Number) by a receiver to identify the proper cryptographic context
226	   for each packet.

228	   Recall that an RTP session is defined [SCFJ96] by a pair of
229	   destination Transport Addresses (one network address plus a port pair
230	   for RTP and RTCP), and that a multimedia session is defined as a
231	   collection of RTP sessions. For example, a particular multimedia
232	   session could include an audio RTP session, a video RTP session, and
233	   a text RTP session.

235	   An SSRC identifier is unique inside an RTP session, and all packets
236	   with the same SSRC form part of the same timing and sequence number
237	   space. Thus, the SSRC field and transport address information can be
238	   used by an SRTP receiver (or by a bump in the stack implementation on
239	   the sender's side) to identify the proper cryptographic context
240	   within that session. Note though that, for instance in a multicast
241	   scenario, the RTP anti-collision mechanism for SSRCs may force these
242	   identifiers to change over time, see discussion in Section 12.

244	   SRTP may allow the different RTP sessions to use identical
245	   cryptographic keys. This is possible if the design of the
246	   synchronization mechanism (i.e., the IV in the case of the F8 and
247	   Counter Modes) avoids keystream re-use (the two-time pad, Section 11)
248	   and with uniqueness requirements on SSRC beyond that dictated by the
249	   RTP standard, see Section 12. However, different multimedia sessions
250	   SHOULD use different keys.

252	   The authentication and encryption keys of each context MUST remain
253	   fixed for the duration of that context. This ensures that incorrect
254	   keys will not be used by the receiver due to a synchronization error.

256	3.3 SRTP Packet Processing

258	   When Generic Forward Error Correction is performed as specified in
259	   RFC 2733, then the security processing takes place before FEC on the
260	   sender's side, and after FEC on the receiver's side.

262	   To construct a proper SRTP packet, given an RTP packet, the sender
263	   does the following:

265	   1. Determine which cryptographic context to use by checking the
266	   SSRC field of the RTP packet, and the Transport Address information
267	   of that packet (e.g., the Destination IP Address and Port Number).

269	   2. Determine the index of the SRTP packet as described in Section 4,
270	   using the rollover counter in the cryptographic context and the
271	   sequence number in the RTP packet. Form the current initialization
272	   vector (IV). If Implicit Header Authentication is provided, this can
273	   be done as described in Section 4.1.

275	   3. Encrypt the Encrypted Portion of the packet, as described in
276	   Section 6, using the IV determined in Step 2 and the encryption key
277	   and salting key in the context found in Step 1.

279	   4. If authentication is provided, compute the authentication tag for
280	   the Authenticated Portion of the packet, as described in Section 7,
281	   using the index determined in Step 2 and the authentication key in
282	   the context found in Step 1. Note that the Encrypted Portion is
283	   encrypted before the authentication tag is computed.

285	   To authenticate and decrypt a SRTP packet, the receiver does the
286	   following:

288	   1. Determine which cryptographic context to use by checking the
289	   SSRC field of the RTP packet and the transport address information of
290	   the underlying transport header (e.g., the Destination IP Address and
291	   Port Number).

293	   2. Determine the index of the SRTP packet from the rollover counter
294	   in the cryptographic context and the sequence number in the RTP
295	   packet, as described in Section 4. Form the current IV in the same
296	   way as done in Step 2 in the encryption process.

298	   3. If authentication is provided, check the Replay List to ensure
299	   that no packet with that index has been received and authenticated
300	   before, as described in Section 5. If that index is in the list, then
301	   the packet has been replayed and is invalid. It MUST be discarded,
302	   and the event SHOULD be logged.

304	   Compute the authentication tag for the Authenticated Portion of the
305	   packet, as described in Section 7, using the index determined in Step
306	   2 and the authentication key in the context found in Step 1. Note
307	   that the Encrypted Portion is not decrypted before the authentication
308	   tag is computed.

310	   If the authentication tag that is computed matches that in the SRTP
311	   packet, then the packet is accepted and the index is added to the
312	   Replay List. Otherwise, the packet is invalid: it MUST be discarded,
313	   and the event SHOULD be logged.

315	   4. Decrypt the Encrypted Portion of the packet, as described in
316	   Section 6, using the IV determined in Step 2 and the encryption key
317	   and salting key in the context found in Step 1.

319	   The processing occurring when replay protection is activated has been
320	   chosen to maximize resistance to denial of service attacks (i.e., to
321	   minimize the receiver's effort in processing spurious packets).

323	3.4 Cryptographic Algorithms

325	   Default encryption and authentication algorithms are specified in
326	   Sections 6.1 and 7.1. While there are numerous encryption and message
327	   authentication algorithms that can be used in SRTP, we define default
328	   algorithms in order to avoid the complexity of specifying the
329	   encodings for the signaling of algorithm and parameter identifiers.

331	4. Synchronization

333	   SRTP implementations use an 'implicit' packet index for sequencing.
334	   Receiver-side implementations use the RTP sequence number to
335	   reconstruct the correct index (that is, location in the sequence of
336	   all RTP packets). The index is defined as s + r * 65,536, where the
337	   sequence number is s and the rollover counter is r.

339	   A robust approach for the proper use of a rollover counter requires
340	   that its handling and use be well defined. In particular, out-of-
341	   order RTP packets with sequence numbers close to 65,536 or zero must
342	   be properly dealt with.

344	   A receiver reconstructs the index i of a packet with sequence number
345	   s using the estimate

347	   i = 65,536 * t + s,

349	   where t is chosen from the set { r-1, r, r+1 } such that i is closest
350	   to the value 65,536 * r + s_l. If the value r+1 is used, then the
351	   rollover counter r in the cryptographic context is incremented by
352	   one.

354	   The pseudocode for the algorithm to process a packet with sequence
355	   number s follows:

357	      if (s_l < 32,768)
358	         if (s - s_l > 32,768)
359	            set i to s + 65,536 * (r-1)
360	         else
361	            set i to s + 65,536 * r
362	         endif
363	      else
364	         if (s_l - 32,768 > s)
365	            set r to r + 1
366	         endif
367	         set i to s + r * 65,536
368	      endif
369	      set s_l to s

371	   The index i is used in replay protection (Section 5) when
372	   authentication is provided, in encryption (Section 6), and in message
373	   authentication (Section 7).

375	   This algorithm should be extended by using the information in the
376	   authenticated RTCP reports.

378	   When RTP authentication is not present, robust synchronization is not
379	   possible. In this case, transmission errors or an active attacker may
380	   force the receiver to erroneously update his rollover counter and
381	   thus to become completely out of synch. It is not possible to protect
382	   against active attackers in such case, but it is possible to have an
383	   update policy for the rollover counter which, except in rare cases,
384	   is robust with respect to random bit errors.

386	   As the rollover counter is 32 bits long, the maximum number of
387	   packets in any given SRTP session is 2^48 = 281,474,976,710,656.
388	   After that number of SRTP packets have been sent, the sender MUST
389	   not send any more packets with that cryptographic context. This
390	   limitation enforces a security benefit by providing an upper bound on
391	   the amount of traffic that can pass before cryptographic keys are
392	   changed.

394	   Other approaches to sequencing were considered and rejected; please
395	   see Section 10.1 for our rationale.

397	4.1. IV Formation for Implicit Header Authentication

399	   There may be several alternatives for the Initialization Vector (IV)
400	   formation. To guarantee synchronization and avoid keystream re-use,
401	   we only require the SSRC, rollover counter and sequence number, or
402	   some function thereof (possibly combined with re-keying mechanisms),
403	   to be part of the IV. Below, we give a concrete proposal which also
404	   provides 'implicit' header authentication, and works with every
405	   cipher having at least 128-bit block size. This particular solution
406	   also gives a high degree of agreement between bit ordering in the RTP
407	   packet header and the IV, simplifying data copying.

409	   When implicit header authentication is provided, data from each RTP
410	   packet to be encrypted and transmitted, must be included in the(IV).
411	   This IV shall be computed and supplied as input to the ciphering
412	   algorithm. This shall be done by taking information of said RTP
413	   packet, the FLAG, and the rollover counter value, and computing the
414	   128-bit IV:

416	    IV = ROC || FLAG || M || PT  || SEQ || TS || SSRC

418	   where TS (Timestamp, 32 bits), SEQ (Sequence Number, 16 bits), M
419	   (Marker Bit, 1 bit), PT (Payload Type, 7 bits), and SSRC
420	   (Synchronization Source, 32 bits) are taken from the current RTP
421	   header. ROC is the 32-bit rollover counter from the identified
422	   context. FLAG is a 8-bit value which is used to signal additional
423	   information. Currently, the only value defined (for RTP) is FLAG =
424	   00..0. The value 00..01 is reserved for RTCP and MUST not be used
425	   with RTP.

427	   With this IV formation, the number of SRTP packets encrypted with any
428	   fixed encryption key MUST therefore be no more than 2^48. Otherwise,
429	   the size of the ROC ..||..SEQ .. field will not be large enough to
430	   avoid keystream reuse.

432	5. Replay Protection

434	   A packet is 'replayed' when it is stored by an adversary, and then
435	   re-injected onto the network. SRTP provides protection against such
436	   attacks whenever authentication is provided, through the storage of
437	   the indices of the most recently received and authenticated packets.

439	   Each SRTP receiver maintains a Replay List, which conceptually
440	   contains the indices of all of the packets which have been received
441	   and authenticated. In practice, the list can use a 'sliding window'
442	   approach, so that a fixed amount of storage suffices for replay
443	   protection. SRTP packet indices which are less than s_l * 65,536 -
444	   SRTP-WINDOW-SIZE MAY be assumed to have been received, where SRTP-
445	   WINDOW_SIZE is a parameter that MUST be at least 64, and which  MAY
446	   be set to a higher value.

448	   The Replay List can be efficiently implemented by using a bitmap to
449	   represent which packets have been received, as described in the
450	   Security Architecture for IP [KA98a].

452	6. Encryption

454	   Encryption uses a 'seekable' additive stream cipher, following the
455	   Stream Cipher ESP [sc-esp]. The stream ciphers that can be used must
456	   be able to efficiently seek to arbitrary locations in their
457	   keystream. Ciphers that can do this include SEAL [RC94, RC98],
458	   LEVIATHAN [MF00b], and any block cipher run in suitable mode. In
459	   particular, AES in counter mode will provide good security,
460	   reasonable performance, and conform to emerging U.S. Federal
461	   standards. Another mode which fulfils the requirements is f8 mode
462	   [ES3D], used together with AES.

464	   SRTP encryption consists of generating a keystream segment
465	   corresponding to the index of the packet, and then bitwise exclusive-
466	   oring that keystream segment into the RTP packet, starting at the
467	   first bit of the RTP payload. Decryption is then done the same way,
468	   but swapping the roles of the plaintext and ciphertext. The
469	   definition of how the keystream is generated, given the index,
470	   depends on the cipher and its mode of operation.

472	   Such a cipher shows features which are desired in a general scenario,
473	   e.g. low computational cost, and speed. It also shows properties
474	   which fulfil additional requirements posed by the cellular
475	   environment [BCNN00], i.e. preservation of RTP header compression
476	   efficiency, and absence of error propagation and message expansion.

478	   Hence, we conclude that the proposed profile can be applied to the
479	   most general heterogenous environment.

481	6.1 Defined Ciphers

483	   The default cipher is the Advanced Encryption Standard (AES), and we
484	   define two modes of running AES, Counter Mode AES and AES in f8-Mode.
485	   Both of these modes provide implicit header authentication through
486	   the use of the IV formation described in Section 4.1. The NULL cipher
487	   is also defined, to be used when encryption is not required.

489	6.1.1. Counter Mode AES

491	   The default cipher SHALL be AES used in the Segmented Integer Counter
492	   Mode (SICM) [M00], with a 128-bit key size and a 128-bit block size.

494	   Conceptually, counter mode consists of encrypting successive
495	   integers. The actual definition is somewhat more complicated, in
496	   order to avoid 128 bit integer arithmetic and to randomize the
497	   starting point of the integer sequence. Each packet is encrypted with
498	   a distinct keystream segment, which is computed as follows.

500	   The 128-bit block is divided into three parts: a 64-bit segment
501	   prefix, a 32-bit block index, which is incremented to generate a
502	   keystream segment, and a 32-bit segment suffix. The segment
503	   prefix/suffix pair is unique for each keystream segment.

505	   A keystream segment is the concatenation of the output blocks of the
506	   cipher in encrypt mode, in which the block indicies are in increasing
507	   order. Symbolically, each keystream segment looks like

509	   E(A || B || C) || E(A || B + 1 mod 2^32 || C) || E(A || B + 2 mod
510	   2^32 || C) ..

512	   where A, B, and C are segment prefix, block index, and segment
513	   suffix, respectively, determined as given below.

515	   The offsets are computed from the salting key k_s and the IV (from
516	   Section 4.1) by exclusive-oring k_s and the IV, and setting A to the
517	   first 64 bits of the result, B as the following 32 and C to the
518	   remaining 32 bits of the result. Symbolically,

520	   A || B || C = IV XOR k_s.

522	   If k_s is less than 128 bits long, then k_s is concatenated with
523	   itself as many times as needed in order to form the salt which is
524	   added to the IV. If no salting key is used, this is interpreted as
525	   k_s = 0.

527	   Note that the segment prefix/suffix pair is distinct for each packet
528	   which is encrypted, thus ensuring that keystream segments are
529	   distinct and non-overlapping.

531	   The restriction on the maximunm number of RTP packets above ensures
532	   the security of the encryption method by limiting the effectiveness
533	   of probabilistic attacks [BR98].

535	   The AES has a block size of 128 bits, so 2^32 output blocks are
536	   sufficient to generate the 2^7 * 2^32 = 549755813888 bits of
537	   keystream needed to encrypt the largest possible RTP packet.

539	6.1.2. AES in f8-Mode

541	   To encrypt UMTS (Universal Mobile Telecommunications System, as 3G
542	   networks) data, a solution (see [ES3D]) known as the f8-algorithm has
543	   been developed. On a high level, the proposed scheme is a variant of
544	   Output Feedback Mode (OFB) [HAC], with a more elaborate
545	   initialization and feedback function. As in normal OFB, the core
546	   consists of a block cipher. We define the use of AES as default block
547	   cipher to be used in f8-Mode for RTP encryption, with 128-bit key and
548	   block size.

550	   Figure 2 shows the structure of an arbitrary b-bit block size cipher,
551	   E, running in what we shall call "f8-mode of operation".

553	                    |
554	                    |
555	                   \|/
556	                +------+
557	                |      |
558	            --->|  E   |
559	           |    |      |
560	           |    +------+
561	           |        |
562	     m --> *        |---------------------------  ...     -------|
563	   _____   |    IV' |           |             |                  |
564	           |        |  ct=1 --> *    ct=2 --> *   ... ct=L-1 --> *
565	           |        |           |             |                  |
566	           |        |       --> *         --> *   ...        --> *
567	           |       \|/     |   \|/       |   \|/            |   \|/
568	           |    +------+   | +------+    | +------+         | +------+
569	           |    |      |   | |      |    | |      |         | |      |
570	     k -------->|  E   |   | |  E   |    | |  E   |         | |  E   |
571	                |      |   | |      |    | |      |         | |      |
572	                +------+   | +------+    | +------+         | +------+
573	                    |      |    |        |    |             |    |
574	                    |------     |--------     |    ...  ----     |
575	                    |           |             |                  |
576	                   \|/         \|/           \|/                \|/

578	                   S(0)        S(1)          S(2)  . . .       S(L-1)

580	   Figure 2. f8-mode of operation (asterisk, *, denotes bitwise XOR).

582	   Let E(k,B) be the 128-bit output of E in encrypt mode when applied to
583	   the 128-bit key k and 128-bit plaintext block B. Let ct, IV, IV',
584	   S(j), and m denote 128-bit blocks, determined below.

586	   The S() keystream for an n-bit message is defined by setting IV' =
587	   E(k XOR m, IV), and ct = S(-1) = 00..0. For j = 0,1,.., L-1 where L =
588	   n/128 (rounded up to nearest integer) compute

590	         S(j) = E(k,IV' XOR ct XOR S(j-1)),            (Eq. 1)
591	         ct   = ct + 1 mod 2^128                       (Eq. 2)

593	   Notice that the IV (as defined in Section 4.1) is not used directly.
594	   Instead it is fed through E under another key to produce an internal,
595	   "salted" value (denoted IV') to prevent an attacker from gaining
596	   known input/ouput pairs, and the roll of the internal counter is to
597	   prevent short keystream cycles. The value of the key mask m is
598	   defined to be

600	     m = k_s || 0x555..5,

602	   i.e. the salting key, padded with the the binary pattern 0101.. to
603	   fill the 128-bit key size. (If no salting key is used, m = 0x55..5.)

605	   The maxium allowable packet size can be determined as follows.
606	   The AES has a block size of 128 bits. Assuming that AES behaves like
607	   a random function, it is (heuristically) secure to generate about
608	   2^64 output blocks, which is sufficient to generate the 2^71 bits of
609	   keystream. In practise though, the counter ct above will often be
610	   sufficient if implemented as a 16- or 32-bit counter. In fact, for
611	   some security margin, other methods SHOULD be used if packets of size
612	   exceeding 2^32 * 128 = 549755813888 bits are to be encrypted.

614	6.1.3. NULL Cipher

616	   The NULL cipher is used when no confidentiality is requested. It
617	   simply copies the plaintext input into the ciphertext output.

619	7. Message Authentication

621	   Message integrity and authentication (hereafter referred to as just
622	   "authentication") are optional functions provided by SRTP.
623	   Authentication can be provided by any message authentication code,
624	   though the default value is UMAC [KBHHKR00].

626	   The authentication tag is computed by applying the UMAC function to
627	   the Authenticated Portion of the SRTP packet.

629	   The authentication tag is appended to the RTP packet. This expansion
630	   of the RTP packet may cause the packet size to exceed the Maximum
631	   transmission Unit (MTU) of a network interface on its path,
632	   especially in circumstances when the application is attempting to
633	   'optimize' the size of packets. MTU path discovery SHOULD be used to
634	   avoid this problem.

636	   Authentication SHOULD be provided by SRTP. The fact that
637	   authentication is optional is motivated by the fact that, while the
638	   function is typically highly desired, there are certain cases
639	   (notably in the cellular environment) where it has an impact in terms
640	   of cost, as motivated in [BCNN00]. In those cases, it is up to the
641	   user security profile to request authentication.

643	7.1 Default MAC: UMAC

645	   The default message authentication code is UMAC [KBHHKR00], which
646	   has proven security properties and is quite fast. Furthermore, it
647	   can be used with short (e.g., two or four byte) authentication tags,
648	   as well as larger tags.

650	   UMAC is a parameterized algorithm (see Section 2.1 of [KBHHKR00]).
651	   The default selection of UMAC parameters for SRTP are:

653	      WORD-LEN              2
654	      UMAC-OUTPUT-LEN       4
655	      L1-KEY-LEN            128
656	      UMAC-KEY-LEN          16
657	      ENDIAN-FAVORITE       BIG
658	      L1-OPERATIONS-SIGN    SIGNED

660	   This choice of parameters is intended to work well on low-power
661	   processors, to minimize packet expansion, and to minimize the size of
662	   the cryptographic context. The WORD-LEN of two will work well on 16
663	   bit and higher processors. The packet expansion is determined by the
664	   UMAC-OUTPUT-LEN to be only four bytes. The storage requirement, per
665	   cryptographic context, is 144 bytes. These parameters ensure a
666	   forgery probability of no greater than 1/2^30 for each individual
667	   packet. Please see the security considerations section in [KBHHKR00]
668	   and the references therein for a more detailed discussion.

670	8. SRTP Parameters

672	   The SRTP-WINDOW-SIZE is defined to be at least 64 (Section 5).

674	   The current defined modes are Counter Mode (default), f8 Mode
675	   (Section 6), and the NULL Cipher. The default cipher is AES (Section
676	   6), used with a block- and encryption key size of 128 bits.

678	9. Secure RTCP

680	   Secure RTCP follows the definition of Secure RTP, but defines the
681	   index and IV differently. In order to differentiate these quantities,
682	   we refer to it as the SRTCP index and IV.

684	   SRTCP is defined as a profile of RTCP, and it adds two new fields
685	   to the RTCP packet definition, the SRTCP index and the authentication
686	   tag. Those fields are appended to an RTCP packet in order to form an
687	   equivalent SRTCP packet, so that they follow any other profile-
688	   specific extensions. An SRTCP packet is illustrated in Figure 3.

690	        0                   1                   2                   3
691	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
692	   +-->+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
693	   |   |V=2|P|    RC   |   PT=SR=200   |             length            |
694	   |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
695	   |   |                         SSRC of sender                        |
696	   | +>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
697	   | | |                              ...                              |
698	   | | |                          sender info                          |
699	   | | |                              ...                              |
700	   | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
701	   | | |                              ...                              |
702	   | | |                         report block 1                        |
703	   | | |                              ...                              |
704	   | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
705	   | | |                              ...                              |
706	   | | |                         report block 2                        |
707	   | | |                              ...                              |
708	   | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
709	   | | |                                                               |
710	   | | |                              ...                              |
711	   | | |                                                               |
712	   | | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
713	   | | |                              ...                              |
714	   | | |                  profile-specific extensions                  |
715	   | | |                              ...                              |
716	   | +>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
717	   | | |                           SRTCP index                         |
718	   +-|>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
719	   | | |                              ...                              |
720	   | | |                       authentication tag                      |
721	   | | |                              ...                              |
722	   | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
723	   | |
724	   | +-- Encrypted Portion
725	   +---- Authenticated Portion

727	   Figure 3.  The format of a Secure RTCP packet, after Section 6.3.1 of
728	   [SCFJ96]. In this case, the underlying RTCP packet is a sender report
729	   packet; the SRTP format is identical for other RTCP packet types.

731	   The SRTCP index is a 32-bit value. As we allow both encrypted and
732	   non-encrypted packets belonging to the same flow (see discussion
733	   below), indices with their most significant bit set to '1' are
734	   reserved for encrypted packets, and indices with most significant bit
735	   set to '0' are used for non-encrypted packets. With this restriction,
736	   the rest of the bits are set to zero before the first SRTCP packet is
737	   sent, and is incremented by one after each SRTCP is sent. Except for
738	   differences in the most significant bit, SRTCP indices form a
739	   strictly increasing sequence. The index is explicitly included in
740	   each packet, in contrast to the 'implicit' index approach used for
741	   SRTP.

743	   SRTCP packet processing is identical to that of SRTP packet
744	   processing, with the following changes:

746	   * SRTCP replay protection is as defined in Section 5, but using the
747	   the SRTCP index as the index i.

749	   * SRTCP encryption is as defined in Section 6, but using the
750	   definition of the SRTCP Encrypted Portion as defined in this
751	   section, using the SRTCP index as the index i, and the IV as defined
752	   in this section.

754	   * The SRTCP authentication tag is defined as in Section 7, but
755	   applying the UMAC function to the Authenticated Portion of the SRTCP
756	   packet as defined in this section, and using the SRTCP index as the
757	   index i.

759	   * SRCTP decryption is performed as in Section 6, but only if the
760	   SRTCP index has its most significant bit equal to 1. If so, the
761	   encrypted portion is decrypted, using the SRTCP index as the index i,
762	   and the IV as defined in this section. In case the most significant
763	   bit of the index is 0, the payload is simply copied.

765	   The IV for ciphers using 128-bit block size is formed in the
766	   following way:

768	   IV = SRTCP index || FLAG || PT || 0..0 || SSRC

770	   where PT (Payload Type, 8 bit), and SSRC (Synchronization Source, 32
771	   bits) are taken from the first header in the RTCP compound packet.
772	   SRTCP index is the added 32-bit index to the packet. A pad of 48
773	   zeros is inserted between the PT and the SSRC.

775	   FLAG is a 8-bit value which is used to signal additional information.
776	   Currently, the only value defined (for RTCP) is FLAG = 00..01. The
777	   value 0..0 is reserved for RTP and MUST not be used for RTCP. This
778	   allows to use the same key for related RTP and RTCP flows (being the
779	   IV unique).

781	   Then this IV is treated in the same way as defined in Section 6,
782	   according to the chosen encryption mode.

784	   The encryption prefix (Section 6.1 of [SCFJ96]), which is a random
785	   32-bit quantity intended to improve privacy, SHOULD NOT be used. This
786	   is because SRTP encryption uses an additive stream cipher, and thus
787	   the prefix offers no benefit.

789	   The maximum number of SRTCP packets is limited to 2^31 =
790	   2,147,483,648. The last RTCP packet MUST contain an RTCP BYE. SRTCP
791	   senders MUST send an RTCP BYE in the final packet, if the maximum
792	   number of SRTCP packets is reached. Similarly, SRTCP receivers MUST
793	   act as though the last RTCP packet included a BYE, even if no BYE was
794	   included in the packet, if the maximum number of SRTCP packets is
795	   reached.

797	   Authentication MUST be required for RTCP, being it the control
798	   protocol (e.g., it has a BYE packet). Moreover, the cost for RTCP
799	   authentication is not of the same order of RTP authentication, being
800	   the session bandwidth allocated to RTCP recommended at 5%. However,
801	   when adding authentication to RTCP, the overhead in bandwidth SHOULD
802	   be considered (it will be more than 5%).

804	   It is allowed to split a compound RTCP packet into two lower-layer
805	   packets, one to be encrypted and one to be sent in the clear, as
806	   described in Section 9.1 of [SCFJ96].

808	   Encryption/non-encryption is signaled by the most significant bit of
809	   the SRTCP index as described above.

811	10. Rationale

813	   SRTP achieves high throughput and low packet expansion by using fast
814	   stream ciphers for encryption, an implicit index for synchronization,
815	   and universal hash functions for message authentication. SRTP shows
816	   to be a suitable choice for the most general scenario, and to fit
817	   also the most demanding one, conversation multimedia over wireless,
818	   having it the necessary robustness properties.

820	   Only a single header extension may be appended to the RTP data
821	   header, so the use of a header extension for SRTP was avoided. SRTP
822	   and SRTCP are defined as profiles of RTP and RTCP, respectively.

824	10.1 Synchronization

826	   RTP runs over unreliable transport. Thus, maintaining synchronization
827	   of the cryptographic context between the sender and receiver is a
828	   conspicuous challenge. Because of the requirement to minimize packet
829	   expansion, no explicit sequencing information should be added. RTP
830	   packets contain two fields for synchronization purposes, the
831	   timestamp and the sequence number. The timestamp field could be used
832	   for cryptographic synchronization in some circumstances. However,
833	   this field is not appropriate for such use. From [SCFJ96]:

835	   Several consecutive RTP packets may have equal timestamps if they are
836	   (logically) generated at once, e.g., belong to the same video frame.
837	   Consecutive RTP packets may contain timestamps that are not monotonic
838	   if the data is not transmitted in the order it was sampled, as in the
839	   case of MPEG interpolated video frames.

841	   The RTP sequence number might be directly used as a unique identifier
842	   for SRTP packets. However, it has only sixteen bits, which would
843	   limit the duration of an SRTP security association to only 64,536
844	   packets, asking therefore for periodically rekeying.

846	   The 'implicit index' approach works as long as the reorder and loss
847	   of the packets is not too great. In particular, 32,768 packets would
848	   need to be lost, or a packet would need to be 32,768 packets out of
849	   sequence in order for synchronization to be lost. Such drastic loss
850	   or reorder is likely to disrupt the RTP application itself.

852	   When a participant joins an SRTP session while that session is in
853	   progress, the entire cryptographic context except for the replay
854	   list is sent to that participant. This step is essential for
855	   security. See also Section 12.

857	10.2 Replay Protection

859	   Replay protection is undoubtedly important for multimedia data, and
860	   SHOULD be provided. Otherwise, it would be possible for an adversary
861	   to perform simple manipulations on data that subverted security. For
862	   example, in a voice application, the phrase "yes" could be
863	   substituted for "no" if replay protection were not present. However,
864	   there are certain scenarios, e.g. conversation multimedia, where it
865	   may be difficult to perform such a kind of attacks. Moreover, to be
866	   useful, replay protection needs to be based on an authentication
867	   mechanism (i.e., authentication of the sequence number of the RTP
868	   header), and this has a cost when cellular links are involved on the
869	   path.

871	10.3 Source Origin Authentication
872	   'Source origin authentication' was listed as an option in the
873	   security goals, not because it is not an appropriate goal, but
874	   because it may not be achievable. This goal may be desirable in some
875	   circumstances, such as multicast environments in which the sender
876	   is more trusted than the receivers, or when translators or mixers
877	   (Section 2.3 of [SCFJ96]) are used. However, it is not clear that
878	   this capability can always be provided, as mixers and translators can
879	   change the payload. Furthermore, this security service essentially
880	   requires digital signatures (at least if collusion resistance is
881	   required [BF00]).

883	   Two examples of the multicast scenario mentioned above are a
884	   military commander addressing his troops over RTP, and financial
885	   market data sent over RTP. In these situations, a 'stream signing'
886	   method can provide digital signatures on the entire RTP packets. An
887	   extensive literature on such methods is developing, and it is
888	   reasonable to expect that one of these methods can be reduced to
889	   practice and specified for RTP. This suggests that it should be left
890	   as an option in the current specification. A future effort can define
891	   a stream signing method as an authentication type for RTP, which
892	   could be used as a replacement for a message integrity transform.

894	   Examples of the mixer and translator scenarios include a translator
895	   re-encoding data at a lower rate or in a different encoding, and a
896	   mixer combining the audio streams of multiple speakers in a
897	   teleconference. In these cases, it is not clear that meaningful
898	   source origin authentication is possible, as the data that is
899	   received is not the same as the data that is signed. If the
900	   translator is trusted by the receivers, then it could sign or re-sign
901	   the data streams, but this scenario may not be prevalent. It may be
902	   possible to devise a signing scheme that authenticates the source but
903	   not the content (enabling the receivers to know that "John is one of
904	   the people talking", but not providing authentication on who said
905	   what) by signing the concatenation of the Contributing source (CSRC)
906	   field and some sequencing information (e.g., a timestamp or sequence
907	   number), but such schemes require synchronization between the
908	   senders. This synchronization is not required by the RTP protocol
909	   itself, and may be difficult or impossible to arrange.

911	10.4 Choice of Encryption Transform

913	   When adopting a block cipher mode to produce keystreams, the central
914	   ingredient is the block cipher which is its core. As far as modern
915	   cryptology knows, the security basically stands (and falls) with the
916	   security of the block cipher. This means that if a weakness is found,
917	   replacing the block cipher with a new one will most likely remedy the
918	   security problems. We define AES (Rijndael) [AES] as default block
919	   cipher, as it is widely believed to be secure.

921	11. Security Considerations

923	   The security of UMAC is well understood, and is described in
924	   [KBHHKR00].

926	   Additive ciphers do not provide any security service other than
927	   privacy. In particular, they do not provide message authentication
928	   (see [RK99] or [S96] for a discussion of this security service).
929	   However, SRTP uses a message authentication code to provide that
930	   security service.

932	   By using 'seekable' stream ciphers, SRTP avoids the denial of service
933	   attacks that are possible on stream ciphers that lack this property
934	   (these attacks are described in Section 3.4 of [B96]).

936	   No bit of keystream in an additive stream cipher should ever be used
937	   to encrypt multiple distinct plaintext bits. Such keystream reuse
938	   (jokingly called a 'two-time pad' system by cryptographers), can
939	   seriously compromise security. The NSA's VENONA project [C99]
940	   provides a historical example of such a compromise. In SRTP, a 'two-
941	   time pad' is avoided by requiring the key or the IV to be unique.

943	   An SSRC is mapped to a unique crypto context. Multiple crypto
944	   contexts may contain identical keys; in this case, each context
945	   together with data from the RTP header MUST produce a unique IV
946	   (which is typically assured by plugging the unique SSRC in the IV).

948	   If manual keying is used, two different cryptographic contexts might
949	   accidentally use the same encryption key with non-negligible
950	   probability, through manual error or procedural inadequacies. Thus,
951	   manual keying SHOULD NOT be used for SRTP (or SRTCP).

953	   An additive stream cipher is vulnerable to attacks that use
954	   statistical knowledge about the plaintext source to enable key
955	   collision and time-memory tradeoff attacks [MF00,H80,Bi96]. These
956	   attacks take advantage of commonalities among plaintexts, and provide
957	   a way for a cryptanalyst to amortize the computational effort of
958	   decryption over many keys, thus reducing the effective key size of
959	   the cipher. A detailed analysis of these attacks and their
960	   applicability to the encryption of Internet traffic is provided in
961	   [MF00]. In summary, the effective key size of SRTP when used in a
962	   security system in which m distinct keys are used, is equal to the
963	   key size of the cipher less the logarithm (base two) of m. Protection
964	   against such attacks can be provided simply by increasing the size of
965	   the keys used, which here can be accomplished by the use of the
966	   "salting key".

968	   In order to provide an effective key size of n bits in a deployment
969	   in which 2^m SRTP/SRTCP cryptographic contexts will be created, the
970	   true key size will need to be n+m bits. The value of m SHOULD be 32
971	   bits for networks with 50,000 connections (fully meshed networks
972	   with up to 200 devices), and SHOULD be 64 bits for networks with
973	   49e+12 connections (fully meshed networks with up to 7,000,000
974	   devices). These choices of m ensures that key collision attacks
975	   amortized over a ten year period offer no advantage over exhaustive
976	   search, when new SRTP keys are established for every connection
977	   every hour (note that such an attack requires the storage of all
978	   network traffic over the ten year period). These choices will suffice
979	   for many networks, though SRTP deployments with more stringent
980	   security requirements will need to make a detailed assessment of
981	   those requirements with respect to the attacks described in [MF00].

983	   Implementations SHOULD use keys that are as large as possible. Please
984	   note that in many cases increasing the key size of a cipher does not
985	   affect the throughput of that cipher.

987	   It is an important point that the m bits of 'extra' key provided to
988	   thwart these attacks need not be private. In jurisdictions with
989	   mandated limits on the length of a secret key, the additional key
990	   bits could be made public. This is because those bits are
991	   functionally equivalent to the 'salt' that is used to protect
992	   passwords from dictionary attacks. The fact that the 'extra' key bits
993	   are distinct for many different keys defeats the key collision and
994	   time-memory tradeoff attacks by reducing the number of keys over
995	   which cryptanalytic computation can be amortized.

997	   Note that other security protocols which use additive ciphers for the
998	   encryption of Internet traffic (e.g., SSL, TLS, SSH, IPSEC) are also
999	   vulnerable to the attacks described in [MF00]. Those attacks are
1000	   generic to additive encryption of redundant plaintext, and are not
1001	   particular to SRTP.

1003	11.1 SSRC collision

1005	   Assume that two or more communication parties use the same key.
1006	   Though RTP implements an SSRC collision detection mechanism, it is
1007	   impossible to guarantee that two parties do not accidently choose the
1008	   same SSRC and send a few packets before the collision is detected. In
1009	   a very unfortunate case, the IV formation in Section 4.1 could in
1010	   fact make the keystreams collide and we have a 'two-time pad'. This
1011	   is probably a bigger problem in the case of group communication when
1012	   a single group key is desired. See also some administrative issues
1013	   with SSRC collisions in Section 12.

1015	11.2. Confidentiality of the RTP Payload

1017	   It is important to be aware that, as with any stream cipher, the
1018	   exact length of the payload is revealed by the encryption. This means
1019	   that it may be possible to deduce certain "formatting bits" of the
1020	   payload, as the length of the CODEC output might vary due to certain
1021	   parameter settings etc. This, in turn, implies that the corresponding
1022	   bit of the keystream can be deduced. However, if the stream cipher is
1023	   secure, knowledge of a few bits of the keystream will not aid an
1024	   attacker in predicting the following keystream bits. Thus, the
1025	   payload length (and information deducible from this) will leak, but
1026	   nothing else.

1028	11.3. Confidentiality of the RTP Header

1030	   With our proposal, RTP headers are sent in the clear to allow for
1031	   header compression. This means that data such as payload type,
1032	   synchronization source identifier, and timestamp are available to an
1033	   eavesdropper. Moreover, since RTP allows for future extensions of
1034	   headers, we cannot foresee what kind of possibly sensitive
1035	   information might also be "leaked".

1037	   Our proposal is a low-cost method, which allows header compression to
1038	   reduce bandwidth. It is up to the endpoints policies to decide about
1039	   the security scheme to employ. If the header compression is omitted,
1040	   other solutions might be applicable, e.g. [sc-esp]. In other words,
1041	   we provide a solution that works in the most general scenario, even
1042	   in the most demanding one (like conversational multimedia over low-
1043	   bandwidth, unreliable media. Of course the solution will then also
1044	   work in less restricted environments, but we suggest that if one
1045	   really needs to protect headers, and is allowed to do so by the
1046	   surrounding environment, then he should also look at alternatives. In
1047	   addition, we strongly recommend the use of profiles to select the
1048	   right trade-off for the required level of security.

1050	11.4 Integrity of RTP headers

1052	   The IV formation in Section 4.1, which depends on the RTP header,
1053	   provides an 'implicit' authentication of that header, which is useful
1054	   when the authentication option is not present. This is because any
1055	   attacks which modify the header of such a packet will cause the SRTP
1056	   receiver to use an incorrect IV in the decryption step, with the
1057	   result that the decrypted RTP payload will be essentially random.

1059	12. Multicast and Multi-unicast

1061	   The scheme described here can be used in case a single, unique key (a
1062	   single pair, encryption group key and authentication group key) is to
1063	   be used inside a multimedia session, for a low complexity key
1064	   management. However, it then becomes necessary to have a way to
1065	   assure that each SSRC is unique inside that multimedia session. This
1066	   is a light and feasible solution in several scenarios, e.g. one
1067	   sender only, streaming, and unicast.

1069	   In multicast and multi-unicast, to use the same group key for the
1070	   multimedia session, there should be a way to guarantee uniqueness of
1071	   the SSRC before starting sending. Otherwise, the triggering of the
1072	   anti-collision mechanism will ask for a change in the SSRCs of the
1073	   parties that happened to have the same SSRC, hence giving trouble in
1074	   pointing to the right context.

1076	   The problem remains how to address the context database after the
1077	   anti-collision algorithm has changed the SSRCs. Section 3.3 defines
1078	   the use of SSRC and Transport Address of that packet as selectors to
1079	   the database. In case of UDP, the unchanged transport address can be
1080	   a good indicator that a collision, followed by anti-collision
1081	   triggering, has happened. So, simply try decryptions until a RTCP
1082	   message confirms the change in the SSRC on that transport address and
1083	   then update the database selector triplet.

1085	   If the requirement of unique SSRC inside that multimedia session
1086	   cannot be guaranteed (e.g., for large groups), then a unique key per
1087	   sender should be used. The additional requirement is to have SSRC
1088	   unique per sender, which appears to be feasible enough. However, the
1089	   same consideration on the anti-collision algorithm triggerring
1090	   applies.

1092	13. Acknowledgements

1094	   The authors would like to thank Brian Weis and Magnus Westerlund for
1095	   their reviews and comments.

1097	14. Author's Addresses

1099	    Questions and comments about this memo can be directed to:

1101	      David A. McGrew
1102	      David Oran
1103	      Cisco Systems, Inc.
1104	      San Jose, CA 95134-1706 USA
1105	      mcgrew@cisco.com, oran@cisco.com

1107	      Rolf Blom
1108	      Elisabetta Carrara
1109	      Mats Naslund
1110	      Karl Norrman
1111	      Ericsson Research
1112	      {rolf.blom, elisabetta.carrara, mats.naslund,
1113	      karl.norrman}@era.ericsson.se

1115	15. References

1117	   [AES] NIST, "Advanced Encryption Standard (AES)",
1118	   http://csrc.nist.gov/encryption/aes/

1120	   [B97]   Bradner, S., "Key words for use in RFCs to Indicate
1121	   Requirement Levels", RFC 2119, March 1997.

1123	   [BCNN00]  Blom, R., Carrara, E., Naslund, M., and Norrman, K.,
1124	   "Conversational Multimedia Security in 3G Networks", Internet Draft,
1125	   November 2000, <draft-blom-cmsec-3g-00.txt>.

1127	   [BF00] Boneh, D., and Franklin, M., "Message Authentication in a
1128	   Multicast Environment", the Proceedings of the Seventh Annual
1129	   Workshop on Selected Areas in Cryptography (SAC 2000), Springer-
1130	   Verlag.

1132	   [C99]   Crowell, W. P., "Introduction to the VENONA Project",
1133	   http://www.nsa.gov:8080/docs/venona/index.html.

1135	   [ES3D] ETSI SAGE 3GPP Standard Algorithms Task Force, "Security
1136	   Algorithms Group of Experts (SAGE); General Report on the Design,
1137	   Specification and Evaluation of 3GPP Standard Confidentiality and
1138	   Integrity Algorithms", Public report, Draft Version 1.0, Dec 1999.

1140	   [ES3E] ETSI SAGE 3GPP Standard Algorithms Task Force, "Security
1141	   Algorithms Group of Experts (SAGE) Report on the Evaluation of 3GPP
1142	   Standard Confidentiality and Integrity Algorithms", Public report,
1143	   Draft Version 1.0, Dec 1999.

1145	   [HAC]  Menezes, A., Van Oorschot, P., and Vanstone, S., "Handbook of
1146	   Applied Cryptography", CRC Press, 1997, ISBN 0-8493-8523-7.

1148	   [H80]   Hellman, M. E., "A cryptanalytic time-memory trade-off", IEEE
1149	   Transactions on Information Theory, July 1980, pp. 401-406.

1151	   [KA98a] Kent, S., and R. Atkinson, "Security Architecture for IP",
1152	   RFC 2401, November 1998.

1154	   [KBHHKR00] Krovetz, T., Black, J., Halevi, S., Hevia, A., Krawczyk,
1155	   H., Rogaway, P., "UMAC: Message Authentication Code using Universal
1156	   Hashing", Internet Draft, October 2000, <draft-krovetz-umac-01.txt>.

1158	   [LRW00] Lipmaa, H., Rogaway, P., and Wagner, D., "Comments to NIST
1159	   Concerning AES Modes of Operation: CTR-Mode Encryption", NIST
1160	   Workshop on AES Modes of Operation,
1161	   http://csrc.nist.gov/encryption/aes/modes/lipmaa-ctr.pdf

1163	   [M00]   McGrew, D., "Segmented Integer Counter Mode: Specification
1164	   and Rationale", NIST Workshop on AES Modes of Operation,
1165	   http://www.mindspring.com/~dmcgrew/sic-mode.pdf

1167	   [MF00]  McGrew, D., and Fluhrer, S., "Attacks on Encryption of
1168	   Redundant Plaintext and Implications on Internet Security", the
1169	   Proceedings of the Seventh Annual Workshop on Selected Areas in
1170	   Cryptography (SAC 2000), Springer-Verlag.

1172	   [MF00b] McGrew, D., and Fluhrer, S., "The Stream Cipher LEVIATHAN:
1173	   Specification and Supporting Documentation", Submission to the New
1174	   European Schemes for Signatures, Integrity, and Encryption (NESSIE)
1175	   Process, October, 2000http://www.cryptonessie.org/.

1177	   [R92]   Rueppel, R., "Stream Ciphers", Chapter 2 of Simmons, G.,
1178	   "Contemporary Cryptology: the Science of Information Integrity,"
1179	   1992, IEEE Press.

1181	   [RC94]  Rogaway, P. and Coppersmith, D., "A Software-Optimized
1182	   Encryption Algorithm", Proceedings of the 1994 Fast Software
1183	   Encryption Workshop, Lecture Notes In Computer Science, Volume 809,
1184	   Springer-Verlag, 1994, pp. 56-63.

1186	   [RC98]  Rogaway, P. and Coppersmith, D., "A Software-Optimized
1187	   Encryption Algorithm", Journal of Cryptology, Volume 11, Number 4,
1188	   Springer-Verlag, 1998, Pages 273-287.  Also available on the Internet
1189	   at http://www.cs.ucdavis.edu/~rogaway/papers/seal-abstract.html.

1191	   [RK99]  Rescorla, E., and Korver, B., "Guidelines for Writing RFC
1192	   Text on Security Considerations," draft-rescorla-sec-cons-00.txt

1194	   [S96]   Schneier, B. "Applied Cryptography: Protocols, Algorithms,
1195	   and Source Code in C", Wiley, 1996.

1197	   [sc-esp] McGrew, D., Fluhrer, S., Peyravian, M.,  "The Stream Cipher
1198	   Encapsulating Security Payload", Internet Draft, July 2000

1200	   [SCFJ96] Schulzrinne, H., Casner, S., Frederick, R., Jacobson, V.,
1201	   "RTP: A Transport Protocol for Real-Time Applications", IETF Request
1202	   For Comments RFC 1889.

1204	Appendix

1206	A. Test vectors

1208	   We include in the following some test vectors for f8-AES.

1210	   key:
1211	     234829008467be186c3de14aae72d62c

1213	   salting key || 0x555... :
1214	     32f2870d555555555555555555555555

1216	   AES-internal expanded key:
1217	     23482900 8467be18 6c3de14a ae72d62c
1218	     62be58e4 e6d9e6fc 8ae407b6 2496d19a
1219	     f080e0d2 1659062e 9cbd0198 b82bd002
1220	     05f097be 13a99190 8f149008 373f400a
1221	     78f9f024 6b5061b4 e444f1bc d37bb1b6
1222	     4931be42 2261dff6 c6252e4a 155e9ffc
1223	     31ea0e1b 138bd1ed d5aeffa7 c0f0605b
1224	     fd3a37a1 eeb1e64c 3b1f19eb fbef79b0
1225	     a28cd0ae 4c3d36e2 77222f09 8ccd56b9
1226	     043d86ca 4800b028 3f229f21 b3efc998
1227	     ede0c0a7 a5e0708f 9ac2efae 292d2636

1229	   AES-internal expanded salting key || 555...:
1230	     32f2870d 55555555 55555555 55555555
1231	     cf0e7bf1 9a5b2ea4 cf0e7bf1 9a5b2ea4
1232	     f43f3249 6e641ced a16a671c 3b3149b8
1233	     37045eab 59604246 f80a255a c33b6ce2
1234	     dd54c685 843484c3 7c3ea199 bf05cd7b
1235	     a6e9e78d 22dd634e 5ee3c2d7 e1e60fac
1236	     089f7675 2a42153b 74a1d7ec 9547d840
1237	     e8fe7f5f c2bc6a64 b61dbd88 235a65c8
1238	     d6b39779 140ffd1d a2124095 8148255d
1239	     9f8cdb75 8b832668 299166fd a8d943a0
1240	     9c963bb7 17151ddf 3e847b22 965d3882

1242	   RTP-packet header fields:
1243	     version      = 2
1244	     padding      = 0
1245	     extension    = 0
1246	     CSRC count   = 0
1247	     marker bit   = 0
1248	     payload type = 6e
1249	     sequence no. = 5cba
1250	     timestamp    = 50681de5
1251	     SSRC         = 5c621599

1253	   Data from Cryptographic context:
1254	   FLAG = 0
1255	   Rollover counter = d462564a

1257	   IV:
1258	     d462564a006e5cba50681de55c621599

1260	   IV':
1261	     4fee844eedb458a3e2b0c7ed43888cc1

1263	   Encryption of bits 0 to 127:

1265	   ct: 0
1266	   S(-1)                   : 00000000000000000000000000000000
1267	   S(-1) XOR IV'           : 4fee844eedb458a3e2b0c7ed43888cc1
1268	   S(-1) XOR IV' XOR ct    : 4fee844eedb458a3e2b0c7ed43888cc1
1269	   plain text P[0..127]    : 6e915f07cd6f1c0d44afaab4961c7d31
1270	   final keystream S(0)    : b2d3b3d7e16092de379e33b350582e63
1271	   cipher text C[0..127]   : dc42ecd02c0f8ed373319907c6445352

1273	   Encryption of bits 128 to 255:

1275	   ct: 1
1276	   S(0)                    : b2d3b3d7e16092de379e33b350582e63
1277	   S(0) XOR IV'            : fd3d37990cd4ca7dd52ef45e13d0a2a2
1278	   S(0) XOR IV' XOR ct     : fd3d37990cd4ca7dd52ef45e13d0a2a3
1279	   plain text P[128..255]  : 7b9daad84352a6d4bcdf501a560832a0
1280	   final keystream S(1)    : b1ce287dc53c1975de3d7d0500f780ba
1281	   cipher text C[128..255] : ca5382a5866ebfa162e22d1f56ffb21a

1283	   ------------------------------------------------------------

1285	   This Internet-Draft expires in July 2001.