Network Working Group R. Blom Internet-Draft Y. Cheng Intended status: Standards Track F. Lindholm Expires: May 7, 2009 J. Mattsson M. Naslund K. Norrman Ericsson Research November 3, 2008 SRTP Store and Forward draft-mattsson-srtp-store-and-forward-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 7, 2009. Abstract The Secure Real-time Transport Protocol (SRTP) was designed to allow simple and efficient protection of RTP. To provide this, encryption and authentication of media and control signaling are tightly coupled to the RTP session, and to the information in the RTP header. Hence, in general it is not possible to perform store and forward of protected media. This document gives, based on a use case analysis, requirements that Blom, et al. Expires May 7, 2009 [Page 1] Internet-Draft SRTP Store and Forward November 2008 SRTP and new SRTP transforms need to satisfy in order to allow secure store-and-forward operation. A first proposal on how to introduce the needed new functionality and transforms in SRTP is also presented. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Selected SRTP background facts . . . . . . . . . . . . . . . . 4 4. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4.1. Trust Model and Assumptions . . . . . . . . . . . . . . . 6 4.2. Media Distribution Use Cases . . . . . . . . . . . . . . . 6 4.2.1. Streaming Pre-encrypted Media . . . . . . . . . . . . 6 4.2.2. Video on Demand . . . . . . . . . . . . . . . . . . . 6 4.2.3. Caching Protected Media in the Network . . . . . . . . 7 4.2.4. Recording Encrypted Media at Home . . . . . . . . . . 7 4.3. Answering Machine use case . . . . . . . . . . . . . . . . 7 4.3.1. Storing/Caching Encrypted Media . . . . . . . . . . . 7 4.3.2. Transport Protection . . . . . . . . . . . . . . . . . 8 4.3.3. Playback of Media Stream . . . . . . . . . . . . . . . 8 4.3.4. Multiple Callers . . . . . . . . . . . . . . . . . . . 9 4.4. Use Case: Centralized Conferencing . . . . . . . . . . . . 9 5. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 9 6. Solution Proposal . . . . . . . . . . . . . . . . . . . . . . 11 6.1. overview . . . . . . . . . . . . . . . . . . . . . . . . . 11 6.2. SRTP Cryptographic Contexts . . . . . . . . . . . . . . . 12 6.3. New Transforms . . . . . . . . . . . . . . . . . . . . . . 13 6.3.1. Media Protection Transform . . . . . . . . . . . . . . 14 6.3.2. Replay Protection . . . . . . . . . . . . . . . . . . 16 6.3.3. Key Derivation . . . . . . . . . . . . . . . . . . . . 16 7. Commented Example Usage . . . . . . . . . . . . . . . . . . . 16 8. Implications on SRTP . . . . . . . . . . . . . . . . . . . . . 18 9. Security Considerations . . . . . . . . . . . . . . . . . . . 18 9.1. Media protection Transform . . . . . . . . . . . . . . . . 18 9.2. Replay Protection . . . . . . . . . . . . . . . . . . . . 18 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 12.1. Normative References . . . . . . . . . . . . . . . . . . . 19 12.2. Informative References . . . . . . . . . . . . . . . . . . 19 Appendix A. Draft Compound Transform Details . . . . . . . . . . 19 A.1. Processing . . . . . . . . . . . . . . . . . . . . . . . . 20 A.1.1. Sender . . . . . . . . . . . . . . . . . . . . . . . . 21 A.1.2. Middlebox . . . . . . . . . . . . . . . . . . . . . . 21 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 Intellectual Property and Copyright Statements . . . . . . . . . . 24 Blom, et al. Expires May 7, 2009 [Page 2] Internet-Draft SRTP Store and Forward November 2008 1. Introduction The Secure Real-time Transport Protocol (SRTP) [RFC3711] is a profile of the Real-time Transport Protocol (RTP) [RFC3550], and it provides confidentiality, message authentication, and replay protection to both RTP and RTCP (Real-time Transport Control Protocol). SRTP was designed to protect real-time point-to-point communications and is, as presently defined, not aimed for communication solutions that include non-trusted store-and-forward middleboxes, i.e middleboxes that should not have access to cleartext media, but still should be able to have access to other data in order retransmit media according to RTP standard procedures. Media in need of e2e protection could e.g. be real-time voice and video information/media clips for internal use by personnel in enterprises or authorities. There are also multimedia telephony applications utilizing mail-boxes and other store and forward functions that need e2e protection. E2e protection could also be needed to protect subscribed media like commercial-free radio and television that is distributed over the Internet. A typical use case is store-and-forward media distributions systems. Many of those systems require that media is confidentiality protected end-to-end (e2e) between the media source and the media rendering device; this to prevent illegitimate media intercept or sharing. At the same time the communication should be hop-by-hop (hbh) protected to prevent malicious users from performing denial of service attacks by sending bogus data to store-and-forward middleboxes. Methods like the Packet-switched Streaming Service (PSS) [3GPP.26.234] exhibit the properties needed for secure store-and-forward operation, but they are part of larger frameworks tailored for very specific use cases. Thus it would be desirable to be able to offer use of SRTP as a general lightweight mechanism to achieve this type of protection. Trying to use SRTP with store-and-forward middleboxes reveals two main problems. The first problem is due to the fact that the incoming and outgoing RTP streams in general are independent; received RTP packets cannot just be stored and later retransmitted. This in particular implies that SRTP with currently defined transforms cannot be applied. For details see section 3. It should be noted that store-and-forward of media in most cases requires that side information is available when retransmitting received media. Such side information is e.g. RTP timestamp information and the needed side information may come from the RTP header, RTCP messages and session definition data. Blom, et al. Expires May 7, 2009 [Page 3] Internet-Draft SRTP Store and Forward November 2008 The second problem is that to provide both e2e and hbh protection, two independent security contexts with associated protection mechanisms have to coexist; a feature unavailable in SRTP as currently specified. To resolve these problems, SRTP needs enhancements that in an efficient and coherent way support store-and- forward use cases. The objective of this document is to explore use cases for a SRTP store-and-forward solution, derive associated requirements and present and discuss an approach for a solution. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Definitions of terms and notation will, unless otherwise indicated, be as defined in [RFC3711]. The term authentication will be used to denote message authentication and message integrity protection. By RTP transport protection or simply transport protection, we mean protection (confidentiality, authentication, etc.) of streamed RTP packets. This is provided by SRTP according to [RFC3711]. By media protection, we similarly mean protection of the application payloads carried in RTP. SRTP provides media protection, but only during transport (see above). 3. Selected SRTP background facts SRTP as currently specified has the features described below, which explain why it cannot be directly used in store-and-forward applications. They also indicate how a SRTP store-and-forward solution could be designed. o All current SRTP transforms use the RTP header as input. AES-CTR uses the SSRC and the packet index to calculate the IV (Initialization Vector), AES-f8 uses even more header parameters, and HMAC-SHA1 authenticates the full RTP header. The SSRC is typically determined by the key management protocol and the packet index includes the RTP sequence number, which should be randomly chosen according to RTP [RFC3550]. All this means that there are no standard compliant ways to receive SRTP protected packets in Blom, et al. Expires May 7, 2009 [Page 4] Internet-Draft SRTP Store and Forward November 2008 one stream and later just retransmit the packets as they were received. o Even if the SRTP relevant RTP parameters like SSRC and the SRTP index could be determined beforehand for the retransmission stream, it would not allow a client to randomly seek in a stream without renegotiating the session, as it would lead to misalignment between the packet index used for streaming and the packet index used by SRTP at the originator. If the user jumps to a different part of the stream, it is impossible to continue increasing the RTP sequence number stepwise while at the same time keeping it equal to the sequence number needed for decryption. Jumping backward (e.g. media rewind) would cause even more problems as the retransmitted packets would be discarded by the SRTP replay protection. o The encryption key and the authentication key are both derived from the same master key in SRTP, see Figure 1. This means that a client which is able to derive e.g. the authentication key will also always have access to the encryption key making it impossible to use say the session encr_key for e2e protection and the session auth_key for hbh protection. Packet index -------+ | v +------------+ +------------+ Session encr_key | | Master key | +------------------> | External +---------------->| Key | Session auth_key | Key | | Derivation +------------------> | Management +---------------->| | Session salt_key | | Master salt | +------------------> +------------+ +------------+ Figure 1: SRTP key derivation 4. Use Cases The use cases below were chosen to illustrate media streaming scenarios where the current SRTP specification [RFC3711] does not provide sufficient functionality. These use cases provide context and general rationale for the requirements presented in Section 5. Note that the necessary key distribution and media session set-up is out of scope for this document and will thus not be discussed in any detail in the use cases below. Blom, et al. Expires May 7, 2009 [Page 5] Internet-Draft SRTP Store and Forward November 2008 4.1. Trust Model and Assumptions The trust model assumed in this document includes two parties who wish to communicate securely via one or more honest but curious middleboxes. This means that the communicating parties trust the middlebox to deliver the media as expected, but they do not trust it with cleartext data. In the use cases below there is no example of multiple (sequential) middleboxes, but it is a natural generalization and it seems warranted to cover this case as well. 4.2. Media Distribution Use Cases 4.2.1. Streaming Pre-encrypted Media A content creator wants to distribute high value content to clients. The content provider distributes the media via a streaming server which should not have access to cleartext media, typically because it is not trusted by the content creator. In one scenario the content creator streams the media to the streaming server where the media is stored in a protected format. In another scenario the protected media may be delivered to the streaming server via e.g. file transfer. These use cases correspond to use of pre-encryption in media distribution. In both cases protected media is available in the streaming server for later transmission to different clients. Even in cases when the streaming server could be trusted with cleartext data there are reasons why one would like to avoid performing encryption in the streaming server itself. One reason to use pre-encryption is to offload the streaming server the task of encrypting the media, especially if the same media is used several times e.g. in video on demand. If the media is pre-encrypted the streaming server only needs to add integrity protection to the encrypted media before streaming it to the clients. Clients are trusted by the content creator and have access to the encryption key. When a client receives a packet, the authenticity is checked using a security context shared with the streaming server and the decryption is performed using a security context shared with the content creator. 4.2.2. Video on Demand Some protected content is offered as video on demand where users can watch selected video clips at any time. The media is unicasted and the clients are offered random seek functionality which allow them to quickly jump to any part of the video. Other features offered may be rendering with speed translation as in fast forward and slow motion rendering. These features can be used to skip parts of the video or jump backward to see interesting parts again. The problem here is Blom, et al. Expires May 7, 2009 [Page 6] Internet-Draft SRTP Store and Forward November 2008 jumping back and forth and performing rendering speed translations in an e2e protected media stream. 4.2.3. Caching Protected Media in the Network High value encrypted media (e.g. Internet Protocol Television (IPTV), and radio) is broadcasted in a network. Only clients trusted by the content creator have access to the encryption key. A network node is caching the media, but is not trusted by the content creator and has therefore no access to the encryption keys. A client that missed the beginning of a program might stream the media from the network cache instead of listening to the broadcast. Due to the trust model where the content creator only trusts the clients, the media needs to be e2e protected. But the media also needs to be hbh integrity protected to protect against DoS attacks. 4.2.4. Recording Encrypted Media at Home High value encrypted media (e.g. IPTV, and radio) is broadcasted in a network. Only clients trusted by the content creator have access to the encryption key. A user is recording the media on a HDD (Hard Disk Drive), but does not yet have a license or have a license that does not allow cleartext copying. The media is therefore stored in protected format on the HDD. There is however a strong need for the HDD to be able to check the integrity of the media before it is stored. Otherwise a DoS attack may fill the HDD with garbage. 4.3. Answering Machine use case 4.3.1. Storing/Caching Encrypted Media Operators commonly provide an answering machine service to their customers. In this case the communicating parties (the caller and the callee) may not wish to disclose the media to any other party, and hence want to apply encryption between each other. This requires that they are able to establish a shared key; how that is accomplished is out of scope for this document. The answering machine acts as a store and forward middlebox, which has to store encrypted data and re-transmit it to the callee. The answering machine may act as a streaming server when sending the data to the callee, and will then not use the exact same RTP headers on the outgoing SRTP traffic as was used on the incoming SRTP traffic. SRTP as specified in [RFC3711] will not work in this case, since parts of the RTP header is input to the encryption/authentication transforms. An alternative forwarding of the recorded media from the answering machine to the callee could be by file transfer, sending the recorded media in e.g. the same format as was used to store it. Such Blom, et al. Expires May 7, 2009 [Page 7] Internet-Draft SRTP Store and Forward November 2008 forwarding would not be according to SRTP, but would still yield end- to-end protection of the media. Note however, that decryption and rendering would be similar to part of an enhanced SRTP solution. 4.3.2. Transport Protection To avoid that the answering machine is filled up with bogus data, it is necessary for the answering machine to authenticate the sender of the traffic, and further, to verify the authenticity of the incoming traffic. This poses a problem for SRTP as of [RFC3711] in that the message authentication requires a session key shared with the answering machine, but the encryption key shall as discussed above not be available to it. This implies that there is a need for two independent security contexts, one end-to-end and one hop-by-hop. When the callee retrieves the media from the answering machine, message authentication is also beneficial. There are two possibilities. Since the answering machine is trusted not to actively behave maliciously, it may be sufficient to provide message authentication between the answering machine and the callee. Also here it would be necessary to have a separation between the e2e protection and the hbh protection. A second option is that authentication is applied from the caller to the callee. But if the authentication is applied in that way, the answering machine will not be able to verify the integrity of the incoming traffic from the caller. It is of course also possible that message authentication is desired for any combination of endpoints, i.e. between the caller and the callee, between the caller and the answering machine, and between the answering machine and the callee. 4.3.3. Playback of Media Stream When a user listens to the messages stored on the answering machine, it is useful to be able to rewind and/or fast forward in the media stream. For SRTP as of [RFC3711] this is not possible. The reason for that is that even if the same payloads can be re-inserted in the stream by the answering machine, the RTP sequence number is steadily increasing on a per packet basis. Since the synchronization of the encryption transforms is based on the RTP sequence number, the decryption will fail. In addition, message authentication will fail since the authentication according to [RFC3711] shall cover the header of the RTP packet. This implies that the payload and the media have to be protected by a mechanism which is independent of parameters used in the transport protocol. Blom, et al. Expires May 7, 2009 [Page 8] Internet-Draft SRTP Store and Forward November 2008 4.3.4. Multiple Callers Several messages may be left on the answering machine, received in different sessions and possibly from different callers. The result of this is that different keys were used to encrypt the media. Depending on how the callee retrieves the messages from the answering machine, different options are possible. One option is to retrieve each message as a separate stream, and in this case a separate session is required per message. Another option is to somehow switch security contexts midstream when the next message starts. 4.4. Use Case: Centralized Conferencing Another use case is a conference bridge that is not to be trusted with the cleartext media. In this case the conference bridge cannot act as a mixer, but in some cases this may be a reasonable assumption. An example is Push-To-Talk solutions, where only one user at a time is allowed to talk. In this setting, the media may be re-packaged by the conferencing server into RTP packets with different headers compared to the incoming traffic. As described in Section 3, this causes authentication and decryption to fail in SRTP. 5. Requirements The use cases above show that to enable store and forward in an enhanced SRTP, it has to in an efficient way support the following requirements: o Transport independent media protection It SHALL be possible to have media protection which is independent of RTP parameters. To allow retransmission of received protected media, a transform for protecting the RTP payload that is independent of RTP transport parameters is needed. The media protection MUST cover both message authentication and confidentiality protection. o Media source authentication It SHALL be possible to provide source authentication of the media stream. In a group setting, source authentication is here meant to ensure that the message originated from a member of the group. This Blom, et al. Expires May 7, 2009 [Page 9] Internet-Draft SRTP Store and Forward November 2008 requirement is fulfilled if media has authentication protection in a transport independent manner. o Support of playback of protected media streams A client SHALL be able to do random seek in a protected media stream. Note that as playback functions like retransmission and random seek capability are features in the described use cases, replay protection can not be required for transport independent media protection. o Transport protection It SHALL be possible to provide transport protection which is independent of the media protection. The transport protection MUST be able to provide confidentiality, authentication, and replay protection for RTP and at least authentication and replay protection for RTCP. This requirement maps well against SRTP as of [RFC3711]. Transport protection is also a means to provide replay protection of the media on a hop-by-hop basis. o Separation of security contexts It MUST be possible to have independent security contexts for the transport independent media protection and the transport protection. This means in particular that there has to be two distinct master keys, one for e2e media protection and one for hbh transport protection. o Change of transport independent media protection security context It MUST be possible to signal to the receiver the current media protection security context to use. It MUST be possible to change this security context midstream. This is needed to allow single stream multiplexing of e.g. protected media "clips" which were generated using different transport independent media protection security contexts The requirements imply that the media protection format has to Blom, et al. Expires May 7, 2009 [Page 10] Internet-Draft SRTP Store and Forward November 2008 include a Crypto Context Indicator (CCI) field for robust operation. The CCI can be thought of as a generalized MKI and may be defined to also include all the MKI based functionality defined in [RFC3711]. 6. Solution Proposal 6.1. overview The stated requirements above seem possible to meet by implementing a few minor additions to SRTP. These additions mainly address new SRTP transforms, introduction of media and transport protection crypto context definitions together with key handling and derivation. A high level description of the proposed new SRTP functionality is as follows: The first step is to perform a transport independent media protection operation. The coverage of this transform is the RTP payload only. This operation could preferably be an Authenticated Encryption with Associated Data (AEAD) transform, which allows part of the payload to be sent in plain. The media protection will rely on an explicit IV sequence number (IVSN) which is forwarded in the payload. After the steps making up the transport independent media protection have been performed the protection processing proceeds as currently defined by [RFC3711], which results in the addition of the required transport protection. Keying for transport protection, i.e. the SRTP internal key derivation performed is the same as described in [RFC3711]. The key derivation function operates on a master key and a master salt where the master key is denoted hbh-key. The keying for the media protection is defined in an equivalent way, producing keying material for the AEAD transform. The e2e keying material is based on another master key, the e2e-key, which is independent of the hbh-key. Also for the e2e context a master salt is defined. The key derivations used to derive the e2e keying material will also use the key derivation function defined in [RFC3711]. Note that with the approach taken only the end-points for the media protection will have to implement the new SRTP functionality with a combined media and transport transform including handling of two security contexts. In the following we will denote such a combined transform a Compound Transform. The store and forward middlebox can rely solely on [RFC3711] using already existing functionality for Blom, et al. Expires May 7, 2009 [Page 11] Internet-Draft SRTP Store and Forward November 2008 store-and-forward operation, given that the transport transform in the compound transform is equivalent to a transform defined for [RFC3711]. However, there are some practical reasons why also the middlebox needs to have some "knowledge" of the e2e part of the protection, see below. To summarize: By a compound transform, we mean the combination of media protection transform according to the suggested AEAD of Section 6.3 (using the e2e key) and one of the defined transforms of [RFC3711] for the hbh part (using the hbh key). The compound transform should be defined in this way to allow an intermediary to re-use a [RFC3711] compliant implementation of SRTP to first receive and then resend the media. For RTCP the solution principles described for RTP applies. However, the main application for RTCP is to control the traffic over one hop which means that e2e encryption cannot be applied in general. But note that there are RTCP application messages which might benefit from having e2e integrity protection. 6.2. SRTP Cryptographic Contexts SRTP maintains a cryptographic context, containing master key(s), cryptographic transforms, etc., for the associated SRTP session. Exactly how the parameters in the cryptographic context are agreed is out of scope of SRTP and is a session set-up issue. SRTP assumes that a cryptographic context or rather the master key therein, is shared only between mutually trusted parties. The SRTP cryptographic context concept is reusable for the proposed solution. Conceptually, the originator and the intended end-receiver share an "e2e context" while a "hbh context" is shared by an endpoint and an intermediary or by two intermediaries. To comply with the trust model of the use cases above the master key(s) in the e2e context MUST be cryptographically independent of, and MUST NOT be deducible from the master key of any hbh context. The key management protocol(s) used MUST therefore be able to negotiate keys satisfying these requirements. The identification of the hbh security context should be as defined in [RFC3711] while the used e2e media security context either is implicitly identified in the session set-up or its identification relies on the proposed crypto context indicator (CCI). A sender will use two cryptographic contexts: an e2e context used for payload protection to the end-receiver, and, a hbh context used to secure the SRTP transport to the (first) intermediary. Similarly, the end-receiver will use two contexts. An intermediary node Blom, et al. Expires May 7, 2009 [Page 12] Internet-Draft SRTP Store and Forward November 2008 however, will only use one standard SRTP context. In other words, an e2e context is used to achieve transport independent media protection as required in Section 5, and a hbh context similarly is used to achieve transport protection. For both e2e and hbh contexts, it is assumed that SRTP cryptographic context parameters, such as master key and salt (if needed) are included. From these, SRTP session keys/salts are derived similarly to [RFC3711] (see Figure 2). e2e_context (payload protection) <-----------------------------------> +---+ +---+ +---+ | S | | M | | R | +---+ +---+ +---+ <---------------> <-----------------> hbh_context1 hbh_context2 ^ ^ | | +- transport protection -+ Figure 2: Context sharing (Sender, Middlebox, Receiver) If several senders' payloads are multiplexed within the same SRTP stream from a server to a receiver (as discussed in Section 4.3.4) there may be need for the receiver to switch between e2e contexts in "midstream". This can be implemented using a mechanism similar to the SRTP MKI field in the e2e context (what is referred to as CCI above). The hbh context would, however, not need any change but could rely on an MKI field according to the current definition in [RFC3711]. 6.3. New Transforms As indicated above the new transform will be a media protection transform combined with a transport protection transform where the transport protection transform equals a transform as specified in [RFC3711]. Thus, here we will only describe and discuss the media protection part. We propose that the media protection part of the new transform is defined as an AEAD transform. That is, both confidentiality and authenticity are provided by the same transform which also would allow part of the payload to be forwarded in plain but still be integrity protected. The possibility to have part of the media payload forwarded in plain can be essential to enable simplifications in rendering functionality. Blom, et al. Expires May 7, 2009 [Page 13] Internet-Draft SRTP Store and Forward November 2008 If for some reason only encryption or integrity protection is needed, it is in many cases easy to see how to separate the encryption part from the authentication part and handle them as separate transforms. 6.3.1. Media Protection Transform The new media protection transform is proposed to be AES-GCM as defined in [GCM] and using AES as blockcipher, which allows reuse of most of the functionality in existing SRTP implementations. AES-GCM uses a 128-bit IV and here it is proposed that the IV is constructed from a session salt, a nonce (signaled out of band before the streaming begins), and an IV sequence number, the IVSN, included in each SRTP packet. The exact IV forming function f to use is ffs. The nonce is added to replace the SSRC that guaranteed stream- uniqueness of the IV, and the IVSN replace the packet index that guaranteed packet uniqueness. Blom, et al. Expires May 7, 2009 [Page 14] Internet-Draft SRTP Store and Forward November 2008 Session Crypto (encr.) Key ctxt ind. | | V | +-------+ | A | | | RTP Payload +---->| |-----------------------+ | +---+-----------+ | | GCM |--------------+ | | | A | P |--+ | |-----+ | | | | | | | | | | | | | +---+-----------+ | | | V V V V | P | | +---+-------------+---+--+---+ +---->| | | A | Enc( P ) |TAG|IV|CCI| | | | | | |SN| | +-------+ +---+-------------+---+--+---+ ^ Protected RTP payload ^ | IV | | | +-----+ | Nonce --->| f |<- Session Salt | | | | +-----+ | ^ | | | +-------------------------------+ | +-----+ | IV | | SN | +-----+ Figure 3: Media protection transform The media payload processing is illustrated in Figure 3. Note that the processing incurs message expansion. The original payload is divided into two parts, the part A which only will be authenticated and the part P which will be authenticated and encrypted. The IVSN is a counter which increases by one for each handled payload and it is concatenated to the protected payload. The authentication tag, TAG, is calculated over the plaintext part A and the encrypted part Enc( P ). The session key used is here assumed to be the Session encr_key derived from the e2e-key (see Section 6.3.2). The transform may also include a Crypto Context Identifier, CCI, used to identify the used e2e crypto context. The details of this field are ffs but it may be defined to also include MKI functionality. On the receiver side, the CCI and the IVSN are extracted from the received payload and are used to check the integrity tag and to Blom, et al. Expires May 7, 2009 [Page 15] Internet-Draft SRTP Store and Forward November 2008 retrieve the original RTP Payload in the obvious way. In Appendix A a draft description of the processing steps of a compound transform is given. 6.3.2. Replay Protection When the RTP data is hbh transport protected between server and receiver, replay protection on the transport level is provided as the hbh protection offers the same security features as [RFC3711]. As mentioned, it is assumed that the server is trusted not to attempt replay of data on media level, unless the user requests it and thus, this is in line with the trust model. The IVSN used in the media protection functionality offers the possibility to implement replay protection on application level if an application requires it. 6.3.3. Key Derivation Session key derivation (and optional key refresh) for the hbh context is performed as in [RFC3711] and is based on SRTP 48-bit index. Session key derivation (and optional key refresh) for the e2e context is also performed as in [RFC3711] but it is still open if the functionality for autonomous rekeying needs to be included. If it is included it would be based on the IVSN instead of the packet index. 7. Commented Example Usage In this example use case it is assumed that a sender S wants to send e2e protected media to a receiver R via an intermediary M. For this M will use SRTP with a compound transform as defined above. 1. S defines an e2e crypto context and forwards it to R. S also agrees a hbh crypto context with M. Each crypto context defines a master key, i.e. k_e2e and k_hbh respectively. Note that for store and forward operation, the e2e crypto context has to be decided unilaterally by the sender. The compound transform defines standard HMAC-SHA1 for transport authentication and NULL encryption which corresponds to a transform defined for [RFC3711] The e2e protection is configured to use AES-GCM as defined above, giving both integrity and confidentiality protection. For the e2e protection S also indicates if some part of the payload is Blom, et al. Expires May 7, 2009 [Page 16] Internet-Draft SRTP Store and Forward November 2008 sent in plain by specifying its length. How these crypto contexts are set up (which key management protocol to use etc.) is out of scope. Still, it can be noted that in principle it could be done by having e.g. two MIKEY [RFC3830] exchanges, one between S and M and one between S and R. 2. S sets up an SRTP session with M to have data forwarded to R. S offers the compound transform to M. M, knowing that it will act as an intermediary, accepts the offer (even though it doesn't have access to the e2e crypto context). M records that the media received is e2e protected. M also records the identity of the compound transform used. 3. To receive the media stream, M initiates SRTP as in [RFC3711] using a transform equivalent to the hbh transform in the compound transform offered by S. 4. S starts to transmit SRTP towards M, in effect using GCM and k_e2e for e2e media protection and HMAC-SHA1 with k_hbh for transport authentication. 5. M receives the packets and verifies the hbh authenticity of each SRTP packet and stores the (protected) payloads together with relevant side information to be used when the media is forwarded. Note that M would perform exactly the same operations when storing unprotected media for later forwarding. 6. At some later time, R sets up a session with M to render the stored media. As R contacts a middlebox, R offers use of a compound transform, preferably having the same e2e transform as was used by S (the e2e transform may be part of the e2e crypto context). If R offered a compound transform which doesn't use the same e2e transform or if R offered use of standard SRTP, M would decline the offer and propose a compatible compound transform. A hbh crypto context, which is independent of the first one, is agreed between R and M. 7. M, knowing that the stored payloads are e2e protected, initiates use of SRTP as in [RFC3711] specifying the transform to be used to equal the hbh transform in the compound transform agreed between R and M. M then transmits the authenticated media stream to R. 8. When receiving the SRTP packets from M, R first verifies the transport authentication and then checks e2e media authentication and decrypts the payloads to retrieve the plaintext media. Blom, et al. Expires May 7, 2009 [Page 17] Internet-Draft SRTP Store and Forward November 2008 8. Implications on SRTP As the SRTP specification allows new transforms, the new transforms can be added with only minor implications. The handling of dual security contexts (in the end-points) is however a new feature which will have to be introduced in SRTP. The Key Derivation Function defined in [RFC3711] can be reused for both the e2e and the hbh security contexts. 9. Security Considerations 9.1. Media protection Transform Any fixed key-stream output, generated from the same inputs (i.e. key and IV) MUST only be used to encrypt once. Reusing such a key-stream (commonly called a "two-time pad") would almost certainly compromise security. The new AES-GCM based transform accomplish packet-uniqueness by including the IVSN and stream-uniqueness by inclusion of a nonce in the IV formation. Thus, the nonce MUST be unique between all the RTP streams within the same RTP session that share the same e2e master key. Master keys MAY be shared between streams belonging to the same RTP session, but it is RECOMMENDED that each stream have its own master key. With the above conditions fulfilled the security level of the AES-GCM based transform will equal the level offered by [RFC3711]. Thus the compound transform will as whole also have the same security level as [RFC3711] 9.2. Replay Protection Replay protection is only provided on hbh basis. Not that the requirements on random seek in the media stream rules out any general replay protection mechanism applied on an e2e basis and that this threat falls outside the assumed trust model. Still, the IVSN used offers the possibly to implement application specific replay protection mechanisms. 10. Acknowledgements The authors would like to thank Daniel Catrein, Frank Hartung, and Magnus Westerlund for their support and valuable comments. Blom, et al. Expires May 7, 2009 [Page 18] Internet-Draft SRTP Store and Forward November 2008 11. IANA Considerations To signal that the new transforms are used, each relevant key management protocol needs to register the new transforms including numbering scheme and syntax with IANA. 12. References 12.1. Normative References [GCM] NIST, "Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode (GCM) and GMAC", NIST SP 800-38D, November 2007. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003. [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004. 12.2. Informative References [3GPP.26.234] 3GPP, "Transparent end-to-end Packet-switched Streaming Service (PSS); Protocols and codecs", 3GPP TS 26.234 7.5.0, March 2008. [RFC3830] Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K. Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830, August 2004. Appendix A. Draft Compound Transform Details This informative appendix proposes a way to define the compound transform such that it fits well in the SRTP framework. We assume the transform is defined to provide o Integrity and confidentiality e2e (the media part) o Integrity hbh (the transport part) Blom, et al. Expires May 7, 2009 [Page 19] Internet-Draft SRTP Store and Forward November 2008 Clearly other combinations are also possible in the form of any or all of the 15 possible (non-trivial) combinations of the security services confidentiality and integrity for the hbh as well as the e2e part. However, we feel that integrity and confidentiality on e2e basis combined with hbh integrity will be sufficient in most cases. As discussed above we introduce a compound transform, CT. The CT has two parts: o AES-GCM is used to process the RTP payload, providing confidentiality and integrity and is intended for the e2e protection. Conceptually, we view AES-GCM as an encryption transform within the SRTP framework. o HMAC-SHA1 is used to provide integrity protection of the entire RTP packet (including the AES-GCM encrypted payload and the metadata added by AES-CGM) and is intended for the hbh part. +--------+---+--------------+---+--+---+----------+ | RTP | A | Enc( P ) |TAG|IV|CCI| Auth | | header | | | |SN| | Tag (hbh)| +--------+---+--------------+---+--+---+----------+ ^ ^_____________________________^ | RTP e2e Protected Payload ^ | _______ Hbh protected RTP packet ___| Figure 4: SRTP protection scope Below, we make the natural (and necessary) assumption that the sender is made aware (e.g. by session set-up signaling) that the media will be delivered/stored in a middlebox. Similarly, we assume the middlebox is aware of that it is acting as a middlebox. A.1. Processing Recall that standard SRTP processing has the following principal form. 1. The sender determines keys, transforms, and other parameters from the cryptographic context. 2. The sender encrypts the payload (optional). 3. The sender integrity protects the RTP payload (optional). Blom, et al. Expires May 7, 2009 [Page 20] Internet-Draft SRTP Store and Forward November 2008 On the receiver side, the decryption/integrity verification is reversed. In the following we describe the processing taking place in sender, middlebox, and ultimate receiver as triggered by the use of the CT transform indicated by the cryptographic contexts involved. A.1.1. Sender 1. The sender determines keys and other parameters in the same way as standard SRTP does. The crypto context states that the CT transform shall be used. 2. The sender applies the AES-GCM part of CT to the payload. Conceptually treating AES-GCM as an encryption transform, this agrees with the normal SRTP processing. 3. The sender next applies the HMAC part of CT. Again, this agrees with adding standard SRTP integrity protection. A.1.2. Middlebox A.1.2.1. Message Storage 1. The middlebox determines keys and other parameters in the same way as standard SRTP does. The crypto context states that the CT transform shall be used. Since the middlebox is aware of its role as a (receiving) middlebox, the middlebox configures itself to verify integrity but not to decrypt the payload. To fit with the normal SRTP processing, the middlebox may therefore conceptually configure itself to perform HMAC integrity verification but use NULL decryption as supported by SRTP. 2. The middlebox next applies the HMAC part of CT according to standard SRTP integrity verification and replay protection." 3. The middlebox extracts the payload (which is the AES_GCM output as generated by the sender) and stores it for later retrieval by the receiver. A.1.2.2. Message Delivery 1. The middlebox determines keys and other parameters in the same way as standard SRTP does. The crypto context states that the CT transform shall be used. Since the middlebox is aware of its role as a (sending) middlebox, the middlebox configures itself to not encrypt the payload but only to add integrity protection. Blom, et al. Expires May 7, 2009 [Page 21] Internet-Draft SRTP Store and Forward November 2008 2. The middlebox applies NULL encryption to the payload. 3. The middlebox applies HMAC integrity. A.1.2.3. Message Delivery The crypto context tells the receiver to use the CT transform and the receiver can process accordingly. Authors' Addresses Rolf Blom Ericsson Research SE-164 80 Stockholm Sweden Phone: +46 8 585 317 07 Email: rolf.j.blom@ericsson.com Yi Cheng Ericsson Research SE-164 80 Stockholm Sweden Phone: +46 8 568 674 22 Email: yi.cheng@ericsson.com F. Lindholm Ericsson AB SE-164 80 Stockholm Sweden Phone: +46 8 585 317 05 Email: fredrik.lindholm@ericsson.com John Mattsson Ericsson Research SE-164 80 Stockholm Sweden Phone: +46 8 404 35 01 Email: john.mattsson@ericsson.com Blom, et al. Expires May 7, 2009 [Page 22] Internet-Draft SRTP Store and Forward November 2008 Mats Naslund Ericsson Research SE-164 80 Stockholm Sweden Phone: +46 8 585 337 39 Email: mats.naslund@ericsson.com Karl Norrman Ericsson Research SE-164 80 Stockholm Sweden Phone: +46 8 404 45 02 Email: karl.norrman@ericsson.com Blom, et al. Expires May 7, 2009 [Page 23] Internet-Draft SRTP Store and Forward November 2008 Full Copyright Statement Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Blom, et al. Expires May 7, 2009 [Page 24]