idnits 2.17.1 draft-westerlund-avtcore-transport-multiplexing-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 12, 2012) is 4421 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-00 == Outdated reference: A later version (-03) exists of draft-westerlund-avtcore-multiplex-architecture-01 -- Obsolete informational reference (is this intentional?): RFC 5285 (Obsoleted by RFC 8285) Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Westerlund 3 Internet-Draft Ericsson 4 Intended status: Standards Track C. Perkins 5 Expires: September 13, 2012 University of Glasgow 6 March 12, 2012 8 Multiple RTP Sessions on a Single Lower-Layer Transport 9 draft-westerlund-avtcore-transport-multiplexing-02 11 Abstract 13 This document specifies how multiple RTP sessions are to be 14 multiplexed on the same lower-layer transport, e.g. a UDP flow. It 15 discusses various requirements that have been raised and their 16 feasibility, which results in a solution with a certain 17 applicability. A solution is recommended and that solution is 18 provided in more detail, including signalling and examples. 20 Status of this Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on September 13, 2012. 37 Copyright Notice 39 Copyright (c) 2012 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 55 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 4 56 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 57 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 4 58 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 3.1. Support Use of Multiple RTP Sessions . . . . . . . . . . . 5 60 3.2. Same SSRC Value in Multiple RTP Sessions . . . . . . . . . 5 61 3.3. SRTP . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 62 3.4. Don't Redefine Used Bits . . . . . . . . . . . . . . . . . 7 63 3.5. Firewall Friendly . . . . . . . . . . . . . . . . . . . . 7 64 3.6. Monitoring and Reporting . . . . . . . . . . . . . . . . . 7 65 3.7. Usable Also Over Multicast . . . . . . . . . . . . . . . . 7 66 3.8. Incremental Deployment . . . . . . . . . . . . . . . . . . 8 67 4. Possible Solutions . . . . . . . . . . . . . . . . . . . . . . 8 68 4.1. Header Extension . . . . . . . . . . . . . . . . . . . . . 8 69 4.2. Multiplexing Shim . . . . . . . . . . . . . . . . . . . . 9 70 4.3. Single Session . . . . . . . . . . . . . . . . . . . . . . 10 71 4.4. Use the SRTP MKI field . . . . . . . . . . . . . . . . . . 11 72 4.5. Use an Octet in the Padding . . . . . . . . . . . . . . . 12 73 4.6. Redefine the SSRC field . . . . . . . . . . . . . . . . . 12 74 5. Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . 13 75 5.1. Support of Multiple RTP Sessions Over Single Transport . . 13 76 5.2. Enable Same SSRC Value in Multiple RTP Sessions . . . . . 13 77 5.2.1. Avoid SSRC Translation in Gateways/Translation . . . . 13 78 5.2.2. Support Existing Extensions . . . . . . . . . . . . . 14 79 5.3. Ensure SRTP Functions . . . . . . . . . . . . . . . . . . 14 80 5.4. Don't Redefine Used Bits . . . . . . . . . . . . . . . . . 15 81 5.5. Firewall Friendly . . . . . . . . . . . . . . . . . . . . 16 82 5.6. Monitoring and Reporting . . . . . . . . . . . . . . . . . 17 83 5.7. Usable over Multicast . . . . . . . . . . . . . . . . . . 18 84 5.8. Incremental Deployment . . . . . . . . . . . . . . . . . . 18 85 5.9. Summary and Conclusion . . . . . . . . . . . . . . . . . . 19 86 6. Specification . . . . . . . . . . . . . . . . . . . . . . . . 20 87 6.1. Shim Layer . . . . . . . . . . . . . . . . . . . . . . . . 21 88 6.2. Signalling . . . . . . . . . . . . . . . . . . . . . . . . 24 89 6.3. SRTP Key Management . . . . . . . . . . . . . . . . . . . 25 90 6.3.1. Security Description . . . . . . . . . . . . . . . . . 25 91 6.3.2. DTLS-SRTP . . . . . . . . . . . . . . . . . . . . . . 26 92 6.3.3. MIKEY . . . . . . . . . . . . . . . . . . . . . . . . 26 93 6.4. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 26 94 6.4.1. RTP Packet with Transport Header . . . . . . . . . . . 26 95 6.4.2. SDP Offer/Answer example . . . . . . . . . . . . . . . 27 97 7. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 29 98 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 99 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30 100 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 30 101 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 102 11.1. Normative References . . . . . . . . . . . . . . . . . . . 31 103 11.2. Informational References . . . . . . . . . . . . . . . . . 31 104 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 32 106 1. Introduction 108 There has been renewed interest for having a solution that allows 109 multiple RTP sessions [RFC3550] to use a single lower layer 110 transport, such as a bi-directional UDP flow. The main reason is the 111 cost of doing NAT/FW traversal for each individual flow. ICE and 112 other NAT/FW traversal solutions are clearly capable of attempting to 113 open multiple flows. However, there is both increased risk for 114 failure and an increased cost in the creation of multiple flows. The 115 increased cost comes as slightly higher delay in establishing the 116 traversal, and the amount of consumed NAT/FW resources. The latter 117 might be an increasing problem in the IPv4 to IPv6 transition period. 119 This document draws up some requirements for consideration on how to 120 transport multiple RTP sessions over a single lower-layer transport. 121 These requirements will have to be weighted as the combined set of 122 requirements result in that no known solution exist that can fulfill 123 them completely. 125 A number of possible solutions are then considered and discussed with 126 respect to their properties. Based on that, the authors recommends a 127 shim layer variant as single solution, which is described in more 128 detail including signalling solution and examples. 130 2. Conventions 132 2.1. Terminology 134 Some terminology used in this document. 136 Multiplexing: Unless specifically noted, all mentioning of 137 multiplexing in this document refer to the multiplexing of 138 multiple RTP Sessions on the same lower layer transport. It is 139 important to make this distinction as RTP does contain a number of 140 multiplexing points for various purposes, such as media formats 141 (Payload Type), media sources (SSRC), and RTP sessions. 143 2.2. Requirements Language 145 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 146 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 147 document are to be interpreted as described in RFC 2119 [RFC2119]. 149 3. Requirements 151 This section lists and discusses a number of potential requirements. 153 However, it is not difficult to realize that it is in fact possible 154 to put requirements that makes the set of feasible solutions an empty 155 set. It is thus necessary to consider which requirements that are 156 essential to fulfill and which can be compromised on to arrive at a 157 solution. 159 3.1. Support Use of Multiple RTP Sessions 161 This may at first glance appear to be an obvious requirement. 162 Although the authors are convinced it is a mandatory requirement for 163 a solution, it warrants some discussion around the implications of 164 not having multiple RTP sessions and instead use a single RTP 165 session. 167 The usage of multiple RTP sessions allow separation of media streams 168 that have different usages or purposes in an RTP based application, 169 for example to separate the video of a presenter or most important 170 current talker from those of the listeners that not all end-points 171 receiver. Also separation for different processing based on media 172 types such as audio and video in end-points and central nodes. Thus 173 providing the node with the knowledge that any SSRC within the 174 session is supposed to be processed in a similar or same way. 176 For simpler cases, where the streams within each media type need the 177 same processing, it is clearly possible to find other multiplex 178 solutions, for example based on the Payload Type and the differences 179 in encoding that the payload type allows to describe. This may 180 anyhow be insufficient when you get into more advanced usages where 181 you have multiple sources of the same media type, but for different 182 usages or as alternatives. For example when you have one set of 183 video sources that shows session participants and another set of 184 video sources that shares an application or slides, you likely want 185 to separate those streams for various reasons such as control, 186 prioritization, QoS, methods for robustification, etc. In those 187 cases, using the RTP session for separation of properties is a 188 powerful tool. A tool with properties that need to be preserved when 189 providing a solution for how to use only a single lower-layer 190 transport. 192 For more discussion of the usage of RTP sessions verses other 193 multiplexing we recommend RTP Multiplexing Architecture 194 [I-D.westerlund-avtcore-multiplex-architecture]. 196 3.2. Same SSRC Value in Multiple RTP Sessions 198 Two different RTP sessions being multiplexed on the same lower layer 199 transport need to be able to use the same SSRC value. This is a 200 strong requirement, for two reasons: 202 1. To avoid mandating SSRC assignment rules that are coordinated 203 between the sessions. If the RTP sessions multiplexed together 204 must have unique SSRC values, then additional code that works 205 between RTP Sessions is needed in the implementations. Thus 206 raising the bar for implementing this solution. In addition, if 207 one gateways between parts of a system using this multiplexing 208 and parts that aren't multiplexing, the part that isn't 209 multiplexing must also fulfill the requirements on how SSRC is 210 assigned or force the gateway to translate SSRCs. Translating 211 SSRC is actually hard as it requires one to understand the 212 semantics of all current and future RTP and RTCP extensions. 213 Otherwise a barrier for deploying new extensions is created. 215 2. There are some few RTP extensions that currently rely on being 216 able to use the same SSRC in different RTP sessions: 218 * XOR FEC (RFC5109) 220 * RTP Retransmission in session mode (RFC4588) 222 * Certain Layered Coding 224 3.3. SRTP 226 SRTP [RFC3711] is one of the most commonly used security solutions 227 for RTP. In addition, it is the only one recommended by IETF that is 228 integrated into RTP. This integration has several aspects that needs 229 to be considered when designing a solution for multiplexing RTP 230 sessions on the same lower layer transport. 232 Determining Crypto Context: SRTP first of all needs to know which 233 session context a received or to-be-sent packet relates to. It 234 also normally relies on the lower layer transport to identify the 235 session. It uses the MKI, if present, to determine which key set 236 is to be used. Then the SSRC and sequence number are used by most 237 crypto suites, including the most common use of AES Counter Mode, 238 to actually generate the correct cipher stream. 240 Unencrypted Headers: SRTP has chosen to leave the RTP headers and 241 the first two 32-bit words of the first RTCP header unencrypted, 242 to allow for both header compression and monitoring to work also 243 in the presence of encryption. As these fields are in clear text 244 they are used in most crypto suites for SRTP to determine how to 245 protect or recover the plain text. 247 It is here important to contrast SRTP against a set of other possible 248 protection mechanisms. DTLS, TLS, and IPsec are all protecting and 249 encapsulating the entire RTP and RTCP packets. They don't perform 250 any partial operations on the RTP and RTCP packets. Any change that 251 is considered to be part of the RTP and RTCP packet is transparent to 252 them, but possibly not to SRTP. Thus the impact on SRTP operations 253 must be considered when defining a mechanism. 255 3.4. Don't Redefine Used Bits 257 As the core of RTP is in use in many systems and has a really large 258 deployment story and numerous implementations, changing any of the 259 field definitions is highly problematic. First of all, the 260 implementations need to change to support this new semantics. 261 Secondly, you get a large transition issue when you have some session 262 participants that support the new semantics and some that don't. 263 Combing the two behaviors in the same session can force the 264 deployment of costly and less than perfect translation devices. 266 3.5. Firewall Friendly 268 It is desirable that current firewalls will accept the solutions as 269 normal RTP packets. However, in the authors' opinion we can't let 270 the firewall stifle invention and evolution of the protocol. It is 271 also necessary to be aware that a change that will make most deep 272 inspecting firewall consider the packet as not valid RTP/RTCP will 273 have more difficult deployment story. 275 3.6. Monitoring and Reporting 277 It is desirable that a third party monitor can still operate on the 278 multiplexed RTP Sessions. It is however likely that they will 279 require an update to correctly monitor and report on multiplexed RTP 280 Sessions. 282 Another type of function to consider is packet sniffers and their 283 selector filters. These may be impacted by a change of the fields. 284 An observation is that many such systems are usually quite rapidly 285 updated to consider new types of standardized or simply common packet 286 formats. 288 3.7. Usable Also Over Multicast 290 It is desirable that a solution should be possible to use also when 291 RTP and RTCP packets are sent over multicast, both Any Source 292 Multicast (ASM) and Single Source Multicast (SSM). The reason for 293 this requirement is to allow a system using RTP to use the same 294 configuration regardless of the transport being done over unicast or 295 multicast. In addition, multicast can't be claimed to have an issue 296 with using multiple ports, as each multicast group has a complete 297 port space scoped by address. 299 3.8. Incremental Deployment 301 A good solution has the property that in topologies that contains RTP 302 mixers or Translators, a single session participant can enable 303 multiplexing without having any impact on any other session 304 participants. Thus a node should be able to take a multiplexed 305 packet and then easily send it out with minimal or no modification on 306 another leg of the session, where each RTP session is transported 307 over its own lower-layer transport. It should also be as easy to do 308 the reverse forwarding operation. 310 4. Possible Solutions 312 This section looks at a few possible solutions and discusses their 313 feasibility. 315 4.1. Header Extension 317 One proposal is to define an RTP header extension [RFC5285] that 318 explicitly enumerates the session identifier in each packet. This 319 proposal has some merits regarding RTP, since it uses an existing 320 extension mechanism; it explicitly enumerates the session allowing 321 for third parties to associate the packet to a given RTP session; and 322 it works with SRTP as currently defined since a header extension is 323 by default not encrypted, and is thus readable by the receiving stack 324 without needing to guess which session it belongs to and attempt to 325 decrypt it. This approach does, however, conflict with the 326 requirement from [RFC5285] that "header extensions using this 327 specification MUST only be used for data that can be safely ignored 328 by the recipient", since correct processing of the received packet 329 depends on using the header extension to demultiplex it to the 330 correct RTP session. 332 Using a header extension also result in the session ID is in the 333 integrity protected part of the packet. Thus a translator between 334 multiplexed and non-multiplexed has the options: 336 1. to be part of the security context to verify the field 338 2. to be part of the security context to verify the field and remove 339 it before forwarding the packet 341 3. to be outside of the security context and leave the header 342 extension in the packet. However, that requires successful 343 negotiation of the header extension, but not of the 344 functionality, with the receiving end-points. 346 The biggest existing hurdle for this solution is that there exist no 347 header extension field in the RTCP packets. This requires defining a 348 solution for RTCP that allows carrying the explicit indicator, 349 preferably in a position that isn't encrypted by SRTCP. However, the 350 current SRTCP definition does not offer such a position in the 351 packet. 353 Modifying the RR or SR packets is possible using profile specific 354 extensions. However, that has issues when it comes to deployability 355 and in addition any information placed there would end up in the 356 encrypted part. 358 Another alternative could be to define another RTCP packet type that 359 only contains the common header, using the 5 bits in the first byte 360 of the common header to carry a session id. That would allow SRTCP 361 to work correctly as long it accepts this new packet type being the 362 first in the packet. Allowing a non-SR/RR packet as the first packet 363 in a compound RTCP packet is also needed if an implementation is to 364 support Reduced Size RTCP packets [RFC5506]. The remaining downside 365 with this is that all stack implementations supporting multiplexing 366 would need to modify its RTCP compound packet rules to include this 367 packet type first. Thus a translator box between supporting nodes 368 and non-supporting nodes needs to be in the crypto context. 370 This solution's per packet overhead is expected to be 64-bits for 371 RTCP. For RTP it is 64-bits if no header extension was otherwise 372 used, and an additional 16 bits (short header), or 24 bits plus (if 373 needed) padding to next 32-bits boundary if other header extensions 374 are used. 376 4.2. Multiplexing Shim 378 This proposal is to prefix or postfix all RTP and RTCP packets with a 379 session ID field. This field would be outside of the normal RTP and 380 RTCP packets, thus having no impact on the RTP and RTCP packets and 381 their processing. An additional step of demultiplexing processing 382 would be added prior to RTP stack processing to determine in which 383 RTP session context the packet shall be included. This has also no 384 impact on SRTP/SRTCP as the shim layer would be outside of its 385 protection context. The shim layer's session ID is however 386 implicitly integrity protected as any error in the field will result 387 in the packet being placed in the wrong or non-existing context, thus 388 resulting in a integrity failure if processed by SRTP/SRTCP. 390 This proposal is quite simple to implement in any gateway or 391 translating device that goes from a multiplexed to a non-multiplexed 392 domain or vice versa, as only an additional field needs to be added 393 to or removed from the packet. 395 The main downside of this proposal is that it is very likely to 396 trigger a firewall response from any deep packet inspection device. 397 If the field is prefixed, the RTP fields are not matching the 398 heuristics field (unless the shim is designed to look like an RTP 399 header, in which case the payload length is unlikely to match the 400 expected value) and thus are likely preventing classification of the 401 packet as an RTP packet. If it is postfixed, it is likely classified 402 as an RTP packet but may not correctly validate if the content 403 validation is such that the payload length is expected to match 404 certain values. It is expected that a postfixed shim will be less 405 problematic than a prefixed shim in this regard, but we are lacking 406 hard data on this. 408 This solution's per packet overhead is 1 byte. 410 4.3. Single Session 412 Given the difficulty of multiplexing several RTP sessions onto a 413 single lower-layer transport, it's tempting to send multiple media 414 streams in a single RTP session. Doing this avoids the need to de- 415 multiplex several sessions on a single transport, but at the cost of 416 losing the RTP session as a separator for different type of streams. 417 Lacking different RTP sessions to demultiplex incoming packets, a 418 receiver will have to dig deeper into the packet before determining 419 what to do with it. Care must be taken in that inspection. For 420 example, you must be careful to ensure that each real media source 421 uses its own SSRC in the session and that this SSRC doesn't change 422 media type. 424 The loss of the RTP session as a separator for different usages or 425 purpose would be an minor issue if the only difference between the 426 RTP sessions is the media type. In this case, the application could 427 use the Payload Type field to identify the media type. The loss of 428 the RTP Session functionality is however severe, if the application 429 uses the RTP Session for separating different treatments, contexts 430 etc. Then you would need additional signalling to bind the different 431 sources to groups which can help make the necessary distinctions. 433 However, the loss of the RTP session as separator is not the only 434 issue with this approach. The RTP Multiplexing Architecture 435 [I-D.westerlund-avtcore-multiplex-architecture] discusses a number of 436 issues in Section 6.7. These include RTCP bandwidth differences, 437 limitations in the number of payload types, media aware RTP mixers 438 and interactions with Legacy end-points. 440 Additional attention should be place on this important aspect. In 441 multi-party situations using central nodes there exist some 442 difficulties in having a legacy implementation using multiple RTP 443 sessions interworking with an end-point having only a single RTP 444 session across the central node. The main reason is the fact that 445 the one using single session with multiple media types has only one 446 SSRC space, while the other end-points have multiple spaces. Thus 447 translation may have to occur because there is several RTP sessions 448 using the same SSRC value. This has both limitations, processing 449 overhead and the possibility of becoming an deployment obstacle for 450 new RTP/RTCP extensions. 452 This approach has been proposed in the RTCWeb context in 453 [I-D.lennox-rtcweb-rtp-media-type-mux] and 454 [I-D.ietf-mmusic-sdp-bundle-negotiation]. These drafts describe how 455 to signal multiple media streams multiplexed into a single RTP 456 session, and address some of the issues raised here and in Section 457 6.7 of the RTP Multiplexing Architecture 458 [I-D.westerlund-avtcore-multiplex-architecture] draft. 460 This method has several limitations that limits its usage as solution 461 in providing multiple RTP sessions on the same lower layer transport. 462 However, we acknowledge that there are some uses for which this 463 method may be sufficient and which can accept the methods limitations 464 and downsides. The RTCWEB WG has a working assumption to support 465 this method. For more details of this method, see the relevant 466 drafts under development. We do include this method in the 467 comparison to provide a more complete picture of the pro and cons of 468 this method. 470 This solution has no per packet overhead. The signalling overhead 471 will be a different question. 473 4.4. Use the SRTP MKI field 475 This proposal is to overload the MKI SRTP/SRTCP identifier to not 476 only identify a particular crypto context, but also identify the 477 actual RTP Session. This clearly is a miss use of the MKI field, 478 however it appears to be with little negative implications. SRTP 479 already supports handling of multiple crypto contexts. 481 The two major downsides with this proposal is first the fact that it 482 requires using SRTP/SRTCP to multiplex multiple sessions on a single 483 lower layer transport. The second issue is that the session ID 484 parameter needs to be put into the various key-management schemes and 485 to make them understand that the reason to establish multiple crypto 486 contexts is because they are connected to various RTP Sessions. 487 Considering that SRTP have at least 3 used keying mechanisms, DTLS- 488 SRTP [RFC5764], Security Descriptions [RFC4568], and MIKEY [RFC3830], 489 this is not an insignificant amount of work. 491 This solution has 32-bit per packet overhead, but only if the MKI was 492 not already used. 494 4.5. Use an Octet in the Padding 496 The basics of this proposal is to have the RTP packet and the last 497 (required by RFC3550) RTCP packet in a compound to include padding, 498 at least 2 bytes. One byte for the padding count (last byte) and one 499 byte just before the padding count containing the session ID. 501 This proposal uses bytes to carry the session ID that have no defined 502 value and is intended to be ignored by the receiver. From that 503 perspective it only causes packet expansion that is supported and 504 handled by all existing equipment. If an implementation fails to 505 understand that it is required to interpret this padding byte to 506 learn the session ID, it will see a mostly coherent RTP session 507 except where SSRCs overlap or where the payload types overlap. 508 However, reporting on the individual sources or forwarding the RTCP 509 RR are not completely without merit. 511 There is one downside of this proposal and that has to do with SRTP. 512 To be able to determine the crypto context, it is necessary to access 513 to the encrypted payload of the packet. Thus, the only mechanism 514 available for a receiver to solve this issue is to try the existing 515 crypto contexts for any session on the same lower layer transport and 516 then use the one where the packet decrypts and verifies correctly. 517 Thus for transport flows with many crypto contexts, an attacker could 518 simply generate packets that don't validate to force the receiver to 519 try all crypto contexts they have rather than immediately discard it 520 as not matching a context. A receiver can mitigate this somewhat by 521 using heuristics based on the RTP header fields to determine which 522 context applies for a received packet, but this is not a complete 523 solution. 525 This solution has a 16-bit per packet overhead. 527 4.6. Redefine the SSRC field 529 The Rosenberg et. al. Internet draft "Multiplexing of Real-Time 530 Transport Protocol (RTP) Traffic for Browser based Real-Time 531 Communications (RTC)" [I-D.rosenberg-rtcweb-rtpmux] proposed to 532 redefine the SSRC field. This has the advantage of no packet 533 expansion. It also looks like regular RTP. However, it has a number 534 of implications. First of all it prevents any RTP functionality that 535 require the same SSRC in multiple RTP sessions. 537 Secondly its interoperability with end-point using multiple RTP 538 sessions are problematic. Such interoperability will requires an 539 SSRC translator function in the gatewaying node to ensure that the 540 SSRCs fulfill the semantic rules of the different domains. That 541 translator is actually far from easy as it needs to understand the 542 semantics of all RTP and RTCP extensions that include SSRC/CSRC. 543 This as it is necessary to know when a particular matching 32-bit 544 pattern is an SSRC field and when the field is just a combination of 545 other fields that create the same matching 32-bit pattern. Thus 546 there is a possibility that such a translator becomes a obstacle in 547 deploying future RTP/RTCP extensions. In addition the translator 548 actually have significant overhead when SRTP are in use. This as a 549 verification that the packet is authentic, decryption, SSRC 550 translation, encryption and finally generation of authentication tags 551 are required. In addition the translator must be part of the 552 security context. 554 This solution has no per packet overhead. 556 5. Comparison 558 This section compares the above potential solutions with the 559 requirements. Motivations are provided in addition to a high level 560 metric of successfully, partially and failing to meet requirement. 561 In the end a summary table (Figure 1) of the high level value are 562 provided. 564 5.1. Support of Multiple RTP Sessions Over Single Transport 566 This one is easy to determine. Only the single session proposal 567 fails this requirement as it is not at all designed to meet it. The 568 rest fully support this requirement. The main question around this 569 requirement is how important it is to have as discussed in 570 Section 3.1. 572 5.2. Enable Same SSRC Value in Multiple RTP Sessions 574 Based on the discussion in Section 3.2 two sub-requirements have been 575 derived. 577 5.2.1. Avoid SSRC Translation in Gateways/Translation 579 This sub-requirement is derived based on the desire to avoid having 580 gateways or translators perform full SSRC translation to minimize 581 complexity, avoid the requirement to have gateways in security 582 context, and as a hinder to long-term evolution. Two of the 583 proposals have issues with this, due to their lack of support for 584 multiple 32-bit SSRC spaces and lacking possibility to have the same 585 SSRC value in multiple RTP sessions. The proposals that have these 586 properties and thus are marked as failing are the Single Session and 587 Redefine the SSRC field. The other proposals are all succcesful in 588 meeting this requirement. 590 5.2.2. Support Existing Extensions 592 The second sub-requirement is how well the proposals support using 593 the existing RTP mechanisms. Here both Single Session and Redefine 594 the SSRC field will have clear issues as they cannot support the same 595 full 32-bit SSRC value in two different RTP sessions. This is 596 clearly an issue for the XOR based FEC. RTP retransmission and 597 scalable encoding are minor issues as there exist alternatives to 598 those mechanisms that works with the structure of these two 599 proposals. Thus we give them a fail. The Header Extension gets a 600 partial due to unclear interaction between putting in an header 601 extension and these mechanisms. 603 5.3. Ensure SRTP Functions 605 This requirement is about ensuring both secure and efficient usage of 606 SRTP. The Octet in Padding field proposal gets a fail as the 607 receiving end-point cannot determine the intended RTP session prior 608 to de-encryption of the padding field. Thus a catch-22 arises which 609 can only be resolved by trying all session contexts and see what 610 decrypts. This causes a security vulnerability as an attacker can 611 inject a packet which does not meet any of the session contexts. The 612 receiver will then attempt decryption and authentication of it using 613 all its session contexts, increasing the amount of wasted resources 614 by a factor equal to the number of multiplexed sessions. Thus this 615 proposal gets a fail. 617 The proposal of Overloading the SRTP MKI field as session identifier 618 gets a partial due to the fact that it cannot use SRTP's key- 619 management mechanism out of the box. It forces the key-management 620 mechanism and the SRTP implementations to maintain the MKI-to-RTP 621 session bindings to maintain secure and correct function. 623 The Redefine the SSRC field gets a partial due to its need to modify 624 the key-management mechanisms to correctly identify the partial SSRC 625 space the parameters applies to. Similarly, the SRTP implementation 626 also needs to be updated to correctly support this security context 627 differentiation. 629 The header extension based solution gets a less severe partial than 630 Redefine the SSRC and the MKI. It will however have an issue when 631 being gatewayed to a domain that does not multiplex multiple RTP 632 sessions over the same transport. Then the gateway will require to 633 be in the security context to be able to add or remove the header 634 extension as it is in the part of the packet that is integrity 635 protected by SRTP. 637 The remaining two proposals do not affect SRTP mechanisms and thus 638 successfully meet this requirement. 640 5.4. Don't Redefine Used Bits 642 This requirement is all about RTP and RTCP header fields having a 643 given definition should not be changed as it can cause 644 interoperability problems between modified and non-modified 645 implementations. This becomes especially problematic in RTP sessions 646 used for multi-party sessions. 648 Redefine the SSRC field gets a big fail on this as it redefines the 649 SSRC field, a core field in RTP. It has been identified that such a 650 change will have issues since if it gets connected to a non-modified 651 end-point that randomly assigns the SSRC, as supposed by RFC 3550, 652 those SSRCs will be distributed over different RTP sessions at the 653 modified end-point. Also other functions using the SSRC field, not 654 understanding the additional semantics of the SSRC field, is likely 655 to have issues. 657 Using the SRTP MKI field to identify a session is overloading that 658 field with double semantics. This likely has minimal negative impact 659 in RTP since it should be possible to have the SRTP stack use the MKI 660 field to both look up the security context and which output RTP 661 session the processed packet belongs to. However, this redefinition 662 clearly creates issues with the key-management scheme. That will 663 have to be modified to handle both this change and deal with the 664 interoperability issues when negotiating its usage. This gets a full 665 fail due to that it makes the problem someone else's, namely the RTP 666 implementors. 668 Defining an Octet in the Padding field redefines a field, whose 669 definition is to have zero value and is expected to be ignored by the 670 receiver according to the original semantics. Thus this is one of 671 the more benign modifications one can do, however this can still 672 cause issues in implementations that unnecessarily check the field 673 values, or in firewalls. This is judged to be partially meeting the 674 requirement. 676 The Header Extension proposal does in fact not redefine any currently 677 used bits in RTP. The header extension would be a correctly 678 identified extension with its own definition. However, it does 679 redefine a rule on what header extensions are for. The RTCP solution 680 however would have more severe impact as it would need to redefine 681 the standard meaning of an RTCP packet header in addition to the 682 default compound packet rules. Due to these issues the proposal 683 fails to meet this requirement. 685 The multiplexing shim and the single session both successfully meet 686 this requirement. 688 5.5. Firewall Friendly 690 This requirement is clearly difficult to judge as firewall 691 implementations are highly different in both implementation, scope of 692 what it investigates in packets, and set policies. A reasonable goal 693 is to minimize the likeliness that rules and policies intended to let 694 RTP media streams pass, will also let these streams through when 695 multiplexing RTP sessions over a single transport. The below 696 analysis shows that no solution is truly firewall friendly and all 697 are judged as being partially meeting this goal. However, the reason 698 why it is believed that a firewall might react to the streams are 699 quite different. 701 The Single Session and Redefine the SSRC field are likely the least 702 suspect solutions from a firewall perspective. However, as their 703 transport flows contain multiple SSRCs with payloads that indicate 704 likely multiple different media types they are still likely to make a 705 picky firewall block the transport. This is especially true for 706 firewalls that take signalling messages into account where it will 707 expect a particular media type in a given context. A non upgraded 708 firewall might in fact produce two different contexts with 709 overlapping transport parameters where both rules will receive media 710 streams of the other media type that are outside of the allowed rule. 711 However, to be clear if these proposals doesn't get through, none of 712 the other will either as they all will have this behavior. 714 The header extension proposal is potentially problematic for two 715 reasons. The first reason, which also other proposals has, is 716 related to that the same SSRC value can exist in two RTP sessions 717 over the same underlying flow. Anyone tracking the sequence number 718 and timestamp will react badly as the second media stream with the 719 same SSRC causes constant jumps back and forth in these fields 720 compared to the first stream, if packets are transmitted 721 simultaneously for both SSRCs. This issue can likely only be solved 722 by having the firewalls that like to track flows to also use the 723 session identifier to create context. This is possible as the header 724 extension will be in the clear and in the front. The second issue is 725 that the header extension itself may get the firewall to react. 726 Especially very picky ones that expect packets with certain media 727 types to have certain packet lengths. They are not compatible with a 728 header extension. 730 The Multiplexing Shim shares the issue with multiple flows for the 731 same SSRC. Firewalls and deep packet inspection cause the shim 732 placement to be in question. If it is a pre-fixed shim, it prevents 733 the packet from looking like regular IP/UDP/RTP packets and be 734 correctly classified in firewalls and DPI engines. However, if one 735 puts it last, it is unlikely that any firewall or DPI ever will be 736 able to take the session context into account as it is at the end of 737 the packet. This as many line rate processing devices only take a 738 certain amount of the the headers into account. 740 The SRTP MKI field is likely the solution that has least firewall and 741 DPI issues, after the single RTP session. There is no additional 742 suspect field. The only difference from a single RTP session in the 743 transport flow is the fact that multiple MKI are guaranteed to be 744 used. However, that may occur also in a single RTP session usage. 745 Thus the only issues are the one shared with single session and the 746 one that several RTP media streams may use the same SSRC. 748 The octet in the padding field has, in addition to the issues the 749 SRTP MKI field has, the single issue that it redefines something that 750 is supposed to be zero into a value. Thus potentially causing a 751 deeply inspecting firewall to clamp the flow in fear of covert 752 channel or non-compliance. 754 5.6. Monitoring and Reporting 756 The monitoring and reporting requirement considers several aspects. 757 How useful monitoring can one get from an existing legacy monitor, 758 and secondary any issues in upgrading them to handle the selected 759 solution. Thirdly, packet selector filters and packet sniffers 760 concerns are considered. 762 In general one can expect the proposals that have only a single SSRC 763 space to work better with legacy. Thus both Single Session and 764 Redefine SSRC space can gather and report data on media flows most 765 likely. The only potential issue is that due to the different media 766 types and clock rates, some failure may occur. In particular a third 767 party monitor may be targeted to a specific media type, like 768 monitoring VoIP. That monitor will have problems processing any 769 video packets correctly and generate the VoIP specific metrics for 770 any video sending SSRC. In general, no legacy solution for 771 monitoring will be able to correctly create the sub-contexts that 772 each RTP session has in the solutions, without update to handle the 773 new semantics. Also when it comes to the packet filtering and 774 selector filters, fine grained control can only be accomplished 775 implementing the new semantics. Therefore only the Single Session 776 meets this requirement fully. 778 Redefine the SSRC field is close to fully meeting the requirement, 779 however due to that there exist a session structure that is hidden to 780 anyone that is not upgraded to understand the semantics, this only 781 gets a partial. 783 The other proposals all can have multiple RTP sessions using the same 784 SSRC. This will create significant issues for any legacy third party 785 monitor. Only an updated monitor, or for that matter packet 786 selector, can pick out the individual media streams and their 787 associated RTCP traffic. Thus all these proposals gets a failure to 788 meet the requirement. 790 5.7. Usable over Multicast 792 As discussed earlier the goal with having the option usable also over 793 multicast is to remove the need to produce different media streams 794 for transport over unicast and multicast. All of the proposals 795 successfully meet the requirement. 797 5.8. Incremental Deployment 799 The possibility to deploy the usage of the multiplexing of multiple 800 RTP sessions over a single transport, especially in the context of 801 multi-party sessions, is a great benefit for any of the proposals. 802 Thus not all end-point implementations needs to be upgraded before 803 one start enabling it in the central node and any signalling. 805 Considering a centralized multi-party application where some 806 participants are using multiple transport flows and you want to 807 enable one particular participant to use the single transport to the 808 central node, one criteria stands out. The possibility to have one 809 RTP session per transport in one leg, and in the next multiplex them 810 together with minimal complexity and packet changes. Here there are 811 significant differences. 813 The Multiplexing Shim has the least overhead for this. As the 814 central node or gateway between deployments only needs to either add 815 or remove the shim identifier and then forward the packet over the 816 corresponding transport, either a joint one on the single transport 817 side, or over the individual one on the multiple transport side. 819 The SRTP MKI field proposal is almost as good, as the only main 820 difference is the need to coordinate the used MKIs on the non- 821 multiplexed legs so that there is no overlap between the RTP 822 sessions. And if there is, the MKI can be translated in gateway as 823 SRTP has no integrity protection over the MKI. Thus both 824 multiplexing shim and SRTP MKI field does successfully meet this 825 requirement. 827 The Header Extension supports multiple full 32-bit SSRC spaces and 828 can thus handle all the RTP sessions without need for any SSRC 829 translation, however this proposal does run into the problem that the 830 gateway needs to be in the security context to be able to add or 831 remove the header extension when SRTP is used. In addition to the 832 security implications of that, there is a complexity overhead due to 833 the need to redo the authentication tags on all RTP/RTCP packets. 834 Thus it gets a partial. 836 The Octet in the Padding field share issues with the header extension 837 but have even higher complexities for this. The reason is that the 838 padding field is also encrypted. Thus to add or remove it (although 839 removing it may be unncessary) forces the end-point to encrypt at 840 least that byte also, and for ciphers that are not stream-ciphers, 841 the whole packet needs to be re-encrypted. Thus this proposal gets a 842 very weak partially meeting the requirement. 844 The Single Session and Redefine the SSRC field do not allow several 845 vanilla RTP sessions to be connected to these proposals. The reason 846 is the single 32-bit SSRC space they have. Single Session only has 847 one session and the Redefine the SSRC fields uses some of the bits as 848 session identifier. This forces the gateway to translate the SSRC 849 whenever it does not fulfill the rules or semantics of the 850 multiplexed side. For Redefine SSRC field this becomes almost 851 constant as the session identifier part of the SSRC must be the same 852 over all SSRCs from the same session. For Single Session it may only 853 be needed when there otherwise would be an SSRC collision between the 854 sessions. This further assumes that the non-multiplexed side would 855 never use any of the RTP mechanisms that require the same SSRC in 856 multiple RTP sessions, as they cannot be gatewayed at all. When 857 translating an SSRC there is first of all an overhead, with SRTP that 858 includes a complete authenticate, decrypte, encrypt and create a new 859 authentication tag cycle. In addition, the SSRC translation could 860 potentially be a deployment obstacle for new RTP/RTCP extensions 861 required to be understood by the translator to be correctly 862 translated. Therefore these two proposals gets a fail to meet the 863 requirements. 865 5.9. Summary and Conclusion 867 This section contains a summary table of the high level outcome 868 against the different requirements. 870 A table mapping the requirements against the ID numbers used in the 871 table is the following: 873 1: Support multiple RTP sessions over one transport flow 875 2: Enable same SSRC value in multiple RTP sessions 877 2.1: Avoid SSRC translation in gateways/translators 879 2.2: Support existing extensions 881 3: Ensure SRTP functions 883 4: Don't Redefine used bits 885 5: Firewall Friendly 887 6: Monitoring and Reporting should still function 889 7: Usable over Multicast 891 8: Incremental deployment 893 OH: Overhead in Bytes. + means variable 895 ---------------+---+---+---+---+---+---+---+---+---+---- 896 Solution | 1 |2.1|2.2| 3 | 4 | 5 | 6 | 7 | 8 | OH 897 ---------------+---+---+---+---+---+---+---+---+---+---- 898 Header Ext. | S | S | P | P | F | P | F | S | P | 8+ 899 Multiplex Shim | S | S | S | S | S | P | F | S | S | 1 900 Single Session | F | F | F | S | S | P | S | S | F | 0 901 SRTP MKI Field | S | S | S | P | F | P | F | S | S | 4 902 Padding Field | S | S | S | F | P | P | F | S | P | 2 903 Redefine SSRC | S | F | F | P | F | P | P | S | S | 0 904 ---------------+---+---+---+---+---+---+---+---+---+---- 906 Figure 1: Summary Table of Evaluation (Successfully (S), Partially 907 (P) or Fails (F) to meet requirement) 909 Considering these options, the authors would recommend that AVTCORE 910 standardize a solution based on a postfixed multiplexing field, i.e. 911 a shim approach combined with the appropriate signalling as described 912 in Section 4.2. 914 6. Specification 916 This section contains the specification of the solution based on a 917 SHIM, with the explicit session identifier at the end of the 918 encapsulated payload. 920 6.1. Shim Layer 922 This solution is based on a shim layer that is inserted in the stack 923 between the regular RTP and RTCP packets and the transport layer 924 being used by the RTP sessions. Thus the layering looks like the 925 following: 927 +---------------------+ 928 | RTP / RTCP Packet | 929 +---------------------+ 930 | Session ID Layer | 931 +---------------------+ 932 | Transport layer | 933 +---------------------+ 935 Stack View with Session ID SHIM 937 The above stack is in fact a layered one as it does allow multiple 938 RTP Sessions to be multiplexed on top of the Session ID shim layer. 939 This enables the example presented in Figure 2 where four sessions, 940 S1-S4 is sent over the same Transport layer and where the Session ID 941 layer will combine and encapsulate them with the session ID on 942 transmission and separate and decapsulate them on reception. 944 +-------------------+ 945 | S1 | S2 | S3 | S4 | 946 +-------------------+ 947 | Session ID Layer | 948 +-------------------+ 949 | Transport layer | 950 +-------------------+ 952 Figure 2: Multiple RTP Session On Top of Session ID Layer 954 The Session ID layer encapsulates one RTP or RTCP packet from a given 955 RTP session and postfixes a one byte Session ID (SID) field to the 956 packet. Each RTP session being multiplexed on top of a given 957 transport layer is assigned either a single or a pair of unique SID 958 in the range 0-255. The reason for assigning a pair of SIDs to a 959 given RTP session are for RTP Sessions that doesn't support 960 "Multiplexing RTP Data and Control Packets on a Single Port" 961 [RFC5761] to still be able to use a single 5-tuple. The reasons for 962 supporting this extra functionality is that RTP and RTCP multiplexing 963 based on the payload type/packet type fields enforces certain 964 restrictions on the RTP sessions. These restrictions may not be 965 acceptable. As this solution does not have these restrictions, 966 performing RTP and RTCP multiplexing in this way has benefits. 968 Each Session ID value space is scoped by the underlying transport 969 protocol. Common transport protocols like UDP, DCCP, TCP, and SCTP 970 can all be scoped by one or more 5-tuple (Transport protocol, source 971 address and port, destination address and port). The case of 972 multiple 5-tuples occur in the case of multi-unicast topologies, also 973 called meshed multiparty RTP sessions or in case any application 974 would need more than 128 RTP sessions. 976 0 1 2 3 977 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 978 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 979 |V=2|P|X| CC |M| PT | sequence number | | 980 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 981 | timestamp | | 982 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 983 | synchronization source (SSRC) identifier | | 984 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 985 | contributing source (CSRC) identifiers | | 986 | .... | | 987 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 988 | RTP extension (OPTIONAL) | | 989 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 990 | | payload ... | | 991 | | +-------------------------------+ | 992 | | | RTP padding | RTP pad count | | 993 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 994 | ~ SRTP MKI (OPTIONAL) ~ | 995 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 996 | : authentication tag (RECOMMENDED) : | 997 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 998 | | Session ID | | 999 | +---------------+ | 1000 +- Encrypted Portion* Authenticated Portion ---+ 1002 Figure 3: SRTP Packet encapsulated by Session ID Layer 1004 0 1 2 3 1005 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1006 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 1007 |V=2|P| RC | PT=SR or RR | length | | 1008 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1009 | SSRC of sender | | 1010 +>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 1011 | ~ sender info ~ | 1012 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1013 | ~ report block 1 ~ | 1014 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1015 | ~ report block 2 ~ | 1016 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1017 | ~ ... ~ | 1018 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1019 | |V=2|P| SC | PT=SDES=202 | length | | 1020 | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 1021 | | SSRC/CSRC_1 | | 1022 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1023 | ~ SDES items ~ | 1024 | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 1025 | ~ ... ~ | 1026 +>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 1027 | |E| SRTCP index | | 1028 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 1029 | ~ SRTCP MKI (OPTIONAL) ~ | 1030 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1031 | : authentication tag : | 1032 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1033 | | Session ID | | 1034 | +---------------+ | 1035 +-- Encrypted Portion Authenticated Portion -----+ 1037 Figure 4: SRTCP packet encapsulated by Session ID layer 1039 The processing in a receiver when the Session ID layer is present 1040 will be to 1042 1. Pick up the packet from the lower layer transport 1044 2. Inspect the SID field value 1046 3. Strip the SID field from the packet 1048 4. Forward it to the (S)RTP Session context identified by the SID 1049 value 1051 6.2. Signalling 1053 The use of the Session ID layer needs to be explicitly agreed on 1054 between the communicating parties. Each RTP Session the application 1055 uses must in addition to the regular configuration such as payload 1056 types, RTCP extension etc, have both the underlying 5-tuple (source 1057 address and port, destination address and port, and transport 1058 protocol) and the Session ID used for the particular RTP session. 1059 The signalling requirement is to assign unique Session ID values to 1060 all RTP Sessions being sent over the same 5-tuple. The same Session 1061 ID shall be used for an RTP session independently of the traffic 1062 direction. Note that nothing prevents a multi-media application from 1063 using multiple 5-tuples if desired for some reason, in which case 1064 each 5-tuple has its own session ID value space. 1066 This section defines how to negotiate the use of the Session ID 1067 layer, using the Session Description Protocol (SDP) Offer/Answer 1068 mechanism [RFC3264]. A new media-level SDP attribute, 1069 'session-mux-id', is defined, in order to be used with the media 1070 BUNDLE mechanism defined in [I-D.ietf-mmusic-sdp-bundle-negotiation]. 1071 The attribute allows each media description ("m=" line) associated 1072 with a 'BUNDLE' group to form separate RTP sessions. 1074 The 'session-mux-id' attribute is included for a media description, 1075 in order to indicate the Session ID for that particular media 1076 description. Every media description that shares a common attribute 1077 value is assumed to be part of a single RTP session. An SDP Offerer 1078 MUST include the 'session-mux-id' attribute for every media 1079 description associated with a 'BUNDLE' group. If the SDP Answer does 1080 not contain 'session-mux-id' attributes, the SDP Offerer MUST NOT 1081 assume that separate RTP sessions will be used. If the SDP Answer 1082 still describes a 'BUNDLE' group, the procedures in 1083 [I-D.ietf-mmusic-sdp-bundle-negotiation] apply. 1085 An SDP Answerer MUST NOT include the 'session-mux-id' attribute in an 1086 SDP Answer, unless included in the SDP Offer. 1088 The attribute has the following ABNF [RFC5234] definition. 1090 Session-mux-id-attr = "a=session-mux-id:" SID *SID-prop 1091 SID = SID-value / SID-pairs 1092 SID-value = 1*3DIGIT / "NoN" 1093 SID-pairs = SID-value "/" SID-value ; RTP/RTCP SIDs 1094 SID-prop = SP assignment-policy / prop-ext 1095 prop-ext = token "=" value 1096 assignment-policy = "policy=" ("tentative" / "fixed") 1098 The following parameters MUST be configured as specified: 1100 o RTP Profile SHOULD be the same, but MUST be compatible, like AVP 1101 and AVPF. 1103 o RTCP bandwidth parameters are the same 1105 o RTP Payload type values are not overlapping 1107 In declarative SDP usage, there is clearly no method for fallback 1108 unless some other negotiation protocol is used. 1110 The SID property "policy" is used in negotiation by an end-point to 1111 indicate if the session ID values are merely a tentative suggestion 1112 or if they must have these values. This is used when negotiating SID 1113 for multi-party RTP sessions to support shared transports such as 1114 multicast or RTP translators that are unable to produce renumbered 1115 SIDs on a per end-point basis. The normal behavior is that the offer 1116 suggest a tentative set of values, indicated by "policy=tentative". 1117 These SHOULD be accepted by the peer unless that peer negotiate 1118 session IDs on behalf of a centralized policy, in which case it MAY 1119 change the value(s) in the answer. If the offer represents a policy 1120 that does not allow changing the session ID values, it can indicate 1121 that to the answerer by setting the policy to "fixed". This enables 1122 the answering peer to either accept the value or indicate that there 1123 is a conflict in who is performing the assignment by setting the SID 1124 value to NoN (Not a Number). Offerer and answerer SHOULD always 1125 include the policy they are operating under. Thus, in case of no 1126 centralized behaviors, both offerer and answerer will indicate the 1127 tentative policy. 1129 6.3. SRTP Key Management 1131 Key management for SRTP do needs discussion as we do cause multiple 1132 SRTP sessions to exist on the same underlying transport flow. Thus 1133 we need to ensure that the key management mechanism still are 1134 properly associated with the SRTP session context it intends to key. 1135 To ensure that we do look at the three SRTP key management mechanism 1136 that IETF has specified, one after another. 1138 6.3.1. Security Description 1140 Session Description Protocol (SDP) Security Descriptions for Media 1141 Streams [RFC4568] as being based on SDP has no issue with the RTP 1142 session multiplexing on lower layer specified here. The reason is 1143 that the actual keying is done using a media level SDP attribute. 1144 Thus the attribute is already associated with a particular media 1145 description. A media description that also will have an instance of 1146 the "a=session-mux-id" attribute carrying the SID value/pair used 1147 with this particular crypto parameters. 1149 6.3.2. DTLS-SRTP 1151 Datagram Transport Layer Security (DTLS) Extension to Establish Keys 1152 for the Secure Real-time Transport Protocol (SRTP) [RFC5764] is a 1153 keying mechanism that works on the media plane on the same lower 1154 layer transport that SRTP/SRTCP will be transported over. Thus each 1155 DTLS message must be associated with the SRTP and/or SRTCP flow it is 1156 keying. 1158 The most direct solution is to use the SHIM and the SID context 1159 identifier to be applied also on DTLS packets. Thus using the same 1160 SID that is used with RTP and/or RTCP also for the DTLS message 1161 intended to key that particular SRTP and/or SRTCP flow(s). 1163 6.3.3. MIKEY 1165 MIKEY: Multimedia Internet KEYing [RFC3830] is a key management 1166 protocol that has several transports. In some cases it is used 1167 directly on a transport protocol such as UDP, but there is also a 1168 specification for how MIKEY is used with SDP "Key Management 1169 Extensions for Session Description Protocol (SDP) and Real Time 1170 Streaming Protocol (RTSP)" [RFC4567]. 1172 Lets start with the later, i.e. the SDP transport, which shares the 1173 properties with Security Description in that is can be associated 1174 with a particular media description in a SDP. As long as one avoids 1175 using the session level attribute one can be certain to correctly 1176 associate the key exchange with a given SRTP/SRTCP context. 1178 It does appear that MIKEY directly over a lower layer transport 1179 protocol will have similar issues as DTLS. 1181 6.4. Examples 1183 6.4.1. RTP Packet with Transport Header 1185 The below figure contains an RTP packet with SID field encapsulated 1186 by a UDP packet (added UDP header). 1188 0 1 2 3 1189 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1190 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1191 | Source Port | Destination Port | 1192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1193 | Length | Checksum | 1194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 1195 |V=2|P|X| CC |M| PT | sequence number | | 1196 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1197 | timestamp | | 1198 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1199 | synchronization source (SSRC) identifier | | 1200 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 1201 | contributing source (CSRC) identifiers | | 1202 | .... | | 1203 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1204 | RTP extension (OPTIONAL) | | 1205 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1206 | | payload ... | | 1207 | | +-------------------------------+ | 1208 | | | RTP padding | RTP pad count | | 1209 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 1210 | ~ SRTP MKI (OPTIONAL) ~ | 1211 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1212 | : authentication tag (RECOMMENDED) : | 1213 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1214 | | Session ID | | 1215 | +---------------+ | 1216 +- Encrypted Portion* Authenticated Portion ---+ 1218 SRTP Packet Encapsulated by Session ID Layer 1220 6.4.2. SDP Offer/Answer example 1222 This section contains SDP offer/answer examples. First one example 1223 of successful BUNDLEing, and then two where fallback occurs. 1225 In the below SDP offer, one audio and one video is being offered. 1226 The audio is using SID 0, and the video is using SID 1 to indicate 1227 that they are different RTP sessions despite being offered over the 1228 same 5-tuple. 1230 v=0 1231 o=alice 2890844526 2890844526 IN IP4 atlanta.example.com 1232 s= 1233 c=IN IP4 atlanta.example.com 1234 t=0 0 1235 a=group:BUNDLE foo bar 1236 m=audio 10000 RTP/AVP 0 8 97 1237 b=AS:200 1238 a=mid:foo 1239 a=session-mux-id:0 policy=tentative 1240 a=rtpmap:0 PCMU/8000 1241 a=rtpmap:8 PCMA/8000 1242 a=rtpmap:97 iLBC/8000 1243 m=video 10000 RTP/AVP 31 32 1244 b=AS:1000 1245 a=mid:bar 1246 a=session-mux-id:1 policy=tentative 1247 a=rtpmap:31 H261/90000 1248 a=rtpmap:32 MPV/90000 1250 The SDP answer from an end-point that supports this BUNDLEing: 1251 v=0 1252 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 1253 s= 1254 c=IN IP4 biloxi.example.com 1255 t=0 0 1256 a=group:BUNDLE foo bar 1257 m=audio 20000 RTP/AVP 0 1258 b=AS:200 1259 a=mid:foo 1260 a=session-mux-id:0 policy=tentative 1261 a=rtpmap:0 PCMU/8000 1262 m=video 20000 RTP/AVP 32 1263 b=AS:1000 1264 a=mid:bar 1265 a=session-mux-id:1 policy=tentative 1266 a=rtpmap:32 MPV/90000 1268 The SDP answer from an end-point that does not support this BUNDLEing 1269 or the general signalling of 1270 [I-D.ietf-mmusic-sdp-bundle-negotiation]. 1272 v=0 1273 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 1274 s= 1275 c=IN IP4 biloxi.example.com 1276 t=0 0 1277 m=audio 20000 RTP/AVP 0 1278 b=AS:200 1279 a=rtpmap:0 PCMU/8000 1280 m=video 30000 RTP/AVP 32 1281 b=AS:1000 1282 a=rtpmap:32 MPV/90000 1284 The SDP answer of a client supporting 1285 [I-D.ietf-mmusic-sdp-bundle-negotiation] but not this BUNDLEing would 1286 look like this: 1287 v=0 1288 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 1289 s= 1290 c=IN IP4 biloxi.example.com 1291 t=0 0 1292 a=group:BUNDLE foo bar 1293 m=audio 20000 RTP/AVP 0 1294 a=mid:foo 1295 b=AS:200 1296 a=rtpmap:0 PCMU/8000 1297 m=video 20000 RTP/AVP 32 1298 a=mid:bar 1299 b=AS:1000 1300 a=rtpmap:32 MPV/90000 1302 In this last case, the result is a sing RTP session with both media 1303 types being established. If that isn't supported or desired, the 1304 offerer will have to either re-invite without the BUNDLE grouping to 1305 force different 5-tuples, or simply terminate the session. 1307 7. Open Issues 1309 This work is still in the early phase of specification. This section 1310 contains a list of open issues where the author desires some input. 1312 1. Should RTP and RTCP multiplexing without RFC 5761 support be 1313 included? 1315 2. In Section 6.2 there is a discussion of which parameters that 1316 must be configured. The scope of these rules and if they do make 1317 sense needs additional discussion. 1319 3. Can we provide better control so that applications that doesn't 1320 desire fallback to single RTP session when Multiplexing shim 1321 fails to be supported but Bundle is supported ends up with a 1322 better alternative? 1324 8. IANA Considerations 1326 This document request the registration of one SDP attribute. Details 1327 of the registration to be filled in. 1329 9. Security Considerations 1331 The security properties of the Session ID layer is depending on what 1332 mechanism is used to protect the RTP and RTCP packets of a given RTP 1333 session. If IPsec or transport layer security solutions such as DTLS 1334 or TLS are being used then both the encapsulated RTP/RTCP packets and 1335 the session ID layer will be protected by that security mechanism. 1336 Thus potentially providing both confidentiality, integrity and source 1337 authentication. If SRTP is used, the session ID layer will not be 1338 directly protected by SRTP. However, it will be implicitly integrity 1339 protected (assuming the RTP/RTCP packet is integrity protected) as 1340 the only function of the field is to identify the session context. 1341 Thus any modification of the SID field will attempt to retrieve the 1342 wrong SRTP crypto context. If that retrieval fails, the packet will 1343 be anyway be discarded. If it is successful, the context will not 1344 lead to successful verification of the packet. 1346 10. Acknowledgements 1348 This document is based on the input from various people, especially 1349 in the context of the RTCWEB discussion of how to use only a single 1350 lower layer transport. The RTP and RTCP packet figures are borrowed 1351 from RFC3711. The SDP example is extended from the one present in 1352 [I-D.ietf-mmusic-sdp-bundle-negotiation]. The authors would like to 1353 thank Christer Holmberg for assistance in utilizing the BUNDLE 1354 grouping mechanism. 1356 The proposal in Section 4.5 is original suggested by Colin Perkins. 1357 The idea in Section 4.6 is from an Internet Draft 1358 [I-D.rosenberg-rtcweb-rtpmux] written by Jonathan Rosenberg et. al. 1359 The proposal in Section 4.3 is a result of discussion by a group of 1360 people at IETF meeting #81 in Quebec. 1362 11. References 1363 11.1. Normative References 1365 [I-D.ietf-mmusic-sdp-bundle-negotiation] 1366 Holmberg, C. and H. Alvestrand, "Multiplexing Negotiation 1367 Using Session Description Protocol (SDP) Port Numbers", 1368 draft-ietf-mmusic-sdp-bundle-negotiation-00 (work in 1369 progress), February 2012. 1371 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1372 Requirement Levels", BCP 14, RFC 2119, March 1997. 1374 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1375 Jacobson, "RTP: A Transport Protocol for Real-Time 1376 Applications", STD 64, RFC 3550, July 2003. 1378 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 1379 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 1380 RFC 3711, March 2004. 1382 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 1383 Specifications: ABNF", STD 68, RFC 5234, January 2008. 1385 11.2. Informational References 1387 [I-D.lennox-rtcweb-rtp-media-type-mux] 1388 Rosenberg, J. and J. Lennox, "Multiplexing Multiple Media 1389 Types In a Single Real-Time Transport Protocol (RTP) 1390 Session", draft-lennox-rtcweb-rtp-media-type-mux-00 (work 1391 in progress), October 2011. 1393 [I-D.rosenberg-rtcweb-rtpmux] 1394 Rosenberg, J., Jennings, C., Peterson, J., Kaufman, M., 1395 Rescorla, E., and T. Terriberry, "Multiplexing of Real- 1396 Time Transport Protocol (RTP) Traffic for Browser based 1397 Real-Time Communications (RTC)", 1398 draft-rosenberg-rtcweb-rtpmux-00 (work in progress), 1399 July 2011. 1401 [I-D.westerlund-avtcore-multiplex-architecture] 1402 Westerlund, M., Burman, B., and C. Perkins, "RTP 1403 Multiplexing Architecture", 1404 draft-westerlund-avtcore-multiplex-architecture-01 (work 1405 in progress), March 2012. 1407 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 1408 with Session Description Protocol (SDP)", RFC 3264, 1409 June 2002. 1411 [RFC3830] Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K. 1412 Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830, 1413 August 2004. 1415 [RFC4567] Arkko, J., Lindholm, F., Naslund, M., Norrman, K., and E. 1416 Carrara, "Key Management Extensions for Session 1417 Description Protocol (SDP) and Real Time Streaming 1418 Protocol (RTSP)", RFC 4567, July 2006. 1420 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 1421 Description Protocol (SDP) Security Descriptions for Media 1422 Streams", RFC 4568, July 2006. 1424 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 1425 Header Extensions", RFC 5285, July 2008. 1427 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 1428 Real-Time Transport Control Protocol (RTCP): Opportunities 1429 and Consequences", RFC 5506, April 2009. 1431 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 1432 Control Packets on a Single Port", RFC 5761, April 2010. 1434 [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer 1435 Security (DTLS) Extension to Establish Keys for the Secure 1436 Real-time Transport Protocol (SRTP)", RFC 5764, May 2010. 1438 Authors' Addresses 1440 Magnus Westerlund 1441 Ericsson 1442 Farogatan 6 1443 SE-164 80 Kista 1444 Sweden 1446 Phone: +46 10 714 82 87 1447 Email: magnus.westerlund@ericsson.com 1449 Colin Perkins 1450 University of Glasgow 1451 School of Computing Science 1452 Glasgow G12 8QQ 1453 United Kingdom 1455 Email: csp@csperkins.org