idnits 2.17.1 draft-westerlund-avtcore-transport-multiplexing-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 22, 2012) is 4197 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-01 == Outdated reference: A later version (-13) exists of draft-ietf-avtcore-multi-media-rtp-session-00 -- Obsolete informational reference (is this intentional?): RFC 793 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) -- Obsolete informational reference (is this intentional?): RFC 5245 (Obsoleted by RFC 8445, RFC 8839) -- Obsolete informational reference (is this intentional?): RFC 5285 (Obsoleted by RFC 8285) -- Obsolete informational reference (is this intentional?): RFC 5389 (Obsoleted by RFC 8489) Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Westerlund 3 Internet-Draft Ericsson 4 Intended status: Standards Track C. Perkins 5 Expires: April 25, 2013 University of Glasgow 6 October 22, 2012 8 Multiple RTP Sessions on a Single Lower-Layer Transport 9 draft-westerlund-avtcore-transport-multiplexing-04 11 Abstract 13 This document specifies how multiple RTP sessions are to be 14 multiplexed on the same lower-layer transport, e.g. a UDP flow. It 15 discusses various requirements that have been raised and their 16 feasibility, which results in a solution with a certain 17 applicability. A solution is recommended and that solution is 18 provided in more detail, including signalling and examples. 20 Status of this Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on April 25, 2013. 37 Copyright Notice 39 Copyright (c) 2012 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 55 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 4 56 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 57 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 5 58 3. Motivations . . . . . . . . . . . . . . . . . . . . . . . . . 5 59 3.1. NAT and Firewalls . . . . . . . . . . . . . . . . . . . . 5 60 3.2. No Transport Level QoS . . . . . . . . . . . . . . . . . . 5 61 3.3. Multiple RTP sessions . . . . . . . . . . . . . . . . . . 6 62 3.4. Usage of RTP Extensions . . . . . . . . . . . . . . . . . 6 63 3.5. Incremental Deployment . . . . . . . . . . . . . . . . . . 7 64 3.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 7 65 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 7 66 4.1. Support Use of Multiple RTP Sessions . . . . . . . . . . . 7 67 4.2. Same SSRC Value in Multiple RTP Sessions . . . . . . . . . 8 68 4.3. SRTP . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 69 4.4. Don't Redefine Used Bits . . . . . . . . . . . . . . . . . 9 70 4.5. Firewall Friendly . . . . . . . . . . . . . . . . . . . . 9 71 4.6. Monitoring and Reporting . . . . . . . . . . . . . . . . . 9 72 4.7. Usable Also Over Multicast . . . . . . . . . . . . . . . . 10 73 4.8. Incremental Deployment . . . . . . . . . . . . . . . . . . 10 74 5. Design Considerations . . . . . . . . . . . . . . . . . . . . 10 75 5.1. Location of SHIM . . . . . . . . . . . . . . . . . . . . . 10 76 5.2. ICE and DTLS-SRTP Integration . . . . . . . . . . . . . . 12 77 5.3. Signalling Fallback . . . . . . . . . . . . . . . . . . . 12 78 6. Specification . . . . . . . . . . . . . . . . . . . . . . . . 13 79 6.1. Shim Layer . . . . . . . . . . . . . . . . . . . . . . . . 14 80 6.2. Signalling . . . . . . . . . . . . . . . . . . . . . . . . 17 81 6.3. SRTP Key Management . . . . . . . . . . . . . . . . . . . 18 82 6.3.1. Security Description . . . . . . . . . . . . . . . . . 19 83 6.3.2. DTLS-SRTP . . . . . . . . . . . . . . . . . . . . . . 19 84 6.3.3. MIKEY . . . . . . . . . . . . . . . . . . . . . . . . 19 85 6.4. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 20 86 6.4.1. RTP Packet with Transport Header . . . . . . . . . . . 20 87 6.4.2. SDP Offer/Answer example . . . . . . . . . . . . . . . 21 88 7. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 25 89 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26 90 9. Security Considerations . . . . . . . . . . . . . . . . . . . 26 91 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 26 92 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 27 93 11.1. Normative References . . . . . . . . . . . . . . . . . . . 27 94 11.2. Informational References . . . . . . . . . . . . . . . . . 27 95 Appendix A. Possible Solutions . . . . . . . . . . . . . . . . . 29 96 A.1. Header Extension . . . . . . . . . . . . . . . . . . . . . 29 97 A.2. Multiplexing Shim . . . . . . . . . . . . . . . . . . . . 30 98 A.3. Single Session . . . . . . . . . . . . . . . . . . . . . . 31 99 A.4. Use the SRTP MKI field . . . . . . . . . . . . . . . . . . 32 100 A.5. Use an Octet in the Padding . . . . . . . . . . . . . . . 33 101 A.6. Redefine the SSRC field . . . . . . . . . . . . . . . . . 33 102 Appendix B. Comparison . . . . . . . . . . . . . . . . . . . . . 34 103 B.1. Support of Multiple RTP Sessions Over Single Transport . . 34 104 B.2. Enable Same SSRC Value in Multiple RTP Sessions . . . . . 34 105 B.2.1. Avoid SSRC Translation in Gateways/Translation . . . . 34 106 B.2.2. Support Existing Extensions . . . . . . . . . . . . . 35 107 B.3. Ensure SRTP Functions . . . . . . . . . . . . . . . . . . 35 108 B.4. Don't Redefine Used Bits . . . . . . . . . . . . . . . . . 36 109 B.5. Firewall Friendly . . . . . . . . . . . . . . . . . . . . 37 110 B.6. Monitoring and Reporting . . . . . . . . . . . . . . . . . 38 111 B.7. Usable over Multicast . . . . . . . . . . . . . . . . . . 39 112 B.8. Incremental Deployment . . . . . . . . . . . . . . . . . . 39 113 B.9. Summary and Conclusion . . . . . . . . . . . . . . . . . . 40 114 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 41 116 1. Introduction 118 There has been renewed interest for having a solution that allows 119 multiple RTP sessions [RFC3550] to use a single lower layer 120 transport, such as a bi-directional UDP flow. The main reason is the 121 cost of doing NAT/FW traversal for each individual flow. ICE and 122 other NAT/FW traversal solutions are clearly capable of attempting to 123 open multiple flows. However, there is both increased risk for 124 failure and an increased cost in the creation of multiple flows. The 125 increased cost comes as slightly higher delay in establishing the 126 traversal, and the amount of consumed NAT/FW resources. The latter 127 might be an increasing problem in the IPv4 to IPv6 transition period. 129 There is ongoing work on specifying how and when one RTP session may 130 contain multiple media types 131 [I-D.ietf-avtcore-multi-media-rtp-session]. That addresses certain 132 use cases, while this proposal addresses a different set of use cases 133 and motivations. This is further discussed in the section on 134 Motivations (Section 3). The classical method of having one RTP 135 session over a specific transport flow is still motivated for a 136 number of use cases, especially when flow based QoS is to be used for 137 some media streams. 139 This document draws up some requirements for consideration on how to 140 transport multiple RTP sessions over a single lower-layer transport. 141 These requirements had to be weighted as the combined set of 142 requirements result in that no known solution exist that can fulfill 143 them completely. 145 A number of possible solutions where considered and discussed with 146 respect to their properties. Based on that, the authors recommended 147 a shim layer variant as single solution, which specified in detail 148 including signalling solution and examples. The other considered 149 proposals and the comparison is available as appendices. 151 2. Conventions 153 2.1. Terminology 155 Some terminology used in this document. 157 Multiplexing: Unless specifically noted, all mentioning of 158 multiplexing in this document refer to the multiplexing of 159 multiple RTP Sessions on the same lower layer transport. It is 160 important to make this distinction as RTP does contain a number of 161 multiplexing points for various purposes, such as media formats 162 (Payload Type), media sources (SSRC), and RTP sessions. 164 2.2. Requirements Language 166 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 167 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 168 document are to be interpreted as described in RFC 2119 [RFC2119]. 170 3. Motivations 172 This section looks at the motivations why an additional solution is 173 needed assuming that you can do both the classical method of having 174 one RTP session per transport flow as defined by the RTP 175 specification [RFC3550] and when you have multiple media types within 176 one RTP session [I-D.ietf-avtcore-multi-media-rtp-session]. 178 3.1. NAT and Firewalls 180 The existence of NATs and Firewalls at almost all Internet access has 181 had implications on protocols like RTP that were designed to use 182 multiple transport flows. First of all, the NAT/FW traversal 183 solution one uses needs to ensure that all these transport flows are 184 established. This has three different impacts: 186 1. Increased delay to perform the transport flow establishment 188 2. The more transport flows, the more state and the more resource 189 consumption in the NAT and Firewalls. When the resource 190 consumption in NAT/FWs reaches their limits, unexpected behaviors 191 usually occur. Commonly resulting in service disruptions. 193 3. More transport flows means a higher risk that some transport flow 194 fails to be established, thus preventing the application to 195 communicate. 197 Using fewer transport flows reduces the risk of communication 198 failure, improved establishment behavior and less load on NAT and 199 Firewalls. 201 3.2. No Transport Level QoS 203 Many RTP-using applications don't utilize any network level Quality 204 of Service functions. Nor do they expect or desire any separation in 205 network treatment of its media packets, independent of whether they 206 are audio, video or text. When an application has no such desire, it 207 doesn't need to provide a transport flow structure that simplifies 208 flow based QoS. 210 3.3. Multiple RTP sessions 212 The usage of multiple RTP sessions allow separation of media streams 213 that have different usages or purposes in an RTP based application, 214 for example to separate the video of a presenter or most important 215 current talker, from those of the listeners that not all end-points 216 receive. Also separation for different processing based on media 217 types such as audio and video in end-points and central nodes. Thus 218 providing the node with the knowledge that any SSRC within the 219 session is supposed to be processed in a similar or same way. 221 For simpler cases, where the streams within each media type need the 222 same processing, it is clearly possible to find other multiplex 223 solutions, for example based on the Payload Type and the differences 224 in encoding that the payload type allows to describe. This may 225 anyhow be insufficient when you get into more advanced usages where 226 you have multiple sources of the same media type, but for different 227 usages or as alternatives. For example when you have one set of 228 video sources that shows session participants and another set of 229 video sources that shares an application or presentation slides, you 230 likely want to separate those streams for various reasons such as 231 control, prioritization, QoS, methods for robustification, etc. In 232 those cases, using the RTP session for separation of properties is a 233 powerful tool. A tool with properties that need to be preserved when 234 providing a solution for how to use only a single lower-layer 235 transport. 237 For more discussion of the usage of RTP sessions verses other 238 multiplexing we recommend RTP Multiplexing Architecture 239 [I-D.westerlund-avtcore-multiplex-architecture]. 241 3.4. Usage of RTP Extensions 243 Applications uses different sets of RTP extensions. The solution for 244 multiple media types in one RTP session 245 [I-D.ietf-avtcore-multi-media-rtp-session] is known to have 246 limitations that prevent the usage of the following RTP mechanisms 247 and extensions: 249 o XOR FEC (RFC5109) 251 o RTP Retransmission in session mode (RFC4588) 253 o Certain Layered Coding 255 A developed solution should minimize the number of RTP/RTCP extension 256 and mechanism that can't be used. 258 3.5. Incremental Deployment 260 In various multi-party communication scenarios deployment can become 261 an issue if all session participants are required to have the 262 functionality before enabling its usage. This is especially 263 difficult in communication scenarios where not all possible 264 participants and their capabilities are know ahead of establishing 265 the communication session with some sub-set of the participants. At 266 least for centralized communication sessions it is desirable to have 267 a solution that enables allows the solution to be used on a single 268 leg without affecting any other leg, nor require advanced translation 269 functionality in any central node. 271 3.6. Summary 273 The center of the motivation is to ensure that the RTP session is a 274 available and usable tool also for applications that has no need for 275 network level separation of its media streams and wants to reduce its 276 exposure to any NAT or Firewall inconsistencies and minimize the 277 resource consumption. As a benefit a well designed solution will 278 enable incremental deployment and minimal limitations in what 279 existing RTP mechanisms or extensions that can be used by the RTP 280 using application. 282 4. Requirements 284 This section lists and discusses a number of potential requirements. 285 However, it is not difficult to realize that it is in fact possible 286 to put requirements that makes the set of feasible solutions an empty 287 set. It is thus necessary to consider which requirements that are 288 essential to fulfill and which can be compromised on to arrive at a 289 solution. 291 4.1. Support Use of Multiple RTP Sessions 293 Section 3.3 discusses a number of reasons why an application may like 294 to have multiple RTP sessions. Considering the motivations for this 295 work this must be an absolute requirement. We also are of the 296 opinion that the session provided by the solution must fulfill the 297 definition in the RTP [RFC3550] specification: 299 "The distinguishing feature of an RTP session is that each 300 maintains a full, separate space of SSRC identifiers (defined 301 next). The set of participants included in one RTP session 302 consists of those that can receive an SSRC identifier transmitted 303 by any one of the participants either in RTP as the SSRC or a CSRC 304 (also defined below) or in RTCP." 306 4.2. Same SSRC Value in Multiple RTP Sessions 308 Two different RTP sessions being multiplexed on the same lower layer 309 transport need to be able to use the same SSRC value. This is a 310 absolute requirement, for two reasons: 312 1. To avoid mandating SSRC assignment rules that are coordinated 313 between the sessions. If the RTP sessions multiplexed together 314 must have unique SSRC values, then additional code that works 315 between RTP Sessions is needed in the implementations. Thus 316 raising the bar for implementing this solution. In addition, if 317 one gateways between parts of a system using this multiplexing 318 and parts that aren't multiplexing, the part that isn't 319 multiplexing must also fulfill the requirements on how SSRC is 320 assigned or force the gateway to translate SSRCs. Translating 321 SSRC is actually hard as it requires one to understand the 322 semantics of all current and future RTP and RTCP extensions. 323 Otherwise a barrier for deploying new extensions is created. 325 2. There are some few RTP extensions that currently rely on being 326 able to use the same SSRC in different RTP sessions: 328 * XOR FEC (RFC5109) 330 * RTP Retransmission in session mode (RFC4588) 332 * Certain Layered Coding 334 4.3. SRTP 336 SRTP [RFC3711] is one of the most commonly used security solutions 337 for RTP. In addition, it is the only one defined by IETF that is 338 integrated into RTP. This integration has several aspects that needs 339 to be considered when designing a solution for multiplexing RTP 340 sessions on the same lower layer transport. 342 Determining Crypto Context: SRTP first of all needs to know which 343 session context a received or to-be-sent packet relates to. It 344 also normally relies on the lower layer transport to identify the 345 session. It uses the Master Key Indicatior (MKI), if present, to 346 determine which key set is to be used. Then the SSRC and sequence 347 number are used by most crypto suites, including the most common 348 use of AES Counter Mode, to actually generate the correct cipher 349 stream. 351 Unencrypted Headers: SRTP has chosen to leave the RTP headers and 352 the first two 32-bit words of the first RTCP header unencrypted, 353 to allow for both header compression and monitoring to work also 354 in the presence of encryption. As these fields are in clear text 355 they are used in most crypto suites for SRTP to determine how to 356 protect or recover the plain text. 358 It is here important to contrast SRTP against a set of other possible 359 protection mechanisms. DTLS, TLS, and IPsec are all protecting and 360 encapsulating the entire RTP and RTCP packets. They don't perform 361 any partial operations on the RTP and RTCP packets. Any change that 362 is considered to be part of the RTP and RTCP packet is transparent to 363 them, but possibly not to SRTP. Thus the impact on SRTP operations 364 must be considered when defining a mechanism. 366 4.4. Don't Redefine Used Bits 368 As the core of RTP is in use in many systems and has a really large 369 deployment story and numerous implementations, changing any of the 370 field definitions is highly problematic. First of all, the 371 implementations need to change to support this new semantics. 372 Secondly, you get a large transition issue when you have some session 373 participants that support the new semantics and some that don't. 374 Combing the two behaviors in the same session can force the 375 deployment of costly and less than perfect translation devices. 377 4.5. Firewall Friendly 379 It is desirable that current Firewalls will accept the solutions as 380 normal RTP packets. However, in the authors' opinion we can't let 381 the firewall stifle invention and evolution of the protocol. It is 382 also necessary to be aware that a change that will make most deep 383 inspecting firewall consider the packet as not valid RTP/RTCP will 384 have a more difficult deployment story. 386 4.6. Monitoring and Reporting 388 It is desirable that a third party monitor can still operate on the 389 multiplexed RTP Sessions. It is however likely that they will 390 require an update to correctly monitor and report on multiplexed RTP 391 Sessions. 393 Another type of function to consider is packet sniffers and their 394 selector filters. These may be impacted by a change of the fields. 395 An observation is that many such systems are usually quite rapidly 396 updated to consider new types of standardized or simply common packet 397 formats. 399 4.7. Usable Also Over Multicast 401 It is desirable that a solution should be possible to use also when 402 RTP and RTCP packets are sent over multicast, both Any Source 403 Multicast (ASM) and Single Source Multicast (SSM). The reason for 404 this requirement is to allow a system using RTP to use the same 405 configuration regardless of the transport being done over unicast or 406 multicast. In addition, multicast can't be claimed to have an issue 407 with using multiple ports, as each multicast group has a complete 408 port space scoped by address. 410 4.8. Incremental Deployment 412 A good solution has the property that in topologies that contains RTP 413 mixers or Translators, a single session participant can enable 414 multiplexing without having any impact on any other session 415 participants. Thus a node should be able to take a multiplexed 416 packet and then easily send it out with minimal or no modification on 417 another leg of the session, where each RTP session is transported 418 over its own lower-layer transport. It should also be as easy to do 419 the reverse forwarding operation. 421 5. Design Considerations 423 When defining a SHIM solution for identifying RTP sessions over a 424 single transport layer there has been some special considerations 425 that is discussed in this section. 427 5.1. Location of SHIM 429 A major question affecting the SHIM is the location of the SHIM 430 header providing the Identifier of the session the packet relate to. 431 This section will discuss in detail about the impact of making the 432 different choices. 434 Identified aspects to consider are: 436 Possibility to Process: A prefixed shim header, i.e. between the 437 transport protocol and the RTP/RTCP packet header has the 438 advantage that any node on the network that likes to include the 439 header in any per-packet processing can reach it. Reasons for 440 per-packet processing are: 442 A. Quality of Service classification 444 B. SHIM ingress or egress 445 C. Monitoring 447 Many routers or similar devices can only read and process the 448 first N bytes of the whole packet, where N is commonly on the 449 order of 64-128 bytes. Any other type of processing means putting 450 the packet on the slow path. Thus a prefixed solution enables 451 this processing while a post fixed solution will most likely 452 forever prevent this type of devices to process it. 454 Legacy Processing: Packets or at least flows of the type IP/UDP/RTP 455 can in many cases be identified in Deep Packet Inspection, 456 Firewalls or other network entities that concern themselves with 457 trying determine what traffic that flows in a particular packet. 458 These nodes can clearly be updated but until they have they may 459 create a hinder against deployment. Thus a post fix gives likely 460 the least resistance for initial deployment. However, also for 461 postfix location the deployment can be hindered in cases multiple 462 RTP sessions using the same SSRC values due to irregular behavior 463 of the fields for what the third party believes is one media 464 stream rather than multiple ones. The prefixed will however 465 maintain the long-term capabilities of such devices assuming they 466 can be updated to include the SHIM header as part of the 467 classification. 469 Header Compression: The different header compression techniques that 470 has been developed compresses IP/UDP/RTP as complete combination. 471 If one instead have a IP/UDP/SHIM/RTP then the compression for the 472 full set may not work or poorly. Instead only IP/UDP header 473 compression is likely to be applied. Thus a prefix will loose 474 some compression efficiency until compression profiles for IP/UDP/ 475 SHIM/RTP has been developed, implemented and deployed. Postfix 476 don't have that issue, but nor can it ever gain anything from 477 header compression which an prefixed solution could once an 478 updated profile is deployed. Postfix also will have reduced 479 efficiency compressing sessions when the same SSRC is used in two 480 different RTP sessions as the RTP header fields like sequence 481 number etc will not behave as expected and need frequent explicit 482 updates. 484 The question of a prefixed or a postfixed header comes down to a 485 trade-off between long term usability and deployment issues: 487 Prefixed: Long term good possibility to adapt any network function 488 that needs to take the SHIM header into account. At the same time 489 any function that tries to analyze packets and because of that may 490 block the packets will be a hinder to deployment. 492 Postfixed: This solution will likely short term have the best 493 possibilities to deploy successfully. However, long term this 494 choice will likely prevent many network nodes that like to be 495 capable of separating the RTP sessions being multiplexed together 496 from successfully doing that. 498 After discussion in the WG it has been determined that prefixed is 499 the prefered solution. 501 5.2. ICE and DTLS-SRTP Integration 503 When using ICE [RFC5245] or DTLS-SRTP [RFC5764] or both with RTP 504 there exist the issue that RTP, STUN [RFC5389] and DTLS-SRTP are 505 simultanously in use over the same lower layer transport flow, like 506 UDP. This multiplexing is based on the value of the first byte of 507 the lower layer transport payload as discussed in Section 5.1.2 of 508 DTLS-SRTP [RFC5764]. 510 The replacement of a single RTP session with the multiple RTP 511 sessions idenfied by a SHIM must not be missidentified to be either 512 STUN or DTLS-SRTP or any other protocol intending to take the 513 available free code-points in the range 193-255 (Decimal). Thus a 514 prefixed SHIM must have its first byte have the two first bits set to 515 10 (Binary). Having the SHIM share the identity of RTP is not an 516 issue as one must have mutual agreement that the SHIM is used instead 517 of RTP. 519 Note: This limits a single byte SHIM to only allow a maximum of 64 520 RTP sessions over a single transport flow. 522 5.3. Signalling Fallback 524 There exist an important aspect in how the SDP signalling functions, 525 especially Offer/Answer [RFC3264]. The initial idea for the 526 signalling was to build on top of bundle 527 [I-D.ietf-mmusic-sdp-bundle-negotiation] which in its default 528 function negotiate multiple media types over one RTP session 529 [I-D.ietf-avtcore-multi-media-rtp-session]. If the signalling for 530 the solution that main purpose is to enable multiple RTP sessions 531 results in those cases the peer doesn't support this specification 532 the communicating peer can end up in single RTP session if the peer 533 supports that. 535 We consider it important that in the signalling design that the 536 application developer can decide what type of fallback that will 537 occur. It is also important to consider that one have to signal SHIM 538 based multiplexing of RTP sessions that are in fact of the type with 539 multiple media types. Thus the signalling for SHIM must be able to 540 describe multiple different scenarios: 542 1. Multiple RTP sessions multiplexed together using SHIM over one 543 transport 545 2. Like 1 but where at least one RTP session is containing multiple 546 media types 548 3. Like 1, but where the peer doesn't support SHIM and the initiator 549 wants to fallback to independent transports 551 4. Like 2, but where the peer doesn't support SHIM and wants to 552 fallback to multiple BUNDLED sessions over independent 553 transports. 555 In addition it must be possible to have multiple different transports 556 where each is a SHIM multiplex. This is to support decomposed end- 557 points or cases where certain media traffic is required to go to a 558 central processing node while others goes directly to a peer. 560 To enable all of these scenarios we propose a solution where each 561 indicates SHIM multiplex is indicated as its own grouping attribute 562 across all media blocks that are included in some form in the 563 multiplex. This resulting in that these media blocks fall under a 564 form of BUNDLE super set. This super set will also have some of 565 bundles restrictions on the transport layer, but not on higher layer. 566 Which Session ID pair a particular media block is associated is 567 signalled using a SDP attribute (a=session-mux-id) in each media 568 block. When multiple media block are assigned the same session ID 569 pair, they form a RTP session with multiple media types and have the 570 full restriction of bundle between them. 572 The method of fallback is indicated by providing explicit BUNDLE 573 grouping in addition to the SHIM when the fallback from SHIM is to 574 BUNDLE. 576 Note: Signalling solution is awaiting resolution of design path for 577 bundle and will then consider that solution and issues raised. 579 6. Specification 581 This section contains the specification of the RTP session 582 multiplexing SHIM, using an explicit session identifier of the 583 encapsulated payload. 585 6.1. Shim Layer 587 This solution is based on a shim layer that is inserted in the stack 588 between the regular RTP and RTCP packets and the transport layer 589 being used by the RTP sessions. Thus the layering looks like the 590 following: 592 +---------------------+ 593 | RTP / RTCP Packet | 594 +---------------------+ 595 | Session ID Layer | 596 +---------------------+ 597 | Transport layer | 598 +---------------------+ 600 Stack View with Session ID SHIM 602 The above stack is in fact a layered one as it does allow multiple 603 RTP Sessions to be multiplexed on top of the Session ID shim layer. 604 This enables the example presented in Figure 1 where four sessions, 605 S1-S4 is sent over the same Transport layer and where the Session ID 606 layer will combine and encapsulate them with the session ID on 607 transmission and separate and decapsulate them on reception. 609 +-------------------+ 610 | S1 | S2 | S3 | S4 | 611 +-------------------+ 612 | Session ID Layer | 613 +-------------------+ 614 | Transport layer | 615 +-------------------+ 617 Figure 1: Multiple RTP Session On Top of Session ID Layer 619 The Session ID layer encapsulates one RTP or RTCP packet from a given 620 RTP session and prefixes the 2-byte Session ID layer to the packet. 621 The Session ID layer is depicted below (Figure 2) and consists of 622 first 2 fixed bit values (10b) followed by a 14 bits unsigned integer 623 field with the Session ID (SID) value. 624 0 1 625 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 626 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 627 |1 0| Session ID (SID) | 628 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 630 Figure 2: Session ID layer 632 Each RTP session being multiplexed on top of a given transport layer 633 is assigned either a single or a pair of unique SID in the range 634 0-16383. The reason for assigning a pair of SIDs to a given RTP 635 session are for RTP Sessions that doesn't support "Multiplexing RTP 636 Data and Control Packets on a Single Port" [RFC5761] to still be able 637 to use a single 5-tuple. The reasons for supporting this extra 638 functionality is that RTP and RTCP multiplexing based on the payload 639 type/packet type fields enforces certain restrictions on the RTP 640 sessions. These restrictions may not be acceptable. As this 641 solution does not have these restrictions, performing RTP and RTCP 642 multiplexing in this way has benefits. 644 Each Session ID value space is scoped by the underlying transport 645 protocol. Common transport protocols like UDP [RFC0768], DCCP 646 [RFC4340], TCP [RFC0793], and SCTP [RFC4960] can all be scoped by one 647 or more 5-tuple (Transport protocol, source address and port, 648 destination address and port). The case of multiple 5-tuples occur 649 in the case of multi-unicast topologies, also called meshed 650 multiparty RTP sessions or in case any application would need more 651 than 8192 RTP sessions. 653 0 1 2 3 654 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 655 +-------------------------------+ 656 |1 0| Session ID | 657 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 658 |V=2|P|X| CC |M| PT | sequence number | | 659 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 660 | timestamp | | 661 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 662 | synchronization source (SSRC) identifier | | 663 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 664 | contributing source (CSRC) identifiers | | 665 | .... | | 666 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 667 | RTP extension (OPTIONAL) | | 668 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 669 | | payload ... | | 670 | | +-------------------------------+ | 671 | | | RTP padding | RTP pad count | | 672 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 673 | ~ SRTP MKI (OPTIONAL) ~ | 674 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 675 | : authentication tag (RECOMMENDED) : | 676 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 677 +- Encrypted Portion* Authenticated Portion ---+ 679 Figure 3: SRTP Packet encapsulated by Session ID Layer 681 0 1 2 3 682 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 683 +-------------------------------+ 684 |1 0| Session ID | 685 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 686 |V=2|P| RC | PT=SR or RR | length | | 687 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 688 | SSRC of sender | | 689 +>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 690 | ~ sender info ~ | 691 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 692 | ~ report block 1 ~ | 693 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 694 | ~ report block 2 ~ | 695 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 696 | ~ ... ~ | 697 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 698 | |V=2|P| SC | PT=SDES=202 | length | | 699 | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 700 | | SSRC/CSRC_1 | | 701 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 702 | ~ SDES items ~ | 703 | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 704 | ~ ... ~ | 705 +>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 706 | |E| SRTCP index | | 707 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 708 | ~ SRTCP MKI (OPTIONAL) ~ | 709 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 710 | : authentication tag : | 711 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 712 +-- Encrypted Portion Authenticated Portion -----+ 714 Figure 4: SRTCP packet encapsulated by Session ID layer 716 The processing in a receiver when the Session ID layer is present 717 will be to 719 1. Pick up the packet from the lower layer transport 721 2. Inspect the SID field value 723 3. Strip the SID field from the packet 725 4. Forward it to the (S)RTP Session context identified by the SID 726 value 728 6.2. Signalling 730 Note: This section may need updating as the direction of the solution 731 for Bundle has settled and the impact of the raised issues has been 732 analyzed. 734 The use of the Session ID layer needs to be explicitly agreed on 735 between the communicating parties. Each RTP Session the application 736 uses must in addition to the regular configuration such as payload 737 types, RTCP extension etc, have both the underlying 5-tuple (source 738 address and port, destination address and port, and transport 739 protocol) and the Session ID used for the particular RTP session. 740 The signalling requirement is to assign unique Session ID values to 741 all RTP Sessions being sent over the same 5-tuple. The same Session 742 ID shall be used for an RTP session independently of the traffic 743 direction. Note that nothing prevents a multi-media application from 744 using multiple 5-tuples if desired for some reason, in which case 745 each 5-tuple has its own session ID value space. 747 This section defines how to negotiate the use of the Session ID 748 layer, using the Session Description Protocol (SDP) Offer/Answer 749 mechanism [RFC3264]. A new SDP grouping semantics is defined "SHIM" 750 and a new media-level SDP attribute, 'session-mux-id. The attribute 751 allows each media description ("m=" line) associated with a 'SHIM' 752 group to be identified in which RTP session it belongs. 754 The 'session-mux-id' attribute is included for a media description, 755 in order to indicate the Session ID for that particular media 756 description. Every media description that shares a common attribute 757 value is assumed to be part of a single RTP session. An SDP Offerer 758 MUST include the 'session-mux-id' attribute for every media 759 description associated with a 'SHIM' group. If the SDP Answer does 760 not contain the SHIM group, the SDP Offerer MUST NOT use SHIM based 761 layering. However, if that is separate RTP sessions or BUNDLE is 762 determined on what was present in the offer and answer. This will 763 depend on what the offering party likes to happen. If they want a 764 failure to negotiate a SHIM, instead may be one or more bundle groups 765 then also the BUNDLE grouping is included in the offer. If the SDP 766 Answer still describes a 'BUNDLE' group, the procedures in 767 [I-D.ietf-mmusic-sdp-bundle-negotiation] apply. If not independent 768 transports and sessions are used. 770 An SDP Answerer MUST NOT include the 'SHIM' group and 771 'session-mux-id' attribute in an SDP Answer, unless they where 772 included in the SDP Offer. 774 The attribute has the following ABNF [RFC5234] definition. 776 Session-mux-id-attr = "a=session-mux-id:" SID *SID-prop 777 SID = SID-value / SID-pairs 778 SID-value = 1*3DIGIT / "NoN" 779 SID-pairs = SID-value "/" SID-value ; RTP/RTCP SIDs 780 SID-prop = SP assignment-policy / prop-ext 781 prop-ext = token "=" value 782 assignment-policy = "policy=" ("tentative" / "fixed") 784 The SHIM group SHALL contain all media descriptions that are intended 785 to be sent over the same transport flow, independent of Session ID. 786 For all media descriptions part of the same SHIM group the transport 787 parameters, i.e. ports, ICE-candidates etc MUST be the same and 788 handled as described by BUNDLE. Note, the parameters related to the 789 RTP session does not need to be same. 791 For media descriptions that have the same value of the Session ID 792 SHALL be treated the same way as if they where part of a BUNDLE 793 group, independently if that is indicated or not in the SDP. 795 The SID property "policy" is used in negotiation by an end-point to 796 indicate if the session ID values are merely a tentative suggestion 797 or if they must have these values. This is used when negotiating SID 798 for multi-party RTP sessions to support shared transports such as 799 multicast or RTP translators that are unable to produce renumbered 800 SIDs on a per end-point basis. The normal behavior is that the offer 801 suggest a tentative set of values, indicated by "policy=tentative". 802 These SHOULD be accepted by the peer unless that peer negotiate 803 session IDs on behalf of a centralized policy, in which case it MAY 804 change the value(s) in the answer. If the offer represents a policy 805 that does not allow changing the session ID values, it can indicate 806 that to the answerer by setting the policy to "fixed". This enables 807 the answering peer to either accept the value or indicate that there 808 is a conflict in who is performing the assignment by setting the SID 809 value to NoN (Not a Number). Offerer and answerer SHOULD always 810 include the policy they are operating under. Thus, in case of no 811 centralized behaviors, both offerer and answerer will indicate the 812 tentative policy. 814 6.3. SRTP Key Management 816 Key management for SRTP do needs discussion as we do cause multiple 817 SRTP sessions to exist on the same underlying transport flow. Thus 818 we need to ensure that the key management mechanism still are 819 properly associated with the SRTP session context it intends to key. 820 To ensure that we do look at the three SRTP key management mechanism 821 that IETF has specified, one after another. 823 6.3.1. Security Description 825 Session Description Protocol (SDP) Security Descriptions for Media 826 Streams [RFC4568] as being based on SDP has no issue with the RTP 827 session multiplexing on lower layer specified here. The reason is 828 that the actual keying is done using a media level SDP attribute. 829 Thus the attribute is already associated with a particular media 830 description. A media description that also will have an instance of 831 the "a=session-mux-id" attribute carrying the SID value/pair used 832 with this particular crypto parameters. 834 6.3.2. DTLS-SRTP 836 Datagram Transport Layer Security (DTLS) Extension to Establish Keys 837 for the Secure Real-time Transport Protocol (SRTP) [RFC5764] is a 838 keying mechanism that works on the media plane on the same lower 839 layer transport that SRTP/SRTCP will be transported over. 841 The most direct solution would be to use the SHIM and the SID context 842 identifier to be applied also on DTLS packets. Thus using the same 843 SID that is used with RTP and/or RTCP also for the DTLS message 844 intended to key that particular SRTP and/or SRTCP flow(s). This of 845 course requires independent usage of DTLS-SRTP for each RTP session. 846 In addition it requires changing the layering for DTLS-SRTP as well 847 as RTP. Thus this behavior doesn't gain you anything in regards to 848 key-management when using SHIM and have some costs. 850 Instead we propose that an DTLS-SRTP key-derivation change is 851 introduced. By including the Session ID value in the derivation of 852 the keying material a single DTLS-SRTP key-management operation could 853 apply keys and parameters for all the RTP sessions in the same 854 transport flow. Thus the keying cost is significantly reduced, 855 especially in regards to network communication and delay impact and 856 vunerability to packet loss. 858 Details to be written up. 860 6.3.3. MIKEY 862 MIKEY: Multimedia Internet KEYing [RFC3830] is a key management 863 protocol that has several transports. In some cases it is used 864 directly on a transport protocol such as UDP, but there is also a 865 specification for how MIKEY is used with SDP "Key Management 866 Extensions for Session Description Protocol (SDP) and Real Time 867 Streaming Protocol (RTSP)" [RFC4567]. 869 Lets start with the later, i.e. the SDP transport, which shares the 870 properties with Security Description in that is can be associated 871 with a particular media description in a SDP. As long as one avoids 872 using the session level attribute one can be certain to correctly 873 associate the key exchange with a given SRTP/SRTCP context. 875 It does appear that MIKEY directly over a lower layer transport 876 protocol will have similar issues as DTLS. 878 6.4. Examples 880 6.4.1. RTP Packet with Transport Header 882 The below figure contains an RTP packet with SID field encapsulated 883 by a UDP packet (added UDP header). 885 0 1 2 3 886 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 887 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 888 | Source Port | Destination Port | 889 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 890 | Length | Checksum | 891 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 892 |1 0| Session ID | 893 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 894 |V=2|P|X| CC |M| PT | sequence number | | 895 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 896 | timestamp | | 897 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 898 | synchronization source (SSRC) identifier | | 899 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 900 | contributing source (CSRC) identifiers | | 901 | .... | | 902 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 903 | RTP extension (OPTIONAL) | | 904 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 905 | | payload ... | | 906 | | +-------------------------------+ | 907 | | | RTP padding | RTP pad count | | 908 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 909 | ~ SRTP MKI (OPTIONAL) ~ | 910 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 911 | : authentication tag (RECOMMENDED) : | 912 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 913 +- Encrypted Portion* Authenticated Portion ---+ 915 SRTP Packet Encapsulated by Session ID Layer 917 6.4.2. SDP Offer/Answer example 919 6.4.2.1. Basic Example 921 This section contains SDP offer/answer examples. First one example 922 of successful SHIMing, and then two where fallback occurs. The 923 fallback option here is to fallback to individual transports, thus no 924 BUNDLE group. 926 In the below SDP offer, one audio and one video is being offered. 927 The audio is using SID 0, and the video is using SID 1 to indicate 928 that they are different RTP sessions despite being offered over the 929 same 5-tuple. 930 v=0 931 o=alice 2890844526 2890844526 IN IP4 atlanta.example.com 932 s= 933 c=IN IP4 atlanta.example.com 934 t=0 0 935 a=group:SHIM foo bar 936 m=audio 10000 RTP/AVP 0 8 97 937 b=AS:200 938 a=mid:foo 939 a=session-mux-id:0 policy=tentative 940 a=rtpmap:0 PCMU/8000 941 a=rtpmap:8 PCMA/8000 942 a=rtpmap:97 iLBC/8000 943 m=video 10000 RTP/AVP 31 32 944 b=AS:1000 945 a=mid:bar 946 a=session-mux-id:1 policy=tentative 947 a=rtpmap:31 H261/90000 948 a=rtpmap:32 MPV/90000 950 The SDP answer from an end-point that supports this BUNDLEing: 952 v=0 953 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 954 s= 955 c=IN IP4 biloxi.example.com 956 t=0 0 957 a=group:SHIM foo bar 958 m=audio 20000 RTP/AVP 0 959 b=AS:200 960 a=mid:foo 961 a=session-mux-id:0 policy=tentative 962 a=rtpmap:0 PCMU/8000 963 m=video 20000 RTP/AVP 32 964 b=AS:1000 965 a=mid:bar 966 a=session-mux-id:1 policy=tentative 967 a=rtpmap:32 MPV/90000 969 The SDP answer from an end-point that does not support this SHIMing. 970 v=0 971 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 972 s= 973 c=IN IP4 biloxi.example.com 974 t=0 0 975 m=audio 20000 RTP/AVP 0 976 b=AS:200 977 a=rtpmap:0 PCMU/8000 978 m=video 30000 RTP/AVP 32 979 b=AS:1000 980 a=rtpmap:32 MPV/90000 982 6.4.2.2. Advanced Example 984 In this example we have two BUNDLED sessions, one with audio and 985 video and one with XOR based FEC [RFC5109] for the audio and the 986 video. These two RTP session are then SHIMed into a single transport 987 flow. 989 v=0 990 o=alice 2890844526 2890844526 IN IP4 atlanta.example.com 991 s= 992 c=IN IP4 atlanta.example.com 993 t=0 0 994 a=group:SHIM foo bar 1 2 995 a=group:BUNDLE 1 2 996 a=group:BUNDLE foo bar 997 a=group:FEC foo 1 998 a=group:FEC bar 2 999 m=audio 10000 RTP/AVP 0 8 97 1000 b=AS:200 1001 a=mid:foo 1002 a=session-mux-id:0 policy=tentative 1003 a=rtpmap:0 PCMU/8000 1004 a=rtpmap:8 PCMA/8000 1005 a=rtpmap:97 iLBC/8000 1006 m=video 10000 RTP/AVP 31 32 1007 b=AS:1000 1008 a=mid:bar 1009 a=session-mux-id:0 policy=tentative 1010 a=rtpmap:31 H261/90000 1011 a=rtpmap:32 MPV/90000 1012 m=audio 10000 RTP/AVP 100 1013 b=AS:100 1014 a=rtpmap:100 ulpfec/8000 1015 a=mid:1 1016 a=session-mux-id:1 policy=tentative 1017 m=video 10000 RTP/AVP 101 1018 b=AS:500 1019 a=mid:2 1020 a=session-mux-id:1 policy=tentative 1021 a=rtpmap:101 ulpfec/90000 1023 The SDP answer of a client supporting 1024 [I-D.ietf-mmusic-sdp-bundle-negotiation] but not this SHIMing would 1025 look like this: 1027 v=0 1028 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 1029 s= 1030 c=IN IP4 biloxi.example.com 1031 t=0 0 1032 a=group:BUNDLE 1 2 1033 a=group:BUNDLE foo bar 1034 a=group:FEC foo 1 1035 a=group:FEC bar 2 1036 m=audio 20000 RTP/AVP 0 8 97 1037 b=AS:200 1038 a=mid:foo 1039 a=rtpmap:0 PCMU/8000 1040 a=rtpmap:8 PCMA/8000 1041 a=rtpmap:97 iLBC/8000 1042 m=video 20000 RTP/AVP 31 32 1043 b=AS:1000 1044 a=mid:bar 1045 a=rtpmap:31 H261/90000 1046 a=rtpmap:32 MPV/90000 1047 m=audio 20002 RTP/AVP 100 1048 b=AS:100 1049 a=rtpmap:100 ulpfec/8000 1050 a=mid:1 1051 m=video 20002 RTP/AVP 101 1052 b=AS:500 1053 a=mid:2 1054 a=rtpmap:101 ulpfec/90000 1056 In the above case two different RTP sessions, both being of a BUNDLE 1057 type with multiple media types in each. The two established flows 1058 will be Alice:10000<->Bob:20000, and Alice:10000<->Bob:20002. 1060 If the peer did support neither of the SHIM or BUNDLE extension the 1061 answer would look like this: 1063 v=0 1064 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 1065 s= 1066 c=IN IP4 biloxi.example.com 1067 t=0 0 1068 a=group:FEC foo 1 1069 a=group:FEC bar 2 1070 m=audio 20000 RTP/AVP 0 8 97 1071 b=AS:200 1072 a=mid:foo 1073 a=rtpmap:0 PCMU/8000 1074 a=rtpmap:8 PCMA/8000 1075 a=rtpmap:97 iLBC/8000 1076 m=video 20002 RTP/AVP 31 32 1077 b=AS:1000 1078 a=mid:bar 1079 a=rtpmap:31 H261/90000 1080 a=rtpmap:32 MPV/90000 1081 m=audio 20004 RTP/AVP 100 1082 b=AS:100 1083 a=rtpmap:100 ulpfec/8000 1084 a=mid:1 1085 m=video 20006 RTP/AVP 101 1086 b=AS:500 1087 a=mid:2 1088 a=rtpmap:101 ulpfec/90000 1090 In this case four different transport flows would be established for 1091 RTP, each with a different RTP session over them. The answer also 1092 knows the binding between the sessions with FEC and their source data 1093 thanks to the FEC specification. 1095 7. Open Issues 1097 This work is still in the early phase of specification. This section 1098 contains a list of open issues where the author desires some input. 1100 1. In Section 6.2 there is a discussion of which parameters that 1101 must be configured. The scope of these rules and if they do make 1102 sense needs additional discussion. 1104 2. Can we provide better control so that applications that doesn't 1105 desire fallback to single RTP session when Multiplexing shim 1106 fails to be supported but Bundle is supported ends up with a 1107 better alternative? 1109 3. The details for how to do key-derivation, preferably in such a 1110 way that it can be reused by multiple key-management solutions 1111 like MIKEY and DTLS-SRTP 1113 4. The signalling solution will be revisited when the BUNDLE 1114 solution discussion has yeild some result. 1116 8. IANA Considerations 1118 This document request the registration of one SDP attribute. Details 1119 of the registration to be filled in. 1121 9. Security Considerations 1123 The security properties of the Session ID layer is depending on what 1124 mechanism is used to protect the RTP and RTCP packets of a given RTP 1125 session. If IPsec or transport layer security solutions such as DTLS 1126 or TLS are being used then both the encapsulated RTP/RTCP packets and 1127 the session ID layer will be protected by that security mechanism. 1128 Thus potentially providing both confidentiality, integrity and source 1129 authentication. If SRTP is used, the session ID layer will not be 1130 directly protected by SRTP. However, it will be implicitly integrity 1131 protected (assuming the RTP/RTCP packet is integrity protected) as 1132 the only function of the field is to identify the session context. 1133 Thus any modification of the SID field will attempt to retrieve the 1134 wrong SRTP crypto context. If that retrieval fails, the packet will 1135 be anyway be discarded. If it is successful, the context will not 1136 lead to successful verification of the packet. 1138 10. Acknowledgements 1140 This document is based on the input from various people, especially 1141 in the context of the RTCWEB discussion of how to use only a single 1142 lower layer transport. The RTP and RTCP packet figures are borrowed 1143 from RFC3711. The SDP example is extended from the one present in 1144 [I-D.ietf-mmusic-sdp-bundle-negotiation]. Eric Rescorla contributed 1145 the basic idea of optimizing the DTLS-SRTP key-management by 1146 modifying the key derivation process. 1148 The proposal in Appendix A.5 is original suggested by Colin Perkins. 1149 The idea in Appendix A.6 is from an Internet Draft 1150 [I-D.rosenberg-rtcweb-rtpmux] written by Jonathan Rosenberg et. al. 1151 The proposal in Appendix A.3 is a result of discussion by a group of 1152 people at IETF meeting #81 in Quebec. 1154 11. References 1156 11.1. Normative References 1158 [I-D.ietf-mmusic-sdp-bundle-negotiation] 1159 Holmberg, C. and H. Alvestrand, "Multiplexing Negotiation 1160 Using Session Description Protocol (SDP) Port Numbers", 1161 draft-ietf-mmusic-sdp-bundle-negotiation-01 (work in 1162 progress), August 2012. 1164 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1165 Requirement Levels", BCP 14, RFC 2119, March 1997. 1167 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1168 Jacobson, "RTP: A Transport Protocol for Real-Time 1169 Applications", STD 64, RFC 3550, July 2003. 1171 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 1172 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 1173 RFC 3711, March 2004. 1175 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 1176 Specifications: ABNF", STD 68, RFC 5234, January 2008. 1178 11.2. Informational References 1180 [I-D.ietf-avtcore-multi-media-rtp-session] 1181 Westerlund, M., Perkins, C., and J. Lennox, "Multiple 1182 Media Types in an RTP Session", 1183 draft-ietf-avtcore-multi-media-rtp-session-00 (work in 1184 progress), October 2012. 1186 [I-D.lennox-rtcweb-rtp-media-type-mux] 1187 Rosenberg, J. and J. Lennox, "Multiplexing Multiple Media 1188 Types In a Single Real-Time Transport Protocol (RTP) 1189 Session", draft-lennox-rtcweb-rtp-media-type-mux-00 (work 1190 in progress), October 2011. 1192 [I-D.rosenberg-rtcweb-rtpmux] 1193 Rosenberg, J., Jennings, C., Peterson, J., Kaufman, M., 1194 Rescorla, E., and T. Terriberry, "Multiplexing of Real- 1195 Time Transport Protocol (RTP) Traffic for Browser based 1196 Real-Time Communications (RTC)", 1197 draft-rosenberg-rtcweb-rtpmux-00 (work in progress), 1198 July 2011. 1200 [I-D.westerlund-avtcore-multiplex-architecture] 1201 Westerlund, M., Burman, B., Perkins, C., and H. 1203 Alvestrand, "Guidelines for using the Multiplexing 1204 Features of RTP", 1205 draft-westerlund-avtcore-multiplex-architecture-02 (work 1206 in progress), July 2012. 1208 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 1209 August 1980. 1211 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1212 RFC 793, September 1981. 1214 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 1215 with Session Description Protocol (SDP)", RFC 3264, 1216 June 2002. 1218 [RFC3830] Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K. 1219 Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830, 1220 August 2004. 1222 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 1223 Congestion Control Protocol (DCCP)", RFC 4340, March 2006. 1225 [RFC4567] Arkko, J., Lindholm, F., Naslund, M., Norrman, K., and E. 1226 Carrara, "Key Management Extensions for Session 1227 Description Protocol (SDP) and Real Time Streaming 1228 Protocol (RTSP)", RFC 4567, July 2006. 1230 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 1231 Description Protocol (SDP) Security Descriptions for Media 1232 Streams", RFC 4568, July 2006. 1234 [RFC4960] Stewart, R., "Stream Control Transmission Protocol", 1235 RFC 4960, September 2007. 1237 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 1238 Correction", RFC 5109, December 2007. 1240 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 1241 (ICE): A Protocol for Network Address Translator (NAT) 1242 Traversal for Offer/Answer Protocols", RFC 5245, 1243 April 2010. 1245 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 1246 Header Extensions", RFC 5285, July 2008. 1248 [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, 1249 "Session Traversal Utilities for NAT (STUN)", RFC 5389, 1250 October 2008. 1252 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 1253 Real-Time Transport Control Protocol (RTCP): Opportunities 1254 and Consequences", RFC 5506, April 2009. 1256 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 1257 Control Packets on a Single Port", RFC 5761, April 2010. 1259 [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer 1260 Security (DTLS) Extension to Establish Keys for the Secure 1261 Real-time Transport Protocol (SRTP)", RFC 5764, May 2010. 1263 Appendix A. Possible Solutions 1265 This section documents the solutions explored when selecting a SHIM 1266 based one and discusses their feasibility. 1268 A.1. Header Extension 1270 One proposal is to define an RTP header extension [RFC5285] that 1271 explicitly enumerates the session identifier in each packet. This 1272 proposal has some merits regarding RTP, since it uses an existing 1273 extension mechanism; it explicitly enumerates the session allowing 1274 for third parties to associate the packet to a given RTP session; and 1275 it works with SRTP as currently defined since a header extension is 1276 by default not encrypted, and is thus readable by the receiving stack 1277 without needing to guess which session it belongs to and attempt to 1278 decrypt it. This approach does, however, conflict with the 1279 requirement from [RFC5285] that "header extensions using this 1280 specification MUST only be used for data that can be safely ignored 1281 by the recipient", since correct processing of the received packet 1282 depends on using the header extension to demultiplex it to the 1283 correct RTP session. 1285 Using a header extension also result in the session ID is in the 1286 integrity protected part of the packet. Thus a translator between 1287 multiplexed and non-multiplexed has the options: 1289 1. to be part of the security context to verify the field 1291 2. to be part of the security context to verify the field and remove 1292 it before forwarding the packet 1294 3. to be outside of the security context and leave the header 1295 extension in the packet. However, that requires successful 1296 negotiation of the header extension, but not of the 1297 functionality, with the receiving end-points. 1299 The biggest existing hurdle for this solution is that there exist no 1300 header extension field in the RTCP packets. This requires defining a 1301 solution for RTCP that allows carrying the explicit indicator, 1302 preferably in a position that isn't encrypted by SRTCP. However, the 1303 current SRTCP definition does not offer such a position in the 1304 packet. 1306 Modifying the RR or SR packets is possible using profile specific 1307 extensions. However, that has issues when it comes to deployability 1308 and in addition any information placed there would end up in the 1309 encrypted part. 1311 Another alternative could be to define another RTCP packet type that 1312 only contains the common header, using the 5 bits in the first byte 1313 of the common header to carry a session id. That would allow SRTCP 1314 to work correctly as long it accepts this new packet type being the 1315 first in the packet. Allowing a non-SR/RR packet as the first packet 1316 in a compound RTCP packet is also needed if an implementation is to 1317 support Reduced Size RTCP packets [RFC5506]. The remaining downside 1318 with this is that all stack implementations supporting multiplexing 1319 would need to modify its RTCP compound packet rules to include this 1320 packet type first. Thus a translator box between supporting nodes 1321 and non-supporting nodes needs to be in the crypto context. 1323 This solution's per packet overhead is expected to be 64-bits for 1324 RTCP. For RTP it is 64-bits if no header extension was otherwise 1325 used, and an additional 16 bits (short header), or 24 bits plus (if 1326 needed) padding to next 32-bits boundary if other header extensions 1327 are used. 1329 A.2. Multiplexing Shim 1331 This proposal is to prefix or postfix all RTP and RTCP packets with a 1332 session ID field. This field would be outside of the normal RTP and 1333 RTCP packets, thus having no impact on the RTP and RTCP packets and 1334 their processing. An additional step of demultiplexing processing 1335 would be added prior to RTP stack processing to determine in which 1336 RTP session context the packet shall be included. This has also no 1337 impact on SRTP/SRTCP as the shim layer would be outside of its 1338 protection context. The shim layer's session ID is however 1339 implicitly integrity protected as any error in the field will result 1340 in the packet being placed in the wrong or non-existing context, thus 1341 resulting in a integrity failure if processed by SRTP/SRTCP. 1343 This proposal is quite simple to implement in any gateway or 1344 translating device that goes from a multiplexed to a non-multiplexed 1345 domain or vice versa, as only an additional field needs to be added 1346 to or removed from the packet. 1348 The main downside of this proposal is that it is very likely to 1349 trigger a firewall response from any deep packet inspection device. 1350 If the field is prefixed, the RTP fields are not matching the 1351 heuristics field (unless the shim is designed to look like an RTP 1352 header, in which case the payload length is unlikely to match the 1353 expected value) and thus are likely preventing classification of the 1354 packet as an RTP packet. If it is postfixed, it is likely classified 1355 as an RTP packet but may not correctly validate if the content 1356 validation is such that the payload length is expected to match 1357 certain values. It is expected that a postfixed shim will be less 1358 problematic than a prefixed shim in this regard, but we are lacking 1359 hard data on this. 1361 This solution's per packet overhead is 1 byte. 1363 A.3. Single Session 1365 Given the difficulty of multiplexing several RTP sessions onto a 1366 single lower-layer transport, it's tempting to send multiple media 1367 streams in a single RTP session. Doing this avoids the need to de- 1368 multiplex several sessions on a single transport, but at the cost of 1369 losing the RTP session as a separator for different type of streams. 1370 Lacking different RTP sessions to demultiplex incoming packets, a 1371 receiver will have to dig deeper into the packet before determining 1372 what to do with it. Care must be taken in that inspection. For 1373 example, you must be careful to ensure that each real media source 1374 uses its own SSRC in the session and that this SSRC doesn't change 1375 media type. 1377 The loss of the RTP session as a separator for different usages or 1378 purpose would be an minor issue if the only difference between the 1379 RTP sessions is the media type. In this case, the application could 1380 use the Payload Type field to identify the media type. The loss of 1381 the RTP Session functionality is however severe, if the application 1382 uses the RTP Session for separating different treatments, contexts 1383 etc. Then you would need additional signalling to bind the different 1384 sources to groups which can help make the necessary distinctions. 1386 However, the loss of the RTP session as separator is not the only 1387 issue with this approach. The RTP Multiplexing Architecture 1388 [I-D.westerlund-avtcore-multiplex-architecture] discusses a number of 1389 issues in Section 6.7. These include RTCP bandwidth differences, 1390 limitations in the number of payload types, media aware RTP mixers 1391 and interactions with Legacy end-points. 1393 Additional attention should be place on this important aspect. In 1394 multi-party situations using central nodes there exist some 1395 difficulties in having a legacy implementation using multiple RTP 1396 sessions interworking with an end-point having only a single RTP 1397 session across the central node. The main reason is the fact that 1398 the one using single session with multiple media types has only one 1399 SSRC space, while the other end-points have multiple spaces. Thus 1400 translation may have to occur because there is several RTP sessions 1401 using the same SSRC value. This has both limitations, processing 1402 overhead and the possibility of becoming an deployment obstacle for 1403 new RTP/RTCP extensions. 1405 This approach has been proposed in the RTCWeb context in 1406 [I-D.lennox-rtcweb-rtp-media-type-mux] and 1407 [I-D.ietf-mmusic-sdp-bundle-negotiation]. These drafts describe how 1408 to signal multiple media streams multiplexed into a single RTP 1409 session, and address some of the issues raised here and in Section 1410 6.7 of the RTP Multiplexing Architecture 1411 [I-D.westerlund-avtcore-multiplex-architecture] draft. 1413 This method has several limitations that limits its usage as solution 1414 in providing multiple RTP sessions on the same lower layer transport. 1415 However, we acknowledge that there are some uses for which this 1416 method may be sufficient and which can accept the methods limitations 1417 and downsides. The RTCWEB WG has a working assumption to support 1418 this method. For more details of this method, see the relevant 1419 drafts under development. We do include this method in the 1420 comparison to provide a more complete picture of the pro and cons of 1421 this method. 1423 This solution has no per packet overhead. The signalling overhead 1424 will be a different question. 1426 A.4. Use the SRTP MKI field 1428 This proposal is to overload the MKI SRTP/SRTCP identifier to not 1429 only identify a particular crypto context, but also identify the 1430 actual RTP Session. This clearly is a miss use of the MKI field, 1431 however it appears to be with little negative implications. SRTP 1432 already supports handling of multiple crypto contexts. 1434 The two major downsides with this proposal is first the fact that it 1435 requires using SRTP/SRTCP to multiplex multiple sessions on a single 1436 lower layer transport. The second issue is that the session ID 1437 parameter needs to be put into the various key-management schemes and 1438 to make them understand that the reason to establish multiple crypto 1439 contexts is because they are connected to various RTP Sessions. 1440 Considering that SRTP have at least 3 used keying mechanisms, DTLS- 1441 SRTP [RFC5764], Security Descriptions [RFC4568], and MIKEY [RFC3830], 1442 this is not an insignificant amount of work. 1444 This solution has 32-bit per packet overhead, but only if the MKI was 1445 not already used. 1447 A.5. Use an Octet in the Padding 1449 The basics of this proposal is to have the RTP packet and the last 1450 (required by RFC3550) RTCP packet in a compound to include padding, 1451 at least 2 bytes. One byte for the padding count (last byte) and one 1452 byte just before the padding count containing the session ID. 1454 This proposal uses bytes to carry the session ID that have no defined 1455 value and is intended to be ignored by the receiver. From that 1456 perspective it only causes packet expansion that is supported and 1457 handled by all existing equipment. If an implementation fails to 1458 understand that it is required to interpret this padding byte to 1459 learn the session ID, it will see a mostly coherent RTP session 1460 except where SSRCs overlap or where the payload types overlap. 1461 However, reporting on the individual sources or forwarding the RTCP 1462 RR are not completely without merit. 1464 There is one downside of this proposal and that has to do with SRTP. 1465 To be able to determine the crypto context, it is necessary to access 1466 to the encrypted payload of the packet. Thus, the only mechanism 1467 available for a receiver to solve this issue is to try the existing 1468 crypto contexts for any session on the same lower layer transport and 1469 then use the one where the packet decrypts and verifies correctly. 1470 Thus for transport flows with many crypto contexts, an attacker could 1471 simply generate packets that don't validate to force the receiver to 1472 try all crypto contexts they have rather than immediately discard it 1473 as not matching a context. A receiver can mitigate this somewhat by 1474 using heuristics based on the RTP header fields to determine which 1475 context applies for a received packet, but this is not a complete 1476 solution. 1478 This solution has a 16-bit per packet overhead. 1480 A.6. Redefine the SSRC field 1482 The Rosenberg et. al. Internet draft "Multiplexing of Real-Time 1483 Transport Protocol (RTP) Traffic for Browser based Real-Time 1484 Communications (RTC)" [I-D.rosenberg-rtcweb-rtpmux] proposed to 1485 redefine the SSRC field. This has the advantage of no packet 1486 expansion. It also looks like regular RTP. However, it has a number 1487 of implications. First of all it prevents any RTP functionality that 1488 require the same SSRC in multiple RTP sessions. 1490 Secondly its interoperability with end-point using multiple RTP 1491 sessions are problematic. Such interoperability will requires an 1492 SSRC translator function in the gatewaying node to ensure that the 1493 SSRCs fulfill the semantic rules of the different domains. That 1494 translator is actually far from easy as it needs to understand the 1495 semantics of all RTP and RTCP extensions that include SSRC/CSRC. 1496 This as it is necessary to know when a particular matching 32-bit 1497 pattern is an SSRC field and when the field is just a combination of 1498 other fields that create the same matching 32-bit pattern. Thus 1499 there is a possibility that such a translator becomes a obstacle in 1500 deploying future RTP/RTCP extensions. In addition the translator 1501 actually have significant overhead when SRTP are in use. This as a 1502 verification that the packet is authentic, decryption, SSRC 1503 translation, encryption and finally generation of authentication tags 1504 are required. In addition the translator must be part of the 1505 security context. 1507 This solution has no per packet overhead. 1509 Appendix B. Comparison 1511 This section compares the above potential solutions with the 1512 requirements. Motivations are provided in addition to a high level 1513 metric of successfully, partially and failing to meet requirement. 1514 In the end a summary table (Figure 5) of the high level value are 1515 provided. 1517 B.1. Support of Multiple RTP Sessions Over Single Transport 1519 This one is easy to determine. Only the single session proposal 1520 fails this requirement as it is not at all designed to meet it. The 1521 rest fully support this requirement. The main question around this 1522 requirement is how important it is to have as discussed in 1523 Section 4.1. 1525 B.2. Enable Same SSRC Value in Multiple RTP Sessions 1527 Based on the discussion in Section 4.2 two sub-requirements have been 1528 derived. 1530 B.2.1. Avoid SSRC Translation in Gateways/Translation 1532 This sub-requirement is derived based on the desire to avoid having 1533 gateways or translators perform full SSRC translation to minimize 1534 complexity, avoid the requirement to have gateways in security 1535 context, and as a hinder to long-term evolution. Two of the 1536 proposals have issues with this, due to their lack of support for 1537 multiple 32-bit SSRC spaces and lacking possibility to have the same 1538 SSRC value in multiple RTP sessions. The proposals that have these 1539 properties and thus are marked as failing are the Single Session and 1540 Redefine the SSRC field. The other proposals are all successful in 1541 meeting this requirement. 1543 B.2.2. Support Existing Extensions 1545 The second sub-requirement is how well the proposals support using 1546 the existing RTP mechanisms. Here both Single Session and Redefine 1547 the SSRC field will have clear issues as they cannot support the same 1548 full 32-bit SSRC value in two different RTP sessions. This is 1549 clearly an issue for the XOR based FEC. RTP retransmission and 1550 scalable encoding are minor issues as there exist alternatives to 1551 those mechanisms that works with the structure of these two 1552 proposals. Thus we give them a fail. The Header Extension gets a 1553 partial due to unclear interaction between putting in an header 1554 extension and these mechanisms. 1556 B.3. Ensure SRTP Functions 1558 This requirement is about ensuring both secure and efficient usage of 1559 SRTP. The Octet in Padding field proposal gets a fail as the 1560 receiving end-point cannot determine the intended RTP session prior 1561 to de-encryption of the padding field. Thus a catch-22 arises which 1562 can only be resolved by trying all session contexts and see what 1563 decrypts. This causes a security vulnerability as an attacker can 1564 inject a packet which does not meet any of the session contexts. The 1565 receiver will then attempt decryption and authentication of it using 1566 all its session contexts, increasing the amount of wasted resources 1567 by a factor equal to the number of multiplexed sessions. Thus this 1568 proposal gets a fail. 1570 The proposal of Overloading the SRTP MKI field as session identifier 1571 gets a partial due to the fact that it cannot use SRTP's key- 1572 management mechanism out of the box. It forces the key-management 1573 mechanism and the SRTP implementations to maintain the MKI-to-RTP 1574 session bindings to maintain secure and correct function. 1576 The Redefine the SSRC field gets a partial due to its need to modify 1577 the key-management mechanisms to correctly identify the partial SSRC 1578 space the parameters applies to. Similarly, the SRTP implementation 1579 also needs to be updated to correctly support this security context 1580 differentiation. 1582 The header extension based solution gets a less severe partial than 1583 Redefine the SSRC and the MKI. It will however have an issue when 1584 being gatewayed to a domain that does not multiplex multiple RTP 1585 sessions over the same transport. Then the gateway will require to 1586 be in the security context to be able to add or remove the header 1587 extension as it is in the part of the packet that is integrity 1588 protected by SRTP. 1590 The remaining two proposals do not affect SRTP mechanisms and thus 1591 successfully meet this requirement. 1593 B.4. Don't Redefine Used Bits 1595 This requirement is all about RTP and RTCP header fields having a 1596 given definition should not be changed as it can cause 1597 interoperability problems between modified and non-modified 1598 implementations. This becomes especially problematic in RTP sessions 1599 used for multi-party sessions. 1601 Redefine the SSRC field gets a big fail on this as it redefines the 1602 SSRC field, a core field in RTP. It has been identified that such a 1603 change will have issues since if it gets connected to a non-modified 1604 end-point that randomly assigns the SSRC, as supposed by RFC 3550, 1605 those SSRCs will be distributed over different RTP sessions at the 1606 modified end-point. Also other functions using the SSRC field, not 1607 understanding the additional semantics of the SSRC field, is likely 1608 to have issues. 1610 Using the SRTP MKI field to identify a session is overloading that 1611 field with double semantics. This likely has minimal negative impact 1612 in RTP since it should be possible to have the SRTP stack use the MKI 1613 field to both look up the security context and which output RTP 1614 session the processed packet belongs to. However, this redefinition 1615 clearly creates issues with the key-management scheme. That will 1616 have to be modified to handle both this change and deal with the 1617 interoperability issues when negotiating its usage. This gets a full 1618 fail due to that it makes the problem someone else's, namely the RTP 1619 implementors. 1621 Defining an Octet in the Padding field redefines a field, whose 1622 definition is to have zero value and is expected to be ignored by the 1623 receiver according to the original semantics. Thus this is one of 1624 the more benign modifications one can do, however this can still 1625 cause issues in implementations that unnecessarily check the field 1626 values, or in Firewalls. This is judged to be partially meeting the 1627 requirement. 1629 The Header Extension proposal does in fact not redefine any currently 1630 used bits in RTP. The header extension would be a correctly 1631 identified extension with its own definition. However, it does 1632 redefine a rule on what header extensions are for. The RTCP solution 1633 however would have more severe impact as it would need to redefine 1634 the standard meaning of an RTCP packet header in addition to the 1635 default compound packet rules. Due to these issues the proposal 1636 fails to meet this requirement. 1638 The multiplexing shim and the single session both successfully meet 1639 this requirement. 1641 B.5. Firewall Friendly 1643 This requirement is clearly difficult to judge as firewall 1644 implementations are highly different in both implementation, scope of 1645 what it investigates in packets, and set policies. A reasonable goal 1646 is to minimize the likeliness that rules and policies intended to let 1647 RTP media streams pass, will also let these streams through when 1648 multiplexing RTP sessions over a single transport. The below 1649 analysis shows that no solution is truly firewall friendly and all 1650 are judged as being partially meeting this goal. However, the reason 1651 why it is believed that a firewall might react to the streams are 1652 quite different. 1654 The Single Session and Redefine the SSRC field are likely the least 1655 suspect solutions from a firewall perspective. However, as their 1656 transport flows contain multiple SSRCs with payloads that indicate 1657 likely multiple different media types they are still likely to make a 1658 picky firewall block the transport. This is especially true for 1659 Firewalls that take signalling messages into account where it will 1660 expect a particular media type in a given context. A non upgraded 1661 firewall might in fact produce two different contexts with 1662 overlapping transport parameters where both rules will receive media 1663 streams of the other media type that are outside of the allowed rule. 1664 However, to be clear if these proposals doesn't get through, none of 1665 the other will either as they all will have this behavior. 1667 The header extension proposal is potentially problematic for two 1668 reasons. The first reason, which also other proposals has, is 1669 related to that the same SSRC value can exist in two RTP sessions 1670 over the same underlying flow. Anyone tracking the sequence number 1671 and timestamp will react badly as the second media stream with the 1672 same SSRC causes constant jumps back and forth in these fields 1673 compared to the first stream, if packets are transmitted 1674 simultaneously for both SSRCs. This issue can likely only be solved 1675 by having the Firewalls that like to track flows to also use the 1676 session identifier to create context. This is possible as the header 1677 extension will be in the clear and in the front. The second issue is 1678 that the header extension itself may get the firewall to react. 1679 Especially very picky ones that expect packets with certain media 1680 types to have certain packet lengths. They are not compatible with a 1681 header extension. 1683 The Multiplexing Shim shares the issue with multiple flows for the 1684 same SSRC. Firewalls and deep packet inspection cause the shim 1685 placement to be in question. If it is a pre-fixed shim, it prevents 1686 the packet from looking like regular IP/UDP/RTP packets and be 1687 correctly classified in Firewalls and DPI engines. However, if one 1688 puts it last, it is unlikely that any firewall or DPI ever will be 1689 able to take the session context into account as it is at the end of 1690 the packet. This as many line rate processing devices only take a 1691 certain amount of the headers into account. 1693 The SRTP MKI field is likely the solution that has least firewall and 1694 DPI issues, after the single RTP session. There is no additional 1695 suspect field. The only difference from a single RTP session in the 1696 transport flow is the fact that multiple MKI are guaranteed to be 1697 used. However, that may occur also in a single RTP session usage. 1698 Thus the only issues are the one shared with single session and the 1699 one that several RTP media streams may use the same SSRC. 1701 The octet in the padding field has, in addition to the issues the 1702 SRTP MKI field has, the single issue that it redefines something that 1703 is supposed to be zero into a value. Thus potentially causing a 1704 deeply inspecting firewall to clamp the flow in fear of covert 1705 channel or non-compliance. 1707 B.6. Monitoring and Reporting 1709 The monitoring and reporting requirement considers several aspects. 1710 How useful monitoring can one get from an existing legacy monitor, 1711 and secondary any issues in upgrading them to handle the selected 1712 solution. Thirdly, packet selector filters and packet sniffers 1713 concerns are considered. 1715 In general one can expect the proposals that have only a single SSRC 1716 space to work better with legacy. Thus both Single Session and 1717 Redefine SSRC space can gather and report data on media flows most 1718 likely. The only potential issue is that due to the different media 1719 types and clock rates, some failure may occur. In particular a third 1720 party monitor may be targeted to a specific media type, like 1721 monitoring VoIP. That monitor will have problems processing any 1722 video packets correctly and generate the VoIP specific metrics for 1723 any video sending SSRC. In general, no legacy solution for 1724 monitoring will be able to correctly create the sub-contexts that 1725 each RTP session has in the solutions, without update to handle the 1726 new semantics. Also when it comes to the packet filtering and 1727 selector filters, fine grained control can only be accomplished 1728 implementing the new semantics. Therefore only the Single Session 1729 meets this requirement fully. 1731 Redefine the SSRC field is close to fully meeting the requirement, 1732 however due to that there exist a session structure that is hidden to 1733 anyone that is not upgraded to understand the semantics, this only 1734 gets a partial. 1736 The other proposals all can have multiple RTP sessions using the same 1737 SSRC. This will create significant issues for any legacy third party 1738 monitor. Only an updated monitor, or for that matter packet 1739 selector, can pick out the individual media streams and their 1740 associated RTCP traffic. Thus all these proposals gets a failure to 1741 meet the requirement. 1743 B.7. Usable over Multicast 1745 As discussed earlier the goal with having the option usable also over 1746 multicast is to remove the need to produce different media streams 1747 for transport over unicast and multicast. All of the proposals 1748 successfully meet the requirement. 1750 B.8. Incremental Deployment 1752 The possibility to deploy the usage of the multiplexing of multiple 1753 RTP sessions over a single transport, especially in the context of 1754 multi-party sessions, is a great benefit for any of the proposals. 1755 Thus not all end-point implementations needs to be upgraded before 1756 one start enabling it in the central node and any signalling. 1758 Considering a centralized multi-party application where some 1759 participants are using multiple transport flows and you want to 1760 enable one particular participant to use the single transport to the 1761 central node, one criteria stands out. The possibility to have one 1762 RTP session per transport in one leg, and in the next multiplex them 1763 together with minimal complexity and packet changes. Here there are 1764 significant differences. 1766 The Multiplexing Shim has the least overhead for this. As the 1767 central node or gateway between deployments only needs to either add 1768 or remove the shim identifier and then forward the packet over the 1769 corresponding transport, either a joint one on the single transport 1770 side, or over the individual one on the multiple transport side. 1772 The SRTP MKI field proposal is almost as good, as the only main 1773 difference is the need to coordinate the used MKIs on the non- 1774 multiplexed legs so that there is no overlap between the RTP 1775 sessions. And if there is, the MKI can be translated in gateway as 1776 SRTP has no integrity protection over the MKI. Thus both 1777 multiplexing shim and SRTP MKI field does successfully meet this 1778 requirement. 1780 The Header Extension supports multiple full 32-bit SSRC spaces and 1781 can thus handle all the RTP sessions without need for any SSRC 1782 translation, however this proposal does run into the problem that the 1783 gateway needs to be in the security context to be able to add or 1784 remove the header extension when SRTP is used. In addition to the 1785 security implications of that, there is a complexity overhead due to 1786 the need to redo the authentication tags on all RTP/RTCP packets. 1787 Thus it gets a partial. 1789 The Octet in the Padding field share issues with the header extension 1790 but have even higher complexities for this. The reason is that the 1791 padding field is also encrypted. Thus to add or remove it (although 1792 removing it may be unnecessary) forces the end-point to encrypt at 1793 least that byte also, and for ciphers that are not stream-ciphers, 1794 the whole packet needs to be re-encrypted. Thus this proposal gets a 1795 very weak partially meeting the requirement. 1797 The Single Session and Redefine the SSRC field do not allow several 1798 vanilla RTP sessions to be connected to these proposals. The reason 1799 is the single 32-bit SSRC space they have. Single Session only has 1800 one session and the Redefine the SSRC fields uses some of the bits as 1801 session identifier. This forces the gateway to translate the SSRC 1802 whenever it does not fulfill the rules or semantics of the 1803 multiplexed side. For Redefine SSRC field this becomes almost 1804 constant as the session identifier part of the SSRC must be the same 1805 over all SSRCs from the same session. For Single Session it may only 1806 be needed when there otherwise would be an SSRC collision between the 1807 sessions. This further assumes that the non-multiplexed side would 1808 never use any of the RTP mechanisms that require the same SSRC in 1809 multiple RTP sessions, as they cannot be gatewayed at all. When 1810 translating an SSRC there is first of all an overhead, with SRTP that 1811 includes a complete authenticate, decrypt, encrypt and create a new 1812 authentication tag cycle. In addition, the SSRC translation could 1813 potentially be a deployment obstacle for new RTP/RTCP extensions 1814 required to be understood by the translator to be correctly 1815 translated. Therefore these two proposals gets a fail to meet the 1816 requirements. 1818 B.9. Summary and Conclusion 1820 This section contains a summary table of the high level outcome 1821 against the different requirements. 1823 A table mapping the requirements against the ID numbers used in the 1824 table is the following: 1826 1: Support multiple RTP sessions over one transport flow 1828 2: Enable same SSRC value in multiple RTP sessions 1830 2.1: Avoid SSRC translation in gateways/translators 1832 2.2: Support existing extensions 1834 3: Ensure SRTP functions 1836 4: Don't Redefine used bits 1838 5: Firewall Friendly 1840 6: Monitoring and Reporting should still function 1842 7: Usable over Multicast 1844 8: Incremental deployment 1846 OH: Overhead in Bytes. + means variable 1848 ---------------+---+---+---+---+---+---+---+---+---+---- 1849 Solution | 1 |2.1|2.2| 3 | 4 | 5 | 6 | 7 | 8 | OH 1850 ---------------+---+---+---+---+---+---+---+---+---+---- 1851 Header Ext. | S | S | P | P | F | P | F | S | P | 8+ 1852 Multiplex Shim | S | S | S | S | S | P | F | S | S | 1 1853 Single Session | F | F | F | S | S | P | S | S | F | 0 1854 SRTP MKI Field | S | S | S | P | F | P | F | S | S | 4 1855 Padding Field | S | S | S | F | P | P | F | S | P | 2 1856 Redefine SSRC | S | F | F | P | F | P | P | S | S | 0 1857 ---------------+---+---+---+---+---+---+---+---+---+---- 1859 Figure 5: Summary Table of Evaluation (Successfully (S), Partially 1860 (P) or Fails (F) to meet requirement) 1862 Considering these options, the authors would recommend that AVTCORE 1863 standardize a solution based on a post or prefixed multiplexing 1864 field, i.e. a shim approach combined with the appropriate signalling 1865 as described in Appendix A.2. 1867 Authors' Addresses 1869 Magnus Westerlund 1870 Ericsson 1871 Farogatan 6 1872 SE-164 80 Kista 1873 Sweden 1875 Phone: +46 10 714 82 87 1876 Email: magnus.westerlund@ericsson.com 1878 Colin Perkins 1879 University of Glasgow 1880 School of Computing Science 1881 Glasgow G12 8QQ 1882 United Kingdom 1884 Email: csp@csperkins.org