idnits 2.17.1 draft-westerlund-avtcore-transport-multiplexing-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 28, 2013) is 3893 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-04 == Outdated reference: A later version (-13) exists of draft-ietf-avtcore-multi-media-rtp-session-03 -- Obsolete informational reference (is this intentional?): RFC 793 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) -- Obsolete informational reference (is this intentional?): RFC 5245 (Obsoleted by RFC 8445, RFC 8839) -- Obsolete informational reference (is this intentional?): RFC 5285 (Obsoleted by RFC 8285) -- Obsolete informational reference (is this intentional?): RFC 5389 (Obsoleted by RFC 8489) Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Westerlund 3 Internet-Draft Ericsson 4 Intended status: Standards Track C. S. Perkins 5 Expires: March 01, 2014 University of Glasgow 6 August 28, 2013 8 Multiple RTP Sessions on a Single Lower-Layer Transport 9 draft-westerlund-avtcore-transport-multiplexing-06 11 Abstract 13 This memo defines a mechanism to allow multiple RTP sessions to be 14 multiplexed onto a single lower-layer transport flow (e.g., onto a 15 single UDP 5-tuple). Requirements for multiplexing RTP sessions are 16 discussed, along with the trade-off between the different options. A 17 shim-based multiplexing layer is proposed, along with associated 18 signalling. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on March 01, 2014. 37 Copyright Notice 39 Copyright (c) 2013 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 55 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 4 56 3. Motivations . . . . . . . . . . . . . . . . . . . . . . . . . 4 57 3.1. NAT and Firewalls . . . . . . . . . . . . . . . . . . . . 4 58 3.2. No Transport Level QoS . . . . . . . . . . . . . . . . . 5 59 3.3. Multiple RTP sessions . . . . . . . . . . . . . . . . . . 5 60 3.4. Usage of RTP Extensions . . . . . . . . . . . . . . . . . 5 61 3.5. Incremental Deployment . . . . . . . . . . . . . . . . . 6 62 3.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 6 63 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 6 64 4.1. Support Use of Multiple RTP Sessions . . . . . . . . . . 7 65 4.2. Same SSRC Value in Multiple RTP Sessions . . . . . . . . 7 66 4.3. SRTP . . . . . . . . . . . . . . . . . . . . . . . . . . 8 67 4.4. Don't Redefine Used Bits . . . . . . . . . . . . . . . . 8 68 4.5. Firewall Friendly . . . . . . . . . . . . . . . . . . . . 9 69 4.6. Monitoring and Reporting . . . . . . . . . . . . . . . . 9 70 4.7. Usable Also Over Multicast . . . . . . . . . . . . . . . 9 71 4.8. Incremental Deployment . . . . . . . . . . . . . . . . . 9 72 5. Design Considerations . . . . . . . . . . . . . . . . . . . . 9 73 5.1. Location of SHIM . . . . . . . . . . . . . . . . . . . . 10 74 5.2. ICE and DTLS-SRTP Integration . . . . . . . . . . . . . . 11 75 5.3. Signalling Fall Back . . . . . . . . . . . . . . . . . . 12 76 6. Specification . . . . . . . . . . . . . . . . . . . . . . . . 13 77 6.1. Shim Layer . . . . . . . . . . . . . . . . . . . . . . . 13 78 6.2. Signalling . . . . . . . . . . . . . . . . . . . . . . . 16 79 6.3. SRTP Key Management . . . . . . . . . . . . . . . . . . . 18 80 6.3.1. Security Description . . . . . . . . . . . . . . . . 18 81 6.3.2. DTLS-SRTP . . . . . . . . . . . . . . . . . . . . . . 18 82 6.3.3. MIKEY . . . . . . . . . . . . . . . . . . . . . . . . 19 83 6.4. Examples . . . . . . . . . . . . . . . . . . . . . . . . 19 84 6.4.1. RTP Packet with Transport Header . . . . . . . . . . 19 85 6.4.2. SDP Offer/Answer example . . . . . . . . . . . . . . 20 86 7. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 25 87 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 88 9. Security Considerations . . . . . . . . . . . . . . . . . . . 25 89 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 26 90 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 26 91 11.1. Normative References . . . . . . . . . . . . . . . . . . 26 92 11.2. Informational References . . . . . . . . . . . . . . . . 27 93 Appendix A. Possible Solutions . . . . . . . . . . . . . . . . . 28 94 A.1. Header Extension . . . . . . . . . . . . . . . . . . . . 28 95 A.2. Multiplexing Shim . . . . . . . . . . . . . . . . . . . . 30 96 A.3. Single Session . . . . . . . . . . . . . . . . . . . . . 30 97 A.4. Use the SRTP MKI field . . . . . . . . . . . . . . . . . 32 98 A.5. Use an Octet in the Padding . . . . . . . . . . . . . . . 32 99 A.6. Redefine the SSRC field . . . . . . . . . . . . . . . . . 33 100 Appendix B. Comparison . . . . . . . . . . . . . . . . . . . . . 33 101 B.1. Support of Multiple RTP Sessions Over Single Transport . 33 102 B.2. Enable Same SSRC Value in Multiple RTP Sessions . . . . . 34 103 B.2.1. Avoid SSRC Translation in Gateways/Translation . . . 34 104 B.2.2. Support Existing Extensions . . . . . . . . . . . . . 34 105 B.3. Ensure SRTP Functions . . . . . . . . . . . . . . . . . . 34 106 B.4. Don't Redefine Used Bits . . . . . . . . . . . . . . . . 35 107 B.5. Firewall Friendly . . . . . . . . . . . . . . . . . . . . 36 108 B.6. Monitoring and Reporting . . . . . . . . . . . . . . . . 37 109 B.7. Usable over Multicast . . . . . . . . . . . . . . . . . . 38 110 B.8. Incremental Deployment . . . . . . . . . . . . . . . . . 38 111 B.9. Summary and Conclusion . . . . . . . . . . . . . . . . . 40 112 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41 114 1. Introduction 116 With the ongoing development of the WebRTC conferencing and CLUE 117 telepresence standards, there is renewed interest in defining a 118 mechanism that allows multiple RTP sessions [RFC3550] to share a 119 single lower layer transport, such as a bi-directional UDP flow. The 120 main problem driving this is the cost of doing NAT/firewall traversal 121 for each individual RTP flow. ICE and other NAT/firewall traversal 122 solutions are clearly capable of attempting to open multiple flows. 123 However, there is both increased risk for failure, and an increased 124 cost in the creation of multiple flows. The increased cost comes as 125 slightly higher delay in establishing the traversal, and the amount 126 of consumed NAT/firewall resources. The latter might be an 127 increasing problem in the IPv4 to IPv6 transition period. 129 There is ongoing work on specifying how and when one RTP session can 130 contain multiple media types 131 [I-D.ietf-avtcore-multi-media-rtp-session]. That addresses certain 132 use cases, while this proposal addresses a different set of use cases 133 and motivations (discussed further in Section 3). The classical 134 method of having each RTP session run over a specific transport flow 135 is still motivated for a number of use cases, especially when flow 136 based QoS is to be used for some media streams. 138 This document draws up some requirements for consideration on how to 139 transport multiple RTP sessions over a single lower-layer transport. 140 These requirements have to be weighted carefully, as no known 141 solution exists that can fulfil the combined set of requirements 142 completely. A number of possible solutions where considered and 143 discussed with respect to their properties. Based on that, this memo 144 defines a multiplexing shim, along with SDP signalling, and examples. 146 The other considered proposals and the comparison is available as 147 appendices. 149 2. Conventions 151 The following terminology is used in this document. 153 Multiplexing: Unless specifically noted, all mentioning of 154 multiplexing in this document refer to the multiplexing of 155 multiple RTP Sessions on the same lower layer transport. It is 156 important to make this distinction as RTP does contain a number of 157 multiplexing points for various purposes, such as media formats 158 (Payload Type), media sources (SSRC), and RTP sessions. 160 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 161 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 162 document are to be interpreted as described in RFC 2119 [RFC2119]. 164 3. Motivations 166 This section looks at the motivations why an additional solution is 167 needed, assuming that you can do both the classical method of having 168 one RTP session per transport flow [RFC3550], and when you have 169 multiple media types within one RTP session 170 [I-D.ietf-avtcore-multi-media-rtp-session]. 172 3.1. NAT and Firewalls 174 The existence of NATs and Firewalls on almost all Internet access has 175 implications for protocols, like RTP, that were designed to use 176 multiple transport-layer flows. The NAT/firewall traversal solution 177 has to to ensure that all these transport flows are established. 178 This has three different impacts: 180 1. Increased delay to perform the transport flow establishment 182 2. The more transport flows, the more state and the more resource 183 consumption in the NAT and Firewalls. When the resource 184 consumption in NAT/firewalls reaches their limits, unexpected 185 behaviours usually occur. Commonly resulting in service 186 disruptions. 188 3. More transport flows means a higher risk that some transport flow 189 fails to be established, thus preventing the application to 190 communicate. 192 Using fewer transport-layer flows reduces the risk of communication 193 failure, improves establishment behaviour, and causes less load on 194 NATs and firewalls. 196 3.2. No Transport Level QoS 198 Many RTP-using applications don't utilize any network level Quality 199 of Service functions. Nor do they expect or desire any separation in 200 network treatment of its media packets, independent of whether they 201 are audio, video or text. When an application has no such desire, it 202 doesn't need to provide a transport flow structure that simplifies 203 flow based QoS. 205 3.3. Multiple RTP sessions 207 The use of multiple RTP sessions allows separation of media streams 208 that have different usages or purposes in an RTP based application, 209 for example to separate the video of a presenter or most important 210 current talker from those of the listeners that not all end-points 211 receive. Separation of flows into different RTP sessions also allows 212 different processing based on media types, such as audio and video, 213 in end-points and middleboxes. This can give middleboxes the 214 knowledge that any SSRC within the session is supposed to be 215 processed in a similar way. 217 For simpler cases, where the streams within each media type need the 218 same processing, it is clearly possible to find other multiplex 219 solutions, for example based on the Payload Type and the differences 220 in encoding that the payload type allows to describe. This can 221 anyhow be insufficient when you get into more advanced usages where 222 you have multiple sources of the same media type, but for different 223 usages or as alternatives. For example when you have one set of 224 video sources that shows session participants and another set of 225 video sources that shares an application or presentation slides, you 226 likely want to separate those streams for various reasons such as 227 control, prioritization, QoS, methods for robustness, etc. In those 228 cases, using the RTP session for separation of properties is a 229 powerful tool. A tool with properties that need to be preserved when 230 providing a solution for how to use only a single lower-layer 231 transport. 233 For more discussion of the usage of RTP sessions verses other 234 multiplexing we recommend RTP Multiplexing Architecture 235 [I-D.westerlund-avtcore-multiplex-architecture]. 237 3.4. Usage of RTP Extensions 238 Applications uses different sets of RTP extensions. The solution for 239 multiple media types in one RTP session 240 [I-D.ietf-avtcore-multi-media-rtp-session] is known to have 241 limitations that prevent the usage of the following RTP mechanisms 242 and extensions: 244 o XOR FEC (RFC5109) 246 o RTP Retransmission in session mode (RFC4588) 248 o Certain Layered Coding 250 A developed solution needs to minimize the number of RTP/RTCP 251 extension and mechanisms that can't be used. 253 3.5. Incremental Deployment 255 In various multi-party communication scenarios deployment can become 256 an issue if all session participants need to have the functionality 257 before enabling its usage. This is especially difficult in 258 communication scenarios where not all possible participants and their 259 capabilities are know ahead of establishing the communication session 260 with some sub-set of the participants. At least for centralized 261 communication sessions it is desirable to have a solution that 262 enables allows the solution to be used on a single leg without 263 affecting any other leg, nor require advanced translation 264 functionality in any central node. 266 3.6. Summary 268 The centre of the motivation is to ensure that the RTP session is a 269 available and usable tool also for applications that has no need for 270 network level separation of its media streams and wants to reduce its 271 exposure to any NAT or Firewall inconsistencies and minimize the 272 resource consumption. As a benefit a well designed solution will 273 enable incremental deployment and minimal limitations in what 274 existing RTP mechanisms or extensions that can be used by the RTP 275 using application. 277 4. Requirements 279 This section lists and discusses a number of potential requirements. 280 However, it is not difficult to realize that it is in fact possible 281 to put requirements that makes the set of feasible solutions an empty 282 set. It is thus necessary to consider which requirements that are 283 essential to fulfil and which can be compromised on to arrive at a 284 solution. 286 4.1. Support Use of Multiple RTP Sessions 288 Section 3.3 discusses a number of reasons why an application might 289 like to have multiple RTP sessions. Considering the motivations for 290 this work this is an absolute requirement. We also are of the 291 opinion that the session provided by the solution needs to fulfil the 292 definition in the RTP [RFC3550] specification: 294 "The distinguishing feature of an RTP session is that each 295 maintains a full, separate space of SSRC identifiers (defined 296 next). The set of participants included in one RTP session 297 consists of those that can receive an SSRC identifier transmitted 298 by any one of the participants either in RTP as the SSRC or a CSRC 299 (also defined below) or in RTCP." 301 4.2. Same SSRC Value in Multiple RTP Sessions 303 Two different RTP sessions being multiplexed on the same lower layer 304 transport need to be able to use the same SSRC value. This is a 305 absolute requirement, for two reasons: 307 1. To avoid mandating SSRC assignment rules that are coordinated 308 between the sessions. If the RTP sessions multiplexed together 309 need to have unique SSRC values, then additional code that works 310 between RTP Sessions is needed in the implementations. Thus 311 raising the bar for implementing this solution. In addition, if 312 one gateways between parts of a system using this multiplexing 313 and parts that aren't multiplexing, the part that isn't 314 multiplexing also needs to fulfil the requirements on how SSRC is 315 assigned or force the gateway to translate SSRCs. Translating 316 SSRC is actually hard as it requires one to understand the 317 semantics of all current and future RTP and RTCP extensions. 318 Otherwise a barrier for deploying new extensions is created. 320 2. There are some few RTP extensions that currently rely on being 321 able to use the same SSRC in different RTP sessions: 323 * XOR FEC (RFC5109) 325 * RTP Retransmission in session mode (RFC4588) 327 * Certain Layered Coding 329 4.3. SRTP 331 SRTP [RFC3711] is one of the most commonly used security solutions 332 for RTP. In addition, it is the only one defined by IETF that is 333 integrated into RTP. This integration has several aspects that needs 334 to be considered when designing a solution for multiplexing RTP 335 sessions on the same lower layer transport. 337 Determining Crypto Context: SRTP first of all needs to know which 338 session context a received or to-be-sent packet relates to. It 339 also normally relies on the lower layer transport to identify the 340 session. It uses the Master Key Indicator (MKI), if present, to 341 determine which key set is to be used. Then the SSRC and sequence 342 number are used by most crypto suites, including the most common 343 use of AES Counter Mode, to actually generate the correct cipher 344 stream. 346 Unencrypted Headers: SRTP has chosen to leave the RTP headers and 347 the first two 32-bit words of the first RTCP header unencrypted, 348 to allow for both header compression and monitoring to work also 349 in the presence of encryption. As these fields are in clear text 350 they are used in most crypto suites for SRTP to determine how to 351 protect or recover the plain text. 353 It is here important to contrast SRTP against a set of other possible 354 protection mechanisms. DTLS, TLS, and IPsec are all protecting and 355 encapsulating the entire RTP and RTCP packets. They don't perform 356 any partial operations on the RTP and RTCP packets. Any change that 357 is considered to be part of the RTP and RTCP packet is transparent to 358 them, but possibly not to SRTP. Thus the impact on SRTP operations 359 has to be considered when defining a mechanism. 361 4.4. Don't Redefine Used Bits 363 As the core of RTP is in use in many systems and has a really large 364 deployment story and numerous implementations, changing any of the 365 field definitions is highly problematic. First of all, the 366 implementations need to change to support this new semantics. 367 Secondly, you get a large transition issue when you have some session 368 participants that support the new semantics and some that don't. 369 Combing the two behaviours in the same session can force the 370 deployment of costly and less than perfect translation devices. 372 4.5. Firewall Friendly 374 It is desirable that current Firewalls will accept the solutions as 375 normal RTP packets. However, in the authors' opinion we can't let 376 the firewall stifle invention and evolution of the protocol. It is 377 also necessary to be aware that a change that will make most deep 378 inspecting firewall consider the packet as not valid RTP/RTCP will 379 have a more difficult deployment story. 381 4.6. Monitoring and Reporting 383 It is desirable that a third party monitor can still operate on the 384 multiplexed RTP Sessions. It is however likely that they will 385 require an update to correctly monitor and report on multiplexed RTP 386 Sessions. 388 Another type of function to consider is packet sniffers and their 389 selector filters. These can be impacted by a change of the fields. 390 An observation is that many such systems are usually quite rapidly 391 updated to consider new types of standardized or simply common packet 392 formats. 394 4.7. Usable Also Over Multicast 396 It is desirable that a solution can be used if RTP and RTCP packets 397 are sent over multicast, both Any Source Multicast (ASM) and Single 398 Source Multicast (SSM). The reason for this requirement is to allow 399 a system using RTP to use the same configuration regardless of the 400 transport being done over unicast or multicast. In addition, 401 multicast can't be claimed to have an issue with using multiple 402 ports, as each multicast group has a complete port space scoped by 403 address. 405 4.8. Incremental Deployment 407 A good solution has the property that in topologies that contains RTP 408 mixers or Translators, a single session participant can enable 409 multiplexing without having any impact on any other session 410 participants. Thus a node ought to be able to take a multiplexed 411 packet and then easily send it out with minimal or no modification on 412 another leg of the session, where each RTP session is transported 413 over its own lower-layer transport. It also needs to be as easy to 414 do the reverse forwarding operation. 416 5. Design Considerations 417 When defining a SHIM solution for identifying RTP sessions over a 418 single transport layer there has been some special considerations 419 that is discussed in this section. 421 5.1. Location of SHIM 423 A major question affecting the SHIM is the location of the SHIM 424 header providing the Identifier of the session the packet relate to. 425 This section will discuss in detail about the impact of making the 426 different choices. 428 Identified aspects to consider are: 430 Possibility to Process: A prefixed shim header, i.e. between the 431 transport protocol and the RTP/RTCP packet header has the 432 advantage that any node on the network that likes to include the 433 header in any per-packet processing can reach it. Reasons for 434 per-packet processing are: 436 a. Quality of Service classification 438 b. SHIM ingress or egress 440 c. Monitoring 442 Many routers or similar devices can only read and process the 443 first N bytes of the whole packet, where N is commonly on the 444 order of 64-128 bytes. Any other type of processing means putting 445 the packet on the slow path. Thus a prefixed solution enables 446 this processing while a postfixed solution will most likely 447 forever prevent this type of devices to process it. 449 Legacy Processing: Packets or at least flows of the type IP/UDP/RTP 450 can in many cases be identified in Deep Packet Inspection, 451 Firewalls or other network entities that concern themselves with 452 trying determine what traffic that flows in a particular packet. 453 These nodes can clearly be updated but until they can create a 454 barrier to deployment. Thus a post fix gives likely the least 455 resistance for initial deployment. However, also for postfix 456 location the deployment can be hindered in cases multiple RTP 457 sessions using the same SSRC values due to irregular behaviour of 458 the fields for what the third party believes is one media stream 459 rather than multiple ones. The prefixed will however maintain the 460 long-term capabilities of such devices assuming they can be 461 updated to include the SHIM header as part of the classification. 463 Header Compression: The different header compression techniques that 464 has been developed compresses IP/UDP/RTP as complete combination. 466 If one instead have a IP/UDP/SHIM/RTP then the compression for the 467 full set might not work or poorly. Instead only IP/UDP header 468 compression is likely to be applied. Thus a prefix will loose 469 some compression efficiency until compression profiles for IP/UDP/ 470 SHIM/RTP has been developed, implemented and deployed. Postfix 471 don't have that issue, but nor can it ever gain anything from 472 header compression which an prefixed solution could once an 473 updated profile is deployed. Postfix also will have reduced 474 efficiency compressing sessions when the same SSRC is used in two 475 different RTP sessions as the RTP header fields like sequence 476 number, etc., will not behave as expected and need frequent 477 explicit updates. 479 The question of a prefixed or a postfixed header comes down to a 480 trade-off between long term usability and deployment issues: 482 Prefixed: Long term good possibility to adapt any network function 483 that needs to take the SHIM header into account. At the same time 484 any function that tries to analyse packets and because of that 485 might block the packets will be a hinder to deployment. 487 Postfixed: This solution will likely short term have the best 488 possibilities to deploy successfully. However, long term this 489 choice will likely prevent many network nodes that like to be 490 capable of separating the RTP sessions being multiplexed together 491 from successfully doing that. 493 After discussion in the working group it has been determined that 494 prefixed is the preferred solution. 496 5.2. ICE and DTLS-SRTP Integration 498 When using ICE [RFC5245] or DTLS-SRTP [RFC5764] or both with RTP 499 there exist the issue that RTP, STUN [RFC5389] and DTLS-SRTP are 500 simultaneously in use over the same lower layer transport flow, like 501 UDP. This multiplexing is based on the value of the first byte of 502 the lower layer transport payload as discussed in Section 5.1.2 of 503 DTLS-SRTP [RFC5764]. 505 The replacement of a single RTP session with the multiple RTP 506 sessions identified by a SHIM ought not be misidentified to be either 507 STUN or DTLS-SRTP or any other protocol intending to take the 508 available free code-points in the range 193-255 (Decimal). Thus a 509 prefixed SHIM needs to have its first byte have the two first bits 510 set to 10 (Binary). Having the SHIM share the identity of RTP is not 511 an issue as there has to be mutual agreement that the SHIM is used 512 instead of RTP. 514 Note: This limits a single byte SHIM to only allow a maximum of 64 515 RTP sessions over a single transport flow. 517 5.3. Signalling Fall Back 519 There exist an important aspect in how the SDP signalling functions, 520 especially Offer/Answer [RFC3264]. The initial idea for the 521 signalling was to build on top of bundle 522 [I-D.ietf-mmusic-sdp-bundle-negotiation] which in its default 523 function negotiate multiple media types over one RTP session 524 [I-D.ietf-avtcore-multi-media-rtp-session]. If the signalling for 525 the solution that main purpose is to enable multiple RTP sessions 526 results in those cases the peer doesn't support this specification 527 the communicating peer can end up in single RTP session if the peer 528 supports that. 530 We consider it important that in the signalling design that the 531 application developer can decide what type of fall back that will 532 occur. It is also important to consider that one have to signal SHIM 533 based multiplexing of RTP sessions that are in fact of the type with 534 multiple media types. Thus the signalling for SHIM has to be able to 535 describe multiple different scenarios: 537 1. Multiple RTP sessions multiplexed together using SHIM over one 538 transport 540 2. Like 1 but where at least one RTP session is containing multiple 541 media types 543 3. Like 1, but where the peer doesn't support SHIM and the initiator 544 wants to fall back to independent transports 546 4. Like 2, but where the peer doesn't support SHIM and wants to fall 547 back to multiple BUNDLED sessions over independent transports. 549 In addition it needs to be possible to have multiple different 550 transports where each is a SHIM multiplex. This is to support 551 decomposed end-points or cases where certain media traffic has to go 552 to a central processing node while others goes directly to a peer. 554 To enable all of these scenarios we propose a solution where each 555 indicates SHIM multiplex is indicated as its own grouping attribute 556 across all media blocks that are included in some form in the 557 multiplex. This resulting in that these media blocks fall under a 558 form of BUNDLE super set. This super set will also have some of 559 bundles restrictions on the transport layer, but not on higher layer. 560 Which Session ID pair a particular media block is associated is 561 signalled using a SDP attribute (a=session-mux-id) in each media 562 block. When multiple media block are assigned the same session ID 563 pair, they form a RTP session with multiple media types and have the 564 full restriction of bundle between them. 566 The method of fall back is indicated by providing explicit BUNDLE 567 grouping in addition to the SHIM when the fall back from SHIM is to 568 BUNDLE. 570 Note: Signalling solution is awaiting resolution of design path for 571 bundle and will then consider that solution and issues raised. 573 6. Specification 575 This section contains the specification of the RTP session 576 multiplexing SHIM, using an explicit session identifier of the 577 encapsulated payload. 579 6.1. Shim Layer 581 This solution is based on a shim layer that is inserted in the stack 582 between the regular RTP and RTCP packets and the transport layer 583 being used by the RTP sessions. Thus the layering looks like the 584 following: 586 +---------------------+ 587 | RTP / RTCP Packet | 588 +---------------------+ 589 | Session ID Layer | 590 +---------------------+ 591 | Transport layer | 592 +---------------------+ 594 Stack View with Session ID SHIM 596 The above stack is in fact a layered one as it does allow multiple 597 RTP Sessions to be multiplexed on top of the Session ID shim layer. 598 This enables the example presented in Figure 1 where four sessions, 599 S1-S4 is sent over the same Transport layer and where the Session ID 600 layer will combine and encapsulate them with the session ID on 601 transmission and separate and decapsulate them on reception. 603 +-------------------+ 604 | S1 | S2 | S3 | S4 | 605 +-------------------+ 606 | Session ID Layer | 607 +-------------------+ 608 | Transport layer | 609 +-------------------+ 610 Figure 1: Multiple RTP Session On Top of Session ID Layer 612 The Session ID layer encapsulates one RTP or RTCP packet from a given 613 RTP session and prefixes the 2-byte Session ID layer to the packet. 614 The Session ID layer is depicted below (Figure 2) and consists of 615 first 2 fixed bit values (10b) followed by a 14 bits unsigned integer 616 field with the Session ID (SID) value. 618 0 1 619 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 620 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 621 |1 0| Session ID (SID) | 622 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 624 Figure 2: Session ID layer 626 Each RTP session being multiplexed on top of a given transport layer 627 is assigned either a single or a pair of unique SID in the range 628 0-16383. The reason for assigning a pair of SIDs to a given RTP 629 session are for RTP Sessions that doesn't support "Multiplexing RTP 630 Data and Control Packets on a Single Port" [RFC5761] to still be able 631 to use a single 5-tuple. The reasons for supporting this extra 632 functionality is that RTP and RTCP multiplexing based on the payload 633 type/packet type fields enforces certain restrictions on the RTP 634 sessions. These restrictions might not be acceptable. As this 635 solution does not have these restrictions, performing RTP and RTCP 636 multiplexing in this way has benefits. 638 Each Session ID value space is scoped by the underlying transport 639 protocol. Common transport protocols like UDP [RFC0768], DCCP 640 [RFC4340], TCP [RFC0793], and SCTP [RFC4960] can all be scoped by one 641 or more 5-tuple (Transport protocol, source address and port, 642 destination address and port). The case of multiple 5-tuples occur 643 in the case of multi-unicast topologies, also called meshed 644 multiparty RTP sessions or in case any application would need more 645 than 8192 RTP sessions. 647 0 1 2 3 648 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 649 +-------------------------------+ 650 |1 0| Session ID | 651 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 652 |V=2|P|X| CC |M| PT | sequence number | | 653 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 654 | timestamp | | 655 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 656 | synchronization source (SSRC) identifier | | 657 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 658 | contributing source (CSRC) identifiers | | 659 | .... | | 660 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 661 | RTP extension (OPTIONAL) | | 662 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 663 | | payload ... | | 664 | | +-------------------------------+ | 665 | | | RTP padding | RTP pad count | | 666 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 667 | ~ SRTP MKI (OPTIONAL) ~ | 668 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 669 | : authentication tag (RECOMMENDED) : | 670 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 671 +- Encrypted Portion* Authenticated Portion ---+ 673 Figure 3: SRTP Packet encapsulated by Session ID Layer 675 0 1 2 3 676 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 677 +-------------------------------+ 678 |1 0| Session ID | 679 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 680 |V=2|P| RC | PT=SR or RR | length | | 681 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 682 | SSRC of sender | | 683 +>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 684 | ~ sender info ~ | 685 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 686 | ~ report block 1 ~ | 687 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 688 | ~ report block 2 ~ | 689 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 690 | ~ ... ~ | 691 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 692 | |V=2|P| SC | PT=SDES=202 | length | | 693 | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 694 | | SSRC/CSRC_1 | | 695 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 696 | ~ SDES items ~ | 697 | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 698 | ~ ... ~ | 699 +>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 700 | |E| SRTCP index | | 701 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 702 | ~ SRTCP MKI (OPTIONAL) ~ | 703 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 704 | : authentication tag : | 705 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 706 +-- Encrypted Portion Authenticated Portion -----+ 708 Figure 4: SRTCP packet encapsulated by Session ID layer 710 The processing in a receiver when the Session ID layer is present 711 will be to 713 1. Pick up the packet from the lower layer transport 715 2. Inspect the SID field value 717 3. Strip the SID field from the packet 719 4. Forward it to the (S)RTP Session context identified by the SID 720 value 722 6.2. Signalling 724 Note: This section might need updating as the direction of the 725 solution for Bundle has settled and the impact of the raised issues 726 has been analysed. 728 The use of the Session ID layer needs to be explicitly agreed on 729 between the communicating parties. Each RTP Session the application 730 uses needs, in addition to the regular configuration such as payload 731 types, RTCP extension, etc., have both the underlying 5-tuple (source 732 address and port, destination address and port, and transport 733 protocol) and the Session ID used for the particular RTP session. 734 The signalling requirement is to assign unique Session ID values to 735 all RTP Sessions being sent over the same 5-tuple. The same Session 736 ID SHALL be used for an RTP session independently of the traffic 737 direction. Note that nothing prevents a multi-media application from 738 using multiple 5-tuples if desired for some reason, in which case 739 each 5-tuple has its own session ID value space. 741 This section defines how to negotiate the use of the Session ID 742 layer, using the Session Description Protocol (SDP) Offer/Answer 743 mechanism [RFC3264]. A new SDP grouping semantics is defined "SHIM" 744 and a new media-level SDP attribute, 'session-mux-id. The attribute 745 allows each media description ("m=" line) associated with a 'SHIM' 746 group to be identified in which RTP session it belongs. 748 The 'session-mux-id' attribute is included for a media description, 749 in order to indicate the Session ID for that particular media 750 description. Every media description that shares a common attribute 751 value is assumed to be part of a single RTP session. An SDP Offerer 752 MUST include the 'session-mux-id' attribute for every media 753 description associated with a 'SHIM' group. If the SDP Answer does 754 not contain the SHIM group, the SDP Offerer MUST NOT use SHIM based 755 layering. However, if that is separate RTP sessions or BUNDLE is 756 determined on what was present in the offer and answer. This will 757 depend on what the offering party likes to happen. If they want a 758 failure to negotiate a SHIM, instead can be one or more bundle groups 759 then also the BUNDLE grouping is included in the offer. If the SDP 760 Answer still describes a 'BUNDLE' group, the procedures in 761 [I-D.ietf-mmusic-sdp-bundle-negotiation] apply. If not independent 762 transports and sessions are used. 764 An SDP Answerer MUST NOT include the 'SHIM' group and 'session-mux- 765 id' attribute in an SDP Answer, unless they where included in the SDP 766 Offer. 768 The attribute has the following ABNF [RFC5234] definition. 770 Session-mux-id-attr = "a=session-mux-id:" SID *SID-prop 771 SID = SID-value / SID-pairs 772 SID-value = 1*3DIGIT / "NoN" 773 SID-pairs = SID-value "/" SID-value ; RTP/RTCP SIDs 774 SID-prop = SP assignment-policy / prop-ext 775 prop-ext = token "=" value 776 assignment-policy = "policy=" ("tentative" / "fixed") 778 The SHIM group SHALL contain all media descriptions that are intended 779 to be sent over the same transport flow, independent of Session ID. 780 For all media descriptions part of the same SHIM group the transport 781 parameters, i.e. ports, ICE-candidates, etc., MUST be the same and 782 handled as described by BUNDLE. Note, the parameters related to the 783 RTP session does not need to be same. 785 For media descriptions that have the same value of the Session ID 786 SHALL be treated the same way as if they where part of a BUNDLE 787 group, independently if that is indicated or not in the SDP. 789 The SID property "policy" is used in negotiation by an end-point to 790 indicate if the session ID values are merely a tentative suggestion 791 or if they need to have these values. This is used when negotiating 792 SID for multi-party RTP sessions to support shared transports such as 793 multicast or RTP translators that are unable to produce renumbered 794 SIDs on a per end-point basis. The normal behaviour is that the 795 offer suggest a tentative set of values, indicated by 796 "policy=tentative". These SHOULD be accepted by the peer unless that 797 peer negotiate session IDs on behalf of a centralized policy, in 798 which case it MAY change the value(s) in the answer. If the offer 799 represents a policy that does not allow changing the session ID 800 values, it can indicate that to the answerer by setting the policy to 801 "fixed". This enables the answering peer to either accept the value 802 or indicate that there is a conflict in who is performing the 803 assignment by setting the SID value to NoN (Not a Number). Offerer 804 and answerer SHOULD always include the policy they are operating 805 under. Thus, in case of no centralized behaviours, both offerer and 806 answerer will indicate the tentative policy. 808 6.3. SRTP Key Management 810 Key management for SRTP do needs discussion as we do cause multiple 811 SRTP sessions to exist on the same underlying transport flow. Thus 812 we need to ensure that the key management mechanism still are 813 properly associated with the SRTP session context it intends to key. 814 To ensure that we do look at the three SRTP key management mechanism 815 that IETF has specified, one after another. 817 6.3.1. Security Description 819 Session Description Protocol (SDP) Security Descriptions for Media 820 Streams [RFC4568] as being based on SDP has no issue with the RTP 821 session multiplexing on lower layer specified here. The reason is 822 that the actual keying is done using a media level SDP attribute. 823 Thus the attribute is already associated with a particular media 824 description. A media description that also will have an instance of 825 the "a=session-mux-id" attribute carrying the SID value/pair used 826 with this particular crypto parameters. 828 6.3.2. DTLS-SRTP 830 Datagram Transport Layer Security (DTLS) Extension to Establish Keys 831 for the Secure Real-time Transport Protocol (SRTP) [RFC5764] is a 832 keying mechanism that works on the media plane on the same lower 833 layer transport that SRTP/SRTCP will be transported over. 835 The most direct solution would be to use the SHIM and the SID context 836 identifier to be applied also on DTLS packets. Thus using the same 837 SID that is used with RTP and/or RTCP also for the DTLS message 838 intended to key that particular SRTP and/or SRTCP flow(s). This of 839 course requires independent usage of DTLS-SRTP for each RTP session. 840 In addition it requires changing the layering for DTLS-SRTP as well 841 as RTP. Thus this behaviour doesn't gain you anything in regards to 842 key-management when using SHIM and have some costs. 844 Instead we propose that an DTLS-SRTP key-derivation change is 845 introduced. By including the Session ID value in the derivation of 846 the keying material a single DTLS-SRTP key-management operation could 847 apply keys and parameters for all the RTP sessions in the same 848 transport flow. Thus the keying cost is significantly reduced, 849 especially in regards to network communication and delay impact and 850 vulnerability to packet loss. 852 Details to be written up. 854 6.3.3. MIKEY 856 MIKEY: Multimedia Internet KEYing [RFC3830] is a key management 857 protocol that has several transports. In some cases it is used 858 directly on a transport protocol such as UDP, but there is also a 859 specification for how MIKEY is used with SDP "Key Management 860 Extensions for Session Description Protocol (SDP) and Real Time 861 Streaming Protocol (RTSP)" [RFC4567]. 863 Lets start with the later, i.e. the SDP transport, which shares the 864 properties with Security Description in that is can be associated 865 with a particular media description in a SDP. As long as one avoids 866 using the session level attribute one can be certain to correctly 867 associate the key exchange with a given SRTP/SRTCP context. 869 It does appear that MIKEY directly over a lower layer transport 870 protocol will have similar issues as DTLS. 872 6.4. Examples 874 6.4.1. RTP Packet with Transport Header 876 The below figure contains an RTP packet with SID field encapsulated 877 by a UDP packet (added UDP header). 879 0 1 2 3 880 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 881 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 882 | Source Port | Destination Port | 883 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 884 | Length | Checksum | 885 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 886 |1 0| Session ID | 887 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 888 |V=2|P|X| CC |M| PT | sequence number | | 889 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 890 | timestamp | | 891 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 892 | synchronization source (SSRC) identifier | | 893 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 894 | contributing source (CSRC) identifiers | | 895 | .... | | 896 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 897 | RTP extension (OPTIONAL) | | 898 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 899 | | payload ... | | 900 | | +-------------------------------+ | 901 | | | RTP padding | RTP pad count | | 902 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 903 | ~ SRTP MKI (OPTIONAL) ~ | 904 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 905 | : authentication tag (RECOMMENDED) : | 906 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 907 +- Encrypted Portion* Authenticated Portion ---+ 909 SRTP Packet Encapsulated by Session ID Layer 911 6.4.2. SDP Offer/Answer example 913 6.4.2.1. Basic Example 915 This section contains SDP offer/answer examples. First one example 916 of a successful SHIM, and then two where fall back occurs. The fall 917 back option here is to fall back to individual transports, thus no 918 BUNDLE group. 920 In the below SDP offer, one audio and one video is being offered. 921 The audio is using SID 0, and the video is using SID 1 to indicate 922 that they are different RTP sessions despite being offered over the 923 same 5-tuple. 925 v=0 926 o=alice 2890844526 2890844526 IN IP4 atlanta.example.com 927 s= 928 c=IN IP4 atlanta.example.com 929 t=0 0 930 a=group:SHIM foo bar 931 m=audio 10000 RTP/AVP 0 8 97 932 b=AS:200 933 a=mid:foo 934 a=session-mux-id:0 policy=tentative 935 a=rtpmap:0 PCMU/8000 936 a=rtpmap:8 PCMA/8000 937 a=rtpmap:97 iLBC/8000 938 m=video 10000 RTP/AVP 31 32 939 b=AS:1000 940 a=mid:bar 941 a=session-mux-id:1 policy=tentative 942 a=rtpmap:31 H261/90000 943 a=rtpmap:32 MPV/90000 945 The SDP answer from an end-point that supports this BUNDLEing: 947 v=0 948 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 949 s= 950 c=IN IP4 biloxi.example.com 951 t=0 0 952 a=group:SHIM foo bar 953 m=audio 20000 RTP/AVP 0 954 b=AS:200 955 a=mid:foo 956 a=session-mux-id:0 policy=tentative 957 a=rtpmap:0 PCMU/8000 958 m=video 20000 RTP/AVP 32 959 b=AS:1000 960 a=mid:bar 961 a=session-mux-id:1 policy=tentative 962 a=rtpmap:32 MPV/90000 964 The SDP answer from an end-point that does not support this SHIM. 966 v=0 967 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 968 s= 969 c=IN IP4 biloxi.example.com 970 t=0 0 971 m=audio 20000 RTP/AVP 0 972 b=AS:200 973 a=rtpmap:0 PCMU/8000 974 m=video 30000 RTP/AVP 32 975 b=AS:1000 976 a=rtpmap:32 MPV/90000 978 6.4.2.2. Advanced Example 980 In this example we have two BUNDLED sessions, one with audio and 981 video and one with XOR based FEC [RFC5109] for the audio and the 982 video. These two RTP session are then SHIMed into a single transport 983 flow. 985 v=0 986 o=alice 2890844526 2890844526 IN IP4 atlanta.example.com 987 s= 988 c=IN IP4 atlanta.example.com 989 t=0 0 990 a=group:SHIM foo bar 1 2 991 a=group:BUNDLE 1 2 992 a=group:BUNDLE foo bar 993 a=group:FEC foo 1 994 a=group:FEC bar 2 995 m=audio 10000 RTP/AVP 0 8 97 996 b=AS:200 997 a=mid:foo 998 a=session-mux-id:0 policy=tentative 999 a=rtpmap:0 PCMU/8000 1000 a=rtpmap:8 PCMA/8000 1001 a=rtpmap:97 iLBC/8000 1002 m=video 10000 RTP/AVP 31 32 1003 b=AS:1000 1004 a=mid:bar 1005 a=session-mux-id:0 policy=tentative 1006 a=rtpmap:31 H261/90000 1007 a=rtpmap:32 MPV/90000 1008 m=audio 10000 RTP/AVP 100 1009 b=AS:100 1010 a=rtpmap:100 ulpfec/8000 1011 a=mid:1 1012 a=session-mux-id:1 policy=tentative 1013 m=video 10000 RTP/AVP 101 1014 b=AS:500 1015 a=mid:2 1016 a=session-mux-id:1 policy=tentative 1017 a=rtpmap:101 ulpfec/90000 1019 The SDP answer of a client supporting 1020 [I-D.ietf-mmusic-sdp-bundle-negotiation] but not this SHIMing would 1021 look like this: 1023 v=0 1024 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 1025 s= 1026 c=IN IP4 biloxi.example.com 1027 t=0 0 1028 a=group:BUNDLE 1 2 1029 a=group:BUNDLE foo bar 1030 a=group:FEC foo 1 1031 a=group:FEC bar 2 1032 m=audio 20000 RTP/AVP 0 8 97 1033 b=AS:200 1034 a=mid:foo 1035 a=rtpmap:0 PCMU/8000 1036 a=rtpmap:8 PCMA/8000 1037 a=rtpmap:97 iLBC/8000 1038 m=video 20000 RTP/AVP 31 32 1039 b=AS:1000 1040 a=mid:bar 1041 a=rtpmap:31 H261/90000 1042 a=rtpmap:32 MPV/90000 1043 m=audio 20002 RTP/AVP 100 1044 b=AS:100 1045 a=rtpmap:100 ulpfec/8000 1046 a=mid:1 1047 m=video 20002 RTP/AVP 101 1048 b=AS:500 1049 a=mid:2 1050 a=rtpmap:101 ulpfec/90000 1052 In the above case two different RTP sessions, both being of a BUNDLE 1053 type with multiple media types in each. The two established flows 1054 will be Alice:10000<->Bob:20000, and Alice:10000<->Bob:20002. 1056 If the peer did support neither of the SHIM or BUNDLE extension the 1057 answer would look like this: 1059 v=0 1060 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 1061 s= 1062 c=IN IP4 biloxi.example.com 1063 t=0 0 1064 a=group:FEC foo 1 1065 a=group:FEC bar 2 1066 m=audio 20000 RTP/AVP 0 8 97 1067 b=AS:200 1068 a=mid:foo 1069 a=rtpmap:0 PCMU/8000 1070 a=rtpmap:8 PCMA/8000 1071 a=rtpmap:97 iLBC/8000 1072 m=video 20002 RTP/AVP 31 32 1073 b=AS:1000 1074 a=mid:bar 1075 a=rtpmap:31 H261/90000 1076 a=rtpmap:32 MPV/90000 1077 m=audio 20004 RTP/AVP 100 1078 b=AS:100 1079 a=rtpmap:100 ulpfec/8000 1080 a=mid:1 1081 m=video 20006 RTP/AVP 101 1082 b=AS:500 1083 a=mid:2 1084 a=rtpmap:101 ulpfec/90000 1086 In this case four different transport flows would be established for 1087 RTP, each with a different RTP session over them. The answer also 1088 knows the binding between the sessions with FEC and their source data 1089 thanks to the FEC specification. 1091 7. Open Issues 1093 This work is still at a relatively early phase. This section 1094 contains a list of open issues where the author desires some input. 1096 1. In Section 6.2 there is a discussion of which parameters that 1097 need to be configured. The scope of these rules and if they do 1098 make sense needs additional discussion. 1100 2. Can we provide better control so that applications that doesn't 1101 desire fall back to single RTP session when Multiplexing shim 1102 fails to be supported but Bundle is supported ends up with a 1103 better alternative? 1105 3. The details for how to do key-derivation, preferably in such a 1106 way that it can be reused by multiple key-management solutions 1107 like MIKEY and DTLS-SRTP 1109 4. The signalling solution will be revisited when the BUNDLE 1110 solution discussion has yield some result. 1112 8. IANA Considerations 1114 (tbd: complete the details of the IANA registration for the SDP 1115 attribute) 1117 9. Security Considerations 1119 The security properties of the Session ID layer is depending on what 1120 mechanism is used to protect the RTP and RTCP packets of a given RTP 1121 session. If IPsec or transport layer security solutions such as DTLS 1122 or TLS are being used then both the encapsulated RTP/RTCP packets and 1123 the session ID layer will be protected by that security mechanism. 1124 Thus potentially providing both confidentiality, integrity and source 1125 authentication. If SRTP is used, the session ID layer will not be 1126 directly protected by SRTP. However, it will be implicitly integrity 1127 protected (assuming the RTP/RTCP packet is integrity protected) as 1128 the only function of the field is to identify the session context. 1129 Thus any modification of the SID field will attempt to retrieve the 1130 wrong SRTP crypto context. If that retrieval fails, the packet will 1131 be anyway be discarded. If it is successful, the context will not 1132 lead to successful verification of the packet. 1134 10. Acknowledgements 1136 This document is based on the input from various people, especially 1137 in the context of the RTCWEB discussion of how to use only a single 1138 lower layer transport. The RTP and RTCP packet figures are borrowed 1139 from RFC3711. The SDP example is extended from the one present in 1140 [I-D.ietf-mmusic-sdp-bundle-negotiation]. Eric Rescorla contributed 1141 the basic idea of optimizing the DTLS-SRTP key-management by 1142 modifying the key derivation process. 1144 The proposal in Appendix A.5 is original suggested by Colin Perkins. 1145 The idea in Appendix A.6 is from an Internet Draft 1146 [I-D.rosenberg-rtcweb-rtpmux] written by Jonathan Rosenberg et. al. 1147 The proposal in Appendix A.3 is a result of discussion by a group of 1148 people at IETF meeting #81 in Quebec. 1150 11. References 1152 11.1. Normative References 1154 [I-D.ietf-mmusic-sdp-bundle-negotiation] 1155 Holmberg, C., Alvestrand, H., and C. Jennings, 1156 "Multiplexing Negotiation Using Session Description 1157 Protocol (SDP) Port Numbers", draft-ietf-mmusic-sdp- 1158 bundle-negotiation-04 (work in progress), June 2013. 1160 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1161 Requirement Levels", BCP 14, RFC 2119, March 1997. 1163 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1164 Jacobson, "RTP: A Transport Protocol for Real-Time 1165 Applications", STD 64, RFC 3550, July 2003. 1167 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 1168 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 1169 RFC 3711, March 2004. 1171 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 1172 Specifications: ABNF", STD 68, RFC 5234, January 2008. 1174 11.2. Informational References 1176 [I-D.ietf-avtcore-multi-media-rtp-session] 1177 Westerlund, M., Perkins, C., and J. Lennox, "Sending 1178 Multiple Types of Media in a Single RTP Session", draft- 1179 ietf-avtcore-multi-media-rtp-session-03 (work in 1180 progress), July 2013. 1182 [I-D.lennox-rtcweb-rtp-media-type-mux] 1183 Rosenberg, J. and J. Lennox, "Multiplexing Multiple Media 1184 Types In a Single Real-Time Transport Protocol (RTP) 1185 Session", draft-lennox-rtcweb-rtp-media-type-mux-00 (work 1186 in progress), October 2011. 1188 [I-D.rosenberg-rtcweb-rtpmux] 1189 Rosenberg, J., Jennings, C., Peterson, J., Kaufman, M., 1190 Rescorla, E., and T. Terriberry, "Multiplexing of Real- 1191 Time Transport Protocol (RTP) Traffic for Browser based 1192 Real-Time Communications (RTC)", draft-rosenberg-rtcweb- 1193 rtpmux-00 (work in progress), July 2011. 1195 [I-D.westerlund-avtcore-multiplex-architecture] 1196 Westerlund, M., Perkins, C., and H. Alvestrand, 1197 "Guidelines for using the Multiplexing Features of RTP", 1198 draft-westerlund-avtcore-multiplex-architecture-03 (work 1199 in progress), February 2013. 1201 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 1202 August 1980. 1204 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC 1205 793, September 1981. 1207 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 1208 with Session Description Protocol (SDP)", RFC 3264, June 1209 2002. 1211 [RFC3830] Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K. 1212 Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830, 1213 August 2004. 1215 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 1216 Congestion Control Protocol (DCCP)", RFC 4340, March 2006. 1218 [RFC4567] Arkko, J., Lindholm, F., Naslund, M., Norrman, K., and E. 1219 Carrara, "Key Management Extensions for Session 1220 Description Protocol (SDP) and Real Time Streaming 1221 Protocol (RTSP)", RFC 4567, July 2006. 1223 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 1224 Description Protocol (SDP) Security Descriptions for Media 1225 Streams", RFC 4568, July 2006. 1227 [RFC4960] Stewart, R., "Stream Control Transmission Protocol", RFC 1228 4960, September 2007. 1230 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 1231 Correction", RFC 5109, December 2007. 1233 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 1234 (ICE): A Protocol for Network Address Translator (NAT) 1235 Traversal for Offer/Answer Protocols", RFC 5245, April 1236 2010. 1238 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 1239 Header Extensions", RFC 5285, July 2008. 1241 [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, 1242 "Session Traversal Utilities for NAT (STUN)", RFC 5389, 1243 October 2008. 1245 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 1246 Real-Time Transport Control Protocol (RTCP): Opportunities 1247 and Consequences", RFC 5506, April 2009. 1249 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 1250 Control Packets on a Single Port", RFC 5761, April 2010. 1252 [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer 1253 Security (DTLS) Extension to Establish Keys for the Secure 1254 Real-time Transport Protocol (SRTP)", RFC 5764, May 2010. 1256 Appendix A. Possible Solutions 1258 This section documents the solutions explored when selecting a SHIM 1259 based one and discusses their feasibility. 1261 A.1. Header Extension 1263 One proposal is to define an RTP header extension [RFC5285] that 1264 explicitly enumerates the session identifier in each packet. This 1265 proposal has some merits regarding RTP, since it uses an existing 1266 extension mechanism; it explicitly enumerates the session allowing 1267 for third parties to associate the packet to a given RTP session; and 1268 it works with SRTP as currently defined since a header extension is 1269 by default not encrypted, and is thus readable by the receiving stack 1270 without needing to guess which session it belongs to and attempt to 1271 decrypt it. This approach does, however, conflict with the 1272 requirement from [RFC5285] that "header extensions using this 1273 specification MUST only be used for data that can be safely ignored 1274 by the recipient", since correct processing of the received packet 1275 depends on using the header extension to demultiplex it to the 1276 correct RTP session. 1278 Using a header extension also result in the session ID is in the 1279 integrity protected part of the packet. Thus a translator between 1280 multiplexed and non-multiplexed has the options: 1282 1. to be part of the security context to verify the field 1284 2. to be part of the security context to verify the field and remove 1285 it before forwarding the packet 1287 3. to be outside of the security context and leave the header 1288 extension in the packet. However, that requires successful 1289 negotiation of the header extension, but not of the 1290 functionality, with the receiving end-points. 1292 The biggest existing hurdle for this solution is that there exist no 1293 header extension field in the RTCP packets. This requires defining a 1294 solution for RTCP that allows carrying the explicit indicator, 1295 preferably in a position that isn't encrypted by SRTCP. However, the 1296 current SRTCP definition does not offer such a position in the 1297 packet. 1299 Modifying the RR or SR packets is possible using profile specific 1300 extensions. However, that has issues when it comes to deployment and 1301 in addition any information placed there would end up in the 1302 encrypted part. 1304 Another alternative could be to define another RTCP packet type that 1305 only contains the common header, using the 5 bits in the first byte 1306 of the common header to carry a session id. That would allow SRTCP 1307 to work correctly as long it accepts this new packet type being the 1308 first in the packet. Allowing a non-SR/RR packet as the first packet 1309 in a compound RTCP packet is also needed if an implementation is to 1310 support Reduced Size RTCP packets [RFC5506]. The remaining downside 1311 with this is that all stack implementations supporting multiplexing 1312 would need to modify its RTCP compound packet rules to include this 1313 packet type first. Thus a translator box between supporting nodes 1314 and non-supporting nodes needs to be in the crypto context. 1316 This solution's per packet overhead is expected to be 64-bits for 1317 RTCP. For RTP it is 64-bits if no header extension was otherwise 1318 used, and an additional 16 bits (short header), or 24 bits plus (if 1319 needed) padding to next 32-bits boundary if other header extensions 1320 are used. 1322 A.2. Multiplexing Shim 1324 This proposal is to prefix or postfix all RTP and RTCP packets with a 1325 session ID field. This field would be outside of the normal RTP and 1326 RTCP packets, thus having no impact on the RTP and RTCP packets and 1327 their processing. An additional step of demultiplexing processing 1328 would be added prior to RTP stack processing to determine in which 1329 RTP session context the packet is to be included. This has also no 1330 impact on SRTP/SRTCP as the shim layer would be outside of its 1331 protection context. The shim layer's session ID is however 1332 implicitly integrity protected as any error in the field will result 1333 in the packet being placed in the wrong or non-existing context, thus 1334 resulting in a integrity failure if processed by SRTP/SRTCP. 1336 This proposal is quite simple to implement in any gateway or 1337 translating device that goes from a multiplexed to a non-multiplexed 1338 domain or vice versa, as only an additional field needs to be added 1339 to or removed from the packet. 1341 The main downside of this proposal is that it is very likely to 1342 trigger a firewall response from any deep packet inspection device. 1343 If the field is prefixed, the RTP fields are not matching the 1344 heuristics field (unless the shim is designed to look like an RTP 1345 header, in which case the payload length is unlikely to match the 1346 expected value) and thus are likely preventing classification of the 1347 packet as an RTP packet. If it is postfixed, it is likely classified 1348 as an RTP packet but might not correctly validate if the content 1349 validation is such that the payload length is expected to match 1350 certain values. It is expected that a postfixed shim will be less 1351 problematic than a prefixed shim in this regard, but we are lacking 1352 hard data on this. 1354 This solution's per packet overhead is 1 byte. 1356 A.3. Single Session 1358 Given the difficulty of multiplexing several RTP sessions onto a 1359 single lower-layer transport, it's tempting to send multiple media 1360 streams in a single RTP session. Doing this avoids the need to de- 1361 multiplex several sessions on a single transport, but at the cost of 1362 losing the RTP session as a separator for different type of streams. 1363 Lacking different RTP sessions to demultiplex incoming packets, a 1364 receiver will have to dig deeper into the packet before determining 1365 what to do with it. Care has to be taken in that inspection. For 1366 example, it is important to be careful to ensure that each real media 1367 source uses its own SSRC in the session and that this SSRC doesn't 1368 change media type. 1370 The loss of the RTP session as a separator for different usages or 1371 purpose would be an minor issue if the only difference between the 1372 RTP sessions is the media type. In this case, the application could 1373 use the Payload Type field to identify the media type. The loss of 1374 the RTP Session functionality is however severe, if the application 1375 uses the RTP Session for separating different treatments, contexts 1376 etc. Then you would need additional signalling to bind the different 1377 sources to groups which can help make the necessary distinctions. 1379 However, the loss of the RTP session as separator is not the only 1380 issue with this approach. The RTP Multiplexing Architecture 1381 [I-D.westerlund-avtcore-multiplex-architecture] discusses a number of 1382 issues in Section 6.7. These include RTCP bandwidth differences, 1383 limitations in the number of payload types, media aware RTP mixers 1384 and interactions with Legacy end-points. 1386 Additional attention needs to be placed on this important aspect. In 1387 multi-party situations using central nodes there exist some 1388 difficulties in having a legacy implementation using multiple RTP 1389 sessions interworking with an end-point having only a single RTP 1390 session across the central node. The main reason is the fact that 1391 the one using single session with multiple media types has only one 1392 SSRC space, while the other end-points have multiple spaces. Thus 1393 translation might have to occur because there is several RTP sessions 1394 using the same SSRC value. This has both limitations, processing 1395 overhead and the possibility of becoming an deployment obstacle for 1396 new RTP/RTCP extensions. 1398 This approach has been proposed in the RTCWeb context in 1399 [I-D.lennox-rtcweb-rtp-media-type-mux] and 1400 [I-D.ietf-mmusic-sdp-bundle-negotiation]. These drafts describe how 1401 to signal multiple media streams multiplexed into a single RTP 1402 session, and address some of the issues raised here and in 1403 Section 6.7 of the RTP Multiplexing Architecture 1404 [I-D.westerlund-avtcore-multiplex-architecture] draft. 1406 This method has several limitations that limits its usage as solution 1407 in providing multiple RTP sessions on the same lower layer transport. 1408 However, we acknowledge that there are some uses for which this 1409 method can be sufficient and which can accept the methods limitations 1410 and downsides. The RTCWEB WG has a working assumption to support 1411 this method. For more details of this method, see the relevant 1412 drafts under development. We do include this method in the 1413 comparison to provide a more complete picture of the pro and cons of 1414 this method. 1416 This solution has no per packet overhead. The signalling overhead 1417 will be a different question. 1419 A.4. Use the SRTP MKI field 1421 This proposal is to overload the MKI SRTP/SRTCP identifier to not 1422 only identify a particular crypto context, but also identify the 1423 actual RTP Session. This clearly is a miss use of the MKI field, 1424 however it appears to be with little negative implications. SRTP 1425 already supports handling of multiple crypto contexts. 1427 The two major downsides with this proposal is first the fact that it 1428 requires using SRTP/SRTCP to multiplex multiple sessions on a single 1429 lower layer transport. The second issue is that the session ID 1430 parameter needs to be put into the various key-management schemes and 1431 to make them understand that the reason to establish multiple crypto 1432 contexts is because they are connected to various RTP Sessions. 1433 Considering that SRTP have at least 3 used keying mechanisms, DTLS- 1434 SRTP [RFC5764], Security Descriptions [RFC4568], and MIKEY [RFC3830], 1435 this is not an insignificant amount of work. 1437 This solution has 32-bit per packet overhead, but only if the MKI was 1438 not already used. 1440 A.5. Use an Octet in the Padding 1442 The basics of this proposal is to have the RTP packet and the last 1443 (mandated by RFC3550) RTCP packet in a compound to include padding, 1444 at least 2 bytes. One byte for the padding count (last byte) and one 1445 byte just before the padding count containing the session ID. 1447 This proposal uses bytes to carry the session ID that have no defined 1448 value and is intended to be ignored by the receiver. From that 1449 perspective it only causes packet expansion that is supported and 1450 handled by all existing equipment. If an implementation fails to 1451 understand that it is needs to interpret this padding byte to learn 1452 the session ID, it will see a mostly coherent RTP session except 1453 where SSRCs overlap or where the payload types overlap. However, 1454 reporting on the individual sources or forwarding the RTCP RR are not 1455 completely without merit. 1457 There is one downside of this proposal and that has to do with SRTP. 1458 To be able to determine the crypto context, it is necessary to access 1459 to the encrypted payload of the packet. Thus, the only mechanism 1460 available for a receiver to solve this issue is to try the existing 1461 crypto contexts for any session on the same lower layer transport and 1462 then use the one where the packet decrypts and verifies correctly. 1463 Thus for transport flows with many crypto contexts, an attacker could 1464 simply generate packets that don't validate to force the receiver to 1465 try all crypto contexts they have rather than immediately discard it 1466 as not matching a context. A receiver can mitigate this somewhat by 1467 using heuristics based on the RTP header fields to determine which 1468 context applies for a received packet, but this is not a complete 1469 solution. 1471 This solution has a 16-bit per packet overhead. 1473 A.6. Redefine the SSRC field 1475 The Rosenberg et. al. Internet draft "Multiplexing of Real-Time 1476 Transport Protocol (RTP) Traffic for Browser based Real-Time 1477 Communications (RTC)" [I-D.rosenberg-rtcweb-rtpmux] proposed to 1478 redefine the SSRC field. This has the advantage of no packet 1479 expansion. It also looks like regular RTP. However, it has a number 1480 of implications. First of all it prevents any RTP functionality that 1481 require the same SSRC in multiple RTP sessions. 1483 Secondly its interoperability with end-point using multiple RTP 1484 sessions are problematic. Such interoperability will requires an 1485 SSRC translator function in the gateway node to ensure that the SSRCs 1486 fulfil the semantic rules of the different domains. That translator 1487 is actually far from easy as it needs to understand the semantics of 1488 all RTP and RTCP extensions that include SSRC/CSRC. This as it is 1489 necessary to know when a particular matching 32-bit pattern is an 1490 SSRC field and when the field is just a combination of other fields 1491 that create the same matching 32-bit pattern. Thus there is a 1492 possibility that such a translator becomes a obstacle in deploying 1493 future RTP/RTCP extensions. In addition the translator actually have 1494 significant overhead when SRTP are in use. This as a verification 1495 that the packet is authentic, decryption, SSRC translation, 1496 encryption and finally generation of authentication tags are needed. 1497 In addition the translator has to be part of the security context. 1499 This solution has no per packet overhead. 1501 Appendix B. Comparison 1503 This section compares the above potential solutions with the 1504 requirements. Motivations are provided in addition to a high level 1505 metric of successfully, partially and failing to meet requirement. 1506 In the end a summary table (Figure 5) of the high level value are 1507 provided. 1509 B.1. Support of Multiple RTP Sessions Over Single Transport 1510 This one is easy to determine. Only the single session proposal 1511 fails this requirement as it is not at all designed to meet it. The 1512 rest fully support this requirement. The main question around this 1513 requirement is how important it is to have as discussed in 1514 Section 4.1. 1516 B.2. Enable Same SSRC Value in Multiple RTP Sessions 1518 Based on the discussion in Section 4.2 two sub-requirements have been 1519 derived. 1521 B.2.1. Avoid SSRC Translation in Gateways/Translation 1523 This sub-requirement is derived based on the desire to avoid having 1524 gateways or translators perform full SSRC translation to minimize 1525 complexity, avoid the requirement to have gateways in security 1526 context, and as a hinder to long-term evolution. Two of the 1527 proposals have issues with this, due to their lack of support for 1528 multiple 32-bit SSRC spaces and lacking possibility to have the same 1529 SSRC value in multiple RTP sessions. The proposals that have these 1530 properties and thus are marked as failing are the Single Session and 1531 Redefine the SSRC field. The other proposals are all successful in 1532 meeting this requirement. 1534 B.2.2. Support Existing Extensions 1536 The second sub-requirement is how well the proposals support using 1537 the existing RTP mechanisms. Here both Single Session and Redefine 1538 the SSRC field will have clear issues as they cannot support the same 1539 full 32-bit SSRC value in two different RTP sessions. This is 1540 clearly an issue for the XOR based FEC. RTP retransmission and 1541 scalable encoding are minor issues as there exist alternatives to 1542 those mechanisms that works with the structure of these two 1543 proposals. Thus we give them a fail. The Header Extension gets a 1544 partial due to unclear interaction between putting in an header 1545 extension and these mechanisms. 1547 B.3. Ensure SRTP Functions 1549 This requirement is about ensuring both secure and efficient usage of 1550 SRTP. The Octet in Padding field proposal gets a fail as the 1551 receiving end-point cannot determine the intended RTP session prior 1552 to de-encryption of the padding field. Thus a catch-22 arises which 1553 can only be resolved by trying all session contexts and see what 1554 decrypts. This causes a security vulnerability as an attacker can 1555 inject a packet which does not meet any of the session contexts. The 1556 receiver will then attempt decryption and authentication of it using 1557 all its session contexts, increasing the amount of wasted resources 1558 by a factor equal to the number of multiplexed sessions. Thus this 1559 proposal gets a fail. 1561 The proposal of Overloading the SRTP MKI field as session identifier 1562 gets a partial due to the fact that it cannot use SRTP's key- 1563 management mechanism out of the box. It forces the key-management 1564 mechanism and the SRTP implementations to maintain the MKI-to-RTP 1565 session bindings to maintain secure and correct function. 1567 The Redefine the SSRC field gets a partial due to its need to modify 1568 the key-management mechanisms to correctly identify the partial SSRC 1569 space the parameters applies to. Similarly, the SRTP implementation 1570 also needs to be updated to correctly support this security context 1571 differentiation. 1573 The header extension based solution gets a less severe partial than 1574 Redefine the SSRC and the MKI. It will however have an issue when 1575 using a gateway to a domain that does not multiplex multiple RTP 1576 sessions over the same transport. Then the gateway will require to 1577 be in the security context to be able to add or remove the header 1578 extension as it is in the part of the packet that is integrity 1579 protected by SRTP. 1581 The remaining two proposals do not affect SRTP mechanisms and thus 1582 successfully meet this requirement. 1584 B.4. Don't Redefine Used Bits 1586 This requirement is all about RTP and RTCP header fields having a 1587 given definition ought not be changed as it can cause 1588 interoperability problems between modified and non-modified 1589 implementations. This becomes especially problematic in RTP sessions 1590 used for multi-party sessions. 1592 Redefine the SSRC field gets a big fail on this as it redefines the 1593 SSRC field, a core field in RTP. It has been identified that such a 1594 change will have issues since if it gets connected to a non-modified 1595 end-point that randomly assigns the SSRC, as supposed by RFC 3550, 1596 those SSRCs will be distributed over different RTP sessions at the 1597 modified end-point. Also other functions using the SSRC field, not 1598 understanding the additional semantics of the SSRC field, is likely 1599 to have issues. 1601 Using the SRTP MKI field to identify a session is overloading that 1602 field with double semantics. This likely has minimal negative impact 1603 in RTP since it ought to be possible to have the SRTP stack use the 1604 MKI field to both look up the security context and which output RTP 1605 session the processed packet belongs to. However, this redefinition 1606 clearly creates issues with the key-management scheme. That will 1607 have to be modified to handle both this change and deal with the 1608 interoperability issues when negotiating its usage. This gets a full 1609 fail due to that it makes the problem someone else's, namely the RTP 1610 implementers. 1612 Defining an Octet in the Padding field redefines a field, whose 1613 definition is to have zero value and is expected to be ignored by the 1614 receiver according to the original semantics. Thus this is one of 1615 the more benign modifications one can do, however this can still 1616 cause issues in implementations that unnecessarily check the field 1617 values, or in Firewalls. This is judged to be partially meeting the 1618 requirement. 1620 The Header Extension proposal does in fact not redefine any currently 1621 used bits in RTP. The header extension would be a correctly 1622 identified extension with its own definition. However, it does 1623 redefine a rule on what header extensions are for. The RTCP solution 1624 however would have more severe impact as it would need to redefine 1625 the standard meaning of an RTCP packet header in addition to the 1626 default compound packet rules. Due to these issues the proposal 1627 fails to meet this requirement. 1629 The multiplexing shim and the single session both successfully meet 1630 this requirement. 1632 B.5. Firewall Friendly 1634 This requirement is clearly difficult to judge as firewall 1635 implementations are highly different in both implementation, scope of 1636 what it investigates in packets, and set policies. A reasonable goal 1637 is to minimize the likeliness that rules and policies intended to let 1638 RTP media streams pass, will also let these streams through when 1639 multiplexing RTP sessions over a single transport. The below 1640 analysis shows that no solution is truly firewall friendly and all 1641 are judged as being partially meeting this goal. However, the reason 1642 why it is believed that a firewall might react to the streams are 1643 quite different. 1645 The Single Session and Redefine the SSRC field are likely the least 1646 suspect solutions from a firewall perspective. However, as their 1647 transport flows contain multiple SSRCs with payloads that indicate 1648 likely multiple different media types they are still likely to make a 1649 picky firewall block the transport. This is especially true for 1650 Firewalls that take signalling messages into account where it will 1651 expect a particular media type in a given context. A non upgraded 1652 firewall might in fact produce two different contexts with 1653 overlapping transport parameters where both rules will receive media 1654 streams of the other media type that are outside of the allowed rule. 1655 However, to be clear if these proposals doesn't get through, none of 1656 the other will either as they all will have this behaviour. 1658 The header extension proposal is potentially problematic for two 1659 reasons. The first reason, which also other proposals has, is 1660 related to that the same SSRC value can exist in two RTP sessions 1661 over the same underlying flow. Anyone tracking the sequence number 1662 and timestamp will react badly as the second media stream with the 1663 same SSRC causes constant jumps back and forth in these fields 1664 compared to the first stream, if packets are transmitted 1665 simultaneously for both SSRCs. This issue can likely only be solved 1666 by having the Firewalls that like to track flows to also use the 1667 session identifier to create context. This is possible as the header 1668 extension will be in the clear and in the front. The second issue is 1669 that the header extension itself can get the firewall to react. 1670 Especially very picky ones that expect packets with certain media 1671 types to have certain packet lengths. They are not compatible with a 1672 header extension. 1674 The Multiplexing Shim shares the issue with multiple flows for the 1675 same SSRC. Firewalls and deep packet inspection cause the shim 1676 placement to be in question. If it is a pre-fixed shim, it prevents 1677 the packet from looking like regular IP/UDP/RTP packets and be 1678 correctly classified in Firewalls and DPI engines. However, if one 1679 puts it last, it is unlikely that any firewall or DPI ever will be 1680 able to take the session context into account as it is at the end of 1681 the packet. This as many line rate processing devices only take a 1682 certain amount of the headers into account. 1684 The SRTP MKI field is likely the solution that has least firewall and 1685 DPI issues, after the single RTP session. There is no additional 1686 suspect field. The only difference from a single RTP session in the 1687 transport flow is the fact that multiple MKI are guaranteed to be 1688 used. However, that can occur also in a single RTP session usage. 1689 Thus the only issues are the one shared with single session and the 1690 one that several RTP media streams can use the same SSRC. 1692 The octet in the padding field has, in addition to the issues the 1693 SRTP MKI field has, the single issue that it redefines something that 1694 is supposed to be zero into a value. Thus potentially causing a 1695 deeply inspecting firewall to clamp the flow in fear of covert 1696 channel or non-compliance. 1698 B.6. Monitoring and Reporting 1700 The monitoring and reporting requirement considers several aspects. 1701 How useful monitoring can one get from an existing legacy monitor, 1702 and secondary any issues in upgrading them to handle the selected 1703 solution. Thirdly, packet selector filters and packet sniffers 1704 concerns are considered. 1706 In general one can expect the proposals that have only a single SSRC 1707 space to work better with legacy. Thus both Single Session and 1708 Redefine SSRC space can gather and report data on media flows most 1709 likely. The only potential issue is that due to the different media 1710 types and clock rates, some failure can occur. In particular a third 1711 party monitor can be targeted to a specific media type, like 1712 monitoring VoIP. That monitor will have problems processing any 1713 video packets correctly and generate the VoIP specific metrics for 1714 any video sending SSRC. In general, no legacy solution for 1715 monitoring will be able to correctly create the sub-contexts that 1716 each RTP session has in the solutions, without update to handle the 1717 new semantics. Also when it comes to the packet filtering and 1718 selector filters, fine grained control can only be accomplished 1719 implementing the new semantics. Therefore only the Single Session 1720 meets this requirement fully. 1722 Redefine the SSRC field is close to fully meeting the requirement, 1723 however due to that there exist a session structure that is hidden to 1724 anyone that is not upgraded to understand the semantics, this only 1725 gets a partial. 1727 The other proposals all can have multiple RTP sessions using the same 1728 SSRC. This will create significant issues for any legacy third party 1729 monitor. Only an updated monitor, or for that matter packet 1730 selector, can pick out the individual media streams and their 1731 associated RTCP traffic. Thus all these proposals gets a failure to 1732 meet the requirement. 1734 B.7. Usable over Multicast 1736 As discussed earlier the goal with having the option usable also over 1737 multicast is to remove the need to produce different media streams 1738 for transport over unicast and multicast. All of the proposals 1739 successfully meet the requirement. 1741 B.8. Incremental Deployment 1743 The possibility to deploy the usage of the multiplexing of multiple 1744 RTP sessions over a single transport, especially in the context of 1745 multi-party sessions, is a great benefit for any of the proposals. 1746 Thus not all end-point implementations needs to be upgraded before 1747 one start enabling it in the central node and any signalling. 1749 Considering a centralized multi-party application where some 1750 participants are using multiple transport flows and you want to 1751 enable one particular participant to use the single transport to the 1752 central node, one criteria stands out. The possibility to have one 1753 RTP session per transport in one leg, and in the next multiplex them 1754 together with minimal complexity and packet changes. Here there are 1755 significant differences. 1757 The Multiplexing Shim has the least overhead for this. As the 1758 central node or gateway between deployments only needs to either add 1759 or remove the shim identifier and then forward the packet over the 1760 corresponding transport, either a joint one on the single transport 1761 side, or over the individual one on the multiple transport side. 1763 The SRTP MKI field proposal is almost as good, as the only main 1764 difference is the need to coordinate the used MKIs on the non- 1765 multiplexed legs so that there is no overlap between the RTP 1766 sessions. And if there is, the MKI can be translated in gateway as 1767 SRTP has no integrity protection over the MKI. Thus both 1768 multiplexing shim and SRTP MKI field does successfully meet this 1769 requirement. 1771 The Header Extension supports multiple full 32-bit SSRC spaces and 1772 can thus handle all the RTP sessions without need for any SSRC 1773 translation, however this proposal does run into the problem that the 1774 gateway needs to be in the security context to be able to add or 1775 remove the header extension when SRTP is used. In addition to the 1776 security implications of that, there is a complexity overhead due to 1777 the need to redo the authentication tags on all RTP/RTCP packets. 1778 Thus it gets a partial. 1780 The Octet in the Padding field share issues with the header extension 1781 but have even higher complexities for this. The reason is that the 1782 padding field is also encrypted. Thus to add or remove it (although 1783 removing it might be unnecessary) forces the end-point to encrypt at 1784 least that byte also, and for ciphers that are not stream-ciphers, 1785 the whole packet needs to be re-encrypted. Thus this proposal gets a 1786 very weak partially meeting the requirement. 1788 The Single Session and Redefine the SSRC field do not allow several 1789 vanilla RTP sessions to be connected to these proposals. The reason 1790 is the single 32-bit SSRC space they have. Single Session only has 1791 one session and the Redefine the SSRC fields uses some of the bits as 1792 session identifier. This forces the gateway to translate the SSRC 1793 whenever it does not fulfil the rules or semantics of the multiplexed 1794 side. For Redefine SSRC field this becomes almost constant as the 1795 session identifier part of the SSRC has to be the same over all SSRCs 1796 from the same session. For Single Session it might only be needed 1797 when there otherwise would be an SSRC collision between the sessions. 1798 This further assumes that the non-multiplexed side would never use 1799 any of the RTP mechanisms that require the same SSRC in multiple RTP 1800 sessions, as they cannot be gatewayed at all. When translating an 1801 SSRC there is first of all an overhead, with SRTP that includes a 1802 complete authenticate, decrypt, encrypt and create a new 1803 authentication tag cycle. In addition, the SSRC translation could 1804 potentially be a deployment obstacle for new RTP/RTCP extensions that 1805 has to be understood by the translator to be correctly translated. 1806 Therefore these two proposals gets a fail to meet the requirements. 1808 B.9. Summary and Conclusion 1810 This section contains a summary table of the high level outcome 1811 against the different requirements. 1813 A table mapping the requirements against the ID numbers used in the 1814 table is the following: 1816 1: Support multiple RTP sessions over one transport flow 1818 2: Enable same SSRC value in multiple RTP sessions 1820 2.1: Avoid SSRC translation in gateways/translators 1822 2.2: Support existing extensions 1824 3: Ensure SRTP functions 1826 4: Don't Redefine used bits 1828 5: Firewall Friendly 1830 6: Monitoring and Reporting still needs to function 1832 7: Usable over Multicast 1834 8: Incremental deployment 1836 OH: Overhead in Bytes. + means variable 1838 ---------------+---+---+---+---+---+---+---+---+---+---- 1839 Solution | 1 |2.1|2.2| 3 | 4 | 5 | 6 | 7 | 8 | OH 1840 ---------------+---+---+---+---+---+---+---+---+---+---- 1841 Header Ext. | S | S | P | P | F | P | F | S | P | 8+ 1842 Multiplex Shim | S | S | S | S | S | P | F | S | S | 1 1843 Single Session | F | F | F | S | S | P | S | S | F | 0 1844 SRTP MKI Field | S | S | S | P | F | P | F | S | S | 4 1845 Padding Field | S | S | S | F | P | P | F | S | P | 2 1846 Redefine SSRC | S | F | F | P | F | P | P | S | S | 0 1847 ---------------+---+---+---+---+---+---+---+---+---+---- 1849 Figure 5: Summary Table of Evaluation (Successfully (S), Partially 1850 (P) or Fails (F) to meet requirement) 1852 Considering these options, the authors would recommend that AVTCORE 1853 standardize a solution based on a post or prefixed multiplexing 1854 field, i.e. a shim approach combined with the appropriate signalling 1855 as described in Appendix A.2. 1857 Authors' Addresses 1859 Magnus Westerlund 1860 Ericsson 1861 Farogatan 6 1862 SE-164 80 Kista 1863 Sweden 1865 Phone: +46 10 714 82 87 1866 Email: magnus.westerlund@ericsson.com 1868 Colin Perkins 1869 University of Glasgow 1870 School of Computing Science 1871 Glasgow G12 8QQ 1872 United Kingdom 1874 Email: csp@csperkins.org