idnits 2.17.1 draft-westerlund-avtcore-transport-multiplexing-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 13, 2012) is 4298 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-00 == Outdated reference: A later version (-03) exists of draft-westerlund-avtcore-multiplex-architecture-01 -- Obsolete informational reference (is this intentional?): RFC 5285 (Obsoleted by RFC 8285) Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Westerlund 3 Internet-Draft Ericsson 4 Intended status: Standards Track C. Perkins 5 Expires: January 14, 2013 University of Glasgow 6 July 13, 2012 8 Multiple RTP Sessions on a Single Lower-Layer Transport 9 draft-westerlund-avtcore-transport-multiplexing-03 11 Abstract 13 This document specifies how multiple RTP sessions are to be 14 multiplexed on the same lower-layer transport, e.g. a UDP flow. It 15 discusses various requirements that have been raised and their 16 feasibility, which results in a solution with a certain 17 applicability. A solution is recommended and that solution is 18 provided in more detail, including signalling and examples. 20 Status of this Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on January 14, 2013. 37 Copyright Notice 39 Copyright (c) 2012 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 55 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 4 56 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 57 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 5 58 3. Motivations . . . . . . . . . . . . . . . . . . . . . . . . . 5 59 3.1. NAT and Firewalls . . . . . . . . . . . . . . . . . . . . 5 60 3.2. No Transport Level QoS . . . . . . . . . . . . . . . . . . 5 61 3.3. Multiple RTP sessions . . . . . . . . . . . . . . . . . . 6 62 3.4. Usage of RTP Extensions . . . . . . . . . . . . . . . . . 6 63 3.5. Incremental Deployment . . . . . . . . . . . . . . . . . . 7 64 3.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 7 65 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 7 66 4.1. Support Use of Multiple RTP Sessions . . . . . . . . . . . 7 67 4.2. Same SSRC Value in Multiple RTP Sessions . . . . . . . . . 8 68 4.3. SRTP . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 69 4.4. Don't Redefine Used Bits . . . . . . . . . . . . . . . . . 9 70 4.5. Firewall Friendly . . . . . . . . . . . . . . . . . . . . 9 71 4.6. Monitoring and Reporting . . . . . . . . . . . . . . . . . 9 72 4.7. Usable Also Over Multicast . . . . . . . . . . . . . . . . 10 73 4.8. Incremental Deployment . . . . . . . . . . . . . . . . . . 10 74 5. Design Considerations . . . . . . . . . . . . . . . . . . . . 10 75 5.1. Location of SHIM . . . . . . . . . . . . . . . . . . . . . 10 76 5.2. Signalling Fallback . . . . . . . . . . . . . . . . . . . 12 77 6. Specification . . . . . . . . . . . . . . . . . . . . . . . . 13 78 6.1. Shim Layer . . . . . . . . . . . . . . . . . . . . . . . . 13 79 6.2. Signalling . . . . . . . . . . . . . . . . . . . . . . . . 16 80 6.3. SRTP Key Management . . . . . . . . . . . . . . . . . . . 17 81 6.3.1. Security Description . . . . . . . . . . . . . . . . . 18 82 6.3.2. DTLS-SRTP . . . . . . . . . . . . . . . . . . . . . . 18 83 6.3.3. MIKEY . . . . . . . . . . . . . . . . . . . . . . . . 18 84 6.4. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 19 85 6.4.1. RTP Packet with Transport Header . . . . . . . . . . . 19 86 6.4.2. SDP Offer/Answer example . . . . . . . . . . . . . . . 19 87 7. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 24 88 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 89 9. Security Considerations . . . . . . . . . . . . . . . . . . . 25 90 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25 91 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26 92 11.1. Normative References . . . . . . . . . . . . . . . . . . . 26 93 11.2. Informational References . . . . . . . . . . . . . . . . . 26 94 Appendix A. Possible Solutions . . . . . . . . . . . . . . . . . 27 95 A.1. Header Extension . . . . . . . . . . . . . . . . . . . . . 27 96 A.2. Multiplexing Shim . . . . . . . . . . . . . . . . . . . . 29 97 A.3. Single Session . . . . . . . . . . . . . . . . . . . . . . 29 98 A.4. Use the SRTP MKI field . . . . . . . . . . . . . . . . . . 31 99 A.5. Use an Octet in the Padding . . . . . . . . . . . . . . . 31 100 A.6. Redefine the SSRC field . . . . . . . . . . . . . . . . . 32 101 Appendix B. Comparison . . . . . . . . . . . . . . . . . . . . . 33 102 B.1. Support of Multiple RTP Sessions Over Single Transport . . 33 103 B.2. Enable Same SSRC Value in Multiple RTP Sessions . . . . . 33 104 B.2.1. Avoid SSRC Translation in Gateways/Translation . . . . 33 105 B.2.2. Support Existing Extensions . . . . . . . . . . . . . 33 106 B.3. Ensure SRTP Functions . . . . . . . . . . . . . . . . . . 34 107 B.4. Don't Redefine Used Bits . . . . . . . . . . . . . . . . . 34 108 B.5. Firewall Friendly . . . . . . . . . . . . . . . . . . . . 35 109 B.6. Monitoring and Reporting . . . . . . . . . . . . . . . . . 37 110 B.7. Usable over Multicast . . . . . . . . . . . . . . . . . . 37 111 B.8. Incremental Deployment . . . . . . . . . . . . . . . . . . 38 112 B.9. Summary and Conclusion . . . . . . . . . . . . . . . . . . 39 113 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 40 115 1. Introduction 117 There has been renewed interest for having a solution that allows 118 multiple RTP sessions [RFC3550] to use a single lower layer 119 transport, such as a bi-directional UDP flow. The main reason is the 120 cost of doing NAT/FW traversal for each individual flow. ICE and 121 other NAT/FW traversal solutions are clearly capable of attempting to 122 open multiple flows. However, there is both increased risk for 123 failure and an increased cost in the creation of multiple flows. The 124 increased cost comes as slightly higher delay in establishing the 125 traversal, and the amount of consumed NAT/FW resources. The latter 126 might be an increasing problem in the IPv4 to IPv6 transition period. 128 There is ongoing work on specifying how and when one RTP session may 129 contain multiple media types 130 [I-D.westerlund-avtcore-multi-media-rtp-session]. That addresses 131 certain use cases, while this proposal addresses a different set of 132 use cases and motivations. This is further discussed in the section 133 on Motivations (Section 3). The classical method of having one RTP 134 session over a specific transport flow is still motivated for a 135 number of use cases, especially when flow based QoS is to be used for 136 some media streams. 138 This document draws up some requirements for consideration on how to 139 transport multiple RTP sessions over a single lower-layer transport. 140 These requirements will have to be weighted as the combined set of 141 requirements result in that no known solution exist that can fulfill 142 them completely. 144 A number of possible solutions where considered and discussed with 145 respect to their properties. Based on that, the authors recommends a 146 shim layer variant as single solution, which is described in more 147 detail including signalling solution and examples. The proposals and 148 the comparison is available as appendices. 150 2. Conventions 152 2.1. Terminology 154 Some terminology used in this document. 156 Multiplexing: Unless specifically noted, all mentioning of 157 multiplexing in this document refer to the multiplexing of 158 multiple RTP Sessions on the same lower layer transport. It is 159 important to make this distinction as RTP does contain a number of 160 multiplexing points for various purposes, such as media formats 161 (Payload Type), media sources (SSRC), and RTP sessions. 163 2.2. Requirements Language 165 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 166 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 167 document are to be interpreted as described in RFC 2119 [RFC2119]. 169 3. Motivations 171 This section looks at the motivations why an additional solution is 172 needed assuming that you can do both the classical method of having 173 one RTP session per transport flow as defined by the RTP 174 specification [RFC3550] and when you have multiple media types within 175 one RTP session [I-D.westerlund-avtcore-multi-media-rtp-session]. 177 First we look at the motivations why a single transport flow is of 178 sufficient interest, namely NATs and Firewalls. Then 180 3.1. NAT and Firewalls 182 The existence of NATs and Firewalls at almost all Internet access has 183 had implications on protocols like RTP that were designed to use 184 multiple transport flows. First of all, the NAT/FW traversal 185 solution one uses needs to ensure that all these transport flows are 186 established. This has three different impacts: 188 1. Increased delay to perform the transport flow establishment 190 2. The more transport flows, the more state and the more resource 191 consumption in the NAT and Firewalls. When the resource 192 consumption in NAT/FWs reaches their limits, unexpected behaviors 193 usually occur. 195 3. More transport flows means a higher risk that some transport flow 196 fails to be established, thus preventing the application to 197 communicate. 199 Using fewer transport flows reduces the risk of communication 200 failure, improved establishment behavior and less load on NAT and 201 Firewalls. 203 3.2. No Transport Level QoS 205 Many RTP-using applications don't utilize any network level Quality 206 of Service functions. Nor do they expect or desire any separation in 207 network treatment of its media packets, independent of whether they 208 are audio, video or text. When an application has no such desire, it 209 doesn't need to provide a transport flow structure that simplifies 210 flow based QoS. 212 3.3. Multiple RTP sessions 214 The usage of multiple RTP sessions allow separation of media streams 215 that have different usages or purposes in an RTP based application, 216 for example to separate the video of a presenter or most important 217 current talker from those of the listeners that not all end-points 218 receiver. Also separation for different processing based on media 219 types such as audio and video in end-points and central nodes. Thus 220 providing the node with the knowledge that any SSRC within the 221 session is supposed to be processed in a similar or same way. 223 For simpler cases, where the streams within each media type need the 224 same processing, it is clearly possible to find other multiplex 225 solutions, for example based on the Payload Type and the differences 226 in encoding that the payload type allows to describe. This may 227 anyhow be insufficient when you get into more advanced usages where 228 you have multiple sources of the same media type, but for different 229 usages or as alternatives. For example when you have one set of 230 video sources that shows session participants and another set of 231 video sources that shares an application or slides, you likely want 232 to separate those streams for various reasons such as control, 233 prioritization, QoS, methods for robustification, etc. In those 234 cases, using the RTP session for separation of properties is a 235 powerful tool. A tool with properties that need to be preserved when 236 providing a solution for how to use only a single lower-layer 237 transport. 239 For more discussion of the usage of RTP sessions verses other 240 multiplexing we recommend RTP Multiplexing Architecture 241 [I-D.westerlund-avtcore-multiplex-architecture]. 243 3.4. Usage of RTP Extensions 245 Applications uses different sets of RTP extensions. The solution for 246 multiple media types in one RTP session 247 [I-D.westerlund-avtcore-multi-media-rtp-session] is known to have 248 limitations that prevent the usage of the following RTP mechanisms 249 and extensions: 251 o XOR FEC (RFC5109) 253 o RTP Retransmission in session mode (RFC4588) 255 o Certain Layered Coding 257 A developed solution should minimize the number of RTP/RTCP extension 258 and mechanism that can't be used. 260 3.5. Incremental Deployment 262 In various multi-party communication scenarios deployment can become 263 an issue if all session participants are required to have the 264 functionality before enabling its usage. This is especially 265 difficult in communication scenarios where not all possible 266 participants and their capabilities are know ahead of establishing 267 the communication session with some sub-set of the participants. At 268 least for centralized communication sessions it is desirable to have 269 a solution that enables allows the solution to be used on a single 270 leg without affecting any other leg, nor require advanced 271 functionality in any central node. 273 3.6. Summary 275 The center of the motivation is to ensure that the RTP session is a 276 available and usable tool also for applications that has no need for 277 network level separation of its media streams and wants to reduce its 278 exposure to any NAT or Firewall inconsistencies and minimize the 279 resource consumption. As a benefit a well designed solution will 280 enable incremental deployment and minimal limitations in what 281 existing RTP mechanisms or extensions that can be used by the RTP 282 using application. 284 4. Requirements 286 This section lists and discusses a number of potential requirements. 287 However, it is not difficult to realize that it is in fact possible 288 to put requirements that makes the set of feasible solutions an empty 289 set. It is thus necessary to consider which requirements that are 290 essential to fulfill and which can be compromised on to arrive at a 291 solution. 293 4.1. Support Use of Multiple RTP Sessions 295 Section 3.3 discusses a number of reasons why an application may like 296 to have multiple RTP sessions. Considering the motivations for this 297 work this must be an absolute requirement. We also are of the 298 opinion that the session provided by the solution must fulfill the 299 definition in the RTP [RFC3550] specification: 301 "The distinguishing feature of an RTP session is that each 302 maintains a full, separate space of SSRC identifiers (defined 303 next). The set of participants included in one RTP session 304 consists of those that can receive an SSRC identifier transmitted 305 by any one of the participants either in RTP as the SSRC or a CSRC 306 (also defined below) or in RTCP." 308 4.2. Same SSRC Value in Multiple RTP Sessions 310 Two different RTP sessions being multiplexed on the same lower layer 311 transport need to be able to use the same SSRC value. This is a 312 strong requirement, for two reasons: 314 1. To avoid mandating SSRC assignment rules that are coordinated 315 between the sessions. If the RTP sessions multiplexed together 316 must have unique SSRC values, then additional code that works 317 between RTP Sessions is needed in the implementations. Thus 318 raising the bar for implementing this solution. In addition, if 319 one gateways between parts of a system using this multiplexing 320 and parts that aren't multiplexing, the part that isn't 321 multiplexing must also fulfill the requirements on how SSRC is 322 assigned or force the gateway to translate SSRCs. Translating 323 SSRC is actually hard as it requires one to understand the 324 semantics of all current and future RTP and RTCP extensions. 325 Otherwise a barrier for deploying new extensions is created. 327 2. There are some few RTP extensions that currently rely on being 328 able to use the same SSRC in different RTP sessions: 330 * XOR FEC (RFC5109) 332 * RTP Retransmission in session mode (RFC4588) 334 * Certain Layered Coding 336 4.3. SRTP 338 SRTP [RFC3711] is one of the most commonly used security solutions 339 for RTP. In addition, it is the only one recommended by IETF that is 340 integrated into RTP. This integration has several aspects that needs 341 to be considered when designing a solution for multiplexing RTP 342 sessions on the same lower layer transport. 344 Determining Crypto Context: SRTP first of all needs to know which 345 session context a received or to-be-sent packet relates to. It 346 also normally relies on the lower layer transport to identify the 347 session. It uses the MKI, if present, to determine which key set 348 is to be used. Then the SSRC and sequence number are used by most 349 crypto suites, including the most common use of AES Counter Mode, 350 to actually generate the correct cipher stream. 352 Unencrypted Headers: SRTP has chosen to leave the RTP headers and 353 the first two 32-bit words of the first RTCP header unencrypted, 354 to allow for both header compression and monitoring to work also 355 in the presence of encryption. As these fields are in clear text 356 they are used in most crypto suites for SRTP to determine how to 357 protect or recover the plain text. 359 It is here important to contrast SRTP against a set of other possible 360 protection mechanisms. DTLS, TLS, and IPsec are all protecting and 361 encapsulating the entire RTP and RTCP packets. They don't perform 362 any partial operations on the RTP and RTCP packets. Any change that 363 is considered to be part of the RTP and RTCP packet is transparent to 364 them, but possibly not to SRTP. Thus the impact on SRTP operations 365 must be considered when defining a mechanism. 367 4.4. Don't Redefine Used Bits 369 As the core of RTP is in use in many systems and has a really large 370 deployment story and numerous implementations, changing any of the 371 field definitions is highly problematic. First of all, the 372 implementations need to change to support this new semantics. 373 Secondly, you get a large transition issue when you have some session 374 participants that support the new semantics and some that don't. 375 Combing the two behaviors in the same session can force the 376 deployment of costly and less than perfect translation devices. 378 4.5. Firewall Friendly 380 It is desirable that current Firewalls will accept the solutions as 381 normal RTP packets. However, in the authors' opinion we can't let 382 the firewall stifle invention and evolution of the protocol. It is 383 also necessary to be aware that a change that will make most deep 384 inspecting firewall consider the packet as not valid RTP/RTCP will 385 have more difficult deployment story. 387 4.6. Monitoring and Reporting 389 It is desirable that a third party monitor can still operate on the 390 multiplexed RTP Sessions. It is however likely that they will 391 require an update to correctly monitor and report on multiplexed RTP 392 Sessions. 394 Another type of function to consider is packet sniffers and their 395 selector filters. These may be impacted by a change of the fields. 396 An observation is that many such systems are usually quite rapidly 397 updated to consider new types of standardized or simply common packet 398 formats. 400 4.7. Usable Also Over Multicast 402 It is desirable that a solution should be possible to use also when 403 RTP and RTCP packets are sent over multicast, both Any Source 404 Multicast (ASM) and Single Source Multicast (SSM). The reason for 405 this requirement is to allow a system using RTP to use the same 406 configuration regardless of the transport being done over unicast or 407 multicast. In addition, multicast can't be claimed to have an issue 408 with using multiple ports, as each multicast group has a complete 409 port space scoped by address. 411 4.8. Incremental Deployment 413 A good solution has the property that in topologies that contains RTP 414 mixers or Translators, a single session participant can enable 415 multiplexing without having any impact on any other session 416 participants. Thus a node should be able to take a multiplexed 417 packet and then easily send it out with minimal or no modification on 418 another leg of the session, where each RTP session is transported 419 over its own lower-layer transport. It should also be as easy to do 420 the reverse forwarding operation. 422 5. Design Considerations 424 When defining a SHIM solution for identifying RTP sessions over a 425 single transport layer there has been some special considerations 426 that is discussed in this section. 428 5.1. Location of SHIM 430 A major question affecting the SHIM is the location of the SHIM 431 header providing the Identifier of the session the packet relate to. 432 This section will discuss in detail about the impact of making the 433 different choices. 435 Identified aspects to consider are: 437 Possibility to Process: A prefixed shim header, i.e. between the 438 transport protocol and the RTP/RTCP packet header has the 439 advantage that any node on the network that likes to include the 440 header in any per-packet processing can reach it. Reasons for 441 per-packet processing are: 443 A. Quality of Service classification 445 B. SHIM ingress or egress 446 C. Monitoring 448 Many routers or similar devices can only read and process the 449 first N bytes of the whole packet, where N is commonly on the 450 order of 64-128 bytes. Any other type of processing means putting 451 the packet on the slow path. Thus a prefixed solution enables 452 this processing while a post fixed solution will most likely 453 forever prevent this type of devices to process it. 455 Legacy Processing: Packets or at least flows of the type IP/UDP/RTP 456 can in many cases be identified in Deep Packet Inspection, 457 Firewalls or other network entities that concern themselves with 458 trying determine what traffic that flows in a particular packet. 459 These nodes can clearly be updated but until they have they may 460 create a hinder against deployment. Thus a post fix gives likely 461 the least resistance for initial deployment. However, also for 462 postfix location the deployment can be hindered in cases multiple 463 RTP sessions using the same SSRC values due to irregular behavior 464 of the fields for what the third party believes is one media 465 stream rather than multiple ones. The prefixed will however 466 maintain the long-term capabilities of such devices assuming they 467 can be updated to include the SHIM header as part of the 468 classification. 470 Header Compression: The different header compression techniques that 471 has been developed compresses IP/UDP/RTP as complete combination. 472 If one instead have a IP/UDP/SHIM/RTP then the compression for the 473 full set will not work. Instead only IP/UDP header compression 474 can be applied. Thus a prefix will loose some compression 475 efficiency until compression profiles for IP/UDP/SHIM/RTP has been 476 developed, implemented and deployed. Postfix don't have that 477 issue, but nor can it ever gain anything from header compression 478 which an prefixed solution could once an updated profile is 479 deployed. 481 The question of a prefixed or a postfixed header comes down to a 482 trade-off between long term usability and deployment issues: 484 Prefixed: Long term good possibility to adapt any network function 485 that needs to take the SHIM header into account. At the same time 486 any function that tries to analyze packets and because of that may 487 block the packets will be a hinder to deployment. 489 Postfixed: This solution will likely short term have the best 490 possibilities to deploy successfully. However, long term this 491 choice will likely prevent many network nodes that like to be 492 capable of separating the RTP sessions being multiplexed together 493 from successfully doing that. 495 Open Issue: Which should be chosen? The below specification uses 496 prefix but that can easily be changed. But appears to be the best 497 long term choice without to badly affecting deployability. 499 5.2. Signalling Fallback 501 There exist an important aspect in how the SDP signalling functions, 502 especially Offer/Answer [RFC3264]. The initial idea for the 503 signalling was to build on top of bundle 504 [I-D.ietf-mmusic-sdp-bundle-negotiation] which in its default 505 function negotiate multiple media types over one RTP session 506 [I-D.westerlund-avtcore-multi-media-rtp-session]. If the signalling 507 for the solution that main purpose is to enable multiple RTP sessions 508 results in those cases the peer doesn't support this specification 509 the communicating peer can end up in single RTP session if the peer 510 supports that. 512 We consider it important that in the signalling design that the 513 application developer can decide what type of fallback that will 514 occur. It is also important to consider that one have to signal SHIM 515 based multiplexing of RTP sessions that are in fact of the type with 516 multiple media types. Thus the signalling for SHIM must be able to 517 describe multiple different scenarios: 519 1. Multiple RTP sessions multiplexed together using SHIM over one 520 transport 522 2. Like 1 but where at least one RTP session is containing multiple 523 media types 525 3. Like 1, but where the peer doesn't support SHIM and the initiator 526 wants to fallback to independent transports 528 4. Like 2, but where the peer doesn't support SHIM and wants to 529 fallback to multiple BUNDLED sessions over independent 530 transports. 532 In addition it must be possible to have multiple different transports 533 where each is a SHIM multiplex. 535 To enable all of these scenarios we propose a solution where each 536 indicates SHIM multiplex is indicated as its own grouping attribute 537 across all media blocks that are included in some form in the 538 multiplex. This resulting in that these media blocks fall under a 539 form of BUNDLE super set. This super set will also have some of 540 bundles restrictions on the transport layer, but not on higher layer. 541 Which Session ID pair a particular media block is associated is 542 signalled using a SDP attribute (a=session-mux-id) in each media 543 block. When multiple media block are assigned the same session ID 544 pair, they form a RTP session with multiple media types and have the 545 full restriction of bundle between them. 547 The method of fallback is indicated by providing explicit BUNDLE 548 grouping in addition to the SHIM when the fallback from SHIM is to 549 BUNDLE. 551 6. Specification 553 This section contains the specification of the solution based on a 554 SHIM, with the explicit session identifier of the encapsulated 555 payload. 557 6.1. Shim Layer 559 This solution is based on a shim layer that is inserted in the stack 560 between the regular RTP and RTCP packets and the transport layer 561 being used by the RTP sessions. Thus the layering looks like the 562 following: 564 +---------------------+ 565 | RTP / RTCP Packet | 566 +---------------------+ 567 | Session ID Layer | 568 +---------------------+ 569 | Transport layer | 570 +---------------------+ 572 Stack View with Session ID SHIM 574 The above stack is in fact a layered one as it does allow multiple 575 RTP Sessions to be multiplexed on top of the Session ID shim layer. 576 This enables the example presented in Figure 1 where four sessions, 577 S1-S4 is sent over the same Transport layer and where the Session ID 578 layer will combine and encapsulate them with the session ID on 579 transmission and separate and decapsulate them on reception. 581 +-------------------+ 582 | S1 | S2 | S3 | S4 | 583 +-------------------+ 584 | Session ID Layer | 585 +-------------------+ 586 | Transport layer | 587 +-------------------+ 589 Figure 1: Multiple RTP Session On Top of Session ID Layer 591 The Session ID layer encapsulates one RTP or RTCP packet from a given 592 RTP session and prefixes a one byte Session ID (SID) field to the 593 packet. Each RTP session being multiplexed on top of a given 594 transport layer is assigned either a single or a pair of unique SID 595 in the range 0-255. The reason for assigning a pair of SIDs to a 596 given RTP session are for RTP Sessions that doesn't support 597 "Multiplexing RTP Data and Control Packets on a Single Port" 598 [RFC5761] to still be able to use a single 5-tuple. The reasons for 599 supporting this extra functionality is that RTP and RTCP multiplexing 600 based on the payload type/packet type fields enforces certain 601 restrictions on the RTP sessions. These restrictions may not be 602 acceptable. As this solution does not have these restrictions, 603 performing RTP and RTCP multiplexing in this way has benefits. 605 Each Session ID value space is scoped by the underlying transport 606 protocol. Common transport protocols like UDP, DCCP, TCP, and SCTP 607 can all be scoped by one or more 5-tuple (Transport protocol, source 608 address and port, destination address and port). The case of 609 multiple 5-tuples occur in the case of multi-unicast topologies, also 610 called meshed multiparty RTP sessions or in case any application 611 would need more than 128 RTP sessions. 613 0 1 2 3 614 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 615 +---------------+ 616 | Session ID | 617 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 618 |V=2|P|X| CC |M| PT | sequence number | | 619 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 620 | timestamp | | 621 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 622 | synchronization source (SSRC) identifier | | 623 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 624 | contributing source (CSRC) identifiers | | 625 | .... | | 626 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 627 | RTP extension (OPTIONAL) | | 628 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 629 | | payload ... | | 630 | | +-------------------------------+ | 631 | | | RTP padding | RTP pad count | | 632 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 633 | ~ SRTP MKI (OPTIONAL) ~ | 634 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 635 | : authentication tag (RECOMMENDED) : | 636 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 637 +- Encrypted Portion* Authenticated Portion ---+ 638 Figure 2: SRTP Packet encapsulated by Session ID Layer 640 0 1 2 3 641 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 642 +---------------+ 643 | Session ID | 644 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 645 |V=2|P| RC | PT=SR or RR | length | | 646 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 647 | SSRC of sender | | 648 +>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 649 | ~ sender info ~ | 650 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 651 | ~ report block 1 ~ | 652 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 653 | ~ report block 2 ~ | 654 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 655 | ~ ... ~ | 656 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 657 | |V=2|P| SC | PT=SDES=202 | length | | 658 | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 659 | | SSRC/CSRC_1 | | 660 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 661 | ~ SDES items ~ | 662 | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 663 | ~ ... ~ | 664 +>+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 665 | |E| SRTCP index | | 666 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 667 | ~ SRTCP MKI (OPTIONAL) ~ | 668 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 669 | : authentication tag : | 670 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 671 +-- Encrypted Portion Authenticated Portion -----+ 673 Figure 3: SRTCP packet encapsulated by Session ID layer 675 The processing in a receiver when the Session ID layer is present 676 will be to 678 1. Pick up the packet from the lower layer transport 680 2. Inspect the SID field value 682 3. Strip the SID field from the packet 683 4. Forward it to the (S)RTP Session context identified by the SID 684 value 686 6.2. Signalling 688 The use of the Session ID layer needs to be explicitly agreed on 689 between the communicating parties. Each RTP Session the application 690 uses must in addition to the regular configuration such as payload 691 types, RTCP extension etc, have both the underlying 5-tuple (source 692 address and port, destination address and port, and transport 693 protocol) and the Session ID used for the particular RTP session. 694 The signalling requirement is to assign unique Session ID values to 695 all RTP Sessions being sent over the same 5-tuple. The same Session 696 ID shall be used for an RTP session independently of the traffic 697 direction. Note that nothing prevents a multi-media application from 698 using multiple 5-tuples if desired for some reason, in which case 699 each 5-tuple has its own session ID value space. 701 This section defines how to negotiate the use of the Session ID 702 layer, using the Session Description Protocol (SDP) Offer/Answer 703 mechanism [RFC3264]. A new SDP grouping semantics is defined "SHIM" 704 and a new media-level SDP attribute, 'session-mux-id. The attribute 705 allows each media description ("m=" line) associated with a 'SHIM' 706 group to be identified in which RTP session it belongs. 708 The 'session-mux-id' attribute is included for a media description, 709 in order to indicate the Session ID for that particular media 710 description. Every media description that shares a common attribute 711 value is assumed to be part of a single RTP session. An SDP Offerer 712 MUST include the 'session-mux-id' attribute for every media 713 description associated with a 'SHIM' group. If the SDP Answer does 714 not contain the SHIM group, the SDP Offerer MUST NOT use SHIM based 715 layering. However, if that is separate RTP sessions or BUNDLE is 716 determined on what was present in the offer and answer. This will 717 depend on what the offering party likes to happen. If they want a 718 failure to negotiate a SHIM, instead may be one or more bundle groups 719 then also the BUNDLE grouping is included in the offer. If the SDP 720 Answer still describes a 'BUNDLE' group, the procedures in 721 [I-D.ietf-mmusic-sdp-bundle-negotiation] apply. If not independent 722 transports and sessions are used. 724 An SDP Answerer MUST NOT include the 'SHIM' group and 725 'session-mux-id' attribute in an SDP Answer, unless they where 726 included in the SDP Offer. 728 The attribute has the following ABNF [RFC5234] definition. 730 Session-mux-id-attr = "a=session-mux-id:" SID *SID-prop 731 SID = SID-value / SID-pairs 732 SID-value = 1*3DIGIT / "NoN" 733 SID-pairs = SID-value "/" SID-value ; RTP/RTCP SIDs 734 SID-prop = SP assignment-policy / prop-ext 735 prop-ext = token "=" value 736 assignment-policy = "policy=" ("tentative" / "fixed") 738 The SHIM group SHALL contain all media descriptions that are intended 739 to be sent over the same transport flow, independent of Session ID. 740 For all media descriptions part of the same SHIM group the transport 741 parameters, i.e. ports, ICE-candidates etc MUST be the same and 742 handled as described by BUNDLE. Note, the parameters related to the 743 RTP session does not need to be same. 745 For media descriptions that have the same value of the Session ID 746 SHALL be treated the same way as if they where part of a BUNDLE 747 group, independently if that is indicated or not in the SDP. 749 The SID property "policy" is used in negotiation by an end-point to 750 indicate if the session ID values are merely a tentative suggestion 751 or if they must have these values. This is used when negotiating SID 752 for multi-party RTP sessions to support shared transports such as 753 multicast or RTP translators that are unable to produce renumbered 754 SIDs on a per end-point basis. The normal behavior is that the offer 755 suggest a tentative set of values, indicated by "policy=tentative". 756 These SHOULD be accepted by the peer unless that peer negotiate 757 session IDs on behalf of a centralized policy, in which case it MAY 758 change the value(s) in the answer. If the offer represents a policy 759 that does not allow changing the session ID values, it can indicate 760 that to the answerer by setting the policy to "fixed". This enables 761 the answering peer to either accept the value or indicate that there 762 is a conflict in who is performing the assignment by setting the SID 763 value to NoN (Not a Number). Offerer and answerer SHOULD always 764 include the policy they are operating under. Thus, in case of no 765 centralized behaviors, both offerer and answerer will indicate the 766 tentative policy. 768 6.3. SRTP Key Management 770 Key management for SRTP do needs discussion as we do cause multiple 771 SRTP sessions to exist on the same underlying transport flow. Thus 772 we need to ensure that the key management mechanism still are 773 properly associated with the SRTP session context it intends to key. 774 To ensure that we do look at the three SRTP key management mechanism 775 that IETF has specified, one after another. 777 6.3.1. Security Description 779 Session Description Protocol (SDP) Security Descriptions for Media 780 Streams [RFC4568] as being based on SDP has no issue with the RTP 781 session multiplexing on lower layer specified here. The reason is 782 that the actual keying is done using a media level SDP attribute. 783 Thus the attribute is already associated with a particular media 784 description. A media description that also will have an instance of 785 the "a=session-mux-id" attribute carrying the SID value/pair used 786 with this particular crypto parameters. 788 6.3.2. DTLS-SRTP 790 Datagram Transport Layer Security (DTLS) Extension to Establish Keys 791 for the Secure Real-time Transport Protocol (SRTP) [RFC5764] is a 792 keying mechanism that works on the media plane on the same lower 793 layer transport that SRTP/SRTCP will be transported over. Thus each 794 DTLS message must be associated with the SRTP and/or SRTCP flow it is 795 keying. 797 The most direct solution is to use the SHIM and the SID context 798 identifier to be applied also on DTLS packets. Thus using the same 799 SID that is used with RTP and/or RTCP also for the DTLS message 800 intended to key that particular SRTP and/or SRTCP flow(s). Thus this 801 behavior doesn't gain you anything in regards to key-management when 802 using SHIM. 804 6.3.3. MIKEY 806 MIKEY: Multimedia Internet KEYing [RFC3830] is a key management 807 protocol that has several transports. In some cases it is used 808 directly on a transport protocol such as UDP, but there is also a 809 specification for how MIKEY is used with SDP "Key Management 810 Extensions for Session Description Protocol (SDP) and Real Time 811 Streaming Protocol (RTSP)" [RFC4567]. 813 Lets start with the later, i.e. the SDP transport, which shares the 814 properties with Security Description in that is can be associated 815 with a particular media description in a SDP. As long as one avoids 816 using the session level attribute one can be certain to correctly 817 associate the key exchange with a given SRTP/SRTCP context. 819 It does appear that MIKEY directly over a lower layer transport 820 protocol will have similar issues as DTLS. 822 6.4. Examples 824 6.4.1. RTP Packet with Transport Header 826 The below figure contains an RTP packet with SID field encapsulated 827 by a UDP packet (added UDP header). 829 0 1 2 3 830 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 831 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 832 | Source Port | Destination Port | 833 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 834 | Length | Checksum | 835 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 836 | Session ID | 837 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 838 |V=2|P|X| CC |M| PT | sequence number | | 839 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 840 | timestamp | | 841 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 842 | synchronization source (SSRC) identifier | | 843 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | 844 | contributing source (CSRC) identifiers | | 845 | .... | | 846 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 847 | RTP extension (OPTIONAL) | | 848 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 849 | | payload ... | | 850 | | +-------------------------------+ | 851 | | | RTP padding | RTP pad count | | 852 +>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<+ 853 | ~ SRTP MKI (OPTIONAL) ~ | 854 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 855 | : authentication tag (RECOMMENDED) : | 856 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 857 +- Encrypted Portion* Authenticated Portion ---+ 859 SRTP Packet Encapsulated by Session ID Layer 861 6.4.2. SDP Offer/Answer example 863 6.4.2.1. Basic Example 865 This section contains SDP offer/answer examples. First one example 866 of successful SHIMing, and then two where fallback occurs. The 867 fallback option here is to fallback to individual transports, thus no 868 BUNDLE group. 870 In the below SDP offer, one audio and one video is being offered. 871 The audio is using SID 0, and the video is using SID 1 to indicate 872 that they are different RTP sessions despite being offered over the 873 same 5-tuple. 874 v=0 875 o=alice 2890844526 2890844526 IN IP4 atlanta.example.com 876 s= 877 c=IN IP4 atlanta.example.com 878 t=0 0 879 a=group:SHIM foo bar 880 m=audio 10000 RTP/AVP 0 8 97 881 b=AS:200 882 a=mid:foo 883 a=session-mux-id:0 policy=tentative 884 a=rtpmap:0 PCMU/8000 885 a=rtpmap:8 PCMA/8000 886 a=rtpmap:97 iLBC/8000 887 m=video 10000 RTP/AVP 31 32 888 b=AS:1000 889 a=mid:bar 890 a=session-mux-id:1 policy=tentative 891 a=rtpmap:31 H261/90000 892 a=rtpmap:32 MPV/90000 894 The SDP answer from an end-point that supports this BUNDLEing: 895 v=0 896 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 897 s= 898 c=IN IP4 biloxi.example.com 899 t=0 0 900 a=group:SHIM foo bar 901 m=audio 20000 RTP/AVP 0 902 b=AS:200 903 a=mid:foo 904 a=session-mux-id:0 policy=tentative 905 a=rtpmap:0 PCMU/8000 906 m=video 20000 RTP/AVP 32 907 b=AS:1000 908 a=mid:bar 909 a=session-mux-id:1 policy=tentative 910 a=rtpmap:32 MPV/90000 912 The SDP answer from an end-point that does not support this SHIMing. 914 v=0 915 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 916 s= 917 c=IN IP4 biloxi.example.com 918 t=0 0 919 m=audio 20000 RTP/AVP 0 920 b=AS:200 921 a=rtpmap:0 PCMU/8000 922 m=video 30000 RTP/AVP 32 923 b=AS:1000 924 a=rtpmap:32 MPV/90000 926 6.4.2.2. Advanced Example 928 In this example we have two BUNDLED sessions, one with audio and 929 video and one with XOR based FEC [RFC5109] for the audio and the 930 video. These two RTP session are then SHIMed into a single transport 931 flow. 933 v=0 934 o=alice 2890844526 2890844526 IN IP4 atlanta.example.com 935 s= 936 c=IN IP4 atlanta.example.com 937 t=0 0 938 a=group:SHIM foo bar 1 2 939 a=group:BUNDLE 1 2 940 a=group:BUNDLE foo bar 941 a=group:FEC foo 1 942 a=group:FEC bar 2 943 m=audio 10000 RTP/AVP 0 8 97 944 b=AS:200 945 a=mid:foo 946 a=session-mux-id:0 policy=tentative 947 a=rtpmap:0 PCMU/8000 948 a=rtpmap:8 PCMA/8000 949 a=rtpmap:97 iLBC/8000 950 m=video 10000 RTP/AVP 31 32 951 b=AS:1000 952 a=mid:bar 953 a=session-mux-id:0 policy=tentative 954 a=rtpmap:31 H261/90000 955 a=rtpmap:32 MPV/90000 956 m=audio 10000 RTP/AVP 100 957 b=AS:100 958 a=rtpmap:100 ulpfec/8000 959 a=mid:1 960 a=session-mux-id:1 policy=tentative 961 m=video 10000 RTP/AVP 101 962 b=AS:500 963 a=mid:2 964 a=session-mux-id:1 policy=tentative 965 a=rtpmap:101 ulpfec/90000 967 The SDP answer of a client supporting 968 [I-D.ietf-mmusic-sdp-bundle-negotiation] but not this SHIMing would 969 look like this: 971 v=0 972 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 973 s= 974 c=IN IP4 biloxi.example.com 975 t=0 0 976 a=group:BUNDLE 1 2 977 a=group:BUNDLE foo bar 978 a=group:FEC foo 1 979 a=group:FEC bar 2 980 m=audio 20000 RTP/AVP 0 8 97 981 b=AS:200 982 a=mid:foo 983 a=rtpmap:0 PCMU/8000 984 a=rtpmap:8 PCMA/8000 985 a=rtpmap:97 iLBC/8000 986 m=video 20000 RTP/AVP 31 32 987 b=AS:1000 988 a=mid:bar 989 a=rtpmap:31 H261/90000 990 a=rtpmap:32 MPV/90000 991 m=audio 20002 RTP/AVP 100 992 b=AS:100 993 a=rtpmap:100 ulpfec/8000 994 a=mid:1 995 m=video 20002 RTP/AVP 101 996 b=AS:500 997 a=mid:2 998 a=rtpmap:101 ulpfec/90000 1000 In the above case two different RTP sessions, both being of a BUNDLE 1001 type with multiple media types in each. The two established flows 1002 will be Alice:10000<->Bob:20000, and Alice:10000<->Bob:20002. 1004 If the peer did support neither of the SHIM or BUNDLE extension the 1005 answer would look like this: 1007 v=0 1008 o=bob 2808844564 2808844564 IN IP4 biloxi.example.com 1009 s= 1010 c=IN IP4 biloxi.example.com 1011 t=0 0 1012 a=group:FEC foo 1 1013 a=group:FEC bar 2 1014 m=audio 20000 RTP/AVP 0 8 97 1015 b=AS:200 1016 a=mid:foo 1017 a=rtpmap:0 PCMU/8000 1018 a=rtpmap:8 PCMA/8000 1019 a=rtpmap:97 iLBC/8000 1020 m=video 20002 RTP/AVP 31 32 1021 b=AS:1000 1022 a=mid:bar 1023 a=rtpmap:31 H261/90000 1024 a=rtpmap:32 MPV/90000 1025 m=audio 20004 RTP/AVP 100 1026 b=AS:100 1027 a=rtpmap:100 ulpfec/8000 1028 a=mid:1 1029 m=video 20006 RTP/AVP 101 1030 b=AS:500 1031 a=mid:2 1032 a=rtpmap:101 ulpfec/90000 1034 In this case four different transport flows would be established for 1035 RTP, each with a different RTP session over them. The answer also 1036 knows the binding between the sessions with FEC and their source data 1037 thanks to the FEC specification. 1039 7. Open Issues 1041 This work is still in the early phase of specification. This section 1042 contains a list of open issues where the author desires some input. 1044 1. In Section 6.2 there is a discussion of which parameters that 1045 must be configured. The scope of these rules and if they do make 1046 sense needs additional discussion. 1048 2. Can we provide better control so that applications that doesn't 1049 desire fallback to single RTP session when Multiplexing shim 1050 fails to be supported but Bundle is supported ends up with a 1051 better alternative? 1053 3. Is there any issues with using DTLS-SRTP individually per RTP 1054 session? 1056 4. Shall the SHIM header be prefixed or postfixed in relation to the 1057 RTP/RTCP packets? 1059 8. IANA Considerations 1061 This document request the registration of one SDP attribute. Details 1062 of the registration to be filled in. 1064 9. Security Considerations 1066 The security properties of the Session ID layer is depending on what 1067 mechanism is used to protect the RTP and RTCP packets of a given RTP 1068 session. If IPsec or transport layer security solutions such as DTLS 1069 or TLS are being used then both the encapsulated RTP/RTCP packets and 1070 the session ID layer will be protected by that security mechanism. 1071 Thus potentially providing both confidentiality, integrity and source 1072 authentication. If SRTP is used, the session ID layer will not be 1073 directly protected by SRTP. However, it will be implicitly integrity 1074 protected (assuming the RTP/RTCP packet is integrity protected) as 1075 the only function of the field is to identify the session context. 1076 Thus any modification of the SID field will attempt to retrieve the 1077 wrong SRTP crypto context. If that retrieval fails, the packet will 1078 be anyway be discarded. If it is successful, the context will not 1079 lead to successful verification of the packet. 1081 10. Acknowledgements 1083 This document is based on the input from various people, especially 1084 in the context of the RTCWEB discussion of how to use only a single 1085 lower layer transport. The RTP and RTCP packet figures are borrowed 1086 from RFC3711. The SDP example is extended from the one present in 1087 [I-D.ietf-mmusic-sdp-bundle-negotiation]. The authors would like to 1088 thank Christer Holmberg for assistance in utilizing the BUNDLE 1089 grouping mechanism. 1091 The proposal in Appendix A.5 is original suggested by Colin Perkins. 1092 The idea in Appendix A.6 is from an Internet Draft 1093 [I-D.rosenberg-rtcweb-rtpmux] written by Jonathan Rosenberg et. al. 1094 The proposal in Appendix A.3 is a result of discussion by a group of 1095 people at IETF meeting #81 in Quebec. 1097 11. References 1099 11.1. Normative References 1101 [I-D.ietf-mmusic-sdp-bundle-negotiation] 1102 Holmberg, C. and H. Alvestrand, "Multiplexing Negotiation 1103 Using Session Description Protocol (SDP) Port Numbers", 1104 draft-ietf-mmusic-sdp-bundle-negotiation-00 (work in 1105 progress), February 2012. 1107 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1108 Requirement Levels", BCP 14, RFC 2119, March 1997. 1110 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1111 Jacobson, "RTP: A Transport Protocol for Real-Time 1112 Applications", STD 64, RFC 3550, July 2003. 1114 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 1115 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 1116 RFC 3711, March 2004. 1118 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 1119 Specifications: ABNF", STD 68, RFC 5234, January 2008. 1121 11.2. Informational References 1123 [I-D.lennox-rtcweb-rtp-media-type-mux] 1124 Rosenberg, J. and J. Lennox, "Multiplexing Multiple Media 1125 Types In a Single Real-Time Transport Protocol (RTP) 1126 Session", draft-lennox-rtcweb-rtp-media-type-mux-00 (work 1127 in progress), October 2011. 1129 [I-D.rosenberg-rtcweb-rtpmux] 1130 Rosenberg, J., Jennings, C., Peterson, J., Kaufman, M., 1131 Rescorla, E., and T. Terriberry, "Multiplexing of Real- 1132 Time Transport Protocol (RTP) Traffic for Browser based 1133 Real-Time Communications (RTC)", 1134 draft-rosenberg-rtcweb-rtpmux-00 (work in progress), 1135 July 2011. 1137 [I-D.westerlund-avtcore-multi-media-rtp-session] 1138 Westerlund, M., Perkins, C., and J. Lennox, "Multiple 1139 Media Types in an RTP Session", 1140 draft-westerlund-avtcore-multi-media-rtp-session-00 (work 1141 in progress), July 2012. 1143 [I-D.westerlund-avtcore-multiplex-architecture] 1144 Westerlund, M., Burman, B., and C. Perkins, "RTP 1145 Multiplexing Architecture", 1146 draft-westerlund-avtcore-multiplex-architecture-01 (work 1147 in progress), March 2012. 1149 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 1150 with Session Description Protocol (SDP)", RFC 3264, 1151 June 2002. 1153 [RFC3830] Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K. 1154 Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830, 1155 August 2004. 1157 [RFC4567] Arkko, J., Lindholm, F., Naslund, M., Norrman, K., and E. 1158 Carrara, "Key Management Extensions for Session 1159 Description Protocol (SDP) and Real Time Streaming 1160 Protocol (RTSP)", RFC 4567, July 2006. 1162 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 1163 Description Protocol (SDP) Security Descriptions for Media 1164 Streams", RFC 4568, July 2006. 1166 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 1167 Correction", RFC 5109, December 2007. 1169 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 1170 Header Extensions", RFC 5285, July 2008. 1172 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 1173 Real-Time Transport Control Protocol (RTCP): Opportunities 1174 and Consequences", RFC 5506, April 2009. 1176 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 1177 Control Packets on a Single Port", RFC 5761, April 2010. 1179 [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer 1180 Security (DTLS) Extension to Establish Keys for the Secure 1181 Real-time Transport Protocol (SRTP)", RFC 5764, May 2010. 1183 Appendix A. Possible Solutions 1185 This section looks at a few possible solutions and discusses their 1186 feasibility. 1188 A.1. Header Extension 1190 One proposal is to define an RTP header extension [RFC5285] that 1191 explicitly enumerates the session identifier in each packet. This 1192 proposal has some merits regarding RTP, since it uses an existing 1193 extension mechanism; it explicitly enumerates the session allowing 1194 for third parties to associate the packet to a given RTP session; and 1195 it works with SRTP as currently defined since a header extension is 1196 by default not encrypted, and is thus readable by the receiving stack 1197 without needing to guess which session it belongs to and attempt to 1198 decrypt it. This approach does, however, conflict with the 1199 requirement from [RFC5285] that "header extensions using this 1200 specification MUST only be used for data that can be safely ignored 1201 by the recipient", since correct processing of the received packet 1202 depends on using the header extension to demultiplex it to the 1203 correct RTP session. 1205 Using a header extension also result in the session ID is in the 1206 integrity protected part of the packet. Thus a translator between 1207 multiplexed and non-multiplexed has the options: 1209 1. to be part of the security context to verify the field 1211 2. to be part of the security context to verify the field and remove 1212 it before forwarding the packet 1214 3. to be outside of the security context and leave the header 1215 extension in the packet. However, that requires successful 1216 negotiation of the header extension, but not of the 1217 functionality, with the receiving end-points. 1219 The biggest existing hurdle for this solution is that there exist no 1220 header extension field in the RTCP packets. This requires defining a 1221 solution for RTCP that allows carrying the explicit indicator, 1222 preferably in a position that isn't encrypted by SRTCP. However, the 1223 current SRTCP definition does not offer such a position in the 1224 packet. 1226 Modifying the RR or SR packets is possible using profile specific 1227 extensions. However, that has issues when it comes to deployability 1228 and in addition any information placed there would end up in the 1229 encrypted part. 1231 Another alternative could be to define another RTCP packet type that 1232 only contains the common header, using the 5 bits in the first byte 1233 of the common header to carry a session id. That would allow SRTCP 1234 to work correctly as long it accepts this new packet type being the 1235 first in the packet. Allowing a non-SR/RR packet as the first packet 1236 in a compound RTCP packet is also needed if an implementation is to 1237 support Reduced Size RTCP packets [RFC5506]. The remaining downside 1238 with this is that all stack implementations supporting multiplexing 1239 would need to modify its RTCP compound packet rules to include this 1240 packet type first. Thus a translator box between supporting nodes 1241 and non-supporting nodes needs to be in the crypto context. 1243 This solution's per packet overhead is expected to be 64-bits for 1244 RTCP. For RTP it is 64-bits if no header extension was otherwise 1245 used, and an additional 16 bits (short header), or 24 bits plus (if 1246 needed) padding to next 32-bits boundary if other header extensions 1247 are used. 1249 A.2. Multiplexing Shim 1251 This proposal is to prefix or postfix all RTP and RTCP packets with a 1252 session ID field. This field would be outside of the normal RTP and 1253 RTCP packets, thus having no impact on the RTP and RTCP packets and 1254 their processing. An additional step of demultiplexing processing 1255 would be added prior to RTP stack processing to determine in which 1256 RTP session context the packet shall be included. This has also no 1257 impact on SRTP/SRTCP as the shim layer would be outside of its 1258 protection context. The shim layer's session ID is however 1259 implicitly integrity protected as any error in the field will result 1260 in the packet being placed in the wrong or non-existing context, thus 1261 resulting in a integrity failure if processed by SRTP/SRTCP. 1263 This proposal is quite simple to implement in any gateway or 1264 translating device that goes from a multiplexed to a non-multiplexed 1265 domain or vice versa, as only an additional field needs to be added 1266 to or removed from the packet. 1268 The main downside of this proposal is that it is very likely to 1269 trigger a firewall response from any deep packet inspection device. 1270 If the field is prefixed, the RTP fields are not matching the 1271 heuristics field (unless the shim is designed to look like an RTP 1272 header, in which case the payload length is unlikely to match the 1273 expected value) and thus are likely preventing classification of the 1274 packet as an RTP packet. If it is postfixed, it is likely classified 1275 as an RTP packet but may not correctly validate if the content 1276 validation is such that the payload length is expected to match 1277 certain values. It is expected that a postfixed shim will be less 1278 problematic than a prefixed shim in this regard, but we are lacking 1279 hard data on this. 1281 This solution's per packet overhead is 1 byte. 1283 A.3. Single Session 1285 Given the difficulty of multiplexing several RTP sessions onto a 1286 single lower-layer transport, it's tempting to send multiple media 1287 streams in a single RTP session. Doing this avoids the need to de- 1288 multiplex several sessions on a single transport, but at the cost of 1289 losing the RTP session as a separator for different type of streams. 1290 Lacking different RTP sessions to demultiplex incoming packets, a 1291 receiver will have to dig deeper into the packet before determining 1292 what to do with it. Care must be taken in that inspection. For 1293 example, you must be careful to ensure that each real media source 1294 uses its own SSRC in the session and that this SSRC doesn't change 1295 media type. 1297 The loss of the RTP session as a separator for different usages or 1298 purpose would be an minor issue if the only difference between the 1299 RTP sessions is the media type. In this case, the application could 1300 use the Payload Type field to identify the media type. The loss of 1301 the RTP Session functionality is however severe, if the application 1302 uses the RTP Session for separating different treatments, contexts 1303 etc. Then you would need additional signalling to bind the different 1304 sources to groups which can help make the necessary distinctions. 1306 However, the loss of the RTP session as separator is not the only 1307 issue with this approach. The RTP Multiplexing Architecture 1308 [I-D.westerlund-avtcore-multiplex-architecture] discusses a number of 1309 issues in Section 6.7. These include RTCP bandwidth differences, 1310 limitations in the number of payload types, media aware RTP mixers 1311 and interactions with Legacy end-points. 1313 Additional attention should be place on this important aspect. In 1314 multi-party situations using central nodes there exist some 1315 difficulties in having a legacy implementation using multiple RTP 1316 sessions interworking with an end-point having only a single RTP 1317 session across the central node. The main reason is the fact that 1318 the one using single session with multiple media types has only one 1319 SSRC space, while the other end-points have multiple spaces. Thus 1320 translation may have to occur because there is several RTP sessions 1321 using the same SSRC value. This has both limitations, processing 1322 overhead and the possibility of becoming an deployment obstacle for 1323 new RTP/RTCP extensions. 1325 This approach has been proposed in the RTCWeb context in 1326 [I-D.lennox-rtcweb-rtp-media-type-mux] and 1327 [I-D.ietf-mmusic-sdp-bundle-negotiation]. These drafts describe how 1328 to signal multiple media streams multiplexed into a single RTP 1329 session, and address some of the issues raised here and in Section 1330 6.7 of the RTP Multiplexing Architecture 1331 [I-D.westerlund-avtcore-multiplex-architecture] draft. 1333 This method has several limitations that limits its usage as solution 1334 in providing multiple RTP sessions on the same lower layer transport. 1335 However, we acknowledge that there are some uses for which this 1336 method may be sufficient and which can accept the methods limitations 1337 and downsides. The RTCWEB WG has a working assumption to support 1338 this method. For more details of this method, see the relevant 1339 drafts under development. We do include this method in the 1340 comparison to provide a more complete picture of the pro and cons of 1341 this method. 1343 This solution has no per packet overhead. The signalling overhead 1344 will be a different question. 1346 A.4. Use the SRTP MKI field 1348 This proposal is to overload the MKI SRTP/SRTCP identifier to not 1349 only identify a particular crypto context, but also identify the 1350 actual RTP Session. This clearly is a miss use of the MKI field, 1351 however it appears to be with little negative implications. SRTP 1352 already supports handling of multiple crypto contexts. 1354 The two major downsides with this proposal is first the fact that it 1355 requires using SRTP/SRTCP to multiplex multiple sessions on a single 1356 lower layer transport. The second issue is that the session ID 1357 parameter needs to be put into the various key-management schemes and 1358 to make them understand that the reason to establish multiple crypto 1359 contexts is because they are connected to various RTP Sessions. 1360 Considering that SRTP have at least 3 used keying mechanisms, DTLS- 1361 SRTP [RFC5764], Security Descriptions [RFC4568], and MIKEY [RFC3830], 1362 this is not an insignificant amount of work. 1364 This solution has 32-bit per packet overhead, but only if the MKI was 1365 not already used. 1367 A.5. Use an Octet in the Padding 1369 The basics of this proposal is to have the RTP packet and the last 1370 (required by RFC3550) RTCP packet in a compound to include padding, 1371 at least 2 bytes. One byte for the padding count (last byte) and one 1372 byte just before the padding count containing the session ID. 1374 This proposal uses bytes to carry the session ID that have no defined 1375 value and is intended to be ignored by the receiver. From that 1376 perspective it only causes packet expansion that is supported and 1377 handled by all existing equipment. If an implementation fails to 1378 understand that it is required to interpret this padding byte to 1379 learn the session ID, it will see a mostly coherent RTP session 1380 except where SSRCs overlap or where the payload types overlap. 1381 However, reporting on the individual sources or forwarding the RTCP 1382 RR are not completely without merit. 1384 There is one downside of this proposal and that has to do with SRTP. 1385 To be able to determine the crypto context, it is necessary to access 1386 to the encrypted payload of the packet. Thus, the only mechanism 1387 available for a receiver to solve this issue is to try the existing 1388 crypto contexts for any session on the same lower layer transport and 1389 then use the one where the packet decrypts and verifies correctly. 1390 Thus for transport flows with many crypto contexts, an attacker could 1391 simply generate packets that don't validate to force the receiver to 1392 try all crypto contexts they have rather than immediately discard it 1393 as not matching a context. A receiver can mitigate this somewhat by 1394 using heuristics based on the RTP header fields to determine which 1395 context applies for a received packet, but this is not a complete 1396 solution. 1398 This solution has a 16-bit per packet overhead. 1400 A.6. Redefine the SSRC field 1402 The Rosenberg et. al. Internet draft "Multiplexing of Real-Time 1403 Transport Protocol (RTP) Traffic for Browser based Real-Time 1404 Communications (RTC)" [I-D.rosenberg-rtcweb-rtpmux] proposed to 1405 redefine the SSRC field. This has the advantage of no packet 1406 expansion. It also looks like regular RTP. However, it has a number 1407 of implications. First of all it prevents any RTP functionality that 1408 require the same SSRC in multiple RTP sessions. 1410 Secondly its interoperability with end-point using multiple RTP 1411 sessions are problematic. Such interoperability will requires an 1412 SSRC translator function in the gatewaying node to ensure that the 1413 SSRCs fulfill the semantic rules of the different domains. That 1414 translator is actually far from easy as it needs to understand the 1415 semantics of all RTP and RTCP extensions that include SSRC/CSRC. 1416 This as it is necessary to know when a particular matching 32-bit 1417 pattern is an SSRC field and when the field is just a combination of 1418 other fields that create the same matching 32-bit pattern. Thus 1419 there is a possibility that such a translator becomes a obstacle in 1420 deploying future RTP/RTCP extensions. In addition the translator 1421 actually have significant overhead when SRTP are in use. This as a 1422 verification that the packet is authentic, decryption, SSRC 1423 translation, encryption and finally generation of authentication tags 1424 are required. In addition the translator must be part of the 1425 security context. 1427 This solution has no per packet overhead. 1429 Appendix B. Comparison 1431 This section compares the above potential solutions with the 1432 requirements. Motivations are provided in addition to a high level 1433 metric of successfully, partially and failing to meet requirement. 1434 In the end a summary table (Figure 4) of the high level value are 1435 provided. 1437 B.1. Support of Multiple RTP Sessions Over Single Transport 1439 This one is easy to determine. Only the single session proposal 1440 fails this requirement as it is not at all designed to meet it. The 1441 rest fully support this requirement. The main question around this 1442 requirement is how important it is to have as discussed in 1443 Section 4.1. 1445 B.2. Enable Same SSRC Value in Multiple RTP Sessions 1447 Based on the discussion in Section 4.2 two sub-requirements have been 1448 derived. 1450 B.2.1. Avoid SSRC Translation in Gateways/Translation 1452 This sub-requirement is derived based on the desire to avoid having 1453 gateways or translators perform full SSRC translation to minimize 1454 complexity, avoid the requirement to have gateways in security 1455 context, and as a hinder to long-term evolution. Two of the 1456 proposals have issues with this, due to their lack of support for 1457 multiple 32-bit SSRC spaces and lacking possibility to have the same 1458 SSRC value in multiple RTP sessions. The proposals that have these 1459 properties and thus are marked as failing are the Single Session and 1460 Redefine the SSRC field. The other proposals are all successful in 1461 meeting this requirement. 1463 B.2.2. Support Existing Extensions 1465 The second sub-requirement is how well the proposals support using 1466 the existing RTP mechanisms. Here both Single Session and Redefine 1467 the SSRC field will have clear issues as they cannot support the same 1468 full 32-bit SSRC value in two different RTP sessions. This is 1469 clearly an issue for the XOR based FEC. RTP retransmission and 1470 scalable encoding are minor issues as there exist alternatives to 1471 those mechanisms that works with the structure of these two 1472 proposals. Thus we give them a fail. The Header Extension gets a 1473 partial due to unclear interaction between putting in an header 1474 extension and these mechanisms. 1476 B.3. Ensure SRTP Functions 1478 This requirement is about ensuring both secure and efficient usage of 1479 SRTP. The Octet in Padding field proposal gets a fail as the 1480 receiving end-point cannot determine the intended RTP session prior 1481 to de-encryption of the padding field. Thus a catch-22 arises which 1482 can only be resolved by trying all session contexts and see what 1483 decrypts. This causes a security vulnerability as an attacker can 1484 inject a packet which does not meet any of the session contexts. The 1485 receiver will then attempt decryption and authentication of it using 1486 all its session contexts, increasing the amount of wasted resources 1487 by a factor equal to the number of multiplexed sessions. Thus this 1488 proposal gets a fail. 1490 The proposal of Overloading the SRTP MKI field as session identifier 1491 gets a partial due to the fact that it cannot use SRTP's key- 1492 management mechanism out of the box. It forces the key-management 1493 mechanism and the SRTP implementations to maintain the MKI-to-RTP 1494 session bindings to maintain secure and correct function. 1496 The Redefine the SSRC field gets a partial due to its need to modify 1497 the key-management mechanisms to correctly identify the partial SSRC 1498 space the parameters applies to. Similarly, the SRTP implementation 1499 also needs to be updated to correctly support this security context 1500 differentiation. 1502 The header extension based solution gets a less severe partial than 1503 Redefine the SSRC and the MKI. It will however have an issue when 1504 being gatewayed to a domain that does not multiplex multiple RTP 1505 sessions over the same transport. Then the gateway will require to 1506 be in the security context to be able to add or remove the header 1507 extension as it is in the part of the packet that is integrity 1508 protected by SRTP. 1510 The remaining two proposals do not affect SRTP mechanisms and thus 1511 successfully meet this requirement. 1513 B.4. Don't Redefine Used Bits 1515 This requirement is all about RTP and RTCP header fields having a 1516 given definition should not be changed as it can cause 1517 interoperability problems between modified and non-modified 1518 implementations. This becomes especially problematic in RTP sessions 1519 used for multi-party sessions. 1521 Redefine the SSRC field gets a big fail on this as it redefines the 1522 SSRC field, a core field in RTP. It has been identified that such a 1523 change will have issues since if it gets connected to a non-modified 1524 end-point that randomly assigns the SSRC, as supposed by RFC 3550, 1525 those SSRCs will be distributed over different RTP sessions at the 1526 modified end-point. Also other functions using the SSRC field, not 1527 understanding the additional semantics of the SSRC field, is likely 1528 to have issues. 1530 Using the SRTP MKI field to identify a session is overloading that 1531 field with double semantics. This likely has minimal negative impact 1532 in RTP since it should be possible to have the SRTP stack use the MKI 1533 field to both look up the security context and which output RTP 1534 session the processed packet belongs to. However, this redefinition 1535 clearly creates issues with the key-management scheme. That will 1536 have to be modified to handle both this change and deal with the 1537 interoperability issues when negotiating its usage. This gets a full 1538 fail due to that it makes the problem someone else's, namely the RTP 1539 implementors. 1541 Defining an Octet in the Padding field redefines a field, whose 1542 definition is to have zero value and is expected to be ignored by the 1543 receiver according to the original semantics. Thus this is one of 1544 the more benign modifications one can do, however this can still 1545 cause issues in implementations that unnecessarily check the field 1546 values, or in Firewalls. This is judged to be partially meeting the 1547 requirement. 1549 The Header Extension proposal does in fact not redefine any currently 1550 used bits in RTP. The header extension would be a correctly 1551 identified extension with its own definition. However, it does 1552 redefine a rule on what header extensions are for. The RTCP solution 1553 however would have more severe impact as it would need to redefine 1554 the standard meaning of an RTCP packet header in addition to the 1555 default compound packet rules. Due to these issues the proposal 1556 fails to meet this requirement. 1558 The multiplexing shim and the single session both successfully meet 1559 this requirement. 1561 B.5. Firewall Friendly 1563 This requirement is clearly difficult to judge as firewall 1564 implementations are highly different in both implementation, scope of 1565 what it investigates in packets, and set policies. A reasonable goal 1566 is to minimize the likeliness that rules and policies intended to let 1567 RTP media streams pass, will also let these streams through when 1568 multiplexing RTP sessions over a single transport. The below 1569 analysis shows that no solution is truly firewall friendly and all 1570 are judged as being partially meeting this goal. However, the reason 1571 why it is believed that a firewall might react to the streams are 1572 quite different. 1574 The Single Session and Redefine the SSRC field are likely the least 1575 suspect solutions from a firewall perspective. However, as their 1576 transport flows contain multiple SSRCs with payloads that indicate 1577 likely multiple different media types they are still likely to make a 1578 picky firewall block the transport. This is especially true for 1579 Firewalls that take signalling messages into account where it will 1580 expect a particular media type in a given context. A non upgraded 1581 firewall might in fact produce two different contexts with 1582 overlapping transport parameters where both rules will receive media 1583 streams of the other media type that are outside of the allowed rule. 1584 However, to be clear if these proposals doesn't get through, none of 1585 the other will either as they all will have this behavior. 1587 The header extension proposal is potentially problematic for two 1588 reasons. The first reason, which also other proposals has, is 1589 related to that the same SSRC value can exist in two RTP sessions 1590 over the same underlying flow. Anyone tracking the sequence number 1591 and timestamp will react badly as the second media stream with the 1592 same SSRC causes constant jumps back and forth in these fields 1593 compared to the first stream, if packets are transmitted 1594 simultaneously for both SSRCs. This issue can likely only be solved 1595 by having the Firewalls that like to track flows to also use the 1596 session identifier to create context. This is possible as the header 1597 extension will be in the clear and in the front. The second issue is 1598 that the header extension itself may get the firewall to react. 1599 Especially very picky ones that expect packets with certain media 1600 types to have certain packet lengths. They are not compatible with a 1601 header extension. 1603 The Multiplexing Shim shares the issue with multiple flows for the 1604 same SSRC. Firewalls and deep packet inspection cause the shim 1605 placement to be in question. If it is a pre-fixed shim, it prevents 1606 the packet from looking like regular IP/UDP/RTP packets and be 1607 correctly classified in Firewalls and DPI engines. However, if one 1608 puts it last, it is unlikely that any firewall or DPI ever will be 1609 able to take the session context into account as it is at the end of 1610 the packet. This as many line rate processing devices only take a 1611 certain amount of the headers into account. 1613 The SRTP MKI field is likely the solution that has least firewall and 1614 DPI issues, after the single RTP session. There is no additional 1615 suspect field. The only difference from a single RTP session in the 1616 transport flow is the fact that multiple MKI are guaranteed to be 1617 used. However, that may occur also in a single RTP session usage. 1618 Thus the only issues are the one shared with single session and the 1619 one that several RTP media streams may use the same SSRC. 1621 The octet in the padding field has, in addition to the issues the 1622 SRTP MKI field has, the single issue that it redefines something that 1623 is supposed to be zero into a value. Thus potentially causing a 1624 deeply inspecting firewall to clamp the flow in fear of covert 1625 channel or non-compliance. 1627 B.6. Monitoring and Reporting 1629 The monitoring and reporting requirement considers several aspects. 1630 How useful monitoring can one get from an existing legacy monitor, 1631 and secondary any issues in upgrading them to handle the selected 1632 solution. Thirdly, packet selector filters and packet sniffers 1633 concerns are considered. 1635 In general one can expect the proposals that have only a single SSRC 1636 space to work better with legacy. Thus both Single Session and 1637 Redefine SSRC space can gather and report data on media flows most 1638 likely. The only potential issue is that due to the different media 1639 types and clock rates, some failure may occur. In particular a third 1640 party monitor may be targeted to a specific media type, like 1641 monitoring VoIP. That monitor will have problems processing any 1642 video packets correctly and generate the VoIP specific metrics for 1643 any video sending SSRC. In general, no legacy solution for 1644 monitoring will be able to correctly create the sub-contexts that 1645 each RTP session has in the solutions, without update to handle the 1646 new semantics. Also when it comes to the packet filtering and 1647 selector filters, fine grained control can only be accomplished 1648 implementing the new semantics. Therefore only the Single Session 1649 meets this requirement fully. 1651 Redefine the SSRC field is close to fully meeting the requirement, 1652 however due to that there exist a session structure that is hidden to 1653 anyone that is not upgraded to understand the semantics, this only 1654 gets a partial. 1656 The other proposals all can have multiple RTP sessions using the same 1657 SSRC. This will create significant issues for any legacy third party 1658 monitor. Only an updated monitor, or for that matter packet 1659 selector, can pick out the individual media streams and their 1660 associated RTCP traffic. Thus all these proposals gets a failure to 1661 meet the requirement. 1663 B.7. Usable over Multicast 1665 As discussed earlier the goal with having the option usable also over 1666 multicast is to remove the need to produce different media streams 1667 for transport over unicast and multicast. All of the proposals 1668 successfully meet the requirement. 1670 B.8. Incremental Deployment 1672 The possibility to deploy the usage of the multiplexing of multiple 1673 RTP sessions over a single transport, especially in the context of 1674 multi-party sessions, is a great benefit for any of the proposals. 1675 Thus not all end-point implementations needs to be upgraded before 1676 one start enabling it in the central node and any signalling. 1678 Considering a centralized multi-party application where some 1679 participants are using multiple transport flows and you want to 1680 enable one particular participant to use the single transport to the 1681 central node, one criteria stands out. The possibility to have one 1682 RTP session per transport in one leg, and in the next multiplex them 1683 together with minimal complexity and packet changes. Here there are 1684 significant differences. 1686 The Multiplexing Shim has the least overhead for this. As the 1687 central node or gateway between deployments only needs to either add 1688 or remove the shim identifier and then forward the packet over the 1689 corresponding transport, either a joint one on the single transport 1690 side, or over the individual one on the multiple transport side. 1692 The SRTP MKI field proposal is almost as good, as the only main 1693 difference is the need to coordinate the used MKIs on the non- 1694 multiplexed legs so that there is no overlap between the RTP 1695 sessions. And if there is, the MKI can be translated in gateway as 1696 SRTP has no integrity protection over the MKI. Thus both 1697 multiplexing shim and SRTP MKI field does successfully meet this 1698 requirement. 1700 The Header Extension supports multiple full 32-bit SSRC spaces and 1701 can thus handle all the RTP sessions without need for any SSRC 1702 translation, however this proposal does run into the problem that the 1703 gateway needs to be in the security context to be able to add or 1704 remove the header extension when SRTP is used. In addition to the 1705 security implications of that, there is a complexity overhead due to 1706 the need to redo the authentication tags on all RTP/RTCP packets. 1707 Thus it gets a partial. 1709 The Octet in the Padding field share issues with the header extension 1710 but have even higher complexities for this. The reason is that the 1711 padding field is also encrypted. Thus to add or remove it (although 1712 removing it may be unnecessary) forces the end-point to encrypt at 1713 least that byte also, and for ciphers that are not stream-ciphers, 1714 the whole packet needs to be re-encrypted. Thus this proposal gets a 1715 very weak partially meeting the requirement. 1717 The Single Session and Redefine the SSRC field do not allow several 1718 vanilla RTP sessions to be connected to these proposals. The reason 1719 is the single 32-bit SSRC space they have. Single Session only has 1720 one session and the Redefine the SSRC fields uses some of the bits as 1721 session identifier. This forces the gateway to translate the SSRC 1722 whenever it does not fulfill the rules or semantics of the 1723 multiplexed side. For Redefine SSRC field this becomes almost 1724 constant as the session identifier part of the SSRC must be the same 1725 over all SSRCs from the same session. For Single Session it may only 1726 be needed when there otherwise would be an SSRC collision between the 1727 sessions. This further assumes that the non-multiplexed side would 1728 never use any of the RTP mechanisms that require the same SSRC in 1729 multiple RTP sessions, as they cannot be gatewayed at all. When 1730 translating an SSRC there is first of all an overhead, with SRTP that 1731 includes a complete authenticate, decrypt, encrypt and create a new 1732 authentication tag cycle. In addition, the SSRC translation could 1733 potentially be a deployment obstacle for new RTP/RTCP extensions 1734 required to be understood by the translator to be correctly 1735 translated. Therefore these two proposals gets a fail to meet the 1736 requirements. 1738 B.9. Summary and Conclusion 1740 This section contains a summary table of the high level outcome 1741 against the different requirements. 1743 A table mapping the requirements against the ID numbers used in the 1744 table is the following: 1746 1: Support multiple RTP sessions over one transport flow 1748 2: Enable same SSRC value in multiple RTP sessions 1750 2.1: Avoid SSRC translation in gateways/translators 1752 2.2: Support existing extensions 1754 3: Ensure SRTP functions 1756 4: Don't Redefine used bits 1758 5: Firewall Friendly 1760 6: Monitoring and Reporting should still function 1762 7: Usable over Multicast 1763 8: Incremental deployment 1765 OH: Overhead in Bytes. + means variable 1767 ---------------+---+---+---+---+---+---+---+---+---+---- 1768 Solution | 1 |2.1|2.2| 3 | 4 | 5 | 6 | 7 | 8 | OH 1769 ---------------+---+---+---+---+---+---+---+---+---+---- 1770 Header Ext. | S | S | P | P | F | P | F | S | P | 8+ 1771 Multiplex Shim | S | S | S | S | S | P | F | S | S | 1 1772 Single Session | F | F | F | S | S | P | S | S | F | 0 1773 SRTP MKI Field | S | S | S | P | F | P | F | S | S | 4 1774 Padding Field | S | S | S | F | P | P | F | S | P | 2 1775 Redefine SSRC | S | F | F | P | F | P | P | S | S | 0 1776 ---------------+---+---+---+---+---+---+---+---+---+---- 1778 Figure 4: Summary Table of Evaluation (Successfully (S), Partially 1779 (P) or Fails (F) to meet requirement) 1781 Considering these options, the authors would recommend that AVTCORE 1782 standardize a solution based on a post or prefixed multiplexing 1783 field, i.e. a shim approach combined with the appropriate signalling 1784 as described in Appendix A.2. 1786 Authors' Addresses 1788 Magnus Westerlund 1789 Ericsson 1790 Farogatan 6 1791 SE-164 80 Kista 1792 Sweden 1794 Phone: +46 10 714 82 87 1795 Email: magnus.westerlund@ericsson.com 1797 Colin Perkins 1798 University of Glasgow 1799 School of Computing Science 1800 Glasgow G12 8QQ 1801 United Kingdom 1803 Email: csp@csperkins.org