idnits 2.17.1 draft-ietf-avtext-splicing-for-rtp-13.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 14, 2012) is 4181 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC3711' is defined on line 694, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 AVTEXT Working Group J. Xia 3 Internet-Draft Huawei 4 Intended status: Informational November 14, 2012 5 Expires: May 18, 2013 7 Content Splicing for RTP Sessions 8 draft-ietf-avtext-splicing-for-rtp-13 10 Abstract 12 Content splicing is a process that replaces the content of a main 13 multimedia stream with other multimedia content, and delivers the 14 substitutive multimedia content to the receivers for a period of 15 time. Splicing is commonly used for local advertisement insertion by 16 cable operators, replacing a national advertisement content with a 17 local advertisement. 19 This memo describes some use cases for content splicing and a set of 20 requirements for splicing content delivered by RTP. It provides 21 concrete guidelines for how an RTP mixer can be used to handle 22 content splicing. 24 Status of this Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on May 18, 2013. 41 Copyright Notice 43 Copyright (c) 2012 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. System Model and Terminology . . . . . . . . . . . . . . . . . 3 60 3. Requirements for RTP Splicing . . . . . . . . . . . . . . . . 6 61 4. Content Splicing for RTP sessions . . . . . . . . . . . . . . 7 62 4.1. RTP Processing in RTP Mixer . . . . . . . . . . . . . . . 7 63 4.2. RTCP Processing in RTP Mixer . . . . . . . . . . . . . . . 8 64 4.3. Considerations for Handling Media Clipping at the RTP 65 Layer . . . . . . . . . . . . . . . . . . . . . . . . . . 10 66 4.4. Congestion Control Considerations . . . . . . . . . . . . 11 67 4.5. Considerations for Implementing Undetectable Splicing . . 12 68 5. Implementation Considerations . . . . . . . . . . . . . . . . 13 69 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13 70 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 71 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 14 72 9. 10. Appendix- Why Mixer Is Chosen . . . . . . . . . . . . . . 15 73 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 74 10.1. Normative References . . . . . . . . . . . . . . . . . . . 15 75 10.2. Informative References . . . . . . . . . . . . . . . . . . 16 76 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 16 78 1. Introduction 80 This document outlines how content splicing can be used in RTP 81 sessions. Splicing, in general, is a process where part of a 82 multimedia content is replaced with other multimedia content, and 83 delivered to the receivers for a period of time. The substitutive 84 content can be provided for example via another stream or via local 85 media file storage. One representative use case for splicing is 86 local advertisement insertion, allowing content providers to replace 87 the national advertising content with its own regional advertising 88 content prior to delivering the regional advertising content to the 89 receivers. Besides the advertisement insertion use case, there are 90 other use cases in which splicing technology can be applied. For 91 example, splicing a recorded video into a video conferencing session, 92 or implementing a playlist server that stitches pieces of video 93 together. 95 Content splicing is a well-defined operation in MPEG-based cable TV 96 systems. Indeed, the Society for Cable Telecommunications Engineers 97 (SCTE) has created two standards, [SCTE30] and [SCTE35], to 98 standardize MPEG2-TS splicing procedure. SCTE 30 creates a 99 standardized method for communication between advertisements server 100 and splicer, and SCTE 35 supports splicing of MPEG2 transport 101 streams. 103 When using multimedia splicing into the internet, the media may be 104 transported by RTP. In this case the original media content and 105 substitutive media content will use the same time period, but may 106 contain different numbers of RTP packets due to different media 107 codecs and entropy coding. This mismatch may require some 108 adjustments of the RTP header sequence number to maintain 109 consistency. [RFC3550] provides the tools to enabled seamless 110 content splicing in RTP session, but to date there has been no clear 111 guidelines on how to use these tools. 113 This memo outlines the requirements for content splicing in RTP 114 sessions and describes how an RTP mixer can be used to meet these 115 requirements. 117 2. System Model and Terminology 119 In this document, an intermediary network element, the Splicer 120 handles RTP splicing. The Splicer can receive main content and 121 substitutive content simultaneously, but will send one of them at one 122 point of time. 124 When RTP splicing begins, the splicer sends the substitutive content 125 to the RTP receiver instead of the main content for a period of time. 126 When RTP splicing ends, the splicer switches back sending the main 127 content to the RTP receiver. 129 A simplified RTP splicing diagram is depicted in Figure 1, in which 130 only one main content flow and one substitutive content flow are 131 given. Actually, the splicer can handle multiple splicing for 132 multiple RTP sessions simultaneously. RTP splicing may happen more 133 than once in multiple time slots during the lifetime of the main RTP 134 stream. The methods how splicer learns when to start and end the 135 splicing is out of scope for this document. 137 +---------------+ 138 | | Main Content +-----------+ 139 | Main RTP |------------->| | Output Content 140 | Content | | Splicer |---------------> 141 +---------------+ ---------->| | 142 | +-----------+ 143 | 144 | Substitutive Content 145 | 146 | 147 +-----------------------+ 148 | Substitutive RTP | 149 | Content | 150 | or | 151 | Local File Storage | 152 +-----------------------+ 154 Figure 1: RTP Splicing Architecture 156 This document uses the following terminologies. 158 Output RTP Stream 160 The RTP stream that the RTP receiver is currently receiving. The 161 content of output RTP stream can be either main content or 162 substitutive content. 164 Main Content 166 The multimedia content that are conveyed in main RTP stream. Main 167 content will be replaced by the substitutive content during 168 splicing. 170 Main RTP Stream 172 The RTP stream that the splicer is receiving. The content of main 173 RTP stream can be replaced by substitutive content for a period of 174 time. 176 Main RTP Sender 178 The sender of RTP packets carrying the main RTP stream. 180 Substitutive Content 182 The multimedia content that replaces the main content during 183 splicing. The substitutive content can for example be contained 184 in an RTP stream from a media sender or fetched from local media 185 file storage. 187 Substitutive RTP Stream 189 A RTP stream with new content that will replace the content in the 190 main RTP stream. Substitutive RTP stream and main RTP stream are 191 two separate streams. If the substitutive content is provided via 192 substitutive RTP stream, the substitutive RTP Stream must pass 193 through the splicer before the substitutive content is delivered 194 to receiver. 196 Substitutive RTP Sender 198 The sender of RTP packets carrying the substitutive RTP stream. 200 Splicing In Point 202 A virtual point in the RTP stream, suitable for substitutive 203 content entry, typically in the boundary between two independently 204 decodable frames. 206 Splicing Out Point 208 A virtual point in the RTP stream, suitable for substitutive 209 content exist, typically in the boundary between two independently 210 decodable frames. 212 Splicer 214 An intermediary node that inserts substitutive content into main 215 RTP stream. The splicer sends substitutive content to RTP 216 receiver instead of main content during splicing. It is also 217 responsible for processing RTCP traffic between the RTP sender and 218 the RTP receiver. 220 3. Requirements for RTP Splicing 222 In order to allow seamless content splicing at the RTP layer, the 223 following requirements must be met. Meeting these will also allow, 224 but not require, seamless content splicing at layers above RTP. 226 REQ-1: 228 The splicer should be agnostic about the network and transport 229 layer protocols used to deliver the RTP streams. 231 REQ-2: 233 The splicing operation at the RTP layer must allow splicing at any 234 point required by the media content, and must not constrain when 235 splicing in or splicing out operations can take place. 237 REQ-3: 239 Splicing of RTP content must be backward compatible with the RTP/ 240 RTCP protocol, associated profiles, payload formats, and 241 extensions. 243 REQ-4: 245 The splicer will modify the content of RTP packets, and thus break 246 the end-to-end security, at a minimum breaking the data integrity 247 and source authentication. If the Splicer is designated to insert 248 substitutive content, it must be trusted, i.e., be in the security 249 context(s) with the main RTP sender, the substitutive RTP sender, 250 and the receivers. If encryption is employed, the splicer 251 commonly must decrypt the inbound RTP packets and re-encrypt the 252 outbound RTP packets after splicing. 254 REQ-5: 256 The splicer should rewrite as necessary and forward RTCP messages 257 (e.g., including packet loss, jitter, etc.) sent from downstream 258 receiver to the main RTP sender or the substitutive RTP sender, 259 and thus allow the main RTP sender or substitutive RTP sender to 260 learn the performance of the downstream receiver when its content 261 is being passed to RTP receiver. In addition, the splicer should 262 rewrite RTCP messages from the main RTP sender or substitutive RTP 263 sender to the receiver. 265 REQ-6: 267 The splicer must not affect other RTP sessions running between the 268 RTP sender and the RTP receiver, and must be transparent for the 269 RTP sessions it does not splice. 271 REQ-7: 273 The RTP receiver should not be able to detect any splicing points 274 in the RTP stream produced by the splicer on RTP protocol level. 275 For the advertisement insertion use case, it is important to make 276 it difficult for the RTP receiver to detect where an advertisement 277 insertion is starting or ending from the RTP packets, and thus 278 avoiding the RTP receiver from filtering out the advertisement 279 content. This memo only focuses on making the splicing 280 undetectable at the RTP layer. The corresponding processing is 281 depicted in section 4.5. 283 4. Content Splicing for RTP sessions 285 The RTP specification [RFC3550] defines two types of middlebox: RTP 286 translators and RTP mixers. Splicing is best viewed as a mixing 287 operation. The splicer generates a new RTP stream that is a mix of 288 the main RTP stream and the substitutive RTP stream. An RTP mixer is 289 therefore an appropriate model for a content splicer. In next four 290 subsections (from subsection 4.1 to subsection 4.4), the document 291 analyzes how the mixer handles RTP splicing and how it satisfies the 292 general requirements listed in section 3. In subsection 4.5, the 293 document looks at REQ-7 in order to hide the fact that splicing take 294 place. 296 4.1. RTP Processing in RTP Mixer 298 A splicer could be implemented as a mixer that receives the main RTP 299 stream and the substitutive content (possibly via a substitutive RTP 300 stream), and sends a single output RTP stream to the receiver(s). 301 That output RTP stream will contain either the main content or the 302 substitutive content. The output RTP stream will come from the 303 mixer, and will have the synchronization source (SSRC) of the mixer 304 rather than the main RTP sender or the substitutive RTP sender. 306 The mixer uses its own SSRC, sequence number space and timing model 307 when generating the output stream. Moreover, the mixer may insert 308 the SSRC of main RTP stream into contributing source (CSRC) list in 309 the output media stream. 311 At the splicing in point, when the substitutive content becomes 312 active, the mixer chooses the substitutive RTP stream as input stream 313 at splicing in point, and extracts the payload data (i.e., 314 substitutive content). If the substitutive content comes from local 315 media file storage, the mixer directly fetches the substitutive 316 content. After that, the mixer encapsulates substitutive content 317 instead of main content as the payload of the output media stream, 318 and then sends the output RTP media stream to receiver. The mixer 319 may insert the SSRC of substitutive RTP stream into CSRC list in the 320 output media stream. If the substitutive content comes from local 321 media file storage, the mixer should leave the CSRC list blank. 323 At the splicing out point, when the substitutive content ends, the 324 mixer retrieves the main RTP stream as input stream at splicing out 325 point, and extracts the payload data (i.e., main content). After 326 that, the mixer encapsulates main content instead of substitutive 327 content as the payload of the output media stream, and then sends the 328 output media stream to the receivers. Moreover, the mixer may insert 329 the SSRC of main RTP stream into CSRC list in the output media stream 330 as before. 332 Note that if the content is too large to fit into RTP packets sent to 333 RTP receiver, the mixer needs to transcode or perform application- 334 layer fragmentation. Usually the mixer is deployed as part of a 335 managed system and MTU will be carefully managed by this system. 336 This document does not raise any new MTU related issues compared to a 337 standard mixer described in [RFC3550]. 339 Splicing may occur more than once during the lifetime of main RTP 340 stream, this means the mixer needs to send main content and 341 substitutive content in turn with its own SSRC identifier. From 342 receiver point of view, the only source of the output stream is the 343 mixer regardless of where the content is coming from. 345 4.2. RTCP Processing in RTP Mixer 347 By monitoring available bandwidth and buffer levels and by computing 348 network metrics such as packet loss, network jitter, and delay, RTP 349 receiver can learn the network performance and communicate this to 350 the RTP sender via RTCP reception reports. 352 According to the description in section 7.3 of [RFC3550], the mixer 353 splits the RTCP flow between sender and receiver into two separate 354 RTCP loops, RTP sender has no idea about the situation on the 355 receiver. But splicing is a processing that the mixer selects one 356 media stream from multiple streams rather than mixing them, so the 357 mixer can leave the SSRC identifier in the RTCP report intact (i.e., 358 the SSRC of downstream receiver), this enables the main RTP sender or 359 the substitutive RTP sender to learn the situation on the receiver. 361 If the RTCP report corresponds to a time interval that is entirely 362 main content or entirely substitutive content, the number of output 363 RTP packets containing substitutive content is equal to the number of 364 input substitutive RTP packets (from substitutive RTP stream) during 365 splicing, in the same manner, the number of output RTP packets 366 containing main content is equal to the number of input main RTP 367 packets (from main RTP stream) during non-splicing unless the mixer 368 fragment the input RTP packets. This means that the mixer does not 369 need to modify the loss packet fields in reception report blocks in 370 RTCP reports. But if the mixer fragments the input RTP packets, it 371 may need to modify the loss packet fields to compensate for the 372 fragmentation. Whether the input RTP packets are fragmented or not, 373 the mixer still needs to change the SSRC field in report block to the 374 SSRC identifier of the main RTP sender or the substitutive RTP 375 sender, and rewrite the extended highest sequence number field to the 376 corresponding original extended highest sequence number before 377 forwarding the RTCP report to the main RTP sender or the substitutive 378 RTP sender. 380 If the RTCP report spans the splicing in point or the splicing out 381 point, it reflects the characteristics of the combination of main RTP 382 packets and substitutive RTP packets. In this case, the mixer needs 383 to divide the RTCP report into two separate RTCP reports and send 384 them to their original RTP senders respectively. For each RTCP 385 report, the mixer also needs to make the corresponding changes to the 386 packet loss fields in report block besides the SSRC field and the 387 extended highest sequence number field. 389 If the mixer receives an RTCP extended report (XR) block, it should 390 rewrite the XR report block in a similar way to the reception report 391 block in the RTCP report. 393 Besides forwarding the RTCP reports sent from RTP receiver, the mixer 394 can also generate its own RTCP reports to inform the main RTP sender 395 or the substitutive RTP sender of the reception quality of the 396 content reaches the mixer when the content is not sent to the RTP 397 receiver. These RTCP reports use the SSRC of the mixer. If the 398 substitutive content comes from local media file storage, the mixer 399 does not need to generate RTCP reports for the substitutive stream. 401 Based on above RTCP operating mechanism, the RTP sender whose content 402 is being passed to receiver will see the reception quality of its 403 stream as received by the mixer, and the reception quality of spliced 404 stream as received by the receiver. The RTP sender whose content is 405 not being passed to receiver will only see the reception quality of 406 its stream as received by the mixer. 408 The mixer must forward RTCP SDES and BYE packets from the receiver to 409 the sender, and may forward them in inverse direction as defined in 410 section 7.3 of [RFC3550]. 412 Once the mixer receives an RTP/AVPF [RFC4585] transport layer 413 feedback packet, it must handle it carefully as the feedback packet 414 may contain the information of the content that come from different 415 RTP senders. In this case the mixer needs to divide the feedback 416 packet into two separate feedback packets and process the information 417 in the feedback control information (FCI) in the two feedback 418 packets, just as the RTCP report process described above. 420 If the substitutive content comes from local media file storage 421 (i.e., the mixer can be regarded as the substitutive RTP sender), any 422 RTCP packets received from downstream relate to the substitutive 423 content must be terminated on the mixer without any further 424 processing. 426 4.3. Considerations for Handling Media Clipping at the RTP Layer 428 This section provides informative guidelines on how to handle media 429 substitution at both the RTP layer to minimize media impact. Dealing 430 with the media substitution well at the RTP layer is necessary for 431 quality implementations. To perfectly erase any media impact needs 432 more considerations at the higher layers, how the media substitution 433 is erased at the higher layers are outside of the scope of this memo. 435 If the time duration for any substitutive content mismatches, i.e., 436 shorter or longer, than the duration of the main content to be 437 replaced, then media degradations may occur at the splicing point and 438 thus impact the user's experience. 440 If the substitutive content has shorter duration from the main 441 content, then there could be a gap in the output RTP stream. The RTP 442 sequence number will be contiguous across this gap, but there will be 443 an unexpected jump in the RTP timestamp. Such a gap would cause the 444 receiver to have nothing to play. This may be unavoidable, unless 445 the mixer can adjusts the splice in or splice out point to 446 compensate. This assumes the splicing mixer can send more of the 447 main RTP stream in place of the shorter substitutive stream, or vary 448 the length of the substitutive content. It is the responsibility of 449 the higher layer protocols and the media providers to ensure that the 450 substitutive content is of very similar duration as the main content 451 to be replaced. 453 If the substitute content has longer duration than the reserved gap 454 duration, there will be an overlap between the substitutive RTP 455 stream and the main RTP stream at the splicing out point. A 456 straightforward approach is that the mixer performs an ungraceful 457 action, terminating the splicing and switching back to main RTP 458 stream even if this may cause media stuttering on receiver. 459 Alternatively, the mixer may transcode the substitutive content to 460 play at a faster rate than normal, to adjust it to the length of the 461 gap in the main content, and generate a new RTP stream for the 462 transcoded content. This is a complex operation, and very specific 463 to the content and media codec used. Additional approaches exists, 464 these types of issues should be taken into account in both mixer 465 implementors and media generators to enable smooth substitutions. 467 4.4. Congestion Control Considerations 469 If the substitutive content has somewhat different characteristics 470 from the main content it replaces, or if the substitutive content is 471 encoded with a different codec or has different encoding bitrate, it 472 might overload the network and might cause network congestion on the 473 path between the mixer and the RTP receiver(s) that would not have 474 been caused by the main content. 476 To be robust to network congestion and packet loss, a mixer that is 477 performing splicing must continuously monitor the status of 478 downstream network by monitoring any of the following RTCP reports 479 that are used: 481 1. RTCP receiver reports indicate packet loss [RFC3550]. 483 2. RTCP NACKs for lost packet recovery [RFC4585]. 485 3. RTCP ECN Feedback information [RFC6679]. 487 Once the mixer detects congestion on its downstream link, it will 488 treat these reports as follows: 490 1. If the mixer receives the RTCP receiver reports with packet loss 491 indication, it will forward the reports to the substitutive RTP 492 sender or the main RTP sender as described in section 4.2. 494 2. If mixer receives the RTCP NACK packets defined in [RFC4585] from 495 RTP receiver for packet loss recovery, it first identifies the 496 content category of lost packets to which the NACK corresponds. 497 Then, the mixer will generate new RTCP NACK for the lost packets 498 with its own SSRC, and make corresponding changes to their 499 sequence numbers to match original, pre-spliced, packets. If the 500 lost substitutive content comes from local media file storage, 501 the mixer acting as substitutive RTP sender will directly fetch 502 the lost substitutive content and retransmit it to RTP receiver. 503 The mixer may buffer the sent RTP packets and do the 504 retransmission. 506 It is somewhat complex that the lost packets requested in a 507 single RTCP NACK message not only contain the main content but 508 also the substitutive content. To address this, the mixer must 509 divide the RTCP NACK packet into two separate RTCP NACK packets: 510 one requests for the lost main content, and another requests for 511 the lost substitutive content. 513 3. If an ECN-aware mixer receives RTCP ECN feedbacks (RTCP ECN 514 feedback packets or RTCP XR summary reports) defined in [RFC6679] 515 from the RTP receiver, it must process them in a similar way to 516 the RTP/AVPF feedback packet or RTCP XR process described in 517 section 4.2 of this memo. 519 These three methods require the mixer to run a congestion control 520 loop and bitrate adaptation between itself and RTP receiver. The 521 mixer can thin or transcode the main RTP stream or the substitutive 522 RTP stream, but such operations are very inefficient and difficult, 523 and bring undesirable delay. Fortunately in this memo, the mixer 524 acting as splicer can rewrite the RTCP packets sent from the RTP 525 receiver and forward them to the RTP sender, thus letting the RTP 526 sender knows that congestion is being experienced on the path between 527 the mixer and the RTP receiver. Then, the RTP sender applies its 528 congestion control algorithm and reduces the media bitrate to a value 529 that is in compliance with congestion control principles for the 530 slowest link. The congestion control algorithm may be a TCP-friendly 531 bitrate adaptation algorithm specified in [RFC5348], or a DCCP 532 congestion control algorithms defined in [RFC5762]. 534 If the substitutive content comes from local media file storage, the 535 mixer must directly reduce the bitrate as if it were the substitutive 536 RTP sender. 538 From above analysis, to reduce the risk of congestion and remain the 539 bandwidth consumption stable over time, the substitutive RTP stream 540 is recommended to be encoded at an appropriate bitrate to match that 541 of main RTP stream. If the substitutive RTP stream comes from the 542 substitutive RTP sender, this sender had better has some knowledge 543 about the media encoding bitrate of main content in advance. How it 544 knows that is out of scope in this draft. 546 4.5. Considerations for Implementing Undetectable Splicing 548 If it is desirable to prevent receivers from detecting that splicing 549 is occurring at the RTP layer, the mixer must not include a CSRC list 550 in outgoing RTP packets, and must not forward RTCP messages from the 551 main RTP sender or from the substitutive RTP sender. Due to the 552 absence of CSRC list in the output RTP stream, the RTP receiver only 553 initiates SDES, BYE and APP packets to the mixer without any 554 knowledge of the main RTP sender and the substitutive RTP sender. 556 CSRC list identifies the contributing sources, these SSRC identifiers 557 of contributing sources are kept globally unique for each RTP 558 session. The uniqueness of SSRC identifier is used to resolve 559 collisions and detecting RTP-level forwarding loops as defined in 560 section 8.2 of [RFC3550]. The absence of CSRC list in this case will 561 create a danger that loops involving those contributing sources could 562 not be detected. The loops could occur if either the mixer is 563 misconfigured to form a loop, or a second mixer/translator is added, 564 causing packets to loop back to upstream of the original mixer. An 565 undetected RTP packet loop is a serious denial of service threat, 566 which can consume all available bandwidth or mixer processing 567 resources until the looped packets are dropped as result of 568 congestion. So Non-RTP means must be used to detect and resolve 569 loops if the mixer does not add a CSRC list. 571 5. Implementation Considerations 573 When the mixer is used to handle RTP splicing, RTP receiver does not 574 need any RTP/RTCP extension for splicing. As a trade-off, additional 575 overhead could be induced on the mixer which uses its own sequence 576 number space and timing model. So the mixer will rewrite RTP 577 sequence number and timestamp whatever splicing is active or not, and 578 generate RTCP flows for both sides. In case the mixer serves 579 multiple main RTP streams simultaneously, this may lead to more 580 overhead on the mixer. 582 If undetectable splicing requirement is required, CSRC list is not 583 included in outgoing RTP packet, this brings a potential issue with 584 loop detection as briefly described in section 4.5. 586 6. Security Considerations 588 The splicing application is subject to the general security 589 considerations of the RTP specification [RFC3550]. 591 The mixer acting as splicer replaces some content with other content 592 in RTP packets, thus breaking any RTP level end-to-end security, such 593 as integrity protection and source authentication. Thus any RTP 594 level or outside security mechanism, such as IPSec or DTLS will use a 595 security association between the splicer and the receiver. When 596 using SRTP the splicer could be provisioned with the same security 597 association as the main RTP sender. Using a limitation in the SRTP 598 security services regarding source authentication, the splicer can 599 modify and re-protect the RTP packets without enabling the receiver 600 to detect if the data comes from the original source or from the 601 splicer. 603 Security goals to have source authentication all the way from the RTP 604 main sender to the receiver through the splicer is not possible with 605 splicing and any existing solutions. A new solution can 606 theoretically be developed that enables identifying the participating 607 entities and what each provides, i.e. the different media sources, 608 main and substituting, and the splicer providing the RTP level 609 integration of the media payloads in a common timeline and 610 synchronization context. Such a solution would obviously not meet 611 Req-7 and will be detectable on RTP level. 613 The nature of this RTP service offered by a network operator 614 employing a content splicer is that the RTP layer security 615 relationship is between the receiver and the splicer, and between the 616 senders and the splicer, are not end-to-end. This appears to 617 invalidate the undetectability goal, but in the common case the 618 receiver will consider the splicer as the main media source. 620 Some RTP deployments use RTP payload security mechanisms (e.g., 621 ISMACryp [ISMACryp]). If any payload internal security mechanisms 622 are used, only the RTP sender and the RTP receiver establish that 623 security context, in which case, any middlebox (e.g., splicer) 624 between the RTP sender and the RTP receiver will not get such keying 625 material. This may impact the splicer's possibility to perform 626 splicing if it is dependent on RTP payload level hints for finding 627 the splice in and out points. However, other potential solutions 628 exist to specify or mark where the splicing points exist in the media 629 streams. When using RTP payload security mechanisms SRTP or other 630 security mechanism at RTP or lower layers can be used to provide 631 integrity and source authentication between the splicer and the RTP 632 receiver. 634 7. IANA Considerations 636 No IANA actions are required. 638 8. Acknowledgments 640 The following individuals have reviewed the earlier versions of this 641 specification and provided very valuable comments: Colin Perkins, 642 Magnus Westerlund, Roni Even, Tom Van Caenegem, Joerg Ott, David R 643 Oran, Cullen Jennings, Ali C Begen, Charles Eckel and Ning Zong. 645 9. 10. Appendix- Why Mixer Is Chosen 647 Translator and mixer both can realize splicing by changing a set of 648 RTP parameters. 650 Translator has no SSRC, hence it is transparent to RTP sender and 651 receiver. Therefore, RTP sender sees the full path to the receiver 652 when translator is passing its content. When translator insert the 653 substitutive content RTP sender could get a report on the path up to 654 translator itself. Additionally, if splicing does not occur yet, 655 translator does not need to rewrite RTP header, the overhead on 656 translator can be avoided. 658 If mixer is used to do splicing, it can also allow RTP sender to 659 learn the situation of its content on receiver or on mixer just like 660 translator does, which is specified in section 4.2. Compared to 661 translator, mixer's outstanding benefit is that it is pretty straight 662 forward to do with RTCP messages, for example, bit-rate adaptation to 663 handle varying network conditions. But translator needs more 664 considerations and its implementation is more complex. 666 From above analysis, both translator and mixer have their own 667 advantages: less overhead or less complexity on handling RTCP. 668 Through long and sophisticated discussion, the avtext WG members 669 prefer less complexity rather than less overhead and incline to mixer 670 to do splicing. 672 If one chooses mixer as splicer, the overhead on mixer must be taken 673 into account even if the splicing does not occur yet. 675 10. References 677 10.1. Normative References 679 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 680 Jacobson, "RTP: A Transport Protocol for Real-Time 681 Applications", STD 64, RFC 3550, July 2003. 683 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 684 "Extended RTP Profile for Real-time Transport Control 685 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, 686 July 2006. 688 [RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P., 689 and K. Carlberg, "Explicit Congestion Notification (ECN) 690 for RTP over UDP", RFC 6679, August 2012. 692 10.2. Informative References 694 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 695 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 696 RFC 3711, March 2004. 698 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 699 Friendly Rate Control (TFRC): Protocol Specification", 700 RFC 5348, September 2008. 702 [RFC5762] Perkins, C., "RTP and the Datagram Congestion Control 703 Protocol (DCCP)", RFC 5762, April 2010. 705 [SCTE30] Society of Cable Telecommunications Engineers (SCTE), 706 "Digital Program Insertion Splicing API", 2009. 708 [SCTE35] Society of Cable Telecommunications Engineers (SCTE), 709 "Digital Program Insertion Cueing Message for Cable", 710 2011. 712 [ISMACryp] 713 Internet Streaming Media Alliance (ISMA), "ISMA Encryption 714 and Authentication Specification 2.0", November 2007. 716 Author's Address 718 Jinwei Xia 719 Huawei 720 Software No.101 721 Nanjing, Yuhuatai District 210012 722 China 724 Phone: +86-025-86622310 725 Email: xiajinwei@huawei.com