idnits 2.17.1 draft-ietf-avtext-splicing-for-rtp-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (February 7, 2012) is 4456 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC2250' is defined on line 666, but no explicit reference was found in the text == Unused Reference: 'RFC3551' is defined on line 674, but no explicit reference was found in the text == Unused Reference: 'RFC5117' is defined on line 691, but no explicit reference was found in the text == Unused Reference: 'RFC5760' is defined on line 708, but no explicit reference was found in the text ** Obsolete normative reference: RFC 5117 (Obsoleted by RFC 7667) == Outdated reference: A later version (-08) exists of draft-ietf-avtcore-ecn-for-rtp-05 Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 AVTEXT Working Group J. Xia 3 Internet-Draft Huawei 4 Intended status: Informational February 7, 2012 5 Expires: August 10, 2012 7 Content Splicing for RTP Sessions 8 draft-ietf-avtext-splicing-for-rtp-05 10 Abstract 12 This memo outlines RTP splicing. Splicing is a process that replaces 13 the content of the main multimedia stream with other multimedia 14 content, and delivers the substitutive multimedia content to receiver 15 for a period of time. This memo provides some RTP splicing use 16 cases, then we enumerate a set of requirements and analyze whether an 17 existing RTP level middlebox can meet these requirements, at last we 18 provide concrete guidelines for how the chosen middlebox works to 19 handle RTP splicing. 21 Status of this Memo 23 This Internet-Draft is submitted to IETF in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on August 10, 2012. 38 Copyright Notice 40 Copyright (c) 2012 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 56 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 3. RTP Splicing Discussion and Requirements . . . . . . . . . . . 5 58 4. Recommended Solution for RTP Splicing . . . . . . . . . . . . 7 59 4.1. RTP Processing in RTP Mixer . . . . . . . . . . . . . . . 7 60 4.2. RTCP Processing in RTP Mixer . . . . . . . . . . . . . . . 9 61 4.3. Media Clipping Considerations . . . . . . . . . . . . . . 10 62 4.4. Congestion Control Considerations . . . . . . . . . . . . 11 63 4.5. Processing Splicing in User Invisibility Case . . . . . . 13 64 5. Implementation Considerations . . . . . . . . . . . . . . . . 13 65 6. Security Considerations . . . . . . . . . . . . . . . . . . . 14 66 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 67 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 14 68 9. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 14 69 9.1. draft-xia-avtext-splicing-for-rtp-01 . . . . . . . . . . . 14 70 9.2. draft-xia-avtext-splicing-for-rtp-00 . . . . . . . . . . . 15 71 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 72 10.1. Normative References . . . . . . . . . . . . . . . . . . . 15 73 10.2. Informative References . . . . . . . . . . . . . . . . . . 16 74 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 17 76 1. Introduction 78 This document outlines how splicing can be used for RTP sessions. 79 Splicing is a process that replaces the content of the main RTP 80 stream with other multimedia content, and delivers the substitutive 81 content to receiver for a period of time. The substitutive content 82 can be provided for example via another RTP stream or local media 83 file storage. 85 One representative use case for splicing is advertisements insertion, 86 which allows operators to replace the national advertising content 87 with its own regional advertising content prior to delivering the 88 regional advertising content to receiver. 90 Besides the advertisement insertion use case, there are other use 91 cases to which RTP splicing technology can apply. For example, 92 splicing a recorded video into a video conferencing session, and 93 implementing a playlist server that stitches pieces of video together 94 and so forth. 96 So far [SCTE30] and [SCTE35] have standardized MPEG2-TS splicing 97 running over cable. The introduction of multimedia splicing into 98 internet requires changes to transport layer, but to date there is no 99 guideline for how to handle content splicing for RTP sessions 100 [RFC3550]. 102 In this document, we first describe a set of requirements of RTP 103 splicing. Then we provide a method about how an intermediary node 104 can be used to process RTP splicing to meet these requirements from 105 the aspects of feasibility, implementation complexity and backward 106 compatibility. 108 2. Terminology 110 The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 111 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 112 document are to be interpreted as described in [RFC2119]. 114 Current RTP Stream 116 The RTP stream that the RTP receiver is currently receiving. The 117 content of current RTP stream can be either main content or 118 substitutive content. 120 Main Content 122 The multimedia content that are conveyed in main RTP stream. Main 123 content will be replaced by the substitutive content during 124 splicing. 126 Main RTP Stream 128 The RTP stream that the Splicer is receiving. The content of main 129 RTP stream can be replaced by substitutive content for a period of 130 time. 132 Substitutive Content 134 The multimedia content that replaces the main content during 135 splicing. The substitutive content can for example be contained 136 in an RTP stream from a media sender or fetched from local media 137 file storage. 139 Substitutive RTP Stream 141 A RTP stream that may provide substitutive content. Substitutive 142 RTP stream and main RTP stream are two separate streams. If the 143 substitutive content is provided via substitutive RTP stream, the 144 substitutive RTP Stream must pass through Splicer before the 145 substitutive content is delivered to receiver. 147 Splicing In Point 149 A virtual point in the RTP stream, suitable for substitutive 150 content entry, that exists in the boundary of two independently 151 decodable frames. 153 Splicing Out Point 155 A virtual point in the RTP stream, suitable for substitutive 156 content exit, that exists in the boundary of two independently 157 decodable frames. 159 Splicer 161 An intermediary node that inserts substitutive content into main 162 RTP stream. Splicer sends substitutive content to RTP receiver 163 instead of main content during splicing. It is also responsible 164 for processing RTCP traffic between media source and RTP receiver. 166 3. RTP Splicing Discussion and Requirements 168 In this document, we assume an intermediary network element, which is 169 referred to as Splicer, to play the key role to handle RTP splicing. 170 A simplified RTP splicing diagram is depicted in Figure 1, in which 171 only one main content flow and one substitutive content flow are 172 given. 174 +---------------+ 175 | | Main Content +-----------+ 176 |Main RTP Sender|------------->| | Current Content 177 | | | Splicer |----------> 178 +---------------+ ---------->| | 179 | +-----------+ 180 | 181 | Substitutive Content 182 | 183 | 184 +-----------------------+ 185 |Substitutive RTP Sender| 186 | or | 187 | Local File Storage | 188 +-----------------------+ 190 Figure 1: RTP Splicing Architecture 192 When RTP splicing begins, Splicer stops delivering the main content, 193 instead delivering the substitutive content to RTP receiver for a 194 period of time, and then resumes the main content when splicing ends. 195 The methods how Splicer learns when to start and end the splicing is 196 out of scope for this document. The RTP splicing may happen more 197 than once in case that substitutive content will be dispersedly 198 inserted in multiple time slots during the lifetime of the main RTP 199 stream. 201 When realizing splicing technology on RTP layer, there are a set of 202 requirements that must be satisfied to at least some degree on 203 Splicer: 205 REQ-1: 207 Splicer MUST operate in either unicast or multicast session 208 environment. 210 REQ-2: 212 Splicer SHOULD NOT cause perceptible media clipping at the 213 splicing point and adverse impact on the quality of user 214 experience. 216 REQ-3: 218 Splicer MUST be backward compatible with RTP/RTCP protocols, and 219 its associated profiles and extensions to those protocols. For 220 example, Splicer MUST be robust to packet loss, network congestion 221 etc. 223 REQ-4: 225 Splicer MUST be trusted by media source and receiver, and has the 226 valid security context with media source and RTP receiver 227 respectively. 229 REQ-5: 231 Splicer SHOULD allow the media source to learn the performance of 232 the downstream receiver when its content is being passed to RTP 233 receiver. 235 In a number of deployment scenarios, especially advertisement 236 insertion, there may be one specific requirement. Given that it is 237 unacceptable for advertisers that their advertising content is not 238 delivered to user, this may require RTP splicing to be operated 239 within the following constraint: 241 REQ-6: 243 If Splicer intends to prevent RTP receiver from identifying and 244 filtering the substitutive content, it SHOULD eliminate the 245 visibility of splicing process on RTP level from RTP receiver 246 point of view. 248 However, substitutive content and main content are encoded by 249 different encoders and have different parameter sets. In such 250 case, a full media transcoding must be done on Splicer to ensure 251 the completely invisible impact on RTP receiver, but this may be 252 prohibitively expensive and complex. As a trade-off, it is 253 RECOMMENDED to minimize the splicing visibility on RTP receiver, 254 i.e., maintaining RTP header parameters consistent but leaving the 255 RTP payload untranscoded. If one wants to realize complete 256 invisibility, the cost of transcoding must be taken into account. 258 Henceforth, we refer to the minimum and complete invisibility 259 requirement as User Invisibility Requirement. 261 To improve the versatility of existing implementations and better 262 interoperability, it is RECOMMENDED to use existing tools in RTP/RTCP 263 protocol family to realize RTP splicing without any protocol 264 extension unless the existing tools are incompetent for splicing. 266 4. Recommended Solution for RTP Splicing 268 Given that Splicer is an intermediary node exists between the main 269 media source and the RTP receiver and splicing is not a very 270 complicated processing, there are some chance that any existing RTP- 271 level middlebox may has the incidental capability to meet the 272 requirements described in previous section. 274 Since Splicer needs to select substitutive content or main content as 275 the input content at one point of time, an RTP mixer seems to have 276 such capability to do this under its own SSRC. Moreover, mixer may 277 include the CSRC list in outgoing packets to indicate the source(s) 278 of content in some use cases like conferencing, this facilitates the 279 system debugging and loop detection. From this point of view, an RTP 280 mixer may have some chance to be Splicer. In next four subsections 281 (from subsection 4.1 to subsection 4.4), we start analyzing how an 282 RTP mixer handles RTP splicing and how it satisfies the general 283 requirements listed in section 3. 285 In subsection 4.5, we specially consider the special requirement 6 286 (i.e., User Invisibility Requirement) since it needs to mask any RTP 287 splicing clue on receiver (e.g, CSRC list must not be included in 288 outgoing packets to prevent receiver from identifying the difference 289 between main RTP stream and substitutive RTP stream) when mixer is 290 used. 292 4.1. RTP Processing in RTP Mixer 294 Once mixer has learnt when to do splicing, it must get ready for the 295 coming splicing in advance, e.g., fetches the substitutive content 296 either from local media file storage or via substitutive RTP stream 297 earlier than splicing in point. If the substitutive content comes 298 from local media file storage, mixer SHOULD leave the CSRC list blank 299 in the output stream. 301 Even if splicing does not begin, mixer still needs to receive the 302 main RTP stream, and generate a media stream as defined in RFC3550. 303 Using main content, mixer generates the current media stream with its 304 own SSRC, sequence number space and timing model. Moreover, mixer 305 may insert the SSRC of main RTP stream into CSRC list in the current 306 media stream. 308 When splicing begins, mixer chooses the substitutive RTP stream as 309 input stream at splicing in point, and extracts the payload data 310 (i.e., substitutive content). After that, mixer encapsulates 311 substitutive content instead of main content as the payload of the 312 current media stream, and then outputs the current media stream to 313 receiver. Moreover, mixer may insert the SSRC of substitutive RTP 314 stream into CSRC list in the current media stream. 316 When splicing ends, mixer retrieves the main RTP stream as input 317 stream at splicing out point, and extracts the payload data (i.e., 318 main content). After that, mixer encapsulates main content instead 319 of substitutive content as the payload of the current media stream, 320 and then outputs the current media stream to receiver. Moreover, 321 mixer may insert the SSRC of main RTP stream into CSRC list in the 322 current media stream. 324 The whole RTP splicing procedure is perhaps best explained by a 325 pseudo code example: 327 if (splicing begins) { 328 the substitutive RTP stream is terminated on mixer and 329 substitutive content is encapsulated by mixer with its own SSRC 330 identifier; 332 the sequence numbers of the current RTP packets which contain 333 substitutive content are allocated by mixer and maintain 334 consistent with the sequence numbers of previous current RTP 335 packets, until the splicing end; 337 the timestamp of the current RTP packet increments linearly; 339 the CSRC list of the current RTP packet may include SSRC of 340 substitutive RTP stream; 341 } 343 else { 344 the main RTP stream is terminated on mixer and main content is 345 encapsulated by mixer with its own SSRC identifier; 347 the sequence numbers of the current RTP packets which contain main 348 content are allocated by mixer and maintain consistent with the 349 sequence numbers of previous current RTP packets, until the 350 splicing begins; 351 the timestamp of the current RTP packets increments linearly; 353 the CSRC list of the current RTP may include SSRC of main RTP 354 stream; 355 } 357 Splicing may occur more than one time during the lifetime of main RTP 358 stream, this means mixer needs to output main content and 359 substitutive content in turn with its own SSRC identifier. From 360 receiver point of view, the only source of the current stream is 361 mixer wherever the content comes from. 363 Note that, the substitutive content should be outputted in the range 364 of splicing duration. Any gap or overlap between main RTP stream and 365 substitutive RTP stream may induce media clipping at splicing point. 366 More details about preventing media clipping are introduced in 367 section 4.3. 369 4.2. RTCP Processing in RTP Mixer 371 By monitoring available bandwidth and buffer levels and by computing 372 network metrics such as packet loss, network jitter, and delay, RTP 373 receiver can learn the situation on it and can communicate this 374 information to media source via RTCP reception reports. 376 According to the description in section 7.3 of [RFC3550], mixer 377 divides RTCP flow between media source and receiver into two separate 378 RTCP loops, media source probably has no idea about the situation on 379 receiver. Hence, mixer can use some mechanisms, allowing media 380 source to at least some degree to have some knowledge of the 381 situation on receiver when its content is being passed to receiver. 383 Because splicing is a processing that mixer selects one media stream 384 from multiple streams but neither mixing nor transcoding them, upon 385 receiving an RTCP receiver report from downstream receiver, mixer can 386 forward it to original media source with its SSRC identifier intact 387 (i.e., the SSRC of downstream receiver). Given that the number of 388 output RTP packets containing substitutive content is equal to the 389 number of input substitutive RTP packets (from substitutive RTP 390 stream) during splicing. In the same manner, the number of output 391 RTP packets containing main content is equal to the number of input 392 main RTP packets (from main RTP stream) during non-splicing, so mixer 393 does not need to modify loss packet fields in Receiver Report Blocks 394 unless the reporting intervals spans the splicing point. But mixer 395 needs to change the SSRC field in report block to the SSRC identifier 396 of original media source and rewrite the extended highest sequence 397 number field to the corresponding original extended highest sequence 398 number before forwarding the RTCP receiver report to original media 399 source. 401 When a RTCP receiver report spans the splicing point, it reflects the 402 characteristics of the combination of main RTP packets and 403 substitutive RTP packets, in which case, mixer needs to divide the 404 receiver report into two separated receiver reports and send them to 405 their original media sources respectively. For each separated 406 receiver report, mixer also needs to make the corresponding changes 407 to the packet loss fields in report block besides the SSRC field and 408 the extended highest sequence number field. 410 The mixer can also inform the media source of quality with which the 411 content reaches the mixer. This is done by the mixer generating RTCP 412 reports for the RTP stream, which it sends upstream towards the media 413 source. These RTCP reports use the SSRC of the mixer. 415 Based on above RTCP operating mechanism, the media source whose 416 content is being passed to receiver, will see the reception quality 417 of its stream received on mixer, and the reception quality of spliced 418 stream received on receiver. The media source whose content is not 419 being passed to receiver, will only see the reception quality of its 420 stream received on mixer. 422 If the substitutive content comes from local media file storage ( 423 i.e., mixer can be regarded as the substitutive media source), the 424 reception reports received from downstream relate to the substitutive 425 content should be terminated on mixer without any further processing. 427 4.3. Media Clipping Considerations 429 This section provides informative guideline about how media clipping 430 may shape and how mixer deal with the media clipping. 432 If the time slot for substitutive RTP stream mismatches (shorter or 433 longer than) the duration of the reserved main RTP stream for 434 replacing, the media clipping may occur at the splicing point which 435 usually is the joint between two independently decodable frames. 437 At the splicing in point, mixer can fill up receiver's buffer with 438 substitutive content several seconds earlier than the presentation 439 time of substitutive content so that smooth playback can be achieved 440 without pauses or stuttering on RTP receiver. 442 Compared to buffering method used at splicing in point, things become 443 somewhat complex at splicing out point. The case that insertion 444 duration is shorter than the reserved gap time may cause a little 445 playback latency of main RTP stream on RTP receiver, but not 446 adversely impact the quality of user experience. One alternative 447 approach is that mixer may pad some blank content (e.g., all black 448 sequence) to fill up the gap. Another alternative approach is that 449 main media source may send filler content (e.g., static channel 450 identifier) during splicing, the mixer can switch back to early when 451 it runs out of substitutive content. 453 However, in case that insertion duration is longer than the reserved 454 gap duration, there exists an overlap of the substitutive RTP stream 455 and the main RTP stream at splicing out point. One straightforward 456 approach is that mixer takes a ungracefule action, terminating the 457 splicing and switching back to main RTP stream even if this may cause 458 media stuttering on receiver. There is an alternative approach which 459 may be mild but somewhat complex, mixer buffers main content for a 460 while until substitutive content is finished, and then transmits 461 buffered main content to receiver at an acceleated bitrate (as 462 compared to the nominal bitrate of main RTP stream) until its buffer 463 level returns to normal. At this point in time, mixer transmits main 464 content to receiver at an nominal bitrate of main RTP stream. Note 465 that mixer should take into account a variety of parameters, such as 466 available bandwidth between mixer and receiver, mixer buffer level 467 and receiver buffer level, to count the accelerated bitrate value. 469 Another reason to cause media clipping is synchronization delay at 470 splicing point if RTP receiver needs to synchronize multiple current 471 streams for playback. How to address this issue is discussed in 472 detail in [RFC6051], which provides three feasible approaches to 473 reduce synchronization delay. 475 4.4. Congestion Control Considerations 477 Provided that the substitutive content has somewhat different 478 characteristics to the main content it replaces (e.g., the more 479 dynamic content, the higher bandwidth occupation), or substitutive 480 content may be encoded with different codec and has different 481 encoding bitrate, some challenge raise to network capacity and 482 receiver buffer size. A more dynamic content or a higher encoding 483 bitrate stream might overload the network and possibly exceed the 484 receiver's media consumption rate, which might flood receiver's 485 buffer and eventually result in a buffer overflow. Either network 486 overload or buffer overflow would induce network congestion and 487 congestion-caused packet loss. 489 To be robust to network congestion and packet loss, mixer must 490 continuously monitor the network situation by means of a variety of 491 manners: 493 1. RTCP receiver reports indicate packet loss [RFC3550]. 495 2. RTCP NACKs for lost packet recovery [RFC4585]. 497 3. RTCP ECN Feedback information [I-D.ietf-avtcore-ecn-for-rtp]. 499 Upon detection of above three types of RTCP reports during splicing, 500 mixer will treat them with three different manners as following: 502 1. If mixer receives the RTCP receiver reports with packet loss 503 indication, it will process them as the description given in 504 section 7.3 of [RFC3550]. 506 2. If mixer receives the RTCP NACK packets defined in [RFC4585] from 507 RTP receiver for packet loss recovery, it first identifies the 508 content category of lost packets to which the NACK corresponds. 509 Then, mixer will generate new RTCP NACK for the lost packets with 510 its own SSRC, and make corresponding changes to their sequence 511 numbers to match original, pre-spliced, packets. If the lost 512 substitutive content comes from local media file storage, mixer 513 acting as substitutive media source will directly fetch the lost 514 substitutive content and retransmit it to RTP receiver. 516 It is somewhat complex that the lost packets requested in a 517 single RTCP NACK message not only contain the main content but 518 also the substitutive content. To address this, mixer must 519 divide the RTCP NACK packet into two separate RTCP NACK packets: 520 one requests for the lost main content, and another requests for 521 the lost substitutive content. 523 3. In [I-D.ietf-avtcore-ecn-for-rtp], two RTCP extensions are 524 defined for ECN feedback: RTP/AVPF transport layer ECN feedback 525 packet for urgent ECN information, and RTCP XR ECN summary report 526 block for regular reporting of the ECN marking information. 528 If an ECN-aware mixer receives any RTCP ECN feedback (i.e., RTCP 529 ECN feedback packets or RTCP XR summary reports) from RTP 530 receiver, it must operates as description given in section 8.4 of 531 [I-D.ietf-avtcore-ecn-for-rtp], terminating the RTCP ECN feedback 532 packets from downstream receivers, and driving congestion control 533 loop and bitrate adaptation between itself and downstream 534 receiver as if it were the media source. In addition, an ECN- 535 aware RTP mixer must generate RTCP ECN feedback relating to the 536 input RTP streams it terminates, and driving congestion control 537 loop and bitrate adaptation between itself and upstream sender as 538 if it were the RTP sender. 540 Once mixer learns that congestion is being experienced on its 541 downstream link by means of above three detection mechanisms, it 542 should adapt the bitrate of output stream in response to network 543 congestion. The bitrate adaptation may be determined by a TCP- 544 friendly bitrate adaptation algorithm specified in [RFC5348], or by a 545 DCCP congestion control algorithms defined in [RFC5762]. 547 In practice, during splicing, the real reason to cause congestion 548 usually is the different characteristic of substitutive RTP stream 549 (more dynamic content or higher encoding bitrate) with main RTP 550 stream, and that stream transcoding or thinning on mixer is very 551 inefficient and difficult operation. Therefore, a means that enables 552 substitutive media source to limit the media bitrate it is currently 553 generating even in the absence of congestion on the path between 554 itself and mixer is desirable. The TMMBR message defined in 555 [RFC5104] provides an effective method. When mixer detects 556 congestion on its downstream link during splicing, it uses TMMBR to 557 request substitutive media source to reduce the media bitrate to a 558 value that is in compliance with congestion control principles for 559 the slowest link. Upon reception of TMMBR, substitutive media source 560 applies its congestion control algorithm and responds Temporary 561 Maximum Media Stream Bit Rate Notification (TMMBN) to mixer. 563 If the substitutive content comes from local media file storage, 564 mixer must directly reduce the substitutive media bitrate as the 565 substitutive media source when it detects any congestion on its 566 downstream link during splicing. 568 From above analysis, to reduce the risk of congestion and remain the 569 bandwidth consumption stable over time, the substitutive RTP stream 570 is recommended to be encoded at an appropriate bitrate to match that 571 of main RTP stream. If the substitutive RTP stream comes from 572 substitutive media source, the source had better has some knowledge 573 about the media encoding bitrate of main content in advance. How it 574 knows that is out of scope in this draft. 576 4.5. Processing Splicing in User Invisibility Case 578 Mixer will not includes CRSC list in outgoing RTP packets to prevent 579 user from detecting the splicing occurred on RTP level. Due to the 580 absence of CRSC list in current RTP stream, RTP receiver only 581 initiates SDES, BYE and APP packets to mixer without any knowledge of 582 main media source and substitutive media source. This creates a 583 danger that loops involving those sources could not be detected. 585 5. Implementation Considerations 587 When mixer is used to handle RTP splicing, RTP receiver does not need 588 any RTP/RTCP extension for splicing. As a trade-off, additional 589 overhead could be induced on mixer which uses its own sequence number 590 space and timing model. So mixer will rewrite RTP sequence number 591 and timestamp whatever splicing is active or not, and generate RTCP 592 flows for both sides. In case mixer serves multiple main RTP streams 593 simultaneously, this may lead to more overhead on mixer. 595 In addition, there is a potential issue with loop detection, which 596 would be problematic if User Invisibility Requirement is required. 598 6. Security Considerations 600 If any payload internal security mechanisms (e.g., ISMACryp 601 [ISMACryp]) are used, only media source and RTP receiver can learn 602 the security keying material generated by such internal security 603 mechanism, any middlebox (e.g., mixer) between media source and RTP 604 receiver can't get such keying material. Only when regular transport 605 security mechanisms (e.g., SRTP, IPSec, etc) are used, mixer will 606 process the packets passing through it. 608 The security considerations of the RTP specification [RFC3550], the 609 Extended RTP profile for RTCP-Based Feedback [RFC4585], and the 610 Secure Real-time Transport Protocol [RFC3711] apply. Mixer must be 611 trusted by main media source and insertion media source, and must be 612 included in the security context. 614 7. IANA Considerations 616 No IANA actions are required. 618 8. Acknowledgments 620 The following individuals have reviewed the earlier versions of this 621 specification and provided very valuable comments: Colin Perkins, 622 Magnus Westerlund, Roni Even, Tom Van Caenegem, Joerg Ott, David R 623 Oran, Cullen Jennings, Ali C Begen, Charles Eckel and Ning Zong. 625 9. Change Log 627 9.1. draft-xia-avtext-splicing-for-rtp-01 629 The following are the major changes compared to previous version 00: 631 o Use mixer to handle both user visible and invisible splicing. 633 o Add one subsection to describe media clipping considerations. 635 o Add one subsection to describe congestion control considerations. 637 9.2. draft-xia-avtext-splicing-for-rtp-00 639 The following are the major changes compared to previous AVT I-D 640 version 00: 642 o Change primary RTP stream to main RTP stream, add current RTP 643 stream as the streaming received by RTP receiver. 645 o Eliminate the ambiguity of inserted content with substitutive 646 content which replaces the main content rather than pause it. 648 o Clarify the signaling requirements. 650 o Delete the description on Mixer and MCU in section 4, mainly focus 651 on the direction whether a Translator can act as a Splicer. 653 o Add section 5 to describe the exact guidance on how an RTP 654 Translator is used to handle splicing. 656 o Modify the security considerations section and add acknowledges 657 section. 659 10. References 661 10.1. Normative References 663 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 664 Requirement Levels", BCP 14, RFC 2119, March 1997. 666 [RFC2250] Hoffman, D., Fernando, G., Goyal, V., and M. Civanlar, 667 "RTP Payload Format for MPEG1/MPEG2 Video", RFC 2250, 668 January 1998. 670 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 671 Jacobson, "RTP: A Transport Protocol for Real-Time 672 Applications", STD 64, RFC 3550, July 2003. 674 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 675 Video Conferences with Minimal Control", STD 65, RFC 3551, 676 July 2003. 678 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 679 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 680 RFC 3711, March 2004. 682 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 683 "Extended RTP Profile for Real-time Transport Control 684 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, 685 July 2006. 687 [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, 688 "Codec Control Messages in the RTP Audio-Visual Profile 689 with Feedback (AVPF)", RFC 5104, February 2008. 691 [RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117, 692 January 2008. 694 [RFC6051] Perkins, C. and T. Schierl, "Rapid Synchronisation of RTP 695 Flows", RFC 6051, November 2010. 697 [I-D.ietf-avtcore-ecn-for-rtp] 698 Westerlund, M., "Explicit Congestion Notification (ECN) 699 for RTP over UDP", draft-ietf-avtcore-ecn-for-rtp-05 (work 700 in progress), October 2011. 702 10.2. Informative References 704 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 705 Friendly Rate Control (TFRC): Protocol Specification", 706 RFC 5348, September 2008. 708 [RFC5760] Ott, J., Chesterfield, J., and E. Schooler, "RTP Control 709 Protocol (RTCP) Extensions for Single-Source Multicast 710 Sessions with Unicast Feedback", RFC 5760, February 2010. 712 [RFC5762] Perkins, C., "RTP and the Datagram Congestion Control 713 Protocol (DCCP)", RFC 5762, April 2010. 715 [SCTE30] Society of Cable Telecommunications Engineers (SCTE), 716 "Digital Program Insertion Splicing API", 2001. 718 [SCTE35] Society of Cable Telecommunications Engineers (SCTE), 719 "Digital Program Insertion Cueing Message for Cable", 720 2004. 722 [ISMACryp] 723 Internet Streaming Media Alliance (ISMA), "ISMA Encryption 724 and Authentication Specification 2.0", November 2007. 726 Author's Address 728 Jinwei Xia 729 Huawei 730 Software No.101 731 Nanjing, Yuhuatai District 210012 732 China 734 Phone: +86-025-86622310 735 Email: xiajinwei@huawei.com