idnits 2.17.1 draft-ietf-rmt-bb-norm-revised-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 20. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1955. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1966. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1973. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1979. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 9, 2008) is 5706 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-06) exists of draft-ietf-rmt-bb-fec-basic-schemes-revised-05 -- Obsolete informational reference (is this intentional?): RFC 3940 (Obsoleted by RFC 5740) -- Obsolete informational reference (is this intentional?): RFC 3941 (Obsoleted by RFC 5401) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Adamson 3 Internet-Draft Naval Research Laboratory 4 Obsoletes: 3941 (if approved) C. Bormann 5 Intended status: Standards Track Universitaet Bremen TZI 6 Expires: March 13, 2009 M. Handley 7 University College London 8 J. Macker 9 Naval Research Laboratory 10 September 9, 2008 12 Multicast Negative-Acknowledgment (NACK) Building Blocks 13 draft-ietf-rmt-bb-norm-revised-07 15 Status of this Memo 17 By submitting this Internet-Draft, each author represents that any 18 applicable patent or other IPR claims of which he or she is aware 19 have been or will be disclosed, and any of which he or she becomes 20 aware will be disclosed, in accordance with Section 6 of BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that 24 other groups may also distribute working documents as Internet- 25 Drafts. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 The list of current Internet-Drafts can be accessed at 33 http://www.ietf.org/ietf/1id-abstracts.txt. 35 The list of Internet-Draft Shadow Directories can be accessed at 36 http://www.ietf.org/shadow.html. 38 This Internet-Draft will expire on March 13, 2009. 40 Abstract 42 This document discusses the creation of reliable multicast protocols 43 utilizing negative-acknowledgment (NACK) feedback. The rationale for 44 protocol design goals and assumptions are presented. Technical 45 challenges for NACK-based (and in some cases general) reliable 46 multicast protocol operation are identified. These goals and 47 challenges are resolved into a set of functional "building blocks" 48 that address different aspects of reliable multicast protocol 49 operation. It is anticipated that these building blocks will be 50 useful in generating different instantiations of reliable multicast 51 protocols. This document obsoletes RFC 3941. 53 Requirements Language 55 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 56 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 57 document are to be interpreted as described in [RFC2119]. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 62 2. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 4 63 2.1. Delivery Service Model . . . . . . . . . . . . . . . . . . 5 64 2.2. Group Membership Dynamics . . . . . . . . . . . . . . . . 6 65 2.3. Sender/Receiver Relationships . . . . . . . . . . . . . . 6 66 2.4. Group Size Scalability . . . . . . . . . . . . . . . . . . 6 67 2.5. Data Delivery Performance . . . . . . . . . . . . . . . . 7 68 2.6. Network Environments . . . . . . . . . . . . . . . . . . . 8 69 2.7. Intermediate System Assistance . . . . . . . . . . . . . . 8 70 3. Functionality . . . . . . . . . . . . . . . . . . . . . . . . 8 71 3.1. Multicast Sender Transmission . . . . . . . . . . . . . . 11 72 3.2. NACK Repair Process . . . . . . . . . . . . . . . . . . . 13 73 3.3. Multicast Receiver Join Policies and Procedures . . . . . 25 74 3.4. Reliable Multicast Member Identification . . . . . . . . . 26 75 3.5. Data Content Identification . . . . . . . . . . . . . . . 26 76 3.6. Forward Error Correction (FEC) . . . . . . . . . . . . . . 28 77 3.7. Round-trip Timing Collection . . . . . . . . . . . . . . . 29 78 3.8. Group Size Determination/Estimation . . . . . . . . . . . 33 79 3.9. Congestion Control Operation . . . . . . . . . . . . . . . 34 80 3.10. Intermediate System Assistance . . . . . . . . . . . . . . 34 81 4. NACK-based Reliable Multicast Applicability . . . . . . . . . 34 82 5. Security Considerations . . . . . . . . . . . . . . . . . . . 36 83 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 38 84 7. Changes from RFC3941 . . . . . . . . . . . . . . . . . . . . . 38 85 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 38 86 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 38 87 9.1. Normative References . . . . . . . . . . . . . . . . . . . 38 88 9.2. Informative References . . . . . . . . . . . . . . . . . . 39 89 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 41 90 Intellectual Property and Copyright Statements . . . . . . . . . . 43 92 1. Introduction 94 Reliable multicast transport is a desirable technology for efficient 95 and reliable distribution of data to a group on the Internet. The 96 complexities of group communication paradigms necessitate different 97 protocol types and instantiations to meet the range of performance 98 and scalability requirements of different potential reliable 99 multicast applications and users (See [RFC2357]). This document 100 addresses the creation of reliable multicast protocols utilizing 101 negative-acknowledgment (NACK) feedback. NACK-based protocols 102 generally entail less frequent feedback messaging than reliability 103 protocols based on positive acknowledgment (ACK). The less frequent 104 feedback messaging helps simplify the problem of feedback implosion 105 as group size grows large. While different protocol instantiations 106 may be required to meet specific application and network architecture 107 demands[ArchConsiderations], there are a number of fundamental 108 components that may be common to these different instantiations. 109 This document describes the framework and common "building block" 110 components relevant to multicast protocols based primarily on NACK 111 operation for reliable transport. While this document discusses a 112 large set of reliable multicast components and issues relevant to 113 NACK-based reliable multicast protocol design, it specifically 114 addresses in detail the following building blocks which are not 115 addressed in other IETF documents: 117 1. NACK-based Multicast sender transmission strategies, 119 2. NACK repair process with timer-based feedback suppression, and 121 3. Round-trip timing for adapting NACK and other timers. 123 NACK-based reliable multicast implementations SHOULD make use of 124 Forward Error Correction (FEC) erasure coding techniques as described 125 in the FEC Building Block [RFC5052] document. Packet-level erasure 126 coding allows missing packets from a given FEC block to be recovered 127 using the parity packets instead of classical, individualized re- 128 transmission of original source data content. For this reason, this 129 document refers to the protocol mechanisms for reliability as a 130 "repair process." Note that NACK-based protocols can reactively 131 provide the parity packets in response to receiver requests for 132 repair rather than just proactively sending added FEC parity content 133 as part of the original transmission. Hybrid proactive/reactive use 134 of FEC content is also possible with the mechanisms described in this 135 document. Some classes of FEC coding such as Maximal Separable 136 Distance (MDS) codes allow senders to dynamically implement 137 deterministic, highly efficient receiver group repair strategies as 138 part of a NACK-based, selective automated repeat-request (ARQ) 139 scheme. This document describes approaches 140 The potential relationships to other reliable multicast transport 141 building blocks (e.g., FEC, congestion control) and general issues 142 with NACK-based reliable multicast protocols are also discussed. 143 This document follows the guidelines provided in [RFC3269]. 145 *Statement of Intent* 147 This memo contains descriptions of building blocks that can be 148 applied in the design of Reliable Multicast protocols utilizing 149 Negative-Acknowledgement (NACK) feedback. [RFC3941] contained a 150 previous description of this specification. RFC3941 was published in 151 the "Experimental" category. It was the stated intent of the RMT 152 working group to re-submit this specifications as an IETF Proposed 153 Standard in due course. 155 This Proposed Standard specification is thus based on [RFC3941] and 156 has been updated according to accumulated experience and growing 157 protocol maturity since the publication of RFC3941. Said experience 158 applies both to this specification itself and to congestion control 159 strategies related to the use of this specification. 161 The differences between [RFC3941] and this document are listed in 162 Section 7. 164 2. Rationale 166 Each potential protocol instantiation using the building blocks 167 presented here (and in other applicable building block documents) 168 will have specific criteria that may influence individual protocol 169 design. To support the development of applicable building blocks, it 170 is useful to identify and summarize driving general protocol design 171 goals and assumptions. These are areas that each protocol 172 instantiation will need to address in detail. Each building block 173 description in this document will include a discussion of the impact 174 of these design criteria. The categories of design criteria 175 considered here include: 177 1. Delivery Service Model, 179 2. Group Membership Dynamics, 181 3. Sender/receiver relationships, 183 4. Group Size Scalability, 185 5. Data Delivery Performance, and 186 6. Network Environments, 188 All of these areas are at least briefly discussed. Additionally, 189 other reliable multicast transport building block documents such as 190 [RFC5052] have been created to address areas outside of the scope of 191 this document. NACK-based reliable multicast protocol instantiations 192 may depend upon these other building blocks as well as the ones 193 presented here. This document focuses on areas that are unique to 194 NACK-based reliable multicast but may be used in concert with the 195 other building block areas. In some cases, a building block may be 196 able address a wide range of assumptions, while in other cases there 197 will be trade-offs required to meet different application needs or 198 operating environments. Where necessary, building block features are 199 designed to be parametric to meet different requirements. Of course, 200 an underlying goal will be to minimize design complexity and to at 201 least recommend default values for any such parameters that meet a 202 general purpose "bulk data transfer" requirement in a typical 203 Internet environment. The forms of "bulk data transfer" covered here 204 include reliable transport of bulky, but fixed-length, a priori 205 static content and also transmission of non-predetermined, perhaps 206 streamed content of indefinite length. Section 3.5 discusses these 207 different forms of bulk data content in further detail. 209 2.1. Delivery Service Model 211 The implicit goal of a reliable multicast transport protocol is the 212 reliable delivery of data among a group of members communicating 213 using IP multicast datagram service. However, the specific service 214 the application is attempting to provide can impact design decisions. 215 A most basic service model for reliable multicast transport is that 216 of "bulk transfer" which is a primary focus of this and other related 217 RMT working group documents. However, the same principles in 218 protocol design may also be applied to other service models, e.g., 219 more interactive exchanges of small messages such as with white- 220 boarding or text chat. Within these different models there are 221 issues such as the sender's ability to cache transmitted data (or 222 state referencing it) for retransmission or repair. The needs for 223 ordering and/or causality in the sequence of transmissions and 224 receptions among members in the group may be different depending upon 225 data content. The group communication paradigm differs significantly 226 from the point-to-point model in that, depending upon the data 227 content type, some receivers may complete reception of a portion of 228 data content and be able to act upon it before other members have 229 received the content. This may be acceptable (or even desirable) for 230 some applications but not for others. These varying requirements 231 drive the need for a number of different protocol instantiation 232 designs. A significant challenge in developing generally useful 233 building block mechanisms is accommodating even a limited range of 234 these capabilities without defining specific application-level 235 details. 237 Another factor impacting the delivery service model is the potential 238 for different receivers in the multicast group to have significantly 239 differing quality of network connectivity. This may involve 240 receivers with very limited goodput due to connection rate or 241 substantial packet loss. NACK-based protocol implementations may 242 wish to provide policies by which extremely poor-performing receivers 243 are excluded from the main group or migrated to a separate delivery 244 group. Note that some application models may require that the entire 245 group be constrained to the performance of the "weakest member" to 246 satisfy operational requirements. In either case, protocol designs 247 should consider this aspect of the reliable multicast delivery 248 service model. 250 2.2. Group Membership Dynamics 252 One area where group communication can differ from point-to-point 253 communications is that even if the composition of the group changes, 254 the "thread" of communication can still exist. This contrasts with 255 the point-to-point communication model where, if either of the two 256 parties leave, the communication process (exchange of data) is 257 terminated (or at least paused). Depending upon application goals, 258 senders and receivers participating in a reliable multicast transport 259 "session" may be able to join late, leave, and/or potentially rejoin 260 while the ongoing group communication "thread" still remains 261 functional and useful. Also note that this can impact protocol 262 message content. If "late joiners" are supported, some amount of 263 additional information may be placed in message headers to 264 accommodate this functionality. Alternatively, the information may 265 be sent in its own message (on demand or intermittently) if the 266 impact to the overhead of typical message transmissions is deemed too 267 great. Group dynamics can also impact other protocol mechanisms such 268 as NACK timing, congestion control operation, etc. 270 2.3. Sender/Receiver Relationships 272 The relationship of senders and receivers among group members 273 requires consideration. In some applications, there may be a single 274 sender multicasting to a group of receivers. In other cases, there 275 may be more than one sender or the potential for everyone in the 276 group to be a sender *and* receiver of data may exist. 278 2.4. Group Size Scalability 280 Native IP multicast [RFC1112] may scale to extremely large group 281 sizes. It may be desirable for some applications to scale along with 282 the multicast infrastructure's ability to scale. In its simplest 283 form, there are limits to the group size to which a NACK-based 284 protocol can be applied without the potential for the volume of NACK 285 feedback messages to overwhelm network capacity. This is often 286 referred to as "feedback implosion". Research suggests that NACK- 287 based reliable multicast group sizes on the order of tens of 288 thousands of receivers may operate with acceptable levels of feedback 289 to the sender using probabilistic, timer-based suppression techniques 290 [NormFeedback]. Instead of receivers immediately transmitting 291 feedback messages when loss is detected, these techniques specify use 292 of purposefully-scaled, random back-off timeouts such that some 293 potential NACKing receivers can self-suppress their feedback upon 294 hearing messages from other receivers that have selected shorter 295 random timeout intervals. However, there may be additional NACK 296 suppression heuristics that can be applied to enable these protocols 297 to scale to even larger group sizes. In large scale cases, it may be 298 prohibitive for members to maintain state on all other members (in 299 particular, other receivers) in the group. The impact of group size 300 needs to be considered in the development of applicable building 301 blocks. 303 Intermediate assistance from devices/systems with direct knowledge of 304 the underlying network topology may be used to increase the 305 performance and scalability of NACK-based reliable multicast 306 protocols. Feedback aggregation and filtering of sender repair data 307 may be possible with NACK-based protocols using FEC-based repair 308 strategies as described in the and other reliable multicast transport 309 building block documents. However, there will continue to be a 310 number of instances where intermediate system assistance is not 311 available or practical. Thus, building block components for based 312 reliable multicast should be capable of operating without such 313 assistance. 315 2.5. Data Delivery Performance 317 There is a trade-off between scalability and data delivery latency 318 when designing NACK-oriented protocols. If probabilistic, timer- 319 based NACK suppression is to be used, there will be some delays built 320 into the NACK process to allow suppression to occur and for the 321 sender of data to identify appropriate content for efficient repair 322 transmission. For example, back-off timeouts can be used to ensure 323 efficient NACK suppression and repair transmission, but this comes at 324 a cost of increased delivery latency and increased buffering 325 requirements for both senders and receivers. The building blocks 326 SHOULD allow applications to establish bounds for data delivery 327 performance. Note that application designers must be aware of the 328 scalability trade-off that is made when such bounds are applied. 330 2.6. Network Environments 332 The Internet Protocol has historically assumed a role of providing 333 service across heterogeneous network topologies. It is desirable 334 that a reliable multicast protocol be capable of effectively 335 operating across a wide range of the networks to which general 336 purpose IP service applies. The bandwidth available on the links 337 between the members of a single group today may vary between low 338 numbers of kbit/s for wireless links and multiple Gbit/s for high 339 speed LAN connections, with varying degrees of contention from other 340 flows. Recently, a number of asymmetric network services including 341 56K/ADSL modems, CATV Internet service, satellite and other wireless 342 communication services have begun to proliferate. Many of these are 343 inherently broadcast media with potentially large "fan-out" to which 344 IP multicast service is highly applicable. Additionally, policy 345 and/or technical issues may result in topologies where multicast 346 connectivity is limited to a Source-Specific Multicast (SSM) model 347 from a specific source [RFC4607]. Receivers in the group may be 348 restricted to unicast feedback for NACKs and other messages. 349 Consideration must be given, in building block development and 350 protocol design, to the nature of the underlying networks. 352 2.7. Intermediate System Assistance 354 Intermediate assistance from devices/systems with direct knowledge of 355 the underlying network topology may be used to leverage the 356 performance and scalability of reliable multicast protocols, there 357 will continue to be a number of instances where this is not available 358 or practical. Any building block components for NACK-oriented 359 reliable multicast SHALL be capable of operating without such 360 assistance. However, it is RECOMMENDED that such protocols also 361 consider utilizing these features when available. 363 3. Functionality 365 The previous section has presented the role of protocol building 366 blocks and some of the criteria that may affect NACK-based reliable 367 multicast building block identification/design. This section 368 describes different building block areas applicable to NACK-based 369 reliable multicast protocols. Some of these areas are specific to 370 NACK-based protocols. Detailed descriptions of such areas are 371 provided. In other cases, the areas (e.g., node identifiers, forward 372 error correction (FEC), etc.) may be applicable to other forms of 373 reliable multicast. In those cases, the discussion below describes 374 requirements placed on those other general building block areas from 375 the standpoint of NACK-based reliable multicast. Where applicable, 376 other building block documents are referenced for possible 377 contribution to NACK-based reliable multicast protocols. 379 For each building block, a notional "interface description" is 380 provided to illustrate any dependencies of one building block 381 component upon another or upon other protocol parameters. A building 382 block component may require some form of "input" from another 383 building block component or other source to perform its function. 384 Any "inputs" required by a building block component and/or any 385 resultant "output" provided will be defined and described in each 386 building block component's interface description. Note that the set 387 of building blocks presented here do not fully satisfy each other's 388 "input" and "output" needs. In some cases, "inputs" for the building 389 blocks here must come from other building blocks external to this 390 document (e.g., congestion control or FEC). In other cases NACK- 391 based reliable multicast building block "inputs" must be satisfied by 392 the specific protocol instantiation or implementation (e.g., 393 application data and control). 395 The following building block components relevant to NACK-based 396 reliable multicast are identified: 398 1. Multicast Sender Transmission 400 2. NACK Repair Process 402 3. Multicast Receiver Join Policies 404 1. Node (member) Identification 406 2. Data Content Identification 408 3. Forward Error Correction (FEC) 410 4. Round-trip Timing Collection 412 5. Group Size Determination/Estimation 414 6. Congestion Control Operation 416 7. Intermediate System Assistance 418 8. Ancillary Protocol Mechanisms 420 Figure 1 provides a pictorial overview of these building block areas 421 and some of their relationships. For example, the content of the 422 data messages that a sender initially transmits depends upon the 423 "Node Identification", "Data Content Identification", and "FEC" 424 components while the rate of message transmission will generally 425 depend upon the "Congestion Control" component. Subsequently, the 426 receivers' response to these transmissions (e.g., NACKing for repair) 427 will depend upon the data message content and inputs from other 428 building block components. Finally, the sender's processing of 429 receiver responses will feed back into its transmission strategy. 431 The components on the left side of this figure are areas that may be 432 applicable beyond NACK-based reliable multicast. The most 433 significant of these components are discussed in other building block 434 documents such as the FEC Building Block [RFC5052]. A brief 435 description of these areas and their role in NACK-based reliable 436 multicast protocols is given below. The components on the right are 437 seen as specific to NACK-based reliable multicast protocols, most 438 notably the NACK repair process. These areas are discussed in detail 439 below. Some other components (e.g., "Security") impact many aspects 440 of the protocol, and others may be more transparent to the core 441 protocol processing. The sections below describe the "Multicast 442 Sender Transmission", "NACK Repair Process", and "RTT Collection" 443 building blocks in detail. The relationships to and among the other 444 building block areas are also discussed, focusing on issues 445 applicable to NACK-based reliable multicast protocol design. Where 446 applicable, specific technical recommendations are made for 447 mechanisms that will properly satisfy the goals of NACK-based 448 reliable multicast transport for the Internet. 450 Application Data and Control 451 | 452 V 453 .---------------------. .-----------------------. 454 | Node Identification |----------->| Sender Transmission |<----. 455 `---------------------' _.-' `-----------------------' | 456 .---------------------. _.-' .' | .--------------. | 457 | Data Identification |--' .'' | | Join Policy | | 458 `---------------------' .' ' V `--------------' | 459 .---------------------. .' ' .----------------------. | 460 ,->| Congestion Control |-' ' | Receiver NACK | | 461 | `---------------------' .' | Repair Process | | 462 | .---------------------. .' | .------------------. | | 463 | | FEC |'. | | NACK Initiation | | | 464 | `---------------------'` `._ | `------------------' | | 465 | .---------------------. ``. `-._ | .------------------. | | 466 `--| RTT Collection |._` ` `->| | NACK Content | | | 467 `---------------------'` `` ` | `------------------' | | 468 .---------------------. ` ``-`._ | .------------------. | | 469 | Group Size Est. |---`-`---`->| | NACK Suppression | | | 470 `---------------------'`. `. `. | `------------------' | | 471 .---------------------. \ | | `----------------------' | 472 | Other | \ . . | +----------------+ | 473 `---------------------' \ \ \ | | Intermediate | | 474 \ \ \ | | System Assist | | 475 \ \ | V +----------------+ | 476 `-` >.-------------------------. | 477 | Sender NACK Processing |___/ 478 | and Repair Response | 479 `-------------------------' 480 ^ ^ 481 | | 482 .-----------------------------. 483 | (Security) | 484 `-----------------------------' 486 Figure 1: NACK-based Reliable Multicast Building Block Framework 488 3.1. Multicast Sender Transmission 490 Reliable multicast senders will transmit data content to the 491 multicast session. The data content will be application dependent. 492 The sender will transmit data content at a rate, and with message 493 sizes, determined by application and/or network architecture 494 requirements. Any FEC encoding of sender transmissions SHOULD 495 conform with the guidelines of the FEC Building Block [RFC5052]. 496 When congestion control mechanisms are needed (REQUIRED for general 497 Internet operation), the sender transmission rate SHALL be controlled 498 by the congestion control mechanism. In any case, it is RECOMMENDED 499 that all data transmissions from multicast senders be subject to rate 500 limitations determined by the application or congestion control 501 algorithm. The sender's transmissions SHOULD make good utilization 502 of the available capacity (which may be limited by the application 503 and/or by congestion control). As a result, it is expected there 504 will be overlap and multiplexing of new data content transmission 505 with repair content. Other factors related to application operation 506 may determine sender transmission formats and methods. For example, 507 some consideration needs to be given to the sender's behavior during 508 intermittent idle periods when it has no data to transmit. 510 In addition to data content, other sender messages or commands may be 511 employed as part of protocol operation. These messages may occur 512 outside of the scope of application data transfer. In NACK-based 513 reliable multicast protocols, reliability of such protocol messages 514 may be attempted by redundant transmission when positive 515 acknowledgement is prohibitive due to group size scalability 516 concerns. Note that protocol design SHOULD provide mechanisms for 517 dealing with cases where such messages are not received by the group. 518 As an example, a command message might be redundantly transmitted by 519 a sender to indicate that it is temporarily (or permanently) halting 520 transmission. At this time, it may be appropriate for receivers to 521 respond with NACKs for any outstanding repairs they require following 522 the rules of the NACK procedure. For efficiency, the sender should 523 allow sufficient time between the redundant transmissions to receive 524 any NACK responses from the receivers to this command. 526 In general, when there is any resultant NACK or other feedback 527 operation, the timing of redundant transmission of control messages 528 issued by a sender and other NACK-based reliable multicast protocol 529 timeouts should be dependent upon the group greatest round trip 530 timing (GRTT) estimate and any expected resultant NACK or other 531 feedback operation. The sender GRTT is an estimate of the worst-case 532 round-trip timing from a given sender to any receivers in the group. 533 It is assumed that the GRTT interval is a conservative estimate of 534 the maximum span (with respect to delay) of the multicast group 535 across a network topology with respect to given sender. NACK-based 536 reliable multicast instantiations SHOULD be able to dynamically adapt 537 to a wide range of multicast network topologies. 539 *Inputs:* 541 1. Application data and control 543 2. Sender node identifier 544 3. Data identifiers 546 4. Segmentation and FEC parameters 548 5. Transmission rate 550 6. Application controls 552 7. Receiver feedback messages (e.g., NACKs) 554 *Outputs:* 556 1. Controlled transmission of messages with headers uniquely 557 identifying data or repair content within the context of the 558 reliable multicast session. 560 2. Commands indicating sender's status or other transport control 561 actions to be taken. 563 3.2. NACK Repair Process 565 A critical component of NACK-based reliable multicast protocols is 566 the NACK repair process. This includes the receiver's role in 567 detecting and requesting repair needs, and the sender's response to 568 such requests. There are four primary elements of the NACK repair 569 process: 571 1. Receiver NACK process initiation, 573 2. NACK suppression, 575 3. NACK message content, 577 4. Sender NACK processing and response. 579 3.2.1. Receiver NACK Process Initiation 581 The NACK process (cycle) will be initiated by receivers that detect a 582 need for repair transmissions from a specific sender to achieve 583 reliable reception. When FEC is applied, a receiver should initiate 584 the NACK process only when it is known its repair requirements exceed 585 the amount of pending FEC transmission for a given coding block of 586 data content. This can be determined at the end of the current 587 transmission block (if it is indicated) or upon the start of 588 reception of a subsequent coding block or transmission object. This 589 implies the sender data content is marked to identify its FEC block 590 number and that ordinal relationship is preserved in order of 591 transmission. 593 Alternatively, if the sender's transmission advertises the quantity 594 of repair packets it is already planning to send for a block, the 595 receiver may be able to initiate the NACK process earlier. Allowing 596 receivers to initiate NACK cycles at any time they detect their 597 repair needs have exceeded pending repair transmissions may result in 598 slightly quicker repair cycles. However, it may be useful to limit 599 NACK process initiation to specific events such as at the end-of- 600 transmission of an FEC coding block or upon detection of subsequent 601 coding blocks. This can allow receivers to aggregate NACK content 602 into a smaller number of NACK messages and provide some implicit 603 loose synchronization among the receiver set to help facilitate 604 effective probabilistic suppression of NACK feedback. The receiver 605 MUST maintain a history of data content received from the sender to 606 determine its current repair needs. When FEC is employed, it is 607 expected that the history will correspond to a record of pending or 608 partially-received coding blocks. 610 For probabilistic, timer-based suppression of feedback, the NACK 611 cycle should begin with receivers observing backoff timeouts. In 612 conjunction with initiating this backoff timeout, it is important 613 that the receivers record the current position in the sender's 614 transmission sequence at which they initiate the NACK cycle. When 615 the suppression backoff timeout expires, the receivers should only 616 consider their repair needs up to this recorded transmission position 617 in making the decision to transmit or suppress a NACK. Without this 618 restriction, suppression is greatly reduced as additional content is 619 received from the sender during the time a NACK message propagates 620 across the network to the sender and other receivers. 622 *Inputs:* 624 1. Sender data content with sequencing identifiers from sender 625 transmissions. 627 2. History of content received from sender. 629 *Outputs:* 631 1. NACK process initiation decision 633 2. Recorded sender transmission sequence position. 635 3.2.2. NACK Suppression 637 An effective feedback suppression mechanism is the use of random 638 backoff timeouts prior to NACK transmission by receivers requiring 639 repairs[SrmFramework]. Upon expiration of the backoff timeout, a 640 receiver will request repairs unless its pending repair needs have 641 been completely superseded by NACK messages heard from other 642 receivers (when receivers are multicasting NACKs) or from some 643 indicator from the sender. When receivers are unicasting NACK 644 messages, the sender may facilitate NACK suppression by forwarding a 645 representation of NACK content it has received to the group at large 646 or provide some other indicator of the repair information it will be 647 subsequently transmitting. 649 For effective and scalable suppression performance, the backoff 650 timeout periods used by receivers should be independently, randomly 651 picked by receivers with a truncated exponential 652 distribution[McastFeedback]. This results in the majority of the 653 receiver set holding off transmission of NACK messages under the 654 assumption that the smaller number of "early NACKers" will supersede 655 the repair needs of the remainder of the group. The mean of the 656 distribution should be determined as a function of the current 657 estimate of sender's GRTT assessment and a group size estimate that 658 is determined by other mechanisms within the protocol or preset by 659 the multicast application. 661 A simple algorithm can be constructed to generate random backoff 662 timeouts with the appropriate distribution. Additionally, the 663 algorithm may be designed to optimize the backoff distribution given 664 the number of receivers ("R") potentially generating feedback. This 665 "optimization" minimizes the number of feedback messages (e.g., NACK) 666 in the worst-case situation where all receivers generate a NACK. The 667 maximum backoff timeout ("T_maxBackoff") can be set to control 668 reliable delivery latency versus volume of feedback traffic. A 669 larger value of "T_maxBackoff" will result in a lower density of 670 feedback traffic for a given repair cycle. A smaller value of 671 "T_maxBackoff" results in shorter latency which also reduces the 672 buffering requirements of senders and receivers for reliable 673 transport. 675 In the functions below, the "log()" function specified refers to the 676 "natural logarithm" and the "exp()" function is similarly based upon 677 the mathematical constant 'e' (a.k.a. Euler's number) where "exp(x)" 678 corresponds to '"e"' raised to the power of '"x"'. Given the 679 receiver group size ("groupSize"), and maximum allowed backoff 680 timeout ("T_maxBackoff"), random backoff timeouts ("t'") with a 681 truncated exponential distribution can be picked with the following 682 algorithm: 684 1. Establish an optimal mean ("L") for the exponential backoff based 685 on the "groupSize": 687 L = log(groupSize) + 1 689 2. Pick a random number ("x") from a uniform distribution over a 690 range of: 692 L L L 693 -------------------- to -------------------- + ---------- 694 T_maxBackoff*(exp(L)-1) T_maxBackoff*(exp(L)-1) T_maxBackoff 696 3. Transform this random variate to generate the desired random 697 backoff time ("t'") with the following equation: 699 t' = T_maxBackoff/L * log(x * (exp(L) - 1) * (T_maxBackoff/L)) 701 This "C" language function can be used to generate an appropriate 702 random backoff time interval: 704 double RandomBackoff(double T_maxBackoff, double groupSize) 705 { 706 double lambda = log(groupSize) + 1; 707 double x = UniformRand(lambda/T_maxBackoff) + 708 lambda / (T_maxBackoff*(exp(lambda)-1)); 709 return ((T_maxBackoff/lambda) * 710 log(x*(exp(lambda)-1)*(T_maxBackoff/lambda))); 711 } // end RandomBackoff() 713 where "UniformRand(double max)" returns random numbers with a uniform 714 distribution from the range of "0..max". For example, based on the 715 POSIX ""rand()"" function, the following "C" code can be used: 717 double UniformRand(double max) 718 { 719 return (max * ((double)rand()/(double)RAND_MAX)); 720 } 722 The number of expected NACK messages generated ("N") within the first 723 round trip time for a single feedback event is approximately: 725 N = exp(1.2 * L / (2*T_maxBackoff/GRTT)) 727 Thus the maximum backoff time can be adjusted to trade-off worst-case 728 NACK feedback volume versus latency. This is derived from the 729 equations given in [McastFeedback] and assumes "T_maxBackoff >= 730 GRTT", and "L" is the mean of the distribution optimized for the 731 given group size as shown in the algorithm above. Note that other 732 mechanisms within the protocol may work to reduce redundant NACK 733 generation further. It is suggested that "T_maxBackoff" be selected 734 as an integer multiple of the sender's current advertised GRTT 735 estimate such that: 736 T_maxBackoff = K * GRTT; where K >= 1 738 For general Internet operation, a default value of "K=4" is 739 RECOMMENDED for operation with multicast (to the group at large) NACK 740 delivery and a value of "K=6" for unicast NACK delivery. Alternate 741 values may be used to achieve desired buffer utilization, reliable 742 delivery latency and group size scalability trade-offs. 744 Given that ("K*GRTT") is the maximum backoff time used by the 745 receivers to initiate NACK transmission, other timeout periods 746 related to the NACK repair process can be scaled accordingly. One of 747 those timeouts is the amount of time a receiver should wait after 748 generating a NACK message before allowing itself to initiate another 749 NACK backoff/transmission cycle ("T_rcvrHoldoff"). This delay should 750 be sufficient for the sender to respond to the received NACK with 751 repair messages. An appropriate value depends upon the amount of 752 time for the NACK to reach the sender and the sender to provide a 753 repair response. This MUST include any amount of sender NACK 754 aggregation period during which possible multiple NACKs are 755 accumulated to determine an efficient repair response. These 756 timeouts are further discussed in the section below on "Sender NACK 757 Processing and Repair Response". 759 There are also secondary measures that can be applied to improve the 760 performance of feedback suppression. For example, the sender's data 761 content transmissions can follow an ordinal sequence of transmission. 762 When repairs for data content occur, the receiver can note that the 763 sender has "rewound" its data content transmission position by 764 observing the data object, FEC block number, and FEC symbol 765 identifiers. Receivers SHOULD limit transmission of NACKs to only 766 when the sender's current transmission position exceeds the point to 767 which the receiver has incomplete reception. This reduces premature 768 requests for repair of data the sender may be planning to provide in 769 response to other receiver requests. This mechanism can be very 770 effective for protocol convergence in high loss conditions when 771 transmissions of NACKs from other receivers (or indicators from the 772 sender) are lost. Another mechanism (particularly applicable when 773 FEC is used) is for the sender to embed an indication of impending 774 repair transmissions in current packets sent. For example, the 775 indication may be as simple as an advertisement of the number of FEC 776 packets to be sent for the current applicable coding block. 778 Finally, some consideration might be given to using the NACKing 779 history of receivers to weight their selection of NACK backoff 780 timeout intervals. For example, if a receiver has historically been 781 experiencing the greatest degree of loss, it may promote itself to 782 statistically NACK sooner than other receivers. Note this requires 783 there is correlation over successive intervals of time in the loss 784 experienced by a receiver. Such correlation MAY not always be 785 present in multicast networks. This adjustment of backoff timeout 786 selection may require the creation of an "early NACK" slot for these 787 historical NACKers. This additional slot in the NACK backoff window 788 will result in a longer repair cycle process that may not be 789 desirable for some applications. The resolution of these trade-offs 790 may be dependent upon the protocol's target application set or 791 network. 793 After the random backoff timeout has expired, the receiver will make 794 a decision on whether to generate a NACK repair request or not (i.e., 795 it has been suppressed). The NACK will be suppressed when any of the 796 following conditions has occurred: 798 1. The accumulated state of NACKs heard from other receivers (or 799 forwarding of this state by the sender) is equal to or supersedes 800 the repair needs of the local receiver. Note that the local 801 receiver should consider its repair needs only up to the sender 802 transmission position recorded at the NACK cycle initiation (when 803 the backoff timer was activated). 805 2. The sender's data content transmission position "rewinds" to a 806 point ordinally less than that of the lowest sequence position of 807 the local receiver's repair needs. (This detection of sender 808 "rewind" indicates the sender has already responded to other 809 receiver repair needs of which the local receiver may not have 810 been aware). This "rewind" event can occur any time between 1) 811 when the NACK cycle was initiated with the backoff timeout 812 activation and 2) the current moment when the backoff timeout has 813 expired to suppress the NACK. Another NACK cycle must be 814 initiated by the receiver when the sender's transmission sequence 815 position exceeds the receiver's lowest ordinal repair point. 816 Note it is possible that the local receiver may have had its 817 repair needs satisfied as a result of the sender's response to 818 the repair needs of other receivers and no further NACKing is 819 required. 821 If these conditions have not occurred and the receiver still has 822 pending repair needs, a NACK message is generated and transmitted. 823 The NACK should consist of an accumulation of repair needs from the 824 receiver's lowest ordinal repair point up to the current sender 825 transmission sequence position. A single NACK message should be 826 generated and the NACK message content should be truncated if it 827 exceeds the payload size of single protocol message. When such NACK 828 payload limits occur, the NACK content SHOULD contain requests for 829 the ordinally lowest repair content needed from the sender. 831 *Inputs:* 832 1. NACK process initiation decision. 834 2. Recorded sender transmission sequence position. 836 3. Sender GRTT. 838 4. Sender group size estimate. 840 5. Application-defined bound on backoff timeout period. 842 6. NACKs from other receivers. 844 7. Pending repair indication from sender (may be forwarded NACKs). 846 8. Current sender transmission sequence position. 848 *Outputs:* 850 1. Yes/no decision to generate NACK message upon backoff timer 851 expiration. 853 3.2.3. NACK Content 855 The content of NACK messages generated by reliable multicast 856 receivers will include information detailing their current repair 857 needs. The specific information depends on the use and type of FEC 858 in the NACK repair process. The identification of repair needs is 859 dependent upon the data content identification (See Section 3.5 860 below). At the highest level the NACK content will identify the 861 sender to which the NACK is addressed and the data transport object 862 (or stream) within the sender's transmission that needs repair. For 863 the indicated transport entity, the NACK content will then identify 864 the specific FEC coding blocks and/or symbols it requires to 865 reconstruct the complete transmitted data. This content may consist 866 of FEC block erasure counts and/or explicit indication of missing 867 blocks or symbols (segments) of data and FEC content. It should also 868 be noted that NACK-based reliable multicast can be effectively 869 instantiated without a requirement for reliable NACK delivery using 870 the techniques discussed here. 872 3.2.3.1. NACK and FEC Repair Strategies 874 Where FEC-based repair is used, the NACK message content will 875 minimally need to identify the coding block(s) for which repair is 876 needed and a count of erasures (missing packets) for the coding 877 block. An exact count of erasures implies the FEC algorithm is 878 capable of repairing _any_ loss combination within the coding block. 879 This count may need to be adjusted for some FEC algorithms. 881 Considering that multiple repair rounds may be required to 882 successfully complete repair, an erasure count also implies that the 883 quantity of unique FEC parity packets the server has available to 884 transmit is essentially unlimited (i.e., the server will always be 885 able to provide new, unique, previously unsent parity packets in 886 response to any subsequent repair requests for the same coding 887 block). Alternatively, the sender may "round-robin" transmit through 888 its available set of FEC symbols for a given coding block, and 889 eventually effect repair. For a most efficient repair strategy, the 890 NACK content will need to also _explicitly_ identify which symbols 891 (information and/or parity) the receiver requires to successfully 892 reconstruct the content of the coding block. This will be 893 particularly true of small to medium size block FEC codes (e.g., Reed 894 Solomon) that are capable of providing a limited number of parity 895 symbols per FEC coding block. 897 When FEC is not used as part of the repair process, or the protocol 898 instantiation is required to provide reliability even when the sender 899 has transmitted all available parity for a given coding block (or the 900 sender's ability to buffer transmission history is exceeded by the 901 "(delay*bandwidth*loss)" characteristics of the network topology), 902 the NACK content will need to contain _explicit_ coding block and/or 903 segment loss information so that the sender can provide appropriate 904 repair packets and/or data retransmissions. Explicit loss 905 information in NACK content may also potentially serve other 906 purposes. For example, it may be useful for decorrelating loss 907 characteristics among a group of receivers to help differentiate 908 candidate congestion control bottlenecks among the receiver set. 910 When FEC is used and NACK content is designed to contain explicit 911 repair requests, there is a strategy where the receivers can NACK for 912 specific content that will help facilitate NACK suppression and 913 repair efficiency. The assumptions for this strategy are that sender 914 may potentially exhaust its supply of new, unique parity packets 915 available for a given coding block and be required to explicitly 916 retransmit some data or parity symbols to complete reliable transfer. 917 Another assumption is that an FEC algorithm where any parity packet 918 can fill any erasure within the coding block (e.g., Reed Solomon) is 919 used. The goal of this strategy is to make maximum use of the 920 available parity and provide the minimal amount of data and repair 921 transmissions during reliable transfer of data content to the group. 923 When systematic FEC codes are used, the sender transmits the data 924 content of the coding block (and optionally some quantity of parity 925 packets) in its initial transmission. Note that a systematic FEC 926 coding block is considered to be logically made up of the contiguous 927 set of source data vectors plus parity vectors for the given FEC 928 algorithm used. For example, a systematic coding scheme that 929 provides for 64 data symbols and 32 parity symbols per coding block 930 would contain FEC symbol identifiers in the range of 0 to 95. 932 Receivers then can construct NACK messages requesting sufficient 933 content to satisfy their repair needs. For example, if the receiver 934 has three erasures in a given received coding block, it will request 935 transmission of the three lowest ordinal parity vectors in the coding 936 block. In our example coding scheme from the previous paragraph, the 937 receiver would explicitly request parity symbols 64 to 66 to fill its 938 three erasures for the coding block. Note that if the receiver's 939 loss for the coding block exceeds the available parity quantity 940 (i.e., greater than 32 missing symbols in our example), the receiver 941 will be required to construct a NACK requesting all (32) of the 942 available parity symbols plus some additional portions of its missing 943 data symbols in order to reconstruct the block. If this is done 944 consistently across the receiver group, the resulting NACKs will 945 comprise a minimal set of sender transmissions to satisfy their 946 repair needs. 948 In summary, the rule is to request the lower ordinal portion of the 949 parity content for the FEC coding block to satisfy the erasure repair 950 needs on the first NACK cycle. If the available number of parity 951 symbols is insufficient, the receiver will also request the subset of 952 ordinally highest missing data symbols to cover what the parity 953 symbols will not fill. Note this strategy assumes FEC codes such as 954 Reed-Solomon for which a single parity symbol can repair any erased 955 symbol. This strategy would need minor modification to take into 956 account the possibly limited repair capability of other FEC types. 957 On subsequent NACK repair cycles where the receiver may have received 958 some portion of its previously requested repair content, the receiver 959 will use the same strategy, but only NACK for the set of parity 960 and/or data symbols it has not yet received. Optionally, the 961 receivers could also provide a count of erasures as a convenience to 962 the sender. 964 Other types of FEC schemes may require alteration to the NACK and 965 repair strategy described here. For example, some of the large block 966 or expandable FEC codes described in [RFC3453] may be less 967 deterministic with respect to defining optimal repair requests by 968 receivers or repair transmission strategies by senders. For these 969 types of codes, it may be sufficient for receivers to NACK with an 970 estimate of the quantity of additional FEC symbols required to 971 complete reliable reception and for the sender to respond 972 accordingly. This apparent disadvantage as compared to codes such as 973 Reed Solomon may be offset by reduced computational requirements 974 and/or ability to support large coding blocks for increased repair 975 efficiency that these codes can offer. 977 After receipt and accumulation of NACK messages during the 978 aggregation period, the sender can begin transmission of fresh 979 (previously untransmitted) parity symbols for the coding block based 980 on the highest receiver erasure count _if_ it has a sufficient 981 quantity of parity symbols that were _not_ previously transmitted. 982 Otherwise, the sender MUST resort to transmitting the explicit set of 983 repair vectors requested. With this approach, the sender needs to 984 maintain very little state on requests it has received from the group 985 without need for synchronization of repair requests from the group. 986 Since all receivers use the same consistent algorithm to express 987 their explicit repair needs, NACK suppression among receivers is 988 simplified over the course of multiple repair cycles. The receivers 989 can simply compare NACKs heard from other receivers against their own 990 calculated repair needs to determine whether they should transmit or 991 suppress their pending NACK messages. 993 3.2.3.2. NACK Content Format 995 The format of NACK content will depend on the protocol's data service 996 model and the format of data content identification the protocol 997 uses. This NACK format also depends upon the type of FEC encoding 998 (if any) used. Figure 2 illustrates a logical, hierarchical 999 transmission content identification scheme, denoting that the notion 1000 of objects (or streams) and/or FEC blocking is optional at the 1001 protocol instantiation's discretion. Note that the identification of 1002 objects is with respect to a given sender. It is recommended that 1003 transport data content identification is done within the context of a 1004 sender in a given session. Since the notion of session "streams" and 1005 "blocks" is optional, the framework degenerates to that of typical 1006 transport data segmentation and reassembly in its simplest form. 1008 Session_ 1009 \_ 1010 Sender_ 1011 \_ 1012 [Object/Stream(s)]_ 1013 \_ 1014 [FEC Blocks]_ 1015 \_ 1016 Symbols 1018 Figure 2: Reliable Multicast Data Content Identification Hierarchy 1020 The format of NACK messages should enable the following: 1022 1. Identification of transport data units required to repair the 1023 received content, whether this is an entire missing object/stream 1024 (or range), entire FEC coding block(s), or sets of symbols, 1026 2. Simple processing for NACK aggregation and suppression, 1028 3. Inclusion of NACKs for multiple objects, FEC coding blocks and/or 1029 symbols in a single message, and 1031 4. A reasonably compact format. 1033 If the reliable multicast transport object/stream is identified with 1034 an __ and the FEC symbol being transmitted is identified 1035 with an __, the concatenation of __ comprises a basic transport protocol data unit (TPDU) 1037 identifier for symbols from a given source. NACK content can be 1038 composed of lists and/or ranges of these TPDU identifiers to build up 1039 NACK messages to describe the receivers repair needs. If no 1040 hierarchical object delineation or FEC blocking is used, the TPDU is 1041 a simple linear representation of the data symbols transmitted by the 1042 sender. When the TPDU represents a hierarchy for purposes of object/ 1043 stream delineation and/or FEC blocking, the NACK content unit may 1044 require flags to indicate which portion of the TPDU is applicable. 1045 For example, if an entire "object" (or range of objects) is missing 1046 in the received data, the receiver will not necessarily know the 1047 appropriate range of __ or __ 1048 for which to request repair and thus requires some mechanism to 1049 request repair (or retransmission) of the entire unit represented by 1050 an __. The same is true if entire FEC coding blocks 1051 represented by one or a range of __ have been 1052 lost. 1054 *Inputs*: 1056 1. Sender identification. 1058 2. Sender data identification. 1060 3. Sender FEC Object Transmission Information. 1062 4. Recorded sender transmission sequence position. 1064 5. Current sender transmission sequence position. History of repair 1065 needs for this sender. 1067 *Outputs*: 1069 1. NACK message with repair requests. 1071 3.2.4. Sender Repair Response 1073 Upon reception of a repair request from a receiver in the group, the 1074 sender will initiate a repair response procedure. The sender may 1075 wish to delay transmission of repair content until it has had 1076 sufficient time to accumulate potentially multiple NACKs from the 1077 receiver set. This allows the sender to determine the most efficient 1078 repair strategy for a given transport stream/object or FEC coding 1079 block. Depending upon the approach used, some protocols may find it 1080 beneficial for the sender to provide an indicator of pending repair 1081 transmissions as part of its current transmitted message content. 1082 This can aid some NACK suppression mechanisms. The amount of time to 1083 perform this NACK aggregation should be sufficient to allow for the 1084 maximum receiver NACK backoff window (""T_maxBackoff"" from Section 1085 3.2.2) and propagation of NACK messages from the receivers to the 1086 sender. Note the maximum transmission delay of a message from a 1087 receiver to the sender may be approximately "(1*GRTT)" in the case of 1088 very asymmetric network topology with respect to transmission delay. 1089 Thus, if the maximum receiver NACK backoff time is "T_maxBackoff = 1090 K*GRTT", the sender NACK aggregation period should be equal to at 1091 least: 1093 T_sndrAggregate = T_maxBackoff + 1*GRTT = (K+1)*GRTT 1095 Immediately after the sender NACK aggregation period, the sender will 1096 begin transmitting repair content determined from the aggregate NACK 1097 state and continue with any new transmission. Also, at this time, 1098 the sender should observe a "hold-off" period where it constrains 1099 itself from initiating a new NACK aggregation period to allow 1100 propagation of the new transmission sequence position due to the 1101 repair response to the receiver group. To allow for worst case 1102 asymmetry, this "hold-off" time should be: 1104 T_sndrHoldoff = 1*GRTT 1106 Recall that the receivers will also employ a "hold-off" timeout after 1107 generating a NACK message to allow time for the sender's response. 1108 Given a sender "" plus "" time of 1109 "(K+1)*GRTT", the receivers should use hold-off timeouts of: 1111 T_rcvrHoldoff = T_sndrAggregate + T_sndrHoldoff = (K+2)*GRTT 1113 This allows for a worst-case propagation time of the receiver's NACK 1114 to the sender, the sender's aggregation time and propagation of the 1115 sender's response back to the receiver. Additionally, in the case of 1116 unicast feedback from the receiver set, it may be useful for the 1117 sender to forward (via multicast) a representation of its aggregated 1118 NACK content to the group to allow for NACK suppression when there is 1119 not multicast connectivity among the receiver set. 1121 At the expiration of the "" timeout, the sender will 1122 begin transmitting repair messages according to the accumulated 1123 content of NACKs received. There are some guidelines with regards to 1124 FEC-based repair and the ordering of the repair response from the 1125 sender that can improve reliable multicast efficiency: 1127 When FEC is used, it is beneficial that the sender transmit 1128 previously untransmitted parity content as repair messages whenever 1129 possible. This maximizes the receiving nodes' ability to reconstruct 1130 the entire transmitted content from their individual subsets of 1131 received messages. 1133 The transmitted object and/or stream data and repair content should 1134 be indexed with monotonically increasing sequence numbers (within a 1135 reasonably large ordinal space). If the sender observes the 1136 discipline of transmitting repair for the earliest content (e.g., 1137 ordinally lowest FEC blocks) first, the receivers can use a strategy 1138 of withholding repair requests for later content until the sender 1139 once again returns to that point in the object/stream transmission 1140 sequence. This can increase overall message efficiency among the 1141 group and help work to keep repair cycles relatively synchronized 1142 without dependence upon strict time synchronization among the sender 1143 and receivers. This also helps minimize the buffering requirements 1144 of receivers and senders and reduces redundant transmission of data 1145 to the group at large. 1147 *Inputs:* 1149 1. Receiver NACK messages 1151 2. Group timing information 1153 *Outputs:* 1155 1. Repair messages (FEC and/or Data content retransmission) 1157 2. Advertisement of current pending repair transmissions when 1158 unicast receiver feedback is detected. 1160 3.3. Multicast Receiver Join Policies and Procedures 1162 Consideration should be given to the policies and procedures by which 1163 new receivers join a group (perhaps where reliable transmission is 1164 already in progress) and begin requesting repair. If receiver joins 1165 are unconstrained, the dynamics of group membership may impede the 1166 application's ability to meet its goals for forward progression of 1167 data transmission. Policies limiting the opportunities when 1168 receivers begin participating in the NACK process may be used to 1169 achieve the desired behavior. For example, it may be beneficial for 1170 receivers to attempt reliable reception from a newly-heard sender 1171 only upon non-repair transmissions of data in the first FEC block of 1172 an object or logical portion of a stream. The sender may also 1173 implement policies limiting the receivers from which it will accept 1174 NACK requests, but this may be prohibitive for scalability reasons in 1175 some situations. Alternatively, it may be desirable to have a looser 1176 transport synchronization policy and rely upon session management 1177 mechanisms to limit group dynamics that can cause poor performance, 1178 in some types of bulk transfer applications (or for potential 1179 interactive reliable multicast applications). 1181 *Inputs:* 1183 1. Current object/stream data/repair content and sequencing 1184 identifiers from sender transmissions. 1186 *Outputs:* 1188 1. Receiver yes/no decision to begin receiving and NACKing for 1189 reliable reception of data 1191 3.4. Reliable Multicast Member Identification 1193 In a NACK-based reliable multicast protocol (or other multicast 1194 protocols) where there is the potential for multiple sources of data, 1195 it is necessary to provide some mechanism to uniquely identify the 1196 sources (and possibly some or all receivers in some cases) within the 1197 group. Receivers that send NACK messages to the group will need to 1198 identify the sender to which the NACK is intended. Identity based on 1199 arriving packet source addresses is insufficient for several reasons. 1200 These reasons include routing changes for hosts with multiple 1201 interfaces that result in different packet source addresses for a 1202 given host over time, network address translation (NAT) or firewall 1203 devices, or other transport/network bridging approaches. As a 1204 result, some type of unique source identifier __ field 1205 SHOULD be present in packets transmitted by reliable multicast 1206 session members. 1208 3.5. Data Content Identification 1210 The data and repair content transmitted by a NACK-based reliable 1211 multicast sender requires some form of identification in the protocol 1212 header fields. This identification is required to facilitate the 1213 reliable NACK-oriented repair process. These identifiers will also 1214 be used in NACK messages generated. This building block document 1215 assumes two very general types of data that may comprise bulk 1216 transfer session content. One type is static, discrete objects of 1217 finite size and the other is continuous non-finite streams. A given 1218 application may wish to reliably multicast data content using either 1219 one or both of these paradigms. While it may be possible for some 1220 applications to further generalize this model and provide mechanisms 1221 to encapsulate static objects as content embedded within a stream, 1222 there are advantages in many applications to provide distinct support 1223 for static bulk objects and messages with the context of a reliable 1224 multicast session. These applications may include content caching 1225 servers, file transfer, or collaborative tools with bulk content. 1226 Applications with requirements for these static object types can then 1227 take advantage of transport layer mechanisms (i.e., segmentation/ 1228 reassembly, caching, integrated forward error correction coding, 1229 etc.) rather than being required to provide their own mechanisms for 1230 these functions at the application layer. 1232 As noted, some applications may alternatively desire to transmit bulk 1233 content in the form of one or more streams of non-finite size. 1234 Example streams include continuous quasi-real-time message broadcasts 1235 (e.g., stock ticker) or some content types that are part of 1236 collaborative tools or other applications. And, as indicated above, 1237 some applications may wish to encapsulate other bulk content (e.g., 1238 files) into one or more streams within a multicast session. 1240 The components described within this building block document are 1241 envisioned to be applicable to both of these models with the 1242 potential for a mix of both types within a single multicast session. 1243 To support this requirement, the normal data content identification 1244 should include a field to uniquely identify the object or stream 1245 (e.g., __) within some reasonable temporal or ordinal 1246 interval. Note that it is _not_ expected that this data content 1247 identification will be globally unique. It is expected that the 1248 object/stream identifier will be unique with respect to a given 1249 sender within the reliable multicast session and during the time that 1250 sender is supporting a specific transport instance of that object or 1251 stream. 1253 Since "bulk" object/stream content usually requires segmentation, 1254 some form of segment identification must also be provided. This 1255 segment identifier will be relative to any object or stream 1256 identifier that has been provided. Thus, in some cases, NACK-based 1257 reliable multicast protocol instantiations may be able to receive 1258 transmissions and request repair for multiple streams and one or more 1259 sets of static objects in parallel. For protocol instantiations 1260 employing FEC the segment identification portion of the data content 1261 identifier may consist of a logical concatenation of a coding block 1262 identifier __ and an identifier for the specific 1263 data or parity symbol __ of the code block. The 1264 FEC Basic Schemes building block 1265 [I-D.ietf-rmt-bb-fec-basic-schemes-revised] and descriptions of 1266 additional FEC schemes that may be documented later provide a 1267 standard message format for identifying FEC transmission content. 1268 NACK-based reliable multicast protocol instantiations using FEC 1269 SHOULD follow such guidelines. 1271 Additionally, flags to determine the usage of the content identifier 1272 fields (e.g., stream vs. object) may be applicable. Flags may also 1273 serve other purposes in data content identification. It is expected 1274 that any flags defined will be dependent upon individual protocol 1275 instantiations. 1277 In summary, the following data content identification fields may be 1278 required for NACK-based reliable multicast protocol data content 1279 messages: 1281 1. Source node identifier (__) 1283 2. Object/Stream identifier (__), if applicable. 1285 3. FEC Block identifier (__), if applicable. 1287 4. FEC Symbol identifier (__) 1289 5. Flags to differentiate interpretation of identifier fields or 1290 identifier structure that implicitly indicates usage. 1292 6. Additional FEC transmission content fields per FEC Building Block 1294 These fields have been identified because any generated NACK messages 1295 will use these identifiers in requesting repair or retransmission of 1296 data. 1298 3.6. Forward Error Correction (FEC) 1300 Multiple forward error correction (FEC) approaches using erasure 1301 coding techniques have been identified that can provide great 1302 performance enhancements to the repair process of NACK-oriented and 1303 other reliable multicast protocols [FecBroadcast], [RmFec], 1304 [RFC3453]. NACK-based reliable multicast protocols can reap 1305 additional benefits since FEC-based repair does not generally require 1306 explicit knowledge of repair content within the bounds of its coding 1307 block size (in symbols). In NACK-based reliable multicast, parity 1308 repair packets generated will generally be transmitted only in 1309 response to NACK repair requests from receiving nodes. However, 1310 there are benefits in some network environments for transmitting some 1311 predetermined quantity of FEC repair packets multiplexed with the 1312 regular data symbol transmissions [FecHybrid]. This can reduce the 1313 amount of NACK traffic generated with relatively little overhead cost 1314 when group sizes are very large or the network connectivity has a 1315 large "delay*bandwidth" product with some nominal level of expected 1316 packet loss. While the application of FEC is not unique to NACK- 1317 based reliable multicast, these sorts of requirements may dictate the 1318 types of algorithms and protocol approaches that are applicable. 1320 A specific issue related to the use of FEC with NACK-based reliable 1321 multicast is the mechanism used to identify the portion(s) of 1322 transmitted data content to which specific FEC packets are 1323 applicable. It is expected that FEC algorithms will be based on 1324 generating a set of parity repair packets for a corresponding block 1325 of transmitted data packets. Since data content packets are uniquely 1326 identified by the concatenation of __ during transport, it is 1328 expected that FEC packets will be identified in a similar manner. 1329 The FEC Building Block document [RFC5052] provides detailed 1330 recommendations concerning application of FEC and standard formats 1331 for related reliable multicast protocol messages. 1333 3.7. Round-trip Timing Collection 1335 The measurement of packet propagation round-trip time (RTT) among 1336 members of the group is required to support timer-based NACK 1337 suppression algorithms, timing of sender commands or certain repair 1338 functions, and congestion control operation. The nature of the 1339 round-trip information collected is dependent upon the type of 1340 interaction among the members of the group. In the case of "one-to- 1341 many" transmission, it may be that only the sender requires RTT 1342 knowledge of the GRTT and/or RTT knowledge of only a portion of the 1343 group. Here, the GRTT information might be collected in a reasonably 1344 scalable manner. For congestion control operation, it is possible 1345 that each receiver in the group may need knowledge of its individual 1346 RTT. In this case, an alternative RTT collection scheme may be 1347 utilized where receivers collect individual RTT measurements with 1348 respect to the sender(s) and advertise them to the group or 1349 sender(s). Where it is likely that exchange of reliable multicast 1350 data will occur among the group on a "many-to-many" basis, there are 1351 alternative measurement techniques that might be employed for 1352 increased efficiency[DelayEstimation]. In some cases, there might be 1353 absolute time synchronization available among the participating hosts 1354 that may simplify RTT measurement. There are trade-offs in multicast 1355 congestion control design that require further consideration before a 1356 universal recommendation on RTT (or GRTT) measurement can be 1357 specified. Regardless of how the RTT information is collected (and 1358 more specifically GRTT) with respect to congestion control or other 1359 requirements, the sender will need to advertise its current GRTT 1360 estimate to the group for various NACK timeouts used by receivers. 1362 3.7.1. One-to-Many Sender GRTT Measurement 1364 The goal of this form of RTT measurement is for the sender to 1365 estimate the GRTT among the receivers who are actively participating 1366 in NACK-based reliable multicast operation. The set of receivers 1367 participating in this process may be the entire group or some subset 1368 of the group determined from another mechanism within the protocol 1369 instantiation. An approach to collect this GRTT information follows. 1371 The sender periodically polls the group with a message (independent 1372 or "piggy-backed" with other transmissions) containing a "" 1373 timestamp relative to an internal clock at the sender. Upon 1374 reception of this message, the receivers will record this 1375 "" timestamp and the time (referenced to their own clocks) 1376 at which it was received "". When the receiver provides 1377 feedback to the sender (either explicitly or as part of other 1378 feedback messages depending upon protocol instantiation 1379 specification), it will construct a "response" using the formula: 1381 grttResponse = sendTime + (currentTime - recvTime) 1383 where the "" is the timestamp from the last probe message 1384 received from the source and the (" - ") is 1385 the amount of time differential since that request was received until 1386 the receiver generated the response. 1388 The sender processes each receiver response by calculating a current 1389 RTT measurement for the receiver from whom the response was received 1390 using the following formula: 1392 RTT_rcvr = currentTime - grttResponse 1394 During the each periodic "GRTT" probing interval, the source keeps 1395 the peak round trip timing measurement ("RTT_peak") from the set of 1396 responses it has received. A conservative estimate of "GRTT" is kept 1397 to maximize the efficiency of redundant NACK suppression and repair 1398 aggregation. The update to the source's ongoing estimate of "GRTT" 1399 is done observing the following rules: 1401 1. If a receiver's response round trip time ("RTT_rcvr") is greater 1402 than the current "GRTT" estimate, the "GRTT" is immediately 1403 updated to this new peak value: 1405 GRTT = RTT_rcvr 1407 2. At the end of the response collection period (i.e., the GRTT 1408 probe interval), if the recorded "peak" response "RTT_peak") is 1409 less than the current GRTT estimate, the GRTT is updated to: 1411 GRTT = MAX(0.9*GRTT, RTT_peak) 1413 3. If no feedback is received, the sender "GRTT" estimate remains 1414 unchanged. 1416 4. At the end of the response collection period, the peak tracking 1417 value ("RTT_peak") is reset to ZERO for subsequent peak 1418 detection. 1420 The GRTT collection period (i.e., period of probe transmission) could 1421 be fixed at a value on the order of that expected for group 1422 membership and/or network topology dynamics. For robustness, more 1423 rapid probing could be used at protocol startup before settling to a 1424 less frequent, steady-state interval. Optionally, an algorithm may 1425 be developed to adjust the GRTT collection period dynamically in 1426 response to the current estimate of GRTT (or variations in it) and to 1427 an estimation of packet loss. The overhead of probing messages could 1428 then be reduced when the GRTT estimate is stable and unchanging, but 1429 be adjusted to track more dynamically during periods of variation 1430 with correspondingly shorter GRTT collection periods. GRTT 1431 collection MAY also be coupled with collection of other information 1432 for congestion control purposes. 1434 In summary, although NACK repair cycle timeouts are based on GRTT, it 1435 should be noted that convergent operation of the protocol does not 1436 depend upon highly accurate GRTT estimation. The current mechanism 1437 has proved sufficient in simulations and in the environments where 1438 NACK-based reliable multicast protocols have been deployed to date. 1439 The estimate provided by the given algorithm tracks the peak envelope 1440 of actual GRTT (including operating system effect as well as network 1441 delays) even in relatively high loss connectivity. The steady-state 1442 probing/update interval may potentially be varied to accommodate 1443 different levels of expected network dynamics in different 1444 environments. 1446 3.7.2. One-to-Many Receiver RTT Measurement 1448 In this approach, receivers send messages with timestamps to the 1449 sender. To control the volume of these receiver-generated messages, 1450 a suppression mechanism similar to that described for NACK 1451 suppression my be used. The "age" of receivers' RTT measurement 1452 should be kept by receivers and used as a metric in competing for 1453 feedback opportunities in the suppression scheme. For example, 1454 receiver who have not made any RTT measurement or whose RTT 1455 measurement has aged most should have precedence over other 1456 receivers. In turn the sender may have limited capacity to provide 1457 an "echo" of the receiver timestamps back to the group, and it could 1458 use this RTT "age" metric to determine which receivers get 1459 precedence. The sender can determine the "GRTT" as described in 1460 3.7.1 if it provides sender timestamps to the group. Alternatively, 1461 receivers who note their RTT is greater than the sender GRTT can 1462 compete in the feedback opportunity/suppression scheme to provide the 1463 sender and group with this information. 1465 3.7.3. Many-to-Many RTT Measurement 1467 For reliable multicast sessions that involve multiple senders, it may 1468 be useful to have RTT measurements occur on a true "many-to-many" 1469 basis rather than have each sender independently tracking RTT. Some 1470 protocol efficiency can be gained when receivers can infer an 1471 approximation of their RTT with respect to a sender based on RTT 1472 information they have on another sender and that other sender's RTT 1473 with respect to the new sender of interest. For example, for 1474 receiver "a" and senders "b" and "c", it is likely that: 1476 RTT(a<->b) <= RTT(a<->c)) + RTT(b<->c) 1478 Further refinement of this estimate can be conducted if RTT 1479 information is available to a node concerning its own RTT to a small 1480 subset of other group members and RTT information among those other 1481 group members it learns during protocol operation. 1483 3.7.4. Sender GRTT Advertisement 1485 To facilitate deterministic protocol operation, the sender should 1486 robustly advertise its current estimation of "GRTT" to the receiver 1487 set. Common, robust knowledge of the sender's current operating GRTT 1488 estimate among the group will allow the protocol to progress in its 1489 most efficient manner. The sender's GRTT estimate can be robustly 1490 advertised to the group by simply embedding the estimate into all 1491 pertinent messages transmitted by the sender. The overhead of this 1492 can be made quite small by quantizing (compressing) the GRTT estimate 1493 to a single byte of information. The following C-language functions 1494 allows this to be done over a wide range ("RTT_MIN" through 1495 "RTT_MAX") of GRTT values while maintaining a greater range of 1496 precision for small values and less precision for large values. 1497 Values of 1.0e-06 seconds and 1000 seconds are RECOMMENDED for 1498 "RTT_MIN" and "RTT_MAX" respectively. NACK-based reliable multicast 1499 applications may wish to place an additional, smaller upper limit on 1500 the GRTT advertised by senders to meet application data delivery 1501 latency constraints at the expense of greater feedback volume in some 1502 network environments. 1504 unsigned char QuantizeGrtt(double grtt) 1505 { 1506 if (grtt > RTT_MAX) 1507 grtt = RTT_MAX; 1508 else if (grtt < RTT_MIN) 1509 grtt = RTT_MIN; 1510 if (grtt < (33*RTT_MIN)) 1511 return ((unsigned char)(grtt / RTT_MIN) - 1); 1512 else 1513 return ((unsigned char)(ceil(255.0 - 1514 (13.0 * log(RTT_MAX/grtt))))); 1515 } 1517 double UnquantizeRtt(unsigned char qrtt) 1518 { 1519 return ((qrtt <= 31) ? 1520 (((double)(qrtt+1))*(double)RTT_MIN) : 1521 (RTT_MAX/exp(((double)(255-qrtt))/(double)13.0))); 1522 } 1524 Note that this function is useful for quantizing GRTT times in the 1525 range of 1 microsecond to 1000 seconds. Of course, NACK-based 1526 reliable multicast protocol implementations may wish to further 1527 constrain advertised GRTT estimates (e.g., limit the maximum value) 1528 for practical reasons. 1530 3.8. Group Size Determination/Estimation 1532 When NACK-based reliable multicast protocol operation includes 1533 mechanisms that excite feedback from the group at large (e.g., 1534 congestion control), it may be possible to roughly estimate the group 1535 size based on the number of feedback messages received with respect 1536 to the distribution of the probabilistic suppression mechanism used. 1537 Note the timer-based suppression mechanism described in this document 1538 does not require a very accurate estimate of group size to perform 1539 adequately. Thus, a rough estimate, particularly if conservatively 1540 managed, may suffice. Group size may also be determined 1541 administratively. In absence of any group size determination 1542 mechanism a default group size value of 10,000 is RECOMMENDED for 1543 reasonable management of feedback given the scalability of expected 1544 NACK-based reliable multicast usage. This conservative estimate 1545 (over-estimate) of group size in the algorithms described above will 1546 result in some added latency to the NACK repair process if the actual 1547 group size is smaller but with a guarantee of feedback implosion 1548 protection. The study of the timer-based feedback suppression 1549 mechanism described in [McastFeedback] and [NormFeedback] showed that 1550 the group size estimate need only be with an order-of-magnitude to 1551 provide effective suppression performance. 1553 3.9. Congestion Control Operation 1555 Congestion control that fairly shares available network capacity with 1556 other reliable multicast and TCP instantiations is REQUIRED for 1557 general Internet operation. The TCP-Friendly Multicast Congestion 1558 Control (TFMCC) [TfmccPaper] or Pragmatic General Multicast 1559 Congestion Control (PGMCC) [PgmccPaper] techniques can be applied to 1560 NACK-based reliable multicast operation to meet this requirement. 1561 The former technique has been further documented in [RFC4654] and has 1562 been successfully applied in the NACK-Oriented Reliable Multicast 1563 Protocol [RFC3940]. 1565 3.10. Intermediate System Assistance 1567 NACK-based multicast protocols may benefit from general purpose 1568 intermediate system assistance. In particular, additional NACK 1569 suppression where intermediate systems can aggregate NACK content (or 1570 filter duplicate NACK content) from receivers as it is relayed toward 1571 the sender could enhance NORM group size scalability. For NACK-based 1572 reliable multicast protocols using FEC, it is possible that 1573 intermediate systems may be able to filter FEC repair messages to 1574 provide an intelligent "subcast" of repair content to different legs 1575 of the multicast topology depending on the repair needs learned from 1576 previous receiver NACKs. Similarly, intermediate systems could 1577 monitor receiver NACKs and provide repair transmissions on-demand in 1578 response if sufficient state on the content being transmitted was 1579 being maintained. This can reduce the latency and volume of repair 1580 transmissions when the intermediate system is associated with a 1581 network link that is particularly problematic with respect to packet 1582 loss. These types of assist functions would require intermediate 1583 system interpretation of transport data unit content identifiers and 1584 flags. NACK-based protocol designs should consider the potential for 1585 intermediate system assistance in the specification of protocol 1586 messages and operations. It is likely that intermediate systems 1587 assistance will be more pragmatic if message parsing requirements are 1588 modest and if the amount of state an intermediate system is required 1589 to maintain is relatively small. 1591 4. NACK-based Reliable Multicast Applicability 1593 The Multicast NACK building block applies to protocols wishing to 1594 employ negative acknowledgement to achieve reliable data transfer. 1595 Properly designed NACK-based reliable multicast protocols offer 1596 scalability advantages for applications and/or network topologies 1597 where, for various reasons, it is prohibitive to construct a higher 1598 order delivery infrastructure above the basic Layer 3 IP multicast 1599 service (e.g., unicast or hybrid unicast/multicast data distribution 1600 trees). Additionally, the multicast scalability property of NACK- 1601 based protocols [RmComparison], [RmClasses] is applicable where broad 1602 "fan-out" is expected for a single network hop (e.g., cable-TV data 1603 delivery, satellite, or other broadcast communication services). 1604 Furthermore, the simplicity of a protocol based on "flat" group-wide 1605 multicast distribution may offer advantages for a broad range of 1606 distributed services or dynamic networks and applications. NACK- 1607 based reliable multicast protocols can make use of reciprocal (among 1608 senders and receivers) multicast communication under the Any-Source 1609 Multicast (ASM) model defined in RFC 1112 [RFC1112],and are capable 1610 of scalable operation in asymmetric topologies such as Source- 1611 Specific Multicast (SSM) [RFC4607] where there may only be unicast 1612 routing service from the receivers to the sender(s). 1614 NACK-based reliable multicast protocol operation is compatible with 1615 transport layer forward error correction coding techniques as 1616 described in [RFC3453]and congestion control mechanisms such as those 1617 described in [TfmccPaper]and [PgmccPaper]. A principal limitation of 1618 NACK-based reliable multicast operation involves group size 1619 scalability when network capacity for receiver feedback is very 1620 limited. It is possible that, with proper protocol design, the 1621 intermediate system assistance techniques mentioned in Section 2.4 1622 and described further in Section 3.10 can allow NACK-based approaches 1623 to scale to larger group sizes. NACK-based reliable multicast 1624 operation is also governed by implementation buffering constraints. 1625 Buffering greater than that required for typical point-to-point 1626 reliable transport (e.g., TCP) is recommended to allow for disparity 1627 in the receiver group connectivity and to allow for the feedback 1628 delays required to attain group size scalability. 1630 Prior experimental work included various protocol instantiations that 1631 implemented some of the concepts described in this building block 1632 document. This includes the Pragmatic General Multicast (PGM) 1633 protocol described in [RFC3208] among others that were documented or 1634 deployed outside of IETF activities. While the PGM protocol 1635 specification and some other approaches encompassed many of the goals 1636 of bulk data delivery as described here, this NACK-based building 1637 block provides a more generalized framework so that different 1638 application needs can be met by different protocol instantiation 1639 variants. The NACK-based building block approach described here 1640 includes compatiblity with the other protocol mechanisms including 1641 FEC and congestion control that are described in other IETF reliable 1642 multicast building block documents. The NACK repair process 1643 described in this document can provide performance advantages as 1644 compared to PGM when both are deployed on a pure end-to-end basis 1645 without intermediate system assistance. The round-trip timing 1646 estimation described here and its use in the NACK repair process 1647 allow protocol operation to more automatically adapt to different 1648 network environments or operate within environments where 1649 connectivity is dynamic. Use of the FEC payload identification 1650 techniques described in the FEC building block [RFC5052] and specific 1651 FEC instantiations allow protocol instantiations more flexibility as 1652 FEC techniques evolve than the specific sequence number data 1653 identification scheme described in the PGM specification. Similar 1654 flexibility is expected if protocol instantiations are designed to 1655 modularly invoke (at design time, if not run-time) the appropriate 1656 congestion control building block for different application or 1657 deployment purposes. 1659 5. Security Considerations 1661 NACK-based reliable multicast protocols are expected to be subject to 1662 the same security vulnerabilities as other IP and IP Multicast 1663 protocols. However, unlike point-to-point (unicast) transport 1664 protocols, it is possible that one badly-behaving participant can 1665 impact the transport service experience of others in the group. For 1666 example, a malicious receiver node could intentionally transmit NACK 1667 messages to cause the sender(s) to unnecessarily transmit repairs 1668 instead of making forward progress with reliable transfer. Also, 1669 group-wise messaging to support congestion control or other aspects 1670 of protocol operation may be subject to similar vulnerabilities. 1671 Thus, it is highly RECOMMENDED that security techniques such as 1672 authentication and data integrity checks be applied for NACK-based 1673 reliable multicast deployments. Protocol instantiations using this 1674 building block MUST identify approaches to security that can be used 1675 to address these and other security considerations. 1677 NACK-based reliable multicast is compatible with IP security (IPsec) 1678 authentication mechanisms [RFC4301] that are RECOMMENDED for 1679 protection against session intrusion and denial of service attacks. 1680 A particular threat for NACK-based protocols is that of NACK replay 1681 attacks that could prevent a multicast sender from making forward 1682 progress in transmission. Any standard IPsec mechanisms that can 1683 provide protection against such replay attacks are RECOMMENDED for 1684 use. The IETF Multicast Security (MSEC) Working Group has developed 1685 a set of recommendations in its Multicast Extensions to the Internet 1686 Protocol Security Architecture [I-D.ietf-msec-ipsec-extensions] that 1687 can be applied to appropriately extend IPsec mechanisms to multicast 1688 operation. An appendix of this document specifically addresses the 1689 Nack-Oriented Reliable Multicast protocol service model. As complete 1690 support for IPsec multicast operation may potentially follow reliable 1691 multicast deployment, NACK-based reliable multicast protocol 1692 instantiations SHOULD consider providing support for their own NACK 1693 replay attack protection when network layer mechanisms are not 1694 available. This MAY be necessary when IPsec implementations are used 1695 that do not provide multicast replay attack protection when multiple 1696 sources are present. 1698 For NACK-based multicast deployments with large receiver groups using 1699 IPsec, approaches might be developed that use shared, common keys for 1700 receiver-originated protocol messages to maintain a practical number 1701 of IPsec Security Associations (SAs). However, such group-based 1702 authentication may not be sufficient unless the receiver population 1703 can be completely trusted. Additionally, this can make 1704 identification of badly-behaving (although authenticated) receiver 1705 nodes problematic as such nodes could potentially masquerade as other 1706 receivers in the group. In deployments such as this, one SHOULD 1707 consider use of Source-Specific Multicast (SSM) instead of Any-Source 1708 Multicast (ASM) models of multicast operation. SSM operation can 1709 simplify security challenges in a couple of ways: 1711 1. A NACK-based protocol supporting SSM operation can eliminate 1712 direct receiver-to-receiver signaling. This dramatically reduces 1713 the number of security associations that need to be established. 1715 2. The SSM sender(s) can provide a centralized management point for 1716 secure group operation for its respective data flow with the 1717 sender alone required to conduct individual host authentication 1718 for each receiver when group-based authentication does not 1719 suffice or is not pragmatic to deploy. 1721 When individual host authentication is required, then it is possible 1722 receivers could use a digital signature on the IPsec Encapsulating 1723 Security Protocol (ESP) payload as described in [RFC4359]. Either an 1724 identity-based signature system or a group-specific public key 1725 infrastructure could avoid per-receiver state at the sender(s). 1726 Additionally, implementations MUST also support policies to limit the 1727 impact of extremely or exceptionally poor-performing (due to bad 1728 behavior or otherwise) receivers upon overall group operation if this 1729 is acceptable for the relevant application. 1731 As described in Section 3.4, deployment of NACK-based reliable 1732 multicast in some network environments may require identification of 1733 group members beyond that of IP addressing. If protocol-specific 1734 security mechanisms are developed, then it is RECOMMENDED that 1735 protocol group member identifiers are used as selectors (as defined 1736 in [RFC4301]) for the applicable security associations. When IPsec 1737 is used, it is RECOMMENDED that the protocol implementation verify 1738 that the source IP address of received packets are valid for the 1739 given protocol source identifier in addition to usual IPsec 1740 authentication. This would prevent a badly-behaving (although 1741 authorized) member spoofing messages from other legitimate members, 1742 providing that individual host authentication is supported. 1744 The MSEC Working Group has also developed automated group keying 1745 solutions which are applicable to NACK-based reliable multicast 1746 security. For example, to support IPsec or other security 1747 mechanisms, the Group Secure Association Key Management Protocol 1748 [RFC4535] MAY be used for automated group key management. The 1749 technique it identifies for "Group Establishment for Receive-Only 1750 Members" may be application NACK-based reliable multicast SSM 1751 operation. 1753 6. IANA Considerations 1755 This document has no actions for IANA. 1757 7. Changes from RFC3941 1759 This section lists the changes between the Experimental version of 1760 this specification, [RFC3941], and this version: 1762 1. Change of title to avoid confusion with NORM Protocol 1763 specification, 1765 2. Updated references to related, updated RMT Building Block 1766 documents, and 1768 3. More detailed security considerations. 1770 8. Acknowledgements 1772 (and these are not Negative) 1774 The authors would like to thank George Gross, Rick Jones, and Joerg 1775 Widmer for their valuable comments on this document. The authors 1776 would also like to thank the RMT working group chairs, Roger Kermode 1777 and Lorenzo Vicisano, for their support in development of this 1778 specification, and Sally Floyd for her early inputs into this 1779 document. 1781 9. References 1783 9.1. Normative References 1785 [RFC1112] Deering, S., "Host extensions for IP multicasting", STD 5, 1786 RFC 1112, August 1989. 1788 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1789 Requirement Levels", BCP 14, RFC 2119, March 1997. 1791 [RFC4607] Holbrook, H. and B. Cain, "Source-Specific Multicast for 1792 IP", RFC 4607, August 2006. 1794 9.2. Informative References 1796 [ArchConsiderations] 1797 Clark, D. and D. Tennenhouse, "Architectural 1798 Considerations for a New Generation of Protocols", In 1799 Proc. ACM SIGCOMM pages 201-208, September 1990. 1801 [DelayEstimation] 1802 Ozdemir, V., Muthukrishnan, S., and I. Rhee, "Scalable, 1803 Low-Overhead Network Delay Estimation", NCSU/ AT&T White 1804 Paper , February 1999. 1806 [FecBroadcast] 1807 Metzner, J., "An Improved Broadcast Retransmission 1808 Protocol", IEEE Transactions on Communications Vol. 1809 Com-32, No. 6, June 1984. 1811 [FecHybrid] 1812 Gossink, D. and J. Macker, "Reliable Multicast and 1813 Integrated Parity Retransmission with Channel Estimation", 1814 IEEE Globecomm 1998, 1998. 1816 [I-D.ietf-msec-ipsec-extensions] 1817 Weis, B., Gross, G., and D. Ignjatic, "Multicast 1818 Extensions to the Security Architecture for the Internet 1819 Protocol", draft-ietf-msec-ipsec-extensions-09 (work in 1820 progress), June 2008. 1822 [I-D.ietf-rmt-bb-fec-basic-schemes-revised] 1823 Watson, M., "Basic Forward Error Correction (FEC) 1824 Schemes", draft-ietf-rmt-bb-fec-basic-schemes-revised-05 1825 (work in progress), July 2008. 1827 [McastFeedback] 1828 Nonnenmacher, J. and E. Biersack, "Optimal Multicast 1829 Feedback", in IEEE Infocom p. 964, March/April 1998. 1831 [NormFeedback] 1832 Adamson, B. and J. Macker, "Quantitative Prediction of 1833 NACK-Oriented Reliable Multicast (NORM) Feedback", in IEEE 1834 MILCOM 2002, October 2002. 1836 [PgmccPaper] 1837 Rizzo, L., "pgmcc: A TCP-Friendly Single-Rate Multicast 1838 Congestion Control Scheme", ACM SIGCOMM 2000 , 1839 August 2000. 1841 [RFC2357] Mankin, A., Romanov, A., Bradner, S., and V. Paxson, "IETF 1842 Criteria for Evaluating Reliable Multicast Transport and 1843 Application Protocols", RFC 2357, June 1998. 1845 [RFC3208] Speakman, T., Crowcroft, J., Gemmell, J., Farinacci, D., 1846 Lin, S., Leshchiner, D., Luby, M., Montgomery, T., Rizzo, 1847 L., Tweedly, A., Bhaskar, N., Edmonstone, R., 1848 Sumanasekera, R., and L. Vicisano, "PGM Reliable Transport 1849 Protocol Specification", RFC 3208, December 2001. 1851 [RFC3269] Kermode, R. and L. Vicisano, "Author Guidelines for 1852 Reliable Multicast Transport (RMT) Building Blocks and 1853 Protocol Instantiation documents", RFC 3269, April 2002. 1855 [RFC3453] Luby, M., Vicisano, L., Gemmell, J., Rizzo, L., Handley, 1856 M., and J. Crowcroft, "The Use of Forward Error Correction 1857 (FEC) in Reliable Multicast", RFC 3453, December 2002. 1859 [RFC3940] Adamson, B., Bormann, C., Handley, M., and J. Macker, 1860 "Negative-acknowledgment (NACK)-Oriented Reliable 1861 Multicast (NORM) Protocol", RFC 3940, November 2004. 1863 [RFC3941] Adamson, B., Bormann, C., Handley, M., and J. Macker, 1864 "Negative-Acknowledgment (NACK)-Oriented Reliable 1865 Multicast (NORM) Building Blocks", RFC 3941, 1866 November 2004. 1868 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1869 Internet Protocol", RFC 4301, December 2005. 1871 [RFC4359] Weis, B., "The Use of RSA/SHA-1 Signatures within 1872 Encapsulating Security Payload (ESP) and Authentication 1873 Header (AH)", RFC 4359, January 2006. 1875 [RFC4535] Harney, H., Meth, U., Colegrove, A., and G. Gross, 1876 "GSAKMP: Group Secure Association Key Management 1877 Protocol", RFC 4535, June 2006. 1879 [RFC4654] Widmer, J. and M. Handley, "TCP-Friendly Multicast 1880 Congestion Control (TFMCC): Protocol Specification", 1881 RFC 4654, August 2006. 1883 [RFC5052] Watson, M., Luby, M., and L. Vicisano, "Forward Error 1884 Correction (FEC) Building Block", RFC 5052, August 2007. 1886 [RmClasses] 1887 Levine, B. and J. Garcia-Luna-Aceves, "A Comparison of 1888 Known Classes of Reliable Multicast Protocols", Proc. 1889 International Conference on Network Protocols (ICNP- 1890 96) Columbus, Ohio, October 1996. 1892 [RmComparison] 1893 Pingali, S., Towsley, D., and J. Kurose, "A Comparison of 1894 Sender-Initiated and Receiver-Initiated Reliable Multicast 1895 Protocols", Proc. INFOCOMM San Francisco, CA, 1896 October 1993. 1898 [RmFec] Macker, J., "Reliable Multicast Transport and Integrated 1899 Erasure-based Forward Error Correction", IEEE MILCOM 1997, 1900 October 1997. 1902 [SrmFramework] 1903 Floyd, S., Jacobson, V., McCanne, S., Liu, C., and L. 1904 Zhang, "A Reliable Multicast Framework for Light-weight 1905 Sessions and Application Level Framing", Proc. ACM 1906 SIGCOMM , August 1995. 1908 [TfmccPaper] 1909 Widmer, J. and M. Handley, "Extending Equation-Based 1910 Congestion Control to Multicast Applications", ACM 1911 SIGCOMM 2001, August 2001. 1913 Authors' Addresses 1915 Brian Adamson 1916 Naval Research Laboratory 1917 Washington, DC 20375 1919 Email: adamson@itd.nrl.navy.mil 1921 Carsten Bormann 1922 Universitaet Bremen TZI 1923 Postfach 330440 1924 D-28334 Bremen, Germany 1926 Email: cabo@tzi.org 1927 Mark Handley 1928 University College London 1929 Gower Street 1930 London, WC1E 6BT 1931 UK 1933 Email: M.Handley@cs.ucl.ac.uk 1935 Joe Macker 1936 Naval Research Laboratory 1937 Washington, DC 20375 1939 Email: macker@itd.nrl.navy.mil 1941 Full Copyright Statement 1943 Copyright (C) The IETF Trust (2008). 1945 This document is subject to the rights, licenses and restrictions 1946 contained in BCP 78, and except as set forth therein, the authors 1947 retain all their rights. 1949 This document and the information contained herein are provided on an 1950 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1951 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1952 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1953 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1954 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1955 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1957 Intellectual Property 1959 The IETF takes no position regarding the validity or scope of any 1960 Intellectual Property Rights or other rights that might be claimed to 1961 pertain to the implementation or use of the technology described in 1962 this document or the extent to which any license under such rights 1963 might or might not be available; nor does it represent that it has 1964 made any independent effort to identify any such rights. Information 1965 on the procedures with respect to rights in RFC documents can be 1966 found in BCP 78 and BCP 79. 1968 Copies of IPR disclosures made to the IETF Secretariat and any 1969 assurances of licenses to be made available, or the result of an 1970 attempt made to obtain a general license or permission for the use of 1971 such proprietary rights by implementers or users of this 1972 specification can be obtained from the IETF on-line IPR repository at 1973 http://www.ietf.org/ipr. 1975 The IETF invites any interested party to bring to its attention any 1976 copyrights, patents or patent applications, or other proprietary 1977 rights that may cover technology that may be required to implement 1978 this standard. Please address the information to the IETF at 1979 ietf-ipr@ietf.org.