idnits 2.17.1 

draft-ietf-rmt-bb-norm-revised-07.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 20.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 1955.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1966.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1973.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1979.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (September 9, 2008) is 5706 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Outdated reference: A later version (-06) exists of
     draft-ietf-rmt-bb-fec-basic-schemes-revised-05

  -- Obsolete informational reference (is this intentional?): RFC 3940
     (Obsoleted by RFC 5740)

  -- Obsolete informational reference (is this intentional?): RFC 3941
     (Obsoleted by RFC 5401)


     Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                         B. Adamson
3	Internet-Draft                                 Naval Research Laboratory
4	Obsoletes: 3941 (if approved)                                 C. Bormann
5	Intended status: Standards Track                 Universitaet Bremen TZI
6	Expires: March 13, 2009                                       M. Handley
7	                                               University College London
8	                                                               J. Macker
9	                                               Naval Research Laboratory
10	                                                       September 9, 2008

12	        Multicast Negative-Acknowledgment (NACK) Building Blocks
13	                   draft-ietf-rmt-bb-norm-revised-07

15	Status of this Memo

17	   By submitting this Internet-Draft, each author represents that any
18	   applicable patent or other IPR claims of which he or she is aware
19	   have been or will be disclosed, and any of which he or she becomes
20	   aware will be disclosed, in accordance with Section 6 of BCP 79.

22	   Internet-Drafts are working documents of the Internet Engineering
23	   Task Force (IETF), its areas, and its working groups.  Note that
24	   other groups may also distribute working documents as Internet-
25	   Drafts.

27	   Internet-Drafts are draft documents valid for a maximum of six months
28	   and may be updated, replaced, or obsoleted by other documents at any
29	   time.  It is inappropriate to use Internet-Drafts as reference
30	   material or to cite them other than as "work in progress."

32	   The list of current Internet-Drafts can be accessed at
33	   http://www.ietf.org/ietf/1id-abstracts.txt.

35	   The list of Internet-Draft Shadow Directories can be accessed at
36	   http://www.ietf.org/shadow.html.

38	   This Internet-Draft will expire on March 13, 2009.

40	Abstract

42	   This document discusses the creation of reliable multicast protocols
43	   utilizing negative-acknowledgment (NACK) feedback.  The rationale for
44	   protocol design goals and assumptions are presented.  Technical
45	   challenges for NACK-based (and in some cases general) reliable
46	   multicast protocol operation are identified.  These goals and
47	   challenges are resolved into a set of functional "building blocks"
48	   that address different aspects of reliable multicast protocol
49	   operation.  It is anticipated that these building blocks will be
50	   useful in generating different instantiations of reliable multicast
51	   protocols.  This document obsoletes RFC 3941.

53	Requirements Language

55	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
56	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
57	   document are to be interpreted as described in [RFC2119].

59	Table of Contents

61	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
62	   2.  Rationale  . . . . . . . . . . . . . . . . . . . . . . . . . .  4
63	     2.1.  Delivery Service Model . . . . . . . . . . . . . . . . . .  5
64	     2.2.  Group Membership Dynamics  . . . . . . . . . . . . . . . .  6
65	     2.3.  Sender/Receiver Relationships  . . . . . . . . . . . . . .  6
66	     2.4.  Group Size Scalability . . . . . . . . . . . . . . . . . .  6
67	     2.5.  Data Delivery Performance  . . . . . . . . . . . . . . . .  7
68	     2.6.  Network Environments . . . . . . . . . . . . . . . . . . .  8
69	     2.7.  Intermediate System Assistance . . . . . . . . . . . . . .  8
70	   3.  Functionality  . . . . . . . . . . . . . . . . . . . . . . . .  8
71	     3.1.  Multicast Sender Transmission  . . . . . . . . . . . . . . 11
72	     3.2.  NACK Repair Process  . . . . . . . . . . . . . . . . . . . 13
73	     3.3.  Multicast Receiver Join Policies and Procedures  . . . . . 25
74	     3.4.  Reliable Multicast Member Identification . . . . . . . . . 26
75	     3.5.  Data Content Identification  . . . . . . . . . . . . . . . 26
76	     3.6.  Forward Error Correction (FEC) . . . . . . . . . . . . . . 28
77	     3.7.  Round-trip Timing Collection . . . . . . . . . . . . . . . 29
78	     3.8.  Group Size Determination/Estimation  . . . . . . . . . . . 33
79	     3.9.  Congestion Control Operation . . . . . . . . . . . . . . . 34
80	     3.10. Intermediate System Assistance . . . . . . . . . . . . . . 34
81	   4.  NACK-based Reliable Multicast Applicability  . . . . . . . . . 34
82	   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 36
83	   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 38
84	   7.  Changes from RFC3941 . . . . . . . . . . . . . . . . . . . . . 38
85	   8.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 38
86	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 38
87	     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 38
88	     9.2.  Informative References . . . . . . . . . . . . . . . . . . 39
89	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 41
90	   Intellectual Property and Copyright Statements . . . . . . . . . . 43

92	1.  Introduction

94	   Reliable multicast transport is a desirable technology for efficient
95	   and reliable distribution of data to a group on the Internet.  The
96	   complexities of group communication paradigms necessitate different
97	   protocol types and instantiations to meet the range of performance
98	   and scalability requirements of different potential reliable
99	   multicast applications and users (See [RFC2357]).  This document
100	   addresses the creation of reliable multicast protocols utilizing
101	   negative-acknowledgment (NACK) feedback.  NACK-based protocols
102	   generally entail less frequent feedback messaging than reliability
103	   protocols based on positive acknowledgment (ACK).  The less frequent
104	   feedback messaging helps simplify the problem of feedback implosion
105	   as group size grows large.  While different protocol instantiations
106	   may be required to meet specific application and network architecture
107	   demands[ArchConsiderations], there are a number of fundamental
108	   components that may be common to these different instantiations.
109	   This document describes the framework and common "building block"
110	   components relevant to multicast protocols based primarily on NACK
111	   operation for reliable transport.  While this document discusses a
112	   large set of reliable multicast components and issues relevant to
113	   NACK-based reliable multicast protocol design, it specifically
114	   addresses in detail the following building blocks which are not
115	   addressed in other IETF documents:

117	   1.  NACK-based Multicast sender transmission strategies,

119	   2.  NACK repair process with timer-based feedback suppression, and

121	   3.  Round-trip timing for adapting NACK and other timers.

123	   NACK-based reliable multicast implementations SHOULD make use of
124	   Forward Error Correction (FEC) erasure coding techniques as described
125	   in the FEC Building Block [RFC5052] document.  Packet-level erasure
126	   coding allows missing packets from a given FEC block to be recovered
127	   using the parity packets instead of classical, individualized re-
128	   transmission of original source data content.  For this reason, this
129	   document refers to the protocol mechanisms for reliability as a
130	   "repair process."  Note that NACK-based protocols can reactively
131	   provide the parity packets in response to receiver requests for
132	   repair rather than just proactively sending added FEC parity content
133	   as part of the original transmission.  Hybrid proactive/reactive use
134	   of FEC content is also possible with the mechanisms described in this
135	   document.  Some classes of FEC coding such as Maximal Separable
136	   Distance (MDS) codes allow senders to dynamically implement
137	   deterministic, highly efficient receiver group repair strategies as
138	   part of a NACK-based, selective automated repeat-request (ARQ)
139	   scheme.  This document describes approaches
140	   The potential relationships to other reliable multicast transport
141	   building blocks (e.g., FEC, congestion control) and general issues
142	   with NACK-based reliable multicast protocols are also discussed.
143	   This document follows the guidelines provided in [RFC3269].

145	   *Statement of Intent*

147	   This memo contains descriptions of building blocks that can be
148	   applied in the design of Reliable Multicast protocols utilizing
149	   Negative-Acknowledgement (NACK) feedback.  [RFC3941] contained a
150	   previous description of this specification.  RFC3941 was published in
151	   the "Experimental" category.  It was the stated intent of the RMT
152	   working group to re-submit this specifications as an IETF Proposed
153	   Standard in due course.

155	   This Proposed Standard specification is thus based on [RFC3941] and
156	   has been updated according to accumulated experience and growing
157	   protocol maturity since the publication of RFC3941.  Said experience
158	   applies both to this specification itself and to congestion control
159	   strategies related to the use of this specification.

161	   The differences between [RFC3941] and this document are listed in
162	   Section 7.

164	2.  Rationale

166	   Each potential protocol instantiation using the building blocks
167	   presented here (and in other applicable building block documents)
168	   will have specific criteria that may influence individual protocol
169	   design.  To support the development of applicable building blocks, it
170	   is useful to identify and summarize driving general protocol design
171	   goals and assumptions.  These are areas that each protocol
172	   instantiation will need to address in detail.  Each building block
173	   description in this document will include a discussion of the impact
174	   of these design criteria.  The categories of design criteria
175	   considered here include:

177	   1.  Delivery Service Model,

179	   2.  Group Membership Dynamics,

181	   3.  Sender/receiver relationships,

183	   4.  Group Size Scalability,

185	   5.  Data Delivery Performance, and
186	   6.  Network Environments,

188	   All of these areas are at least briefly discussed.  Additionally,
189	   other reliable multicast transport building block documents such as
190	   [RFC5052] have been created to address areas outside of the scope of
191	   this document.  NACK-based reliable multicast protocol instantiations
192	   may depend upon these other building blocks as well as the ones
193	   presented here.  This document focuses on areas that are unique to
194	   NACK-based reliable multicast but may be used in concert with the
195	   other building block areas.  In some cases, a building block may be
196	   able address a wide range of assumptions, while in other cases there
197	   will be trade-offs required to meet different application needs or
198	   operating environments.  Where necessary, building block features are
199	   designed to be parametric to meet different requirements.  Of course,
200	   an underlying goal will be to minimize design complexity and to at
201	   least recommend default values for any such parameters that meet a
202	   general purpose "bulk data transfer" requirement in a typical
203	   Internet environment.  The forms of "bulk data transfer" covered here
204	   include reliable transport of bulky, but fixed-length, a priori
205	   static content and also transmission of non-predetermined, perhaps
206	   streamed content of indefinite length.  Section 3.5 discusses these
207	   different forms of bulk data content in further detail.

209	2.1.  Delivery Service Model

211	   The implicit goal of a reliable multicast transport protocol is the
212	   reliable delivery of data among a group of members communicating
213	   using IP multicast datagram service.  However, the specific service
214	   the application is attempting to provide can impact design decisions.
215	   A most basic service model for reliable multicast transport is that
216	   of "bulk transfer" which is a primary focus of this and other related
217	   RMT working group documents.  However, the same principles in
218	   protocol design may also be applied to other service models, e.g.,
219	   more interactive exchanges of small messages such as with white-
220	   boarding or text chat.  Within these different models there are
221	   issues such as the sender's ability to cache transmitted data (or
222	   state referencing it) for retransmission or repair.  The needs for
223	   ordering and/or causality in the sequence of transmissions and
224	   receptions among members in the group may be different depending upon
225	   data content.  The group communication paradigm differs significantly
226	   from the point-to-point model in that, depending upon the data
227	   content type, some receivers may complete reception of a portion of
228	   data content and be able to act upon it before other members have
229	   received the content.  This may be acceptable (or even desirable) for
230	   some applications but not for others.  These varying requirements
231	   drive the need for a number of different protocol instantiation
232	   designs.  A significant challenge in developing generally useful
233	   building block mechanisms is accommodating even a limited range of
234	   these capabilities without defining specific application-level
235	   details.

237	   Another factor impacting the delivery service model is the potential
238	   for different receivers in the multicast group to have significantly
239	   differing quality of network connectivity.  This may involve
240	   receivers with very limited goodput due to connection rate or
241	   substantial packet loss.  NACK-based protocol implementations may
242	   wish to provide policies by which extremely poor-performing receivers
243	   are excluded from the main group or migrated to a separate delivery
244	   group.  Note that some application models may require that the entire
245	   group be constrained to the performance of the "weakest member" to
246	   satisfy operational requirements.  In either case, protocol designs
247	   should consider this aspect of the reliable multicast delivery
248	   service model.

250	2.2.  Group Membership Dynamics

252	   One area where group communication can differ from point-to-point
253	   communications is that even if the composition of the group changes,
254	   the "thread" of communication can still exist.  This contrasts with
255	   the point-to-point communication model where, if either of the two
256	   parties leave, the communication process (exchange of data) is
257	   terminated (or at least paused).  Depending upon application goals,
258	   senders and receivers participating in a reliable multicast transport
259	   "session" may be able to join late, leave, and/or potentially rejoin
260	   while the ongoing group communication "thread" still remains
261	   functional and useful.  Also note that this can impact protocol
262	   message content.  If "late joiners" are supported, some amount of
263	   additional information may be placed in message headers to
264	   accommodate this functionality.  Alternatively, the information may
265	   be sent in its own message (on demand or intermittently) if the
266	   impact to the overhead of typical message transmissions is deemed too
267	   great.  Group dynamics can also impact other protocol mechanisms such
268	   as NACK timing, congestion control operation, etc.

270	2.3.  Sender/Receiver Relationships

272	   The relationship of senders and receivers among group members
273	   requires consideration.  In some applications, there may be a single
274	   sender multicasting to a group of receivers.  In other cases, there
275	   may be more than one sender or the potential for everyone in the
276	   group to be a sender *and* receiver of data may exist.

278	2.4.  Group Size Scalability

280	   Native IP multicast [RFC1112] may scale to extremely large group
281	   sizes.  It may be desirable for some applications to scale along with
282	   the multicast infrastructure's ability to scale.  In its simplest
283	   form, there are limits to the group size to which a NACK-based
284	   protocol can be applied without the potential for the volume of NACK
285	   feedback messages to overwhelm network capacity.  This is often
286	   referred to as "feedback implosion".  Research suggests that NACK-
287	   based reliable multicast group sizes on the order of tens of
288	   thousands of receivers may operate with acceptable levels of feedback
289	   to the sender using probabilistic, timer-based suppression techniques
290	   [NormFeedback].  Instead of receivers immediately transmitting
291	   feedback messages when loss is detected, these techniques specify use
292	   of purposefully-scaled, random back-off timeouts such that some
293	   potential NACKing receivers can self-suppress their feedback upon
294	   hearing messages from other receivers that have selected shorter
295	   random timeout intervals.  However, there may be additional NACK
296	   suppression heuristics that can be applied to enable these protocols
297	   to scale to even larger group sizes.  In large scale cases, it may be
298	   prohibitive for members to maintain state on all other members (in
299	   particular, other receivers) in the group.  The impact of group size
300	   needs to be considered in the development of applicable building
301	   blocks.

303	   Intermediate assistance from devices/systems with direct knowledge of
304	   the underlying network topology may be used to increase the
305	   performance and scalability of NACK-based reliable multicast
306	   protocols.  Feedback aggregation and filtering of sender repair data
307	   may be possible with NACK-based protocols using FEC-based repair
308	   strategies as described in the and other reliable multicast transport
309	   building block documents.  However, there will continue to be a
310	   number of instances where intermediate system assistance is not
311	   available or practical.  Thus, building block components for based
312	   reliable multicast should be capable of operating without such
313	   assistance.

315	2.5.  Data Delivery Performance

317	   There is a trade-off between scalability and data delivery latency
318	   when designing NACK-oriented protocols.  If probabilistic, timer-
319	   based NACK suppression is to be used, there will be some delays built
320	   into the NACK process to allow suppression to occur and for the
321	   sender of data to identify appropriate content for efficient repair
322	   transmission.  For example, back-off timeouts can be used to ensure
323	   efficient NACK suppression and repair transmission, but this comes at
324	   a cost of increased delivery latency and increased buffering
325	   requirements for both senders and receivers.  The building blocks
326	   SHOULD allow applications to establish bounds for data delivery
327	   performance.  Note that application designers must be aware of the
328	   scalability trade-off that is made when such bounds are applied.

330	2.6.  Network Environments

332	   The Internet Protocol has historically assumed a role of providing
333	   service across heterogeneous network topologies.  It is desirable
334	   that a reliable multicast protocol be capable of effectively
335	   operating across a wide range of the networks to which general
336	   purpose IP service applies.  The bandwidth available on the links
337	   between the members of a single group today may vary between low
338	   numbers of kbit/s for wireless links and multiple Gbit/s for high
339	   speed LAN connections, with varying degrees of contention from other
340	   flows.  Recently, a number of asymmetric network services including
341	   56K/ADSL modems, CATV Internet service, satellite and other wireless
342	   communication services have begun to proliferate.  Many of these are
343	   inherently broadcast media with potentially large "fan-out" to which
344	   IP multicast service is highly applicable.  Additionally, policy
345	   and/or technical issues may result in topologies where multicast
346	   connectivity is limited to a Source-Specific Multicast (SSM) model
347	   from a specific source [RFC4607].  Receivers in the group may be
348	   restricted to unicast feedback for NACKs and other messages.
349	   Consideration must be given, in building block development and
350	   protocol design, to the nature of the underlying networks.

352	2.7.  Intermediate System Assistance

354	   Intermediate assistance from devices/systems with direct knowledge of
355	   the underlying network topology may be used to leverage the
356	   performance and scalability of reliable multicast protocols, there
357	   will continue to be a number of instances where this is not available
358	   or practical.  Any building block components for NACK-oriented
359	   reliable multicast SHALL be capable of operating without such
360	   assistance.  However, it is RECOMMENDED that such protocols also
361	   consider utilizing these features when available.

363	3.  Functionality

365	   The previous section has presented the role of protocol building
366	   blocks and some of the criteria that may affect NACK-based reliable
367	   multicast building block identification/design.  This section
368	   describes different building block areas applicable to NACK-based
369	   reliable multicast protocols.  Some of these areas are specific to
370	   NACK-based protocols.  Detailed descriptions of such areas are
371	   provided.  In other cases, the areas (e.g., node identifiers, forward
372	   error correction (FEC), etc.) may be applicable to other forms of
373	   reliable multicast.  In those cases, the discussion below describes
374	   requirements placed on those other general building block areas from
375	   the standpoint of NACK-based reliable multicast.  Where applicable,
376	   other building block documents are referenced for possible
377	   contribution to NACK-based reliable multicast protocols.

379	   For each building block, a notional "interface description" is
380	   provided to illustrate any dependencies of one building block
381	   component upon another or upon other protocol parameters.  A building
382	   block component may require some form of "input" from another
383	   building block component or other source to perform its function.
384	   Any "inputs" required by a building block component and/or any
385	   resultant "output" provided will be defined and described in each
386	   building block component's interface description.  Note that the set
387	   of building blocks presented here do not fully satisfy each other's
388	   "input" and "output" needs.  In some cases, "inputs" for the building
389	   blocks here must come from other building blocks external to this
390	   document (e.g., congestion control or FEC).  In other cases NACK-
391	   based reliable multicast building block "inputs" must be satisfied by
392	   the specific protocol instantiation or implementation (e.g.,
393	   application data and control).

395	   The following building block components relevant to NACK-based
396	   reliable multicast are identified:

398	   1.  Multicast Sender Transmission

400	   2.  NACK Repair Process

402	   3.  Multicast Receiver Join Policies

404	   1.  Node (member) Identification

406	   2.  Data Content Identification

408	   3.  Forward Error Correction (FEC)

410	   4.  Round-trip Timing Collection

412	   5.  Group Size Determination/Estimation

414	   6.  Congestion Control Operation

416	   7.  Intermediate System Assistance

418	   8.  Ancillary Protocol Mechanisms

420	   Figure 1 provides a pictorial overview of these building block areas
421	   and some of their relationships.  For example, the content of the
422	   data messages that a sender initially transmits depends upon the
423	   "Node Identification", "Data Content Identification", and "FEC"
424	   components while the rate of message transmission will generally
425	   depend upon the "Congestion Control" component.  Subsequently, the
426	   receivers' response to these transmissions (e.g., NACKing for repair)
427	   will depend upon the data message content and inputs from other
428	   building block components.  Finally, the sender's processing of
429	   receiver responses will feed back into its transmission strategy.

431	   The components on the left side of this figure are areas that may be
432	   applicable beyond NACK-based reliable multicast.  The most
433	   significant of these components are discussed in other building block
434	   documents such as the FEC Building Block [RFC5052].  A brief
435	   description of these areas and their role in NACK-based reliable
436	   multicast protocols is given below.  The components on the right are
437	   seen as specific to NACK-based reliable multicast protocols, most
438	   notably the NACK repair process.  These areas are discussed in detail
439	   below.  Some other components (e.g., "Security") impact many aspects
440	   of the protocol, and others may be more transparent to the core
441	   protocol processing.  The sections below describe the "Multicast
442	   Sender Transmission", "NACK Repair Process", and "RTT Collection"
443	   building blocks in detail.  The relationships to and among the other
444	   building block areas are also discussed, focusing on issues
445	   applicable to NACK-based reliable multicast protocol design.  Where
446	   applicable, specific technical recommendations are made for
447	   mechanisms that will properly satisfy the goals of NACK-based
448	   reliable multicast transport for the Internet.

450	                                        Application Data and Control
451	                                                    |
452	                                                    V
453	      .---------------------.            .-----------------------.
454	      | Node Identification |----------->|  Sender Transmission  |<----.
455	      `---------------------'       _.-' `-----------------------'     |
456	      .---------------------.   _.-' .'           | .--------------.   |
457	      | Data Identification |--'   .''            | |  Join Policy |   |
458	      `---------------------'    .' '             V `--------------'   |
459	      .---------------------.  .'  '     .----------------------.      |
460	   ,->| Congestion Control  |-'   '      | Receiver NACK        |      |
461	   |  `---------------------'   .'       | Repair Process       |      |
462	   |  .---------------------. .'         | .------------------. |      |
463	   |  |        FEC          |'.          | | NACK Initiation  | |      |
464	   |  `---------------------'` `._       | `------------------' |      |
465	   |  .---------------------. ``. `-._   | .------------------. |      |
466	   `--|    RTT Collection   |._` `    `->| | NACK Content     | |      |
467	      `---------------------'` `` `      | `------------------' |      |
468	      .---------------------.  ` ``-`._  | .------------------. |      |
469	      |    Group Size Est.  |---`-`---`->| | NACK Suppression | |      |
470	      `---------------------'`. `. `.    | `------------------' |      |
471	      .---------------------.  \  | |    `----------------------'      |
472	      |       Other         |   \ . .           | +----------------+   |
473	      `---------------------'    \ \ \          | | Intermediate   |   |
474	                                  \ \ \         | | System Assist  |   |
475	                                   \ \ |        V +----------------+   |
476	                                    `-` >.-------------------------.   |
477	                                         | Sender NACK Processing  |___/
478	                                         | and Repair Response     |
479	                                         `-------------------------'
480	                      ^                         ^
481	                      |                         |
482	                    .-----------------------------.
483	                    |         (Security)          |
484	                    `-----------------------------'

486	     Figure 1: NACK-based Reliable Multicast Building Block Framework

488	3.1.  Multicast Sender Transmission

490	   Reliable multicast senders will transmit data content to the
491	   multicast session.  The data content will be application dependent.
492	   The sender will transmit data content at a rate, and with message
493	   sizes, determined by application and/or network architecture
494	   requirements.  Any FEC encoding of sender transmissions SHOULD
495	   conform with the guidelines of the FEC Building Block [RFC5052].
496	   When congestion control mechanisms are needed (REQUIRED for general
497	   Internet operation), the sender transmission rate SHALL be controlled
498	   by the congestion control mechanism.  In any case, it is RECOMMENDED
499	   that all data transmissions from multicast senders be subject to rate
500	   limitations determined by the application or congestion control
501	   algorithm.  The sender's transmissions SHOULD make good utilization
502	   of the available capacity (which may be limited by the application
503	   and/or by congestion control).  As a result, it is expected there
504	   will be overlap and multiplexing of new data content transmission
505	   with repair content.  Other factors related to application operation
506	   may determine sender transmission formats and methods.  For example,
507	   some consideration needs to be given to the sender's behavior during
508	   intermittent idle periods when it has no data to transmit.

510	   In addition to data content, other sender messages or commands may be
511	   employed as part of protocol operation.  These messages may occur
512	   outside of the scope of application data transfer.  In NACK-based
513	   reliable multicast protocols, reliability of such protocol messages
514	   may be attempted by redundant transmission when positive
515	   acknowledgement is prohibitive due to group size scalability
516	   concerns.  Note that protocol design SHOULD provide mechanisms for
517	   dealing with cases where such messages are not received by the group.
518	   As an example, a command message might be redundantly transmitted by
519	   a sender to indicate that it is temporarily (or permanently) halting
520	   transmission.  At this time, it may be appropriate for receivers to
521	   respond with NACKs for any outstanding repairs they require following
522	   the rules of the NACK procedure.  For efficiency, the sender should
523	   allow sufficient time between the redundant transmissions to receive
524	   any NACK responses from the receivers to this command.

526	   In general, when there is any resultant NACK or other feedback
527	   operation, the timing of redundant transmission of control messages
528	   issued by a sender and other NACK-based reliable multicast protocol
529	   timeouts should be dependent upon the group greatest round trip
530	   timing (GRTT) estimate and any expected resultant NACK or other
531	   feedback operation.  The sender GRTT is an estimate of the worst-case
532	   round-trip timing from a given sender to any receivers in the group.
533	   It is assumed that the GRTT interval is a conservative estimate of
534	   the maximum span (with respect to delay) of the multicast group
535	   across a network topology with respect to given sender.  NACK-based
536	   reliable multicast instantiations SHOULD be able to dynamically adapt
537	   to a wide range of multicast network topologies.

539	   *Inputs:*

541	   1.  Application data and control

543	   2.  Sender node identifier
544	   3.  Data identifiers

546	   4.  Segmentation and FEC parameters

548	   5.  Transmission rate

550	   6.  Application controls

552	   7.  Receiver feedback messages (e.g., NACKs)

554	   *Outputs:*

556	   1.  Controlled transmission of messages with headers uniquely
557	       identifying data or repair content within the context of the
558	       reliable multicast session.

560	   2.  Commands indicating sender's status or other transport control
561	       actions to be taken.

563	3.2.  NACK Repair Process

565	   A critical component of NACK-based reliable multicast protocols is
566	   the NACK repair process.  This includes the receiver's role in
567	   detecting and requesting repair needs, and the sender's response to
568	   such requests.  There are four primary elements of the NACK repair
569	   process:

571	   1.  Receiver NACK process initiation,

573	   2.  NACK suppression,

575	   3.  NACK message content,

577	   4.  Sender NACK processing and response.

579	3.2.1.  Receiver NACK Process Initiation

581	   The NACK process (cycle) will be initiated by receivers that detect a
582	   need for repair transmissions from a specific sender to achieve
583	   reliable reception.  When FEC is applied, a receiver should initiate
584	   the NACK process only when it is known its repair requirements exceed
585	   the amount of pending FEC transmission for a given coding block of
586	   data content.  This can be determined at the end of the current
587	   transmission block (if it is indicated) or upon the start of
588	   reception of a subsequent coding block or transmission object.  This
589	   implies the sender data content is marked to identify its FEC block
590	   number and that ordinal relationship is preserved in order of
591	   transmission.

593	   Alternatively, if the sender's transmission advertises the quantity
594	   of repair packets it is already planning to send for a block, the
595	   receiver may be able to initiate the NACK process earlier.  Allowing
596	   receivers to initiate NACK cycles at any time they detect their
597	   repair needs have exceeded pending repair transmissions may result in
598	   slightly quicker repair cycles.  However, it may be useful to limit
599	   NACK process initiation to specific events such as at the end-of-
600	   transmission of an FEC coding block or upon detection of subsequent
601	   coding blocks.  This can allow receivers to aggregate NACK content
602	   into a smaller number of NACK messages and provide some implicit
603	   loose synchronization among the receiver set to help facilitate
604	   effective probabilistic suppression of NACK feedback.  The receiver
605	   MUST maintain a history of data content received from the sender to
606	   determine its current repair needs.  When FEC is employed, it is
607	   expected that the history will correspond to a record of pending or
608	   partially-received coding blocks.

610	   For probabilistic, timer-based suppression of feedback, the NACK
611	   cycle should begin with receivers observing backoff timeouts.  In
612	   conjunction with initiating this backoff timeout, it is important
613	   that the receivers record the current position in the sender's
614	   transmission sequence at which they initiate the NACK cycle.  When
615	   the suppression backoff timeout expires, the receivers should only
616	   consider their repair needs up to this recorded transmission position
617	   in making the decision to transmit or suppress a NACK.  Without this
618	   restriction, suppression is greatly reduced as additional content is
619	   received from the sender during the time a NACK message propagates
620	   across the network to the sender and other receivers.

622	   *Inputs:*

624	   1.  Sender data content with sequencing identifiers from sender
625	       transmissions.

627	   2.  History of content received from sender.

629	   *Outputs:*

631	   1.  NACK process initiation decision

633	   2.  Recorded sender transmission sequence position.

635	3.2.2.  NACK Suppression

637	   An effective feedback suppression mechanism is the use of random
638	   backoff timeouts prior to NACK transmission by receivers requiring
639	   repairs[SrmFramework].  Upon expiration of the backoff timeout, a
640	   receiver will request repairs unless its pending repair needs have
641	   been completely superseded by NACK messages heard from other
642	   receivers (when receivers are multicasting NACKs) or from some
643	   indicator from the sender.  When receivers are unicasting NACK
644	   messages, the sender may facilitate NACK suppression by forwarding a
645	   representation of NACK content it has received to the group at large
646	   or provide some other indicator of the repair information it will be
647	   subsequently transmitting.

649	   For effective and scalable suppression performance, the backoff
650	   timeout periods used by receivers should be independently, randomly
651	   picked by receivers with a truncated exponential
652	   distribution[McastFeedback].  This results in the majority of the
653	   receiver set holding off transmission of NACK messages under the
654	   assumption that the smaller number of "early NACKers" will supersede
655	   the repair needs of the remainder of the group.  The mean of the
656	   distribution should be determined as a function of the current
657	   estimate of sender's GRTT assessment and a group size estimate that
658	   is determined by other mechanisms within the protocol or preset by
659	   the multicast application.

661	   A simple algorithm can be constructed to generate random backoff
662	   timeouts with the appropriate distribution.  Additionally, the
663	   algorithm may be designed to optimize the backoff distribution given
664	   the number of receivers ("R") potentially generating feedback.  This
665	   "optimization" minimizes the number of feedback messages (e.g., NACK)
666	   in the worst-case situation where all receivers generate a NACK.  The
667	   maximum backoff timeout ("T_maxBackoff") can be set to control
668	   reliable delivery latency versus volume of feedback traffic.  A
669	   larger value of "T_maxBackoff" will result in a lower density of
670	   feedback traffic for a given repair cycle.  A smaller value of
671	   "T_maxBackoff" results in shorter latency which also reduces the
672	   buffering requirements of senders and receivers for reliable
673	   transport.

675	   In the functions below, the "log()" function specified refers to the
676	   "natural logarithm" and the "exp()" function is similarly based upon
677	   the mathematical constant 'e' (a.k.a.  Euler's number) where "exp(x)"
678	   corresponds to '"e"' raised to the power of '"x"'.  Given the
679	   receiver group size ("groupSize"), and maximum allowed backoff
680	   timeout ("T_maxBackoff"), random backoff timeouts ("t'") with a
681	   truncated exponential distribution can be picked with the following
682	   algorithm:

684	   1.  Establish an optimal mean ("L") for the exponential backoff based
685	       on the "groupSize":

687	                           L = log(groupSize) + 1

689	   2.  Pick a random number ("x") from a uniform distribution over a
690	       range of:

692	                L                          L                   L
693	        --------------------  to --------------------  +  ----------
694	       T_maxBackoff*(exp(L)-1)  T_maxBackoff*(exp(L)-1)  T_maxBackoff

696	   3.  Transform this random variate to generate the desired random
697	       backoff time ("t'") with the following equation:

699	       t' = T_maxBackoff/L * log(x * (exp(L) - 1) * (T_maxBackoff/L))

701	   This "C" language function can be used to generate an appropriate
702	   random backoff time interval:

704	        double RandomBackoff(double T_maxBackoff, double groupSize)
705	        {
706	            double lambda = log(groupSize) + 1;
707	            double x = UniformRand(lambda/T_maxBackoff) +
708	                       lambda / (T_maxBackoff*(exp(lambda)-1));
709	            return ((T_maxBackoff/lambda) *
710	                    log(x*(exp(lambda)-1)*(T_maxBackoff/lambda)));
711	        }  // end RandomBackoff()

713	   where "UniformRand(double max)" returns random numbers with a uniform
714	   distribution from the range of "0..max".  For example, based on the
715	   POSIX ""rand()"" function, the following "C" code can be used:

717	           double UniformRand(double max)
718	           {
719	               return (max * ((double)rand()/(double)RAND_MAX));
720	           }

722	   The number of expected NACK messages generated ("N") within the first
723	   round trip time for a single feedback event is approximately:

725	                  N = exp(1.2 * L / (2*T_maxBackoff/GRTT))

727	   Thus the maximum backoff time can be adjusted to trade-off worst-case
728	   NACK feedback volume versus latency.  This is derived from the
729	   equations given in [McastFeedback] and assumes "T_maxBackoff >=
730	   GRTT", and "L" is the mean of the distribution optimized for the
731	   given group size as shown in the algorithm above.  Note that other
732	   mechanisms within the protocol may work to reduce redundant NACK
733	   generation further.  It is suggested that "T_maxBackoff" be selected
734	   as an integer multiple of the sender's current advertised GRTT
735	   estimate such that:
736	                   T_maxBackoff = K * GRTT; where K >= 1

738	   For general Internet operation, a default value of "K=4" is
739	   RECOMMENDED for operation with multicast (to the group at large) NACK
740	   delivery and a value of "K=6" for unicast NACK delivery.  Alternate
741	   values may be used to achieve desired buffer utilization, reliable
742	   delivery latency and group size scalability trade-offs.

744	   Given that ("K*GRTT") is the maximum backoff time used by the
745	   receivers to initiate NACK transmission, other timeout periods
746	   related to the NACK repair process can be scaled accordingly.  One of
747	   those timeouts is the amount of time a receiver should wait after
748	   generating a NACK message before allowing itself to initiate another
749	   NACK backoff/transmission cycle ("T_rcvrHoldoff").  This delay should
750	   be sufficient for the sender to respond to the received NACK with
751	   repair messages.  An appropriate value depends upon the amount of
752	   time for the NACK to reach the sender and the sender to provide a
753	   repair response.  This MUST include any amount of sender NACK
754	   aggregation period during which possible multiple NACKs are
755	   accumulated to determine an efficient repair response.  These
756	   timeouts are further discussed in the section below on "Sender NACK
757	   Processing and Repair Response".

759	   There are also secondary measures that can be applied to improve the
760	   performance of feedback suppression.  For example, the sender's data
761	   content transmissions can follow an ordinal sequence of transmission.
762	   When repairs for data content occur, the receiver can note that the
763	   sender has "rewound" its data content transmission position by
764	   observing the data object, FEC block number, and FEC symbol
765	   identifiers.  Receivers SHOULD limit transmission of NACKs to only
766	   when the sender's current transmission position exceeds the point to
767	   which the receiver has incomplete reception.  This reduces premature
768	   requests for repair of data the sender may be planning to provide in
769	   response to other receiver requests.  This mechanism can be very
770	   effective for protocol convergence in high loss conditions when
771	   transmissions of NACKs from other receivers (or indicators from the
772	   sender) are lost.  Another mechanism (particularly applicable when
773	   FEC is used) is for the sender to embed an indication of impending
774	   repair transmissions in current packets sent.  For example, the
775	   indication may be as simple as an advertisement of the number of FEC
776	   packets to be sent for the current applicable coding block.

778	   Finally, some consideration might be given to using the NACKing
779	   history of receivers to weight their selection of NACK backoff
780	   timeout intervals.  For example, if a receiver has historically been
781	   experiencing the greatest degree of loss, it may promote itself to
782	   statistically NACK sooner than other receivers.  Note this requires
783	   there is correlation over successive intervals of time in the loss
784	   experienced by a receiver.  Such correlation MAY not always be
785	   present in multicast networks.  This adjustment of backoff timeout
786	   selection may require the creation of an "early NACK" slot for these
787	   historical NACKers.  This additional slot in the NACK backoff window
788	   will result in a longer repair cycle process that may not be
789	   desirable for some applications.  The resolution of these trade-offs
790	   may be dependent upon the protocol's target application set or
791	   network.

793	   After the random backoff timeout has expired, the receiver will make
794	   a decision on whether to generate a NACK repair request or not (i.e.,
795	   it has been suppressed).  The NACK will be suppressed when any of the
796	   following conditions has occurred:

798	   1.  The accumulated state of NACKs heard from other receivers (or
799	       forwarding of this state by the sender) is equal to or supersedes
800	       the repair needs of the local receiver.  Note that the local
801	       receiver should consider its repair needs only up to the sender
802	       transmission position recorded at the NACK cycle initiation (when
803	       the backoff timer was activated).

805	   2.  The sender's data content transmission position "rewinds" to a
806	       point ordinally less than that of the lowest sequence position of
807	       the local receiver's repair needs.  (This detection of sender
808	       "rewind" indicates the sender has already responded to other
809	       receiver repair needs of which the local receiver may not have
810	       been aware).  This "rewind" event can occur any time between 1)
811	       when the NACK cycle was initiated with the backoff timeout
812	       activation and 2) the current moment when the backoff timeout has
813	       expired to suppress the NACK.  Another NACK cycle must be
814	       initiated by the receiver when the sender's transmission sequence
815	       position exceeds the receiver's lowest ordinal repair point.
816	       Note it is possible that the local receiver may have had its
817	       repair needs satisfied as a result of the sender's response to
818	       the repair needs of other receivers and no further NACKing is
819	       required.

821	   If these conditions have not occurred and the receiver still has
822	   pending repair needs, a NACK message is generated and transmitted.
823	   The NACK should consist of an accumulation of repair needs from the
824	   receiver's lowest ordinal repair point up to the current sender
825	   transmission sequence position.  A single NACK message should be
826	   generated and the NACK message content should be truncated if it
827	   exceeds the payload size of single protocol message.  When such NACK
828	   payload limits occur, the NACK content SHOULD contain requests for
829	   the ordinally lowest repair content needed from the sender.

831	   *Inputs:*
832	   1.  NACK process initiation decision.

834	   2.  Recorded sender transmission sequence position.

836	   3.  Sender GRTT.

838	   4.  Sender group size estimate.

840	   5.  Application-defined bound on backoff timeout period.

842	   6.  NACKs from other receivers.

844	   7.  Pending repair indication from sender (may be forwarded NACKs).

846	   8.  Current sender transmission sequence position.

848	   *Outputs:*

850	   1.  Yes/no decision to generate NACK message upon backoff timer
851	       expiration.

853	3.2.3.  NACK Content

855	   The content of NACK messages generated by reliable multicast
856	   receivers will include information detailing their current repair
857	   needs.  The specific information depends on the use and type of FEC
858	   in the NACK repair process.  The identification of repair needs is
859	   dependent upon the data content identification (See Section 3.5
860	   below).  At the highest level the NACK content will identify the
861	   sender to which the NACK is addressed and the data transport object
862	   (or stream) within the sender's transmission that needs repair.  For
863	   the indicated transport entity, the NACK content will then identify
864	   the specific FEC coding blocks and/or symbols it requires to
865	   reconstruct the complete transmitted data.  This content may consist
866	   of FEC block erasure counts and/or explicit indication of missing
867	   blocks or symbols (segments) of data and FEC content.  It should also
868	   be noted that NACK-based reliable multicast can be effectively
869	   instantiated without a requirement for reliable NACK delivery using
870	   the techniques discussed here.

872	3.2.3.1.  NACK and FEC Repair Strategies

874	   Where FEC-based repair is used, the NACK message content will
875	   minimally need to identify the coding block(s) for which repair is
876	   needed and a count of erasures (missing packets) for the coding
877	   block.  An exact count of erasures implies the FEC algorithm is
878	   capable of repairing _any_ loss combination within the coding block.
879	   This count may need to be adjusted for some FEC algorithms.

881	   Considering that multiple repair rounds may be required to
882	   successfully complete repair, an erasure count also implies that the
883	   quantity of unique FEC parity packets the server has available to
884	   transmit is essentially unlimited (i.e., the server will always be
885	   able to provide new, unique, previously unsent parity packets in
886	   response to any subsequent repair requests for the same coding
887	   block).  Alternatively, the sender may "round-robin" transmit through
888	   its available set of FEC symbols for a given coding block, and
889	   eventually effect repair.  For a most efficient repair strategy, the
890	   NACK content will need to also _explicitly_ identify which symbols
891	   (information and/or parity) the receiver requires to successfully
892	   reconstruct the content of the coding block.  This will be
893	   particularly true of small to medium size block FEC codes (e.g., Reed
894	   Solomon) that are capable of providing a limited number of parity
895	   symbols per FEC coding block.

897	   When FEC is not used as part of the repair process, or the protocol
898	   instantiation is required to provide reliability even when the sender
899	   has transmitted all available parity for a given coding block (or the
900	   sender's ability to buffer transmission history is exceeded by the
901	   "(delay*bandwidth*loss)" characteristics of the network topology),
902	   the NACK content will need to contain _explicit_ coding block and/or
903	   segment loss information so that the sender can provide appropriate
904	   repair packets and/or data retransmissions.  Explicit loss
905	   information in NACK content may also potentially serve other
906	   purposes.  For example, it may be useful for decorrelating loss
907	   characteristics among a group of receivers to help differentiate
908	   candidate congestion control bottlenecks among the receiver set.

910	   When FEC is used and NACK content is designed to contain explicit
911	   repair requests, there is a strategy where the receivers can NACK for
912	   specific content that will help facilitate NACK suppression and
913	   repair efficiency.  The assumptions for this strategy are that sender
914	   may potentially exhaust its supply of new, unique parity packets
915	   available for a given coding block and be required to explicitly
916	   retransmit some data or parity symbols to complete reliable transfer.
917	   Another assumption is that an FEC algorithm where any parity packet
918	   can fill any erasure within the coding block (e.g., Reed Solomon) is
919	   used.  The goal of this strategy is to make maximum use of the
920	   available parity and provide the minimal amount of data and repair
921	   transmissions during reliable transfer of data content to the group.

923	   When systematic FEC codes are used, the sender transmits the data
924	   content of the coding block (and optionally some quantity of parity
925	   packets) in its initial transmission.  Note that a systematic FEC
926	   coding block is considered to be logically made up of the contiguous
927	   set of source data vectors plus parity vectors for the given FEC
928	   algorithm used.  For example, a systematic coding scheme that
929	   provides for 64 data symbols and 32 parity symbols per coding block
930	   would contain FEC symbol identifiers in the range of 0 to 95.

932	   Receivers then can construct NACK messages requesting sufficient
933	   content to satisfy their repair needs.  For example, if the receiver
934	   has three erasures in a given received coding block, it will request
935	   transmission of the three lowest ordinal parity vectors in the coding
936	   block.  In our example coding scheme from the previous paragraph, the
937	   receiver would explicitly request parity symbols 64 to 66 to fill its
938	   three erasures for the coding block.  Note that if the receiver's
939	   loss for the coding block exceeds the available parity quantity
940	   (i.e., greater than 32 missing symbols in our example), the receiver
941	   will be required to construct a NACK requesting all (32) of the
942	   available parity symbols plus some additional portions of its missing
943	   data symbols in order to reconstruct the block.  If this is done
944	   consistently across the receiver group, the resulting NACKs will
945	   comprise a minimal set of sender transmissions to satisfy their
946	   repair needs.

948	   In summary, the rule is to request the lower ordinal portion of the
949	   parity content for the FEC coding block to satisfy the erasure repair
950	   needs on the first NACK cycle.  If the available number of parity
951	   symbols is insufficient, the receiver will also request the subset of
952	   ordinally highest missing data symbols to cover what the parity
953	   symbols will not fill.  Note this strategy assumes FEC codes such as
954	   Reed-Solomon for which a single parity symbol can repair any erased
955	   symbol.  This strategy would need minor modification to take into
956	   account the possibly limited repair capability of other FEC types.
957	   On subsequent NACK repair cycles where the receiver may have received
958	   some portion of its previously requested repair content, the receiver
959	   will use the same strategy, but only NACK for the set of parity
960	   and/or data symbols it has not yet received.  Optionally, the
961	   receivers could also provide a count of erasures as a convenience to
962	   the sender.

964	   Other types of FEC schemes may require alteration to the NACK and
965	   repair strategy described here.  For example, some of the large block
966	   or expandable FEC codes described in [RFC3453] may be less
967	   deterministic with respect to defining optimal repair requests by
968	   receivers or repair transmission strategies by senders.  For these
969	   types of codes, it may be sufficient for receivers to NACK with an
970	   estimate of the quantity of additional FEC symbols required to
971	   complete reliable reception and for the sender to respond
972	   accordingly.  This apparent disadvantage as compared to codes such as
973	   Reed Solomon may be offset by reduced computational requirements
974	   and/or ability to support large coding blocks for increased repair
975	   efficiency that these codes can offer.

977	   After receipt and accumulation of NACK messages during the
978	   aggregation period, the sender can begin transmission of fresh
979	   (previously untransmitted) parity symbols for the coding block based
980	   on the highest receiver erasure count _if_ it has a sufficient
981	   quantity of parity symbols that were _not_ previously transmitted.
982	   Otherwise, the sender MUST resort to transmitting the explicit set of
983	   repair vectors requested.  With this approach, the sender needs to
984	   maintain very little state on requests it has received from the group
985	   without need for synchronization of repair requests from the group.
986	   Since all receivers use the same consistent algorithm to express
987	   their explicit repair needs, NACK suppression among receivers is
988	   simplified over the course of multiple repair cycles.  The receivers
989	   can simply compare NACKs heard from other receivers against their own
990	   calculated repair needs to determine whether they should transmit or
991	   suppress their pending NACK messages.

993	3.2.3.2.  NACK Content Format

995	   The format of NACK content will depend on the protocol's data service
996	   model and the format of data content identification the protocol
997	   uses.  This NACK format also depends upon the type of FEC encoding
998	   (if any) used.  Figure 2 illustrates a logical, hierarchical
999	   transmission content identification scheme, denoting that the notion
1000	   of objects (or streams) and/or FEC blocking is optional at the
1001	   protocol instantiation's discretion.  Note that the identification of
1002	   objects is with respect to a given sender.  It is recommended that
1003	   transport data content identification is done within the context of a
1004	   sender in a given session.  Since the notion of session "streams" and
1005	   "blocks" is optional, the framework degenerates to that of typical
1006	   transport data segmentation and reassembly in its simplest form.

1008	       Session_
1009	               \_
1010	                 Sender_
1011	                        \_
1012	                          [Object/Stream(s)]_
1013	                                             \_
1014	                                               [FEC Blocks]_
1015	                                                            \_
1016	                                                              Symbols

1018	    Figure 2: Reliable Multicast Data Content Identification Hierarchy

1020	   The format of NACK messages should enable the following:

1022	   1.  Identification of transport data units required to repair the
1023	       received content, whether this is an entire missing object/stream
1024	       (or range), entire FEC coding block(s), or sets of symbols,

1026	   2.  Simple processing for NACK aggregation and suppression,

1028	   3.  Inclusion of NACKs for multiple objects, FEC coding blocks and/or
1029	       symbols in a single message, and

1031	   4.  A reasonably compact format.

1033	   If the reliable multicast transport object/stream is identified with
1034	   an _<objectId>_ and the FEC symbol being transmitted is identified
1035	   with an _<fecPayloadId>_, the concatenation of _<objectId::
1036	   fecPayloadId>_ comprises a basic transport protocol data unit (TPDU)
1037	   identifier for symbols from a given source.  NACK content can be
1038	   composed of lists and/or ranges of these TPDU identifiers to build up
1039	   NACK messages to describe the receivers repair needs.  If no
1040	   hierarchical object delineation or FEC blocking is used, the TPDU is
1041	   a simple linear representation of the data symbols transmitted by the
1042	   sender.  When the TPDU represents a hierarchy for purposes of object/
1043	   stream delineation and/or FEC blocking, the NACK content unit may
1044	   require flags to indicate which portion of the TPDU is applicable.
1045	   For example, if an entire "object" (or range of objects) is missing
1046	   in the received data, the receiver will not necessarily know the
1047	   appropriate range of _<sourceBlockNumbers>_ or _<encodingSymbolIds>_
1048	   for which to request repair and thus requires some mechanism to
1049	   request repair (or retransmission) of the entire unit represented by
1050	   an _<objectId>_.  The same is true if entire FEC coding blocks
1051	   represented by one or a range of _<sourceBlockNumbers>_ have been
1052	   lost.

1054	   *Inputs*:

1056	   1.  Sender identification.

1058	   2.  Sender data identification.

1060	   3.  Sender FEC Object Transmission Information.

1062	   4.  Recorded sender transmission sequence position.

1064	   5.  Current sender transmission sequence position.  History of repair
1065	       needs for this sender.

1067	   *Outputs*:

1069	   1.  NACK message with repair requests.

1071	3.2.4.  Sender Repair Response

1073	   Upon reception of a repair request from a receiver in the group, the
1074	   sender will initiate a repair response procedure.  The sender may
1075	   wish to delay transmission of repair content until it has had
1076	   sufficient time to accumulate potentially multiple NACKs from the
1077	   receiver set.  This allows the sender to determine the most efficient
1078	   repair strategy for a given transport stream/object or FEC coding
1079	   block.  Depending upon the approach used, some protocols may find it
1080	   beneficial for the sender to provide an indicator of pending repair
1081	   transmissions as part of its current transmitted message content.
1082	   This can aid some NACK suppression mechanisms.  The amount of time to
1083	   perform this NACK aggregation should be sufficient to allow for the
1084	   maximum receiver NACK backoff window (""T_maxBackoff"" from Section
1085	   3.2.2) and propagation of NACK messages from the receivers to the
1086	   sender.  Note the maximum transmission delay of a message from a
1087	   receiver to the sender may be approximately "(1*GRTT)" in the case of
1088	   very asymmetric network topology with respect to transmission delay.
1089	   Thus, if the maximum receiver NACK backoff time is "T_maxBackoff =
1090	   K*GRTT", the sender NACK aggregation period should be equal to at
1091	   least:

1093	            T_sndrAggregate = T_maxBackoff + 1*GRTT = (K+1)*GRTT

1095	   Immediately after the sender NACK aggregation period, the sender will
1096	   begin transmitting repair content determined from the aggregate NACK
1097	   state and continue with any new transmission.  Also, at this time,
1098	   the sender should observe a "hold-off" period where it constrains
1099	   itself from initiating a new NACK aggregation period to allow
1100	   propagation of the new transmission sequence position due to the
1101	   repair response to the receiver group.  To allow for worst case
1102	   asymmetry, this "hold-off" time should be:

1104	                           T_sndrHoldoff = 1*GRTT

1106	   Recall that the receivers will also employ a "hold-off" timeout after
1107	   generating a NACK message to allow time for the sender's response.
1108	   Given a sender "<T_sndrAggregate>" plus "<T_sndrHoldoff>" time of
1109	   "(K+1)*GRTT", the receivers should use hold-off timeouts of:

1111	        T_rcvrHoldoff = T_sndrAggregate + T_sndrHoldoff = (K+2)*GRTT

1113	   This allows for a worst-case propagation time of the receiver's NACK
1114	   to the sender, the sender's aggregation time and propagation of the
1115	   sender's response back to the receiver.  Additionally, in the case of
1116	   unicast feedback from the receiver set, it may be useful for the
1117	   sender to forward (via multicast) a representation of its aggregated
1118	   NACK content to the group to allow for NACK suppression when there is
1119	   not multicast connectivity among the receiver set.

1121	   At the expiration of the "<T_sndrAggregate>" timeout, the sender will
1122	   begin transmitting repair messages according to the accumulated
1123	   content of NACKs received.  There are some guidelines with regards to
1124	   FEC-based repair and the ordering of the repair response from the
1125	   sender that can improve reliable multicast efficiency:

1127	   When FEC is used, it is beneficial that the sender transmit
1128	   previously untransmitted parity content as repair messages whenever
1129	   possible.  This maximizes the receiving nodes' ability to reconstruct
1130	   the entire transmitted content from their individual subsets of
1131	   received messages.

1133	   The transmitted object and/or stream data and repair content should
1134	   be indexed with monotonically increasing sequence numbers (within a
1135	   reasonably large ordinal space).  If the sender observes the
1136	   discipline of transmitting repair for the earliest content (e.g.,
1137	   ordinally lowest FEC blocks) first, the receivers can use a strategy
1138	   of withholding repair requests for later content until the sender
1139	   once again returns to that point in the object/stream transmission
1140	   sequence.  This can increase overall message efficiency among the
1141	   group and help work to keep repair cycles relatively synchronized
1142	   without dependence upon strict time synchronization among the sender
1143	   and receivers.  This also helps minimize the buffering requirements
1144	   of receivers and senders and reduces redundant transmission of data
1145	   to the group at large.

1147	   *Inputs:*

1149	   1.  Receiver NACK messages

1151	   2.  Group timing information

1153	   *Outputs:*

1155	   1.  Repair messages (FEC and/or Data content retransmission)

1157	   2.  Advertisement of current pending repair transmissions when
1158	       unicast receiver feedback is detected.

1160	3.3.  Multicast Receiver Join Policies and Procedures

1162	   Consideration should be given to the policies and procedures by which
1163	   new receivers join a group (perhaps where reliable transmission is
1164	   already in progress) and begin requesting repair.  If receiver joins
1165	   are unconstrained, the dynamics of group membership may impede the
1166	   application's ability to meet its goals for forward progression of
1167	   data transmission.  Policies limiting the opportunities when
1168	   receivers begin participating in the NACK process may be used to
1169	   achieve the desired behavior.  For example, it may be beneficial for
1170	   receivers to attempt reliable reception from a newly-heard sender
1171	   only upon non-repair transmissions of data in the first FEC block of
1172	   an object or logical portion of a stream.  The sender may also
1173	   implement policies limiting the receivers from which it will accept
1174	   NACK requests, but this may be prohibitive for scalability reasons in
1175	   some situations.  Alternatively, it may be desirable to have a looser
1176	   transport synchronization policy and rely upon session management
1177	   mechanisms to limit group dynamics that can cause poor performance,
1178	   in some types of bulk transfer applications (or for potential
1179	   interactive reliable multicast applications).

1181	   *Inputs:*

1183	   1.  Current object/stream data/repair content and sequencing
1184	       identifiers from sender transmissions.

1186	   *Outputs:*

1188	   1.  Receiver yes/no decision to begin receiving and NACKing for
1189	       reliable reception of data

1191	3.4.  Reliable Multicast Member Identification

1193	   In a NACK-based reliable multicast protocol (or other multicast
1194	   protocols) where there is the potential for multiple sources of data,
1195	   it is necessary to provide some mechanism to uniquely identify the
1196	   sources (and possibly some or all receivers in some cases) within the
1197	   group.  Receivers that send NACK messages to the group will need to
1198	   identify the sender to which the NACK is intended.  Identity based on
1199	   arriving packet source addresses is insufficient for several reasons.
1200	   These reasons include routing changes for hosts with multiple
1201	   interfaces that result in different packet source addresses for a
1202	   given host over time, network address translation (NAT) or firewall
1203	   devices, or other transport/network bridging approaches.  As a
1204	   result, some type of unique source identifier _<sourceId>_ field
1205	   SHOULD be present in packets transmitted by reliable multicast
1206	   session members.

1208	3.5.  Data Content Identification

1210	   The data and repair content transmitted by a NACK-based reliable
1211	   multicast sender requires some form of identification in the protocol
1212	   header fields.  This identification is required to facilitate the
1213	   reliable NACK-oriented repair process.  These identifiers will also
1214	   be used in NACK messages generated.  This building block document
1215	   assumes two very general types of data that may comprise bulk
1216	   transfer session content.  One type is static, discrete objects of
1217	   finite size and the other is continuous non-finite streams.  A given
1218	   application may wish to reliably multicast data content using either
1219	   one or both of these paradigms.  While it may be possible for some
1220	   applications to further generalize this model and provide mechanisms
1221	   to encapsulate static objects as content embedded within a stream,
1222	   there are advantages in many applications to provide distinct support
1223	   for static bulk objects and messages with the context of a reliable
1224	   multicast session.  These applications may include content caching
1225	   servers, file transfer, or collaborative tools with bulk content.
1226	   Applications with requirements for these static object types can then
1227	   take advantage of transport layer mechanisms (i.e., segmentation/
1228	   reassembly, caching, integrated forward error correction coding,
1229	   etc.) rather than being required to provide their own mechanisms for
1230	   these functions at the application layer.

1232	   As noted, some applications may alternatively desire to transmit bulk
1233	   content in the form of one or more streams of non-finite size.
1234	   Example streams include continuous quasi-real-time message broadcasts
1235	   (e.g., stock ticker) or some content types that are part of
1236	   collaborative tools or other applications.  And, as indicated above,
1237	   some applications may wish to encapsulate other bulk content (e.g.,
1238	   files) into one or more streams within a multicast session.

1240	   The components described within this building block document are
1241	   envisioned to be applicable to both of these models with the
1242	   potential for a mix of both types within a single multicast session.
1243	   To support this requirement, the normal data content identification
1244	   should include a field to uniquely identify the object or stream
1245	   (e.g., _<objectId>_) within some reasonable temporal or ordinal
1246	   interval.  Note that it is _not_ expected that this data content
1247	   identification will be globally unique.  It is expected that the
1248	   object/stream identifier will be unique with respect to a given
1249	   sender within the reliable multicast session and during the time that
1250	   sender is supporting a specific transport instance of that object or
1251	   stream.

1253	   Since "bulk" object/stream content usually requires segmentation,
1254	   some form of segment identification must also be provided.  This
1255	   segment identifier will be relative to any object or stream
1256	   identifier that has been provided.  Thus, in some cases, NACK-based
1257	   reliable multicast protocol instantiations may be able to receive
1258	   transmissions and request repair for multiple streams and one or more
1259	   sets of static objects in parallel.  For protocol instantiations
1260	   employing FEC the segment identification portion of the data content
1261	   identifier may consist of a logical concatenation of a coding block
1262	   identifier _<sourceBlockNumber>_ and an identifier for the specific
1263	   data or parity symbol _<encodingSymbolId>_ of the code block.  The
1264	   FEC Basic Schemes building block
1265	   [I-D.ietf-rmt-bb-fec-basic-schemes-revised] and descriptions of
1266	   additional FEC schemes that may be documented later provide a
1267	   standard message format for identifying FEC transmission content.
1268	   NACK-based reliable multicast protocol instantiations using FEC
1269	   SHOULD follow such guidelines.

1271	   Additionally, flags to determine the usage of the content identifier
1272	   fields (e.g., stream vs. object) may be applicable.  Flags may also
1273	   serve other purposes in data content identification.  It is expected
1274	   that any flags defined will be dependent upon individual protocol
1275	   instantiations.

1277	   In summary, the following data content identification fields may be
1278	   required for NACK-based reliable multicast protocol data content
1279	   messages:

1281	   1.  Source node identifier (_<sourceId>_)

1283	   2.  Object/Stream identifier (_<objectId>_), if applicable.

1285	   3.  FEC Block identifier (_<sourceBlockNumber>_), if applicable.

1287	   4.  FEC Symbol identifier (_<encodingSymbolId>_)

1289	   5.  Flags to differentiate interpretation of identifier fields or
1290	       identifier structure that implicitly indicates usage.

1292	   6.  Additional FEC transmission content fields per FEC Building Block

1294	   These fields have been identified because any generated NACK messages
1295	   will use these identifiers in requesting repair or retransmission of
1296	   data.

1298	3.6.  Forward Error Correction (FEC)

1300	   Multiple forward error correction (FEC) approaches using erasure
1301	   coding techniques have been identified that can provide great
1302	   performance enhancements to the repair process of NACK-oriented and
1303	   other reliable multicast protocols [FecBroadcast], [RmFec],
1304	   [RFC3453].  NACK-based reliable multicast protocols can reap
1305	   additional benefits since FEC-based repair does not generally require
1306	   explicit knowledge of repair content within the bounds of its coding
1307	   block size (in symbols).  In NACK-based reliable multicast, parity
1308	   repair packets generated will generally be transmitted only in
1309	   response to NACK repair requests from receiving nodes.  However,
1310	   there are benefits in some network environments for transmitting some
1311	   predetermined quantity of FEC repair packets multiplexed with the
1312	   regular data symbol transmissions [FecHybrid].  This can reduce the
1313	   amount of NACK traffic generated with relatively little overhead cost
1314	   when group sizes are very large or the network connectivity has a
1315	   large "delay*bandwidth" product with some nominal level of expected
1316	   packet loss.  While the application of FEC is not unique to NACK-
1317	   based reliable multicast, these sorts of requirements may dictate the
1318	   types of algorithms and protocol approaches that are applicable.

1320	   A specific issue related to the use of FEC with NACK-based reliable
1321	   multicast is the mechanism used to identify the portion(s) of
1322	   transmitted data content to which specific FEC packets are
1323	   applicable.  It is expected that FEC algorithms will be based on
1324	   generating a set of parity repair packets for a corresponding block
1325	   of transmitted data packets.  Since data content packets are uniquely
1326	   identified by the concatenation of _<sourceId::objectId::
1327	   sourceBlockNumber::encodingSymbolId>_ during transport, it is
1328	   expected that FEC packets will be identified in a similar manner.
1329	   The FEC Building Block document [RFC5052] provides detailed
1330	   recommendations concerning application of FEC and standard formats
1331	   for related reliable multicast protocol messages.

1333	3.7.  Round-trip Timing Collection

1335	   The measurement of packet propagation round-trip time (RTT) among
1336	   members of the group is required to support timer-based NACK
1337	   suppression algorithms, timing of sender commands or certain repair
1338	   functions, and congestion control operation.  The nature of the
1339	   round-trip information collected is dependent upon the type of
1340	   interaction among the members of the group.  In the case of "one-to-
1341	   many" transmission, it may be that only the sender requires RTT
1342	   knowledge of the GRTT and/or RTT knowledge of only a portion of the
1343	   group.  Here, the GRTT information might be collected in a reasonably
1344	   scalable manner.  For congestion control operation, it is possible
1345	   that each receiver in the group may need knowledge of its individual
1346	   RTT.  In this case, an alternative RTT collection scheme may be
1347	   utilized where receivers collect individual RTT measurements with
1348	   respect to the sender(s) and advertise them to the group or
1349	   sender(s).  Where it is likely that exchange of reliable multicast
1350	   data will occur among the group on a "many-to-many" basis, there are
1351	   alternative measurement techniques that might be employed for
1352	   increased efficiency[DelayEstimation].  In some cases, there might be
1353	   absolute time synchronization available among the participating hosts
1354	   that may simplify RTT measurement.  There are trade-offs in multicast
1355	   congestion control design that require further consideration before a
1356	   universal recommendation on RTT (or GRTT) measurement can be
1357	   specified.  Regardless of how the RTT information is collected (and
1358	   more specifically GRTT) with respect to congestion control or other
1359	   requirements, the sender will need to advertise its current GRTT
1360	   estimate to the group for various NACK timeouts used by receivers.

1362	3.7.1.  One-to-Many Sender GRTT Measurement

1364	   The goal of this form of RTT measurement is for the sender to
1365	   estimate the GRTT among the receivers who are actively participating
1366	   in NACK-based reliable multicast operation.  The set of receivers
1367	   participating in this process may be the entire group or some subset
1368	   of the group determined from another mechanism within the protocol
1369	   instantiation.  An approach to collect this GRTT information follows.

1371	   The sender periodically polls the group with a message (independent
1372	   or "piggy-backed" with other transmissions) containing a "<sendTime>"
1373	   timestamp relative to an internal clock at the sender.  Upon
1374	   reception of this message, the receivers will record this
1375	   "<sendTime>" timestamp and the time (referenced to their own clocks)
1376	   at which it was received "<recvTime>".  When the receiver provides
1377	   feedback to the sender (either explicitly or as part of other
1378	   feedback messages depending upon protocol instantiation
1379	   specification), it will construct a "response" using the formula:

1381	             grttResponse = sendTime + (currentTime - recvTime)

1383	   where the "<sendTime>" is the timestamp from the last probe message
1384	   received from the source and the ("<currentTime> - <recvTime>") is
1385	   the amount of time differential since that request was received until
1386	   the receiver generated the response.

1388	   The sender processes each receiver response by calculating a current
1389	   RTT measurement for the receiver from whom the response was received
1390	   using the following formula:

1392	                   RTT_rcvr = currentTime - grttResponse

1394	   During the each periodic "GRTT" probing interval, the source keeps
1395	   the peak round trip timing measurement ("RTT_peak") from the set of
1396	   responses it has received.  A conservative estimate of "GRTT" is kept
1397	   to maximize the efficiency of redundant NACK suppression and repair
1398	   aggregation.  The update to the source's ongoing estimate of "GRTT"
1399	   is done observing the following rules:

1401	   1.  If a receiver's response round trip time ("RTT_rcvr") is greater
1402	       than the current "GRTT" estimate, the "GRTT" is immediately
1403	       updated to this new peak value:

1405	                              GRTT = RTT_rcvr

1407	   2.  At the end of the response collection period (i.e., the GRTT
1408	       probe interval), if the recorded "peak" response "RTT_peak") is
1409	       less than the current GRTT estimate, the GRTT is updated to:

1411	                       GRTT = MAX(0.9*GRTT, RTT_peak)

1413	   3.  If no feedback is received, the sender "GRTT" estimate remains
1414	       unchanged.

1416	   4.  At the end of the response collection period, the peak tracking
1417	       value ("RTT_peak") is reset to ZERO for subsequent peak
1418	       detection.

1420	   The GRTT collection period (i.e., period of probe transmission) could
1421	   be fixed at a value on the order of that expected for group
1422	   membership and/or network topology dynamics.  For robustness, more
1423	   rapid probing could be used at protocol startup before settling to a
1424	   less frequent, steady-state interval.  Optionally, an algorithm may
1425	   be developed to adjust the GRTT collection period dynamically in
1426	   response to the current estimate of GRTT (or variations in it) and to
1427	   an estimation of packet loss.  The overhead of probing messages could
1428	   then be reduced when the GRTT estimate is stable and unchanging, but
1429	   be adjusted to track more dynamically during periods of variation
1430	   with correspondingly shorter GRTT collection periods.  GRTT
1431	   collection MAY also be coupled with collection of other information
1432	   for congestion control purposes.

1434	   In summary, although NACK repair cycle timeouts are based on GRTT, it
1435	   should be noted that convergent operation of the protocol does not
1436	   depend upon highly accurate GRTT estimation.  The current mechanism
1437	   has proved sufficient in simulations and in the environments where
1438	   NACK-based reliable multicast protocols have been deployed to date.
1439	   The estimate provided by the given algorithm tracks the peak envelope
1440	   of actual GRTT (including operating system effect as well as network
1441	   delays) even in relatively high loss connectivity.  The steady-state
1442	   probing/update interval may potentially be varied to accommodate
1443	   different levels of expected network dynamics in different
1444	   environments.

1446	3.7.2.  One-to-Many Receiver RTT Measurement

1448	   In this approach, receivers send messages with timestamps to the
1449	   sender.  To control the volume of these receiver-generated messages,
1450	   a suppression mechanism similar to that described for NACK
1451	   suppression my be used.  The "age" of receivers' RTT measurement
1452	   should be kept by receivers and used as a metric in competing for
1453	   feedback opportunities in the suppression scheme.  For example,
1454	   receiver who have not made any RTT measurement or whose RTT
1455	   measurement has aged most should have precedence over other
1456	   receivers.  In turn the sender may have limited capacity to provide
1457	   an "echo" of the receiver timestamps back to the group, and it could
1458	   use this RTT "age" metric to determine which receivers get
1459	   precedence.  The sender can determine the "GRTT" as described in
1460	   3.7.1 if it provides sender timestamps to the group.  Alternatively,
1461	   receivers who note their RTT is greater than the sender GRTT can
1462	   compete in the feedback opportunity/suppression scheme to provide the
1463	   sender and group with this information.

1465	3.7.3.  Many-to-Many RTT Measurement

1467	   For reliable multicast sessions that involve multiple senders, it may
1468	   be useful to have RTT measurements occur on a true "many-to-many"
1469	   basis rather than have each sender independently tracking RTT.  Some
1470	   protocol efficiency can be gained when receivers can infer an
1471	   approximation of their RTT with respect to a sender based on RTT
1472	   information they have on another sender and that other sender's RTT
1473	   with respect to the new sender of interest.  For example, for
1474	   receiver "a" and senders "b" and "c", it is likely that:

1476	                    RTT(a<->b) <= RTT(a<->c)) + RTT(b<->c)

1478	   Further refinement of this estimate can be conducted if RTT
1479	   information is available to a node concerning its own RTT to a small
1480	   subset of other group members and RTT information among those other
1481	   group members it learns during protocol operation.

1483	3.7.4.  Sender GRTT Advertisement

1485	   To facilitate deterministic protocol operation, the sender should
1486	   robustly advertise its current estimation of "GRTT" to the receiver
1487	   set.  Common, robust knowledge of the sender's current operating GRTT
1488	   estimate among the group will allow the protocol to progress in its
1489	   most efficient manner.  The sender's GRTT estimate can be robustly
1490	   advertised to the group by simply embedding the estimate into all
1491	   pertinent messages transmitted by the sender.  The overhead of this
1492	   can be made quite small by quantizing (compressing) the GRTT estimate
1493	   to a single byte of information.  The following C-language functions
1494	   allows this to be done over a wide range ("RTT_MIN" through
1495	   "RTT_MAX") of GRTT values while maintaining a greater range of
1496	   precision for small values and less precision for large values.
1497	   Values of 1.0e-06 seconds and 1000 seconds are RECOMMENDED for
1498	   "RTT_MIN" and "RTT_MAX" respectively.  NACK-based reliable multicast
1499	   applications may wish to place an additional, smaller upper limit on
1500	   the GRTT advertised by senders to meet application data delivery
1501	   latency constraints at the expense of greater feedback volume in some
1502	   network environments.

1504	       unsigned char QuantizeGrtt(double grtt)
1505	       {
1506	           if (grtt > RTT_MAX)
1507	               grtt = RTT_MAX;
1508	           else if (grtt < RTT_MIN)
1509	               grtt = RTT_MIN;
1510	           if (grtt < (33*RTT_MIN))
1511	               return ((unsigned char)(grtt / RTT_MIN) - 1);
1512	           else
1513	               return ((unsigned char)(ceil(255.0 -
1514	                                       (13.0 * log(RTT_MAX/grtt)))));
1515	       }

1517	       double UnquantizeRtt(unsigned char qrtt)
1518	       {
1519	           return ((qrtt <= 31) ?
1520	                   (((double)(qrtt+1))*(double)RTT_MIN) :
1521	                   (RTT_MAX/exp(((double)(255-qrtt))/(double)13.0)));
1522	       }

1524	   Note that this function is useful for quantizing GRTT times in the
1525	   range of 1 microsecond to 1000 seconds.  Of course, NACK-based
1526	   reliable multicast protocol implementations may wish to further
1527	   constrain advertised GRTT estimates (e.g., limit the maximum value)
1528	   for practical reasons.

1530	3.8.  Group Size Determination/Estimation

1532	   When NACK-based reliable multicast protocol operation includes
1533	   mechanisms that excite feedback from the group at large (e.g.,
1534	   congestion control), it may be possible to roughly estimate the group
1535	   size based on the number of feedback messages received with respect
1536	   to the distribution of the probabilistic suppression mechanism used.
1537	   Note the timer-based suppression mechanism described in this document
1538	   does not require a very accurate estimate of group size to perform
1539	   adequately.  Thus, a rough estimate, particularly if conservatively
1540	   managed, may suffice.  Group size may also be determined
1541	   administratively.  In absence of any group size determination
1542	   mechanism a default group size value of 10,000 is RECOMMENDED for
1543	   reasonable management of feedback given the scalability of expected
1544	   NACK-based reliable multicast usage.  This conservative estimate
1545	   (over-estimate) of group size in the algorithms described above will
1546	   result in some added latency to the NACK repair process if the actual
1547	   group size is smaller but with a guarantee of feedback implosion
1548	   protection.  The study of the timer-based feedback suppression
1549	   mechanism described in [McastFeedback] and [NormFeedback] showed that
1550	   the group size estimate need only be with an order-of-magnitude to
1551	   provide effective suppression performance.

1553	3.9.  Congestion Control Operation

1555	   Congestion control that fairly shares available network capacity with
1556	   other reliable multicast and TCP instantiations is REQUIRED for
1557	   general Internet operation.  The TCP-Friendly Multicast Congestion
1558	   Control (TFMCC) [TfmccPaper] or Pragmatic General Multicast
1559	   Congestion Control (PGMCC) [PgmccPaper] techniques can be applied to
1560	   NACK-based reliable multicast operation to meet this requirement.
1561	   The former technique has been further documented in [RFC4654] and has
1562	   been successfully applied in the NACK-Oriented Reliable Multicast
1563	   Protocol [RFC3940].

1565	3.10.  Intermediate System Assistance

1567	   NACK-based multicast protocols may benefit from general purpose
1568	   intermediate system assistance.  In particular, additional NACK
1569	   suppression where intermediate systems can aggregate NACK content (or
1570	   filter duplicate NACK content) from receivers as it is relayed toward
1571	   the sender could enhance NORM group size scalability.  For NACK-based
1572	   reliable multicast protocols using FEC, it is possible that
1573	   intermediate systems may be able to filter FEC repair messages to
1574	   provide an intelligent "subcast" of repair content to different legs
1575	   of the multicast topology depending on the repair needs learned from
1576	   previous receiver NACKs.  Similarly, intermediate systems could
1577	   monitor receiver NACKs and provide repair transmissions on-demand in
1578	   response if sufficient state on the content being transmitted was
1579	   being maintained.  This can reduce the latency and volume of repair
1580	   transmissions when the intermediate system is associated with a
1581	   network link that is particularly problematic with respect to packet
1582	   loss.  These types of assist functions would require intermediate
1583	   system interpretation of transport data unit content identifiers and
1584	   flags.  NACK-based protocol designs should consider the potential for
1585	   intermediate system assistance in the specification of protocol
1586	   messages and operations.  It is likely that intermediate systems
1587	   assistance will be more pragmatic if message parsing requirements are
1588	   modest and if the amount of state an intermediate system is required
1589	   to maintain is relatively small.

1591	4.  NACK-based Reliable Multicast Applicability

1593	   The Multicast NACK building block applies to protocols wishing to
1594	   employ negative acknowledgement to achieve reliable data transfer.
1595	   Properly designed NACK-based reliable multicast protocols offer
1596	   scalability advantages for applications and/or network topologies
1597	   where, for various reasons, it is prohibitive to construct a higher
1598	   order delivery infrastructure above the basic Layer 3 IP multicast
1599	   service (e.g., unicast or hybrid unicast/multicast data distribution
1600	   trees).  Additionally, the multicast scalability property of NACK-
1601	   based protocols [RmComparison], [RmClasses] is applicable where broad
1602	   "fan-out" is expected for a single network hop (e.g., cable-TV data
1603	   delivery, satellite, or other broadcast communication services).
1604	   Furthermore, the simplicity of a protocol based on "flat" group-wide
1605	   multicast distribution may offer advantages for a broad range of
1606	   distributed services or dynamic networks and applications.  NACK-
1607	   based reliable multicast protocols can make use of reciprocal (among
1608	   senders and receivers) multicast communication under the Any-Source
1609	   Multicast (ASM) model defined in RFC 1112 [RFC1112],and are capable
1610	   of scalable operation in asymmetric topologies such as Source-
1611	   Specific Multicast (SSM) [RFC4607] where there may only be unicast
1612	   routing service from the receivers to the sender(s).

1614	   NACK-based reliable multicast protocol operation is compatible with
1615	   transport layer forward error correction coding techniques as
1616	   described in [RFC3453]and congestion control mechanisms such as those
1617	   described in [TfmccPaper]and [PgmccPaper].  A principal limitation of
1618	   NACK-based reliable multicast operation involves group size
1619	   scalability when network capacity for receiver feedback is very
1620	   limited.  It is possible that, with proper protocol design, the
1621	   intermediate system assistance techniques mentioned in Section 2.4
1622	   and described further in Section 3.10 can allow NACK-based approaches
1623	   to scale to larger group sizes.  NACK-based reliable multicast
1624	   operation is also governed by implementation buffering constraints.
1625	   Buffering greater than that required for typical point-to-point
1626	   reliable transport (e.g., TCP) is recommended to allow for disparity
1627	   in the receiver group connectivity and to allow for the feedback
1628	   delays required to attain group size scalability.

1630	   Prior experimental work included various protocol instantiations that
1631	   implemented some of the concepts described in this building block
1632	   document.  This includes the Pragmatic General Multicast (PGM)
1633	   protocol described in [RFC3208] among others that were documented or
1634	   deployed outside of IETF activities.  While the PGM protocol
1635	   specification and some other approaches encompassed many of the goals
1636	   of bulk data delivery as described here, this NACK-based building
1637	   block provides a more generalized framework so that different
1638	   application needs can be met by different protocol instantiation
1639	   variants.  The NACK-based building block approach described here
1640	   includes compatiblity with the other protocol mechanisms including
1641	   FEC and congestion control that are described in other IETF reliable
1642	   multicast building block documents.  The NACK repair process
1643	   described in this document can provide performance advantages as
1644	   compared to PGM when both are deployed on a pure end-to-end basis
1645	   without intermediate system assistance.  The round-trip timing
1646	   estimation described here and its use in the NACK repair process
1647	   allow protocol operation to more automatically adapt to different
1648	   network environments or operate within environments where
1649	   connectivity is dynamic.  Use of the FEC payload identification
1650	   techniques described in the FEC building block [RFC5052] and specific
1651	   FEC instantiations allow protocol instantiations more flexibility as
1652	   FEC techniques evolve than the specific sequence number data
1653	   identification scheme described in the PGM specification.  Similar
1654	   flexibility is expected if protocol instantiations are designed to
1655	   modularly invoke (at design time, if not run-time) the appropriate
1656	   congestion control building block for different application or
1657	   deployment purposes.

1659	5.  Security Considerations

1661	   NACK-based reliable multicast protocols are expected to be subject to
1662	   the same security vulnerabilities as other IP and IP Multicast
1663	   protocols.  However, unlike point-to-point (unicast) transport
1664	   protocols, it is possible that one badly-behaving participant can
1665	   impact the transport service experience of others in the group.  For
1666	   example, a malicious receiver node could intentionally transmit NACK
1667	   messages to cause the sender(s) to unnecessarily transmit repairs
1668	   instead of making forward progress with reliable transfer.  Also,
1669	   group-wise messaging to support congestion control or other aspects
1670	   of protocol operation may be subject to similar vulnerabilities.
1671	   Thus, it is highly RECOMMENDED that security techniques such as
1672	   authentication and data integrity checks be applied for NACK-based
1673	   reliable multicast deployments.  Protocol instantiations using this
1674	   building block MUST identify approaches to security that can be used
1675	   to address these and other security considerations.

1677	   NACK-based reliable multicast is compatible with IP security (IPsec)
1678	   authentication mechanisms [RFC4301] that are RECOMMENDED for
1679	   protection against session intrusion and denial of service attacks.
1680	   A particular threat for NACK-based protocols is that of NACK replay
1681	   attacks that could prevent a multicast sender from making forward
1682	   progress in transmission.  Any standard IPsec mechanisms that can
1683	   provide protection against such replay attacks are RECOMMENDED for
1684	   use.  The IETF Multicast Security (MSEC) Working Group has developed
1685	   a set of recommendations in its Multicast Extensions to the Internet
1686	   Protocol Security Architecture [I-D.ietf-msec-ipsec-extensions] that
1687	   can be applied to appropriately extend IPsec mechanisms to multicast
1688	   operation.  An appendix of this document specifically addresses the
1689	   Nack-Oriented Reliable Multicast protocol service model.  As complete
1690	   support for IPsec multicast operation may potentially follow reliable
1691	   multicast deployment, NACK-based reliable multicast protocol
1692	   instantiations SHOULD consider providing support for their own NACK
1693	   replay attack protection when network layer mechanisms are not
1694	   available.  This MAY be necessary when IPsec implementations are used
1695	   that do not provide multicast replay attack protection when multiple
1696	   sources are present.

1698	   For NACK-based multicast deployments with large receiver groups using
1699	   IPsec, approaches might be developed that use shared, common keys for
1700	   receiver-originated protocol messages to maintain a practical number
1701	   of IPsec Security Associations (SAs).  However, such group-based
1702	   authentication may not be sufficient unless the receiver population
1703	   can be completely trusted.  Additionally, this can make
1704	   identification of badly-behaving (although authenticated) receiver
1705	   nodes problematic as such nodes could potentially masquerade as other
1706	   receivers in the group.  In deployments such as this, one SHOULD
1707	   consider use of Source-Specific Multicast (SSM) instead of Any-Source
1708	   Multicast (ASM) models of multicast operation.  SSM operation can
1709	   simplify security challenges in a couple of ways:

1711	   1.  A NACK-based protocol supporting SSM operation can eliminate
1712	       direct receiver-to-receiver signaling.  This dramatically reduces
1713	       the number of security associations that need to be established.

1715	   2.  The SSM sender(s) can provide a centralized management point for
1716	       secure group operation for its respective data flow with the
1717	       sender alone required to conduct individual host authentication
1718	       for each receiver when group-based authentication does not
1719	       suffice or is not pragmatic to deploy.

1721	   When individual host authentication is required, then it is possible
1722	   receivers could use a digital signature on the IPsec Encapsulating
1723	   Security Protocol (ESP) payload as described in [RFC4359].  Either an
1724	   identity-based signature system or a group-specific public key
1725	   infrastructure could avoid per-receiver state at the sender(s).
1726	   Additionally, implementations MUST also support policies to limit the
1727	   impact of extremely or exceptionally poor-performing (due to bad
1728	   behavior or otherwise) receivers upon overall group operation if this
1729	   is acceptable for the relevant application.

1731	   As described in Section 3.4, deployment of NACK-based reliable
1732	   multicast in some network environments may require identification of
1733	   group members beyond that of IP addressing.  If protocol-specific
1734	   security mechanisms are developed, then it is RECOMMENDED that
1735	   protocol group member identifiers are used as selectors (as defined
1736	   in [RFC4301]) for the applicable security associations.  When IPsec
1737	   is used, it is RECOMMENDED that the protocol implementation verify
1738	   that the source IP address of received packets are valid for the
1739	   given protocol source identifier in addition to usual IPsec
1740	   authentication.  This would prevent a badly-behaving (although
1741	   authorized) member spoofing messages from other legitimate members,
1742	   providing that individual host authentication is supported.

1744	   The MSEC Working Group has also developed automated group keying
1745	   solutions which are applicable to NACK-based reliable multicast
1746	   security.  For example, to support IPsec or other security
1747	   mechanisms, the Group Secure Association Key Management Protocol
1748	   [RFC4535] MAY be used for automated group key management.  The
1749	   technique it identifies for "Group Establishment for Receive-Only
1750	   Members" may be application NACK-based reliable multicast SSM
1751	   operation.

1753	6.  IANA Considerations

1755	   This document has no actions for IANA.

1757	7.  Changes from RFC3941

1759	   This section lists the changes between the Experimental version of
1760	   this specification, [RFC3941], and this version:

1762	   1.  Change of title to avoid confusion with NORM Protocol
1763	       specification,

1765	   2.  Updated references to related, updated RMT Building Block
1766	       documents, and

1768	   3.  More detailed security considerations.

1770	8.  Acknowledgements

1772	   (and these are not Negative)

1774	   The authors would like to thank George Gross, Rick Jones, and Joerg
1775	   Widmer for their valuable comments on this document.  The authors
1776	   would also like to thank the RMT working group chairs, Roger Kermode
1777	   and Lorenzo Vicisano, for their support in development of this
1778	   specification, and Sally Floyd for her early inputs into this
1779	   document.

1781	9.  References

1783	9.1.  Normative References

1785	   [RFC1112]  Deering, S., "Host extensions for IP multicasting", STD 5,
1786	              RFC 1112, August 1989.

1788	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1789	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1791	   [RFC4607]  Holbrook, H. and B. Cain, "Source-Specific Multicast for
1792	              IP", RFC 4607, August 2006.

1794	9.2.  Informative References

1796	   [ArchConsiderations]
1797	              Clark,  D. and D. Tennenhouse, "Architectural
1798	              Considerations for a New Generation of Protocols", In
1799	              Proc. ACM SIGCOMM pages 201-208, September 1990.

1801	   [DelayEstimation]
1802	              Ozdemir,  V., Muthukrishnan, S., and I. Rhee, "Scalable,
1803	              Low-Overhead Network Delay Estimation", NCSU/ AT&T White
1804	              Paper , February 1999.

1806	   [FecBroadcast]
1807	              Metzner,  J., "An Improved Broadcast Retransmission
1808	              Protocol", IEEE Transactions on Communications Vol.
1809	              Com-32, No. 6, June 1984.

1811	   [FecHybrid]
1812	              Gossink,  D. and J. Macker, "Reliable Multicast and
1813	              Integrated Parity Retransmission with Channel Estimation",
1814	              IEEE Globecomm 1998, 1998.

1816	   [I-D.ietf-msec-ipsec-extensions]
1817	              Weis, B., Gross, G., and D. Ignjatic, "Multicast
1818	              Extensions to the Security Architecture for the Internet
1819	              Protocol", draft-ietf-msec-ipsec-extensions-09 (work in
1820	              progress), June 2008.

1822	   [I-D.ietf-rmt-bb-fec-basic-schemes-revised]
1823	              Watson, M., "Basic Forward Error Correction (FEC)
1824	              Schemes", draft-ietf-rmt-bb-fec-basic-schemes-revised-05
1825	              (work in progress), July 2008.

1827	   [McastFeedback]
1828	              Nonnenmacher,  J. and E. Biersack, "Optimal Multicast
1829	              Feedback", in IEEE Infocom p. 964, March/April 1998.

1831	   [NormFeedback]
1832	              Adamson, B. and J. Macker, "Quantitative Prediction of
1833	              NACK-Oriented Reliable Multicast (NORM) Feedback", in IEEE
1834	              MILCOM 2002, October 2002.

1836	   [PgmccPaper]
1837	              Rizzo,  L., "pgmcc: A TCP-Friendly Single-Rate Multicast
1838	              Congestion Control Scheme", ACM SIGCOMM 2000 ,
1839	              August 2000.

1841	   [RFC2357]  Mankin, A., Romanov, A., Bradner, S., and V. Paxson, "IETF
1842	              Criteria for Evaluating Reliable Multicast Transport and
1843	              Application Protocols", RFC 2357, June 1998.

1845	   [RFC3208]  Speakman, T., Crowcroft, J., Gemmell, J., Farinacci, D.,
1846	              Lin, S., Leshchiner, D., Luby, M., Montgomery, T., Rizzo,
1847	              L., Tweedly, A., Bhaskar, N., Edmonstone, R.,
1848	              Sumanasekera, R., and L. Vicisano, "PGM Reliable Transport
1849	              Protocol Specification", RFC 3208, December 2001.

1851	   [RFC3269]  Kermode, R. and L. Vicisano, "Author Guidelines for
1852	              Reliable Multicast Transport (RMT) Building Blocks and
1853	              Protocol Instantiation documents", RFC 3269, April 2002.

1855	   [RFC3453]  Luby, M., Vicisano, L., Gemmell, J., Rizzo, L., Handley,
1856	              M., and J. Crowcroft, "The Use of Forward Error Correction
1857	              (FEC) in Reliable Multicast", RFC 3453, December 2002.

1859	   [RFC3940]  Adamson, B., Bormann, C., Handley, M., and J. Macker,
1860	              "Negative-acknowledgment (NACK)-Oriented Reliable
1861	              Multicast (NORM) Protocol", RFC 3940, November 2004.

1863	   [RFC3941]  Adamson, B., Bormann, C., Handley, M., and J. Macker,
1864	              "Negative-Acknowledgment (NACK)-Oriented Reliable
1865	              Multicast (NORM) Building Blocks", RFC 3941,
1866	              November 2004.

1868	   [RFC4301]  Kent, S. and K. Seo, "Security Architecture for the
1869	              Internet Protocol", RFC 4301, December 2005.

1871	   [RFC4359]  Weis, B., "The Use of RSA/SHA-1 Signatures within
1872	              Encapsulating Security Payload (ESP) and Authentication
1873	              Header (AH)", RFC 4359, January 2006.

1875	   [RFC4535]  Harney, H., Meth, U., Colegrove, A., and G. Gross,
1876	              "GSAKMP: Group Secure Association Key Management
1877	              Protocol", RFC 4535, June 2006.

1879	   [RFC4654]  Widmer, J. and M. Handley, "TCP-Friendly Multicast
1880	              Congestion Control (TFMCC): Protocol Specification",
1881	              RFC 4654, August 2006.

1883	   [RFC5052]  Watson, M., Luby, M., and L. Vicisano, "Forward Error
1884	              Correction (FEC) Building Block", RFC 5052, August 2007.

1886	   [RmClasses]
1887	              Levine,  B. and J. Garcia-Luna-Aceves, "A Comparison of
1888	              Known Classes of Reliable Multicast Protocols", Proc.
1889	              International Conference on Network Protocols (ICNP-
1890	              96) Columbus, Ohio, October 1996.

1892	   [RmComparison]
1893	              Pingali,  S., Towsley, D., and J. Kurose, "A Comparison of
1894	              Sender-Initiated and Receiver-Initiated Reliable Multicast
1895	              Protocols", Proc. INFOCOMM San Francisco, CA,
1896	              October 1993.

1898	   [RmFec]    Macker,  J., "Reliable Multicast Transport and Integrated
1899	              Erasure-based Forward Error Correction", IEEE MILCOM 1997,
1900	              October 1997.

1902	   [SrmFramework]
1903	              Floyd,  S., Jacobson, V., McCanne, S., Liu, C., and L.
1904	              Zhang, "A Reliable Multicast Framework for Light-weight
1905	              Sessions and Application Level Framing", Proc. ACM
1906	              SIGCOMM , August 1995.

1908	   [TfmccPaper]
1909	              Widmer, J. and M. Handley, "Extending Equation-Based
1910	              Congestion Control to Multicast Applications", ACM
1911	              SIGCOMM 2001, August 2001.

1913	Authors' Addresses

1915	   Brian Adamson
1916	   Naval Research Laboratory
1917	   Washington, DC  20375

1919	   Email: adamson@itd.nrl.navy.mil

1921	   Carsten Bormann
1922	   Universitaet Bremen TZI
1923	   Postfach 330440
1924	   D-28334 Bremen, Germany

1926	   Email: cabo@tzi.org
1927	   Mark Handley
1928	   University College London
1929	   Gower Street
1930	   London,   WC1E 6BT
1931	   UK

1933	   Email: M.Handley@cs.ucl.ac.uk

1935	   Joe Macker
1936	   Naval Research Laboratory
1937	   Washington, DC  20375

1939	   Email: macker@itd.nrl.navy.mil

1941	Full Copyright Statement

1943	   Copyright (C) The IETF Trust (2008).

1945	   This document is subject to the rights, licenses and restrictions
1946	   contained in BCP 78, and except as set forth therein, the authors
1947	   retain all their rights.

1949	   This document and the information contained herein are provided on an
1950	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1951	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
1952	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
1953	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
1954	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1955	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1957	Intellectual Property

1959	   The IETF takes no position regarding the validity or scope of any
1960	   Intellectual Property Rights or other rights that might be claimed to
1961	   pertain to the implementation or use of the technology described in
1962	   this document or the extent to which any license under such rights
1963	   might or might not be available; nor does it represent that it has
1964	   made any independent effort to identify any such rights.  Information
1965	   on the procedures with respect to rights in RFC documents can be
1966	   found in BCP 78 and BCP 79.

1968	   Copies of IPR disclosures made to the IETF Secretariat and any
1969	   assurances of licenses to be made available, or the result of an
1970	   attempt made to obtain a general license or permission for the use of
1971	   such proprietary rights by implementers or users of this
1972	   specification can be obtained from the IETF on-line IPR repository at
1973	   http://www.ietf.org/ipr.

1975	   The IETF invites any interested party to bring to its attention any
1976	   copyrights, patents or patent applications, or other proprietary
1977	   rights that may cover technology that may be required to implement
1978	   this standard.  Please address the information to the IETF at
1979	   ietf-ipr@ietf.org.