RMT Working Group Brian Whetten Internet Engineering Task Force Talarian Internet Draft Dah Ming Chiu Document: draft-ietf-rmt-bb-track-01.txt Sun Microsystems 2 March, 2001 Miriam Kadansky Expires 2 October, 2001 Sun Microsystems Gursel Taskale Talarian Reliable Multicast Transport Building Block for TRACK Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document describes the TRACK Building Block. It contains functions relating to positive acknowledgments and hierarchical tree construction and maintenance. It is primarily meant to be used as part of the TRACK Protocol Instantiation. It is also designed to be useful as part of overlay multicast systems that wish to offer efficient confirmed delivery of multicast messages. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119. INTERNET DRAFT draft-ietf-rmt-bb-track-00.txt 1 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 Table of Contents 1. Introduction 2. Design Rationale 3. Applicability Statement 3.1 Application types 3.2 Network Infrastructure 4. Message Types 5. Global Configuration Variables, Constants, and Reason Codes 5.1 Global Configuration Variables 5.2 Constants 5.3 Reason Codes 6. External APIs 6.1 Interfaces to the BB from PI's 6.1.1 Start(boolean RepairHead, boolean RejoinAllowed, Advertisement) 6.1.2 End 6.1.3 incomingMessage(Message) 6.1.4 getStatistics 6.1.5 MessageSynched(Message) 6.1.6 RepairHead(boolean) 6.2 Interfaces from the BB to the PI 6.2.1 outgoingMessage(Message) 6.2.2 MessageReceived(Message, boolean Synch) 6.2.3 SenderLost 6.2.4 UnrecoverableData 6.2.5 SessionDone 7. Algorithms 7.1 Tree Based Session Creation and Maintenance 7.1.1 Overview of Tree Configuration 7.1.2 Bind 7.1.2.1 Input Parameters 7.1.2.2 Bind Algorithm 7.1.3 Unbind 7.1.4 Eject 7.1.5 Fault Detection 7.1.6 Fault Notification 7.1.7 Fault Recovery 7.2 TRACK Generation 7.2.1 TRACK Generation with the Rotating TRACK Algorithm 7.2.2 Local Repair 7.2.3 Flow Control Window Update 7.2.4 Reliability Window 7.2.5 Confirmed Delivery 7.3 Feedback Aggregation 7.4 Measuring Round Trip Times 8. Security 9. References 10. Acknowledgements 11. Authors' Addresses INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 2 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 1. Introduction This document describes the TRACK Building Block. It contains functions relating to positive acknowledgments and hierarchical tree construction and maintenance. It is primarily meant to be used as part of the TRACK Protocol Instantiation. It is also designed to be useful as part of overlay multicast systems that wish to offer efficient confirmed delivery of multicast messages. As pointed out in the building blocks rationale draft [WVKHFL00], there are two different reliability tasks that can be provided by a reliable multicast transport: ensuring goodput and confirming delivery of application level messages. The NACK Protocol Instantiation and ALC Protocol Instantiation are each primarily concerned with ensuring goodput. The TRACK BB and TRACK PI rely on a repair tree to provide goodput as well as confirmed delivery. If Forward Error Correction, Generic Router Assist or other mechanisms are used to help provide goodput, they are assumed to work transparently at a layer below this BB, as if the IP multicast service has lower error rate. The TRACK BB also assumes that there is an Automatic Tree Building BB [KLCWTCTK01] which provides the list of parents (known as Service Nodes within in Tree BB) each node should join to. If Receivers are used that may also serve as Repair Heads, the TRACK BB assumes the Auto Tree BB is also responsible for selecting the role of each Receiver as either Receiver or Repair Head. However, the TRACK BB may specify that a particular node may not operate as a Repair Head. The TRACK BB also assumes that a separate session advertisement protocol notifies the receivers as to when to join a session, the data multicast address for the session, and the control parameters for the session. The TRACK BB provides additional information and aggregation capabilities, which are useful for congestion control. The TRACK BB provides the following detailed functionality. @ Hierarchical Session Creation and Maintenance. This set of functionality is responsible for creating and maintaining (but not configuring) the hierarchical tree of Repair Heads and Receivers. o Bind. When a child knows the parent it wishes to join to for a given data session, it binds to that parent. o Unbind. When a child wishes to leave a data session, either because the session is over or because the application is finished with the session, it initiates an unbind operation with its parent. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 3 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 o Eject. A parent can also force a child to unbind. This happens if the parent needs to leave the session, if the child is not behaving correctly, or if the parent wants to move the child to another parent as part of tree configuration maintenance. o Fault Detection. In order to verify liveness, parents and children send regular heartbeat messages between themselves. The sender also sends regular null data messages to the group, if it has no data to send. o Fault Recovery. When a child detects that its parent is no longer reachable, it may switch to another parent. When a parent detects that one of its children is no longer reachable, it removes that child from its membership list and reports this up the tree to the Sender of the Data Session. @ TRACK Generation. This set of functionality is responsible for periodically generating TRACK messages from all receivers to acknowledge receipt of data, report missing messages, advance flow control windows, provide roundtrip time measurements and provide other group management information. The algorithms include: o TRACK Timing. In order to avoid ACK implosion, the Receivers and Repair Heads use the rotating TRACK algorithm. o Flow Control and Buffer Management. Receivers and Repair Heads maintain a set of buffers that are at least as large as the Sender's transmission window. The Receivers pass their reception status up to the sender as part of their TRACK messages. This is used to acknowledge receipt of delivery, to advance the buffer windows at each node, and to limit the sender's window advancement to the speed of the slowest receiver. o Application Level Confirmed Delivery. Confirmed Delivery provides transport level confirmation of delivery. Senders can put a ôsynch pointö request in data messages, asking for application level confirmation. Data messages with this flag set are only confirmed by the Receivers after the Receiver applications confirm receipt. @ Local Recovery. This functionality describes how repair heads maintain state on their children and provide repairs in response to requests for retransmission contained in TRACK messages. This has overlap with the NACK BB, which is unified in the TRACK PI. @ TRACK Aggregation. In order to provide the highest levels of scalability and reliability, interior tree nodes provide aggregation of control traffic flowing up the tree. The aggregated feedback information includes that used for end-to- INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 4 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 end confirmed delivery, flow control, congestion control, and group membership monitoring and management. @ Distributed RTT Calculations. One of the primary challenges of congestion control is efficient RTT calculations. TRACK provides two methods to perform these calculations. o Sender Per-Message RTT Calculations. Each message is stamped with a timestamp from the sender. As each is passed up the tree, the amount of dally time spent waiting at each node is accumulated. The lowest measurements are passed up the tree, and the dally time is subtracted from the original measurement. o Local Per-Level RTT Calculations. Each parent measures the local RTT to each of its children as part of the keep-alive messages used for failure detection. 2. Design Rationale Much of the design rationale behind the protocol instantiations and building blocks being standardized by the RMT working group are laid out in [WVKHFL00]. In addition, the design rationale for the TRACK PI is laid out in [WCP00]. This building block conforms with the design rationales laid out in both of those documents. TRACK is designed to provide confirmed delivery, receiver-based flow control, distributed management of group membership (some of them may be dedicated servers in a repair tree), as well as providing aggregation of information up the tree. It also provides requests for retransmissions as part of TRACK messages, and local recovery of lost packets. This TRACK BB is primarily designed to work as part of the TRACK PI, in conjunction with other BB's including NACK, FEC, and Auto Tree. In the spirit of modular reuse specified in [WVKHFL00], it is also designed to be useful as an additional layer of functionality on top of any of the following services. 1) The functionality (if not the exact message headers) of the NORM PI. 2) The functionality (if not the exact message headers) of the ALC PI. 3) Running directly on top of an unreliable IP multicast routing protocol, but on a carefully provisioned network. 4) On top of an overlay multicast (also known as application layer multicast) system. Overlay multicast is a system where servers in the network provide multicast (and unicast) routing as well as reliable multicast delivery, all on top of a combination of unicast (i.e. TCP) and, as available, reliable multicast services. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 5 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 There is a fundamental tradeoff between reliability and real-time performance in the face of failures. There are two primary types of single layer reliability that have been proposed to deal with this: sender reliable and receiver reliable delivery. Sender reliable delivery is similar to TCP, where the sender knows the identity of the receivers in a data session, and is notified when any of them fails to receive all the data messages. Receiver reliable delivery limits knowledge of group membership and failures to only the actual receivers. Senders do not have any knowledge of the membership of a group, and do not require receivers to explicitly join or leave a data session. Receiver reliable protocols scale better in the face of networks that have frequent failures, and have very high isolation of failures between receivers. This TRACK BB provides sender reliable delivery, potentially on top of a receiver reliable system. This BB is specified according to the guidelines in [KV00]. In addition, it specifies all communication between entities in terms of messages, rather than packets. A message is an abstract communication unit, which may be part of, or all of, a given packet. It does not have a specific format, although it does contain a list of fields, some of which may be optional, and some of which may have fixed lengths associated with them. It is up to each protocol instantiation to combine the set of messages in this BB, with those in other components, and create the actual set of message formats that will be used. As mentioned in the introduction, this BB assumes the existence of a separate Auto Tree Configuration BB. It also assumes that data sessions are advertised to all receivers as part of an external BB or other component. It expects to also interact with other BB's through the TRACK PI, but does not require this. 3. Applicability Statement It is widely recognized that no single reliable multicast protocol can meet the needs of all application types over all network types. Distinguishing factors include functionality and performance. From a functionality perspective, TRACK and NACK based reliable multicast protocols present an inherently different reliability model. TRACK based protocols are able to remove messages from the retransmission window when all the children have acknowledged them. NACK based protocols have to rely on other means for determining how long to keep messages in the retransmission window. A popular method is a time based scheme[SFCGLTLLBEJMRSV00]. TRACK based protocols can keep track of the membership of the data session, and provide confirmed delivery against that membership list. NACK protocols have anonymous membership. Since reliability is obtained through control traffic, the difference in the semantics of the term reliability lead to the INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 6 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 second distinguishing factor: performance. When a persistent failure occurs among the members of a TRACK based protocol, there is a possibility that this may slow down other members of the group. NACK protocols have higher isolation of failures, as well as smaller amounts of control traffic under many scenarios. 3.1 Application types The objectives of TRACK are to provide high level reliability, high scalability, congestion control and flow control for one to many bulk data dissemination. TRACK is not designed for many to many applications. Examples of applications that fit into the one-to- many data dissemination model are: real time financial news and market data distribution, electronic software distribution, audio video streaming, distance learning, software updates and server replication. But, not all of these application types have the same reliability requirements. Historically, financial applications have had the most stringent reliability requirements, while audio video streaming have had the least stringent. For applications that want to have strong confirmation of delivery guarantees, TRACK may be more applicable than alternatives such as NORM or ALC. For applications that do not require this level of reliability, or that demand the lowest levels of latency and the highest levels of failure isolation, TRACK may be less applicable. The TRACK BB, in particular, is designed to optionally work on top of a NORM or ALC PI, to allow applications to select this tradeoff on a dynamic basis. 3.2 Network Infrastructure The TRACKs also serve to provide feedback information to the sender. The sender uses this information for congestion and flow control. This allows TRACK to be applicable in most networks (i.e. managed and shared networks, and high congestion networks.) Asymmetric networks with very low upbound bandwidth and a very low loss data channel may be better served through NACK based protocols, particularly if high reliability is not required. A good example is some satellite networks. Networks that have very high loss rates, and regularly experience partial network partitions, router flapping, or other persistent faults, may be better served through NACK only protocols. 4. Message Types The following table summarizes the messages and their fields used by the TRACK BB. All messages contain the session identifier. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 7 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 +--------------------------------------------------------------------+ Message From To Mcast? Fields +--------------------------------------------------------------------+ BindRequest Child Parent no Scope, Level, Role, SubTreeCount +--------------------------------------------------------------------+ BindConfirm Parent Child no Level, RepairAddr, SeqNum, MemberId, CacheInfo +--------------------------------------------------------------------+ BindReject Parent Child no Reason +--------------------------------------------------------------------+ UnbindRequest Child Parent no Reason +--------------------------------------------------------------------+ UnbindConfirm Parent Child no +--------------------------------------------------------------------+ EjectRequest Parent Child either Reason +--------------------------------------------------------------------+ EjectConfirm Child Parent no +--------------------------------------------------------------------+ Heartbeat Parent Child either Level, ParentTimestamp, ChildrenList, SeqNum +--------------------------------------------------------------------+ NullData Sender all yes SenderTimeStamp, AppSynch, End Data Rate, HighestReleased, SeqNum +--------------------------------------------------------------------+ Retransmission Parent Child yes SenderTimeStamp, AppSynch, End Rate, HighestReleased, SeqNum +--------------------------------------------------------------------+ Track Child Parent no SeqNum, BitMask, SubTreeCount Slowest, FailedChildren, HighestAllowed,LocalDallyTime ApplicationConfirms, ParentThere, ParentTimeStamp, SenderTimeStamp, SenderDallyTime +--------------------------------------------------------------------+ The various fields of the messages are described as follows: INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 8 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 - Scope: an integer to indicate how far a repair message travels. This is optional. - Level: an integer that indicates the level in the repair tree. This value is used to keep loops in the tree from forming, in addition to indicating the distance from the sender. Any changes in a node's level are passed down to the Tree BB using the treeLevelUpdate interface. - Role: This indicates if the bind requestor is a receiver or repair head. - SubTreeCount: This is an integer indicating the current number of receivers below the node. - RepairAddr: This field in the BindConfirm message is used to tell the receiver which multicast address the repair head will be sending retransmissions on. If this field is null, then the receiver should expect retransmissions to be sent on the sender's data multicast address. - SeqNum: an integer indicating the sequence number of a data message within a given data session. The SeqNum field in the BindConfirm message indicates the sequence number starting from which the repair head promises to provide repair service. - MemberId: This is an integer the repair head assigns to a particular child. The child receiver uses this value to implement the rotating TRACK Generation algorithm. - CacheInfo: This field contains information about the repair data available from this Repair Head. - Reason: a code indicating the reason for the BindReject, UnbindRequest, or EjectRequest message. - ParentTimestamp: This field is included in Heartbeat messages to signal the need to do a local RTT measurement from a parent. It is the time when the parent sent the packet. - ChildrenList: This field contains the identifiers for a list of children. As part of the keepalive message, this field together with the SeqNum field is used to urge those listed receivers to send a TRACK (for the provided SeqNum). The repair head sending this must have been missing the regular TRACKs from these children for an extended period of time. - SenderTimestamp: This field is included in Data messages to signal the need to do a roundtrip time measurement from the sender, INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 9 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 through the tree, and back to the sender. It is the time (measured by the sender's local clock) when it sent the packet. - AppSynch: a sequence number signaling a request for confirmed delivery by the application. - End: indicates that this packet is the end of the data for this session. - Rate: This field is used by the sender to tell the receivers its sending rate, in packets per second. It is part of the data or nulldata messages. - HighestReleased: This field contains a sequence number, corresponding to the trailing edge of the sender's retransmission window. It is used (as part of the data, nulldata or retransmission headers) to inform the receivers that they should no longer attempt to recover those messages with a smaller (or same) sequence number. - HighestAllowed: a sequence number, used for flow control from the receivers. It signals the highest sequence number the sender is allowed to send that will not overrun the receivers' buffer pools. - BitMask: an array of 1's and 0's. Together with a sequence number it is used to indicate lost data messages. If the i'th element is a 1, it indicates the message SeqNum+i is lost. - Slowest: This field contains a field that characterizes the slowest receiver in the subtree beneath (and including) the node sending the TRACK. This is used to provide information for the congestion control BB, and the aggregation methods on this information are defined by that BB. - ParentThere: This field indicates to the parent that the receiver sending the TRACK has not been receiving the regular keepalive messages from its parent, and is wondering if it needs to find a new parent. - SenderDallyTime: This field is associated with a SenderTimestamp field. It contains the sum of the waiting time that should be subtracted from the RTT measurement at the sender. - LocalDallyTime: This is the same as the SenderDallyTime, but is associated with a ParentTimestamp instead of a SenderTimestamp. - ApplicationConfirms: This is the SeqNum value for which delivery has been confirmed by all children at or below this parent. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 10 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 - FailedChildren: This is a list of all children that have recently been dropped from the repair tree. 5. Global Configuration Variables, Constants, and Reason Codes 5.1 Global Configuration Variables These are variables that control the data session and are advertised to all participants. @ TimeMaxBindResponse: the time, in seconds, to wait for a response to a BindRequest. Initial value is TIMEOUT_PARENT_RESPONSE (recommended value is 3). Maximum value is MAX_TIMEOUT_PARENT_RESPONSE. @ MaxChildren: The maximum number of children a repair head is allowed to handle. Recommended value: 32. @ ConstantHeartbeatPeriod: Instead of dynamically calculating the HeartbeatPeriod as described in Section 7.1.5, a constant period may be used instead. Recommended value: 3 seconds. @ MinimumHeartbeatPeriod: The minimum value for the dynamically calculated HeartbeatPeriod. Recommended value: 1 second. @ MinHoldTime: The minimum amount of time a repair head holds on to data packets. @ MaxHoldTime: The maximum amount of time a repair head holds on to data packets. @ AckWindow: The number of packets seen before a receiver issues an acknowledgement. Recommended value: 32. 5.2 Constants @ NUM_MAX_PARENT_ATTEMPTS: The number of times to try to bind to a repair head before declaring a PARENT_UNREACHABLE error. Recommended value is 5. @ NULL_DATA_PERIOD: The time between transmission of NullData Messages. Recommended value is 1. @ FAILURE_DETECTION_REDUNDANCY: The number of times a message is sent without receiving a response before declaring an error. Recommended value is 3. @ MAX_TRACK_TIMEOUT: The maximum value for TRACKTimeout. Recommended value is 5 seconds. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 11 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 5.3 Reason Codes @ BindReject reason codes @ LOOP_DETECTED @ MAX_CHILDREN_EXCEEDED @ UnbindRequest reason codes @ SESSION_DONE @ APPLICATION_REQUEST @ RECEIVER_TOO_SLOW @ EjectRequest reason codes @ PARENT_LEAVING @ PARENT_FAILURE @ CHILD_TOO_SLOW @ PARENT_OVERLOADED 6. External APIs This section describes external interfaces for the building block. 6.1 Interfaces to the BB from PI's 6.1.1 Start(boolean RepairHead, boolean RejoinAllowed, Advertisement) Start instructs the BB to initiate operation. RepairHead indicates whether or not the node may also operate as a repair head. This parameter is passed along to the tree BB. RejoinAllowed indicates whether or not the node is allowed to rejoin the session if the only repair heads available are missing some repair data needed by this node. This parameter also controls whether or not the node is allowed to join the session after the first data messages have become unrecoverable (late join). The BB uses this parameter to decide whether or not to use a particular repair head (chosen by the tree BB) based on its available repair data. The Advertisement parameter passes to the BB all of the parameters from the session advertisement. 6.1.2 End End instructs the BB to end its operation. If the node is the Sender, it indicates to the group that the last data message is the final one INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 12 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 for the session. Once a receiver has received all of the session's data, it MAY unbind from its parent. However, if the receiver is also a repair head, it continues to operate as a repair head until all of its children have finished. Then it MAY unbind from its own parent. If End is called at a repair head, it MUST use the multicast Eject procedure to inform its children for this session that it is leaving the group. Once the procedure is complete (all children have acknowledged receipt of the Eject, or the Eject has been sent the maximum number of times), the repair head MAY unbind from its own parent. If End is called at a receiver, it MUST use the Unbind procedure to inform its parent for this session that it is leaving the group. 6.1.3 incomingMessage(Message) incomingMessage presents the BB with message received by the PI. 6.1.4 getStatistics getStatistics returns current BB statistics to the upper BB or PI. 6.1.5 MessageSynched(Message) MessageSynched tells the BB that the indicated message has been synched with the application. 6.1.6 RepairHead(boolean) RepairHead tells the BB whether or not it is now acting as a Repair Head. 6.2 Interfaces from the BB to the PI 6.2.1 outgoingMessage(Message) outgoingMessage instructs the PI to send the message. 6.2.2 MessageReceived(Message, boolean Synch) MessageReceived passes a data message up to the PI. Synch indicates whether or not the PI should call MessageSynched once the message has been consumed by the application. 6.2.3 SenderLost SenderLost tells the PI that contact with the sender has been lost. 6.2.4 UnrecoverableData INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 13 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 UnrecoverableData indicates to the PI that the BB was unable to recover some session data. 6.2.5 SessionDone SessionDone indicates to the PI that the sender has completed sending the data, and the node has left the session. 7. Algorithms 7.1 Tree Based Session Creation and Maintenance 7.1.1 Overview of Tree Configuration Before a Data Session starts delivering data, the tree for the Data Session needs to be created. This process binds each Receiver to either a Repair Head or the Sender, and binds the participating Repair Heads into a loop-free tree structure with the Sender as the root of the tree. This process requires tree configuration knowledge, which can be provided with some combination of manual and/or automatic configuration. The algorithms for automatic tree configuration are part of the Automatic Tree Configuration BB. They return to each node the address of the parent it should bind to, as well as zero or more backup parents to use if the primary parent fails. In addition to receiving the tree configuration information, the receivers all receive a Session Advertisement message from the senders, informing them of the Data Multicast Address and other session configuration information. This advertisement may contain other relevant session information such as whether or not Repair Heads should be used, whether manual or automatic tree configuration should be used, the time at which the session will start, and other protocol settings. This advertisement is created as part of either the PI or as part of an external service. In this way, the Sender enforces a set of uniform Session Configuration Parameters on all members of the session. As described in the automatic tree configuration BB, the general algorithm for a given node in tree creation is as follows. 1) Get advertisement that a session is starting 2) Get list of neighbor candidates using the getSNs Tree BB interface, contact them 3) Select best neighbor as parent in a loop free manner 4) Bind to parent 5) Optionally, later rebind to another parent When a child finishes step 4, it is up to automatic tree configuration to, if necessary, continue building the tree in order INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 14 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 to connect the node back to the Sender. After the session is created, children can unbind from their parents and bind again to new parents. This happens when faults occur, or as part of a tree optimization process. Steps 1 through 3 are external to the TRACK BB. Step 4 is performed as part of session creation. Step 5 is performed as part of session maintenance in conjunction with automatic tree building, as either an unbind or eject, combined with another bind operation. Once steps 1 through 3 are completed, Receivers join the Data Multicast Address, and attempt to bind to either the Sender or a local Repair Head. A Receiver will attempt to bind to the first node in the tree configuration list returned by step 3, and if this fails, it will move to the next one. A Receiver only binds to a single Repair Head or Sender, at a time, for each Data Session. The automatic tree building BB ensures that the tree is formed without loops. As part of this, when a Repair Head has a Receiver attempt to bind to it for a given Data Session, it may not at first be able to accept the connection, until it is able to join the tree itself. Because of this, a Receiver will sometimes have to repeatedly attempt to bind to a given parent before succeeding. Once the Sender initiates tree building, it is also free to start sending Data messages on the Data Multicast Address. Repair Heads and Receivers may start receiving these messages, but may not request retransmission or deliver data to the application until they receive confirmation that they have successfully bound to the tree. 7.1.2 Bind 7.1.2.1 Input Parameters In order to join a data session and bind to the tree, the following nodes need the following parameters. A Repair Head requires the following parameters. - Session: the unique identifier for the data session to join, received from the Session Advertisement algorithm in the PI. - ParentAddress: the address and port of the parent node to which the node should connect - UDPListenPort: the number of the port on which the node will listen for its children's control messages - RepairAddr: the multicast address, UDP port, and TTL on which this node sends control messages to its children. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 15 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 A Sender requires the above parameters, except for the ParentAddress. A Receiver requires the above parameters, except for the UDPListenPort and RepairAddr. 7.1.2.2 Bind Algorithm A Bind operation happens when a child wishes to join a parent in the distribution tree for a given data session. The Receivers initiate the first Bind protocols to their parents, which then cause recursive binding by each parent, up to the Sender. Each Receiver sends a separate BindRequest message for each of the streams that it would like to join. At the discretion of the PI, multiple BindRequest messages may be bundled together in a single message. A node sends a BindRequest message to its automatically selected or manually configured parent node. The parent node sends either a BindConfirm message or a BindReject message. Reception of a BindConfirm message terminates the algorithm successfully, while receipt of a BindReject message causes the node to either retry the same parent or restart the Bind algorithm with its next parent candidate (depending on the BindReject reason code), or if it has none, to declare a REJECTED_BY_PARENT error. Once the node is accepted by a Repair head, it informs the Tree BB using the setSN interface. Reliability is achieved through the use of a standard request- response protocol. At the beginning of the algorithm, the child initializes TimeMaxBindResponse to the constant TIMEOUT_PARENT_RESPONSE and initializes NumBindResponseFailures to 0. Every time it sends a BindRequest message, it waits TimeMaxBindResponse for a response from the parent node. If no response is received, the node doubles its value for TimeMaxBindResponse, but limits TimeMaxBindResponse to be no larger than MAX_TIMEOUT_PARENT_RESPONSE. It also increments NumBindResponseFailures, and retransmits the BindRequest message. If NumBindResponseFailures reaches NUM_MAX_PARENT_ATTEMPTS, it reports a PARENT_UNREACHABLE error. When a parent receives a BindRequest message, it first consults the automatic tree building BB for approval (using the acceptChild Tree BB interface), for instance to ensure that accepting the BindRequest will not cause a loop in the tree. Then the parent checks to be sure that it does not have more than MaxChildren children already bound to it for this session. If it can accept the child, it sends back a BindConfirm message. Otherwise, it sends the node a BindReject message. Then the parent checks to see if it is already a member of this data session. If it is not yet a member of this session, it attempts to join the tree itself. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 16 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 The BindConfirm message contains the lowest sequence number that the repair head has available. If this number is 0 or 1, then the repair head has all of the data available from the start of the session. Otherwise, the requesting node is attempting a late join, and can only use this repair head if late join was allowed by the PI. If late join is not allowed, the node may try another repair head, or give up. Similarly, if a failure recovery occurs, when a node tries to bind to a new repair head, it must follow the same rules as for a late join. See section 7.1.5. 7.1.3 Unbind A child may decide to leave a data session for the following reasons. 1) It detects that the data session is finished. 2) The application requests to leave the data session. 3) It is not able to keep up with the data rate of the data session. When any of these conditions occurs, it initiates an Unbind process. An Unbind is, like the Bind function, a simple request-reply protocol. Unlike the Bind function, it only has a single response, UnbindConfirm. With this exception, the Unbind operation uses the same state variables and reliability algorithms as the Bind function. When a child receives an UnbindConfirm message from its parent, it reports a LEFT_DATA_SESSION_GRACEFULLY event. If it does not receive this message after NUM_MAX_PARENT_ATTEMPTS, then it reports a LEFT_DATA_SESSION_ABNORMALLY event. Unbinds are reported to the Tree BB using the lostSN interface. 7.1.4 Eject A parent may decide to remove one or more of its children from a data stream for the following reasons. 1) The parent needs to leave the group due to application reasons. 2) The repair head detects an unrecoverable failure with either its parent or the sender. 3) The parent detects that the child is not able to keep up with the speed of the data stream. 4) The parent is not able to handle the load of its children and needs some of them to move to another parent. In the first two cases, the parent needs to multicast the advertisement of the termination of one or more data sessions to all of its children. In the second two cases, it needs to send one or more unicast notifications to one or more of its children. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 17 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 Consequently, an Eject can be done either with a repeated multicast advertisement message to all children, or a set of unicast request- reply messages to the subset of children that it needs to go to. For the multicast version of Eject, the parent sends a multicast UnbindRequest message to all of its children for a given Data Session, on its Local Multicast Channel. It is only necessary to provide statistical reliability on this message, since children will detect the parent's failure even if the message is not received. Therefore, the UnbindRequest message is sent FAILURE_DETECTION_REDUNDANCY times. For the unicast version of Eject, the parent sends a unicast UnbindRequest message to all of its children. Each of them respond with an EjectConfirm. Reliability is ensured through the same request-reply mechanism as the Bind operation. Ejections are reported to the Tree BB using the removeChild interface. 7.1.5 Fault Detection There are three cases where fault detection is needed. 1) Detection (by a child) that a parent has failed. 2) Detection (by a parent) that a child has failed. 3) Detection (by either a Repair Head or Receiver) that a Sender has failed. In order to be scaleable and efficient, fault detection is primarily accomplished by periodic keep-alive messages, combined with the existing TRACK messages. Nodes expect to see keep-alive messages every set period of time. If more than a fixed number of periods go by, and no keep-alive messages of a given type are received, the node declares a preliminary failure. The detecting node may then ping the potentially failed node before declaring it failed, or it can just declare it failed. Failures are detected through three keep-alive messages: Heartbeat, TRACK, and NullData. The Heartbeat message is multicast periodically from a parent to its children on its local control channel. NullData messages are multicast by a sender on the global multicast address when it has no data to send. TRACK messages are generated periodically, even if no data is being sent to a data session, as described in section 7.2. Heartbeat messages are multicast every HeartbeatPeriod seconds, from a parent to its children. Every time that a parent sends a Retransmission message or a Heartbeat message (as well as at initialization time), it resets a timer for HeartbeatPeriod seconds. If the timer goes off, a Heartbeat is sent. The HeatbeatPeriod is dynamically computed as follows: INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 18 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 interval = AckWindow / PacketRate HeartbeatPeriod = 2 * interval Global configuration parameters ConstantHeartbeatPeriod and MinimumHeartbeatPeriod can be used to either set HeartbeatPeriod to a constant, or give HeartbeatPeriod a lower bound, globally. Similarly, a NullData message is multicast by the sender to all data session members, every NULL_DATA_PERIOD. The NullData timer is set to NULL_DATA_PERIOD, and is reset every time that a Data or NullData message is sent by the Sender. The key parameter for failure detection is the global tree parameter FAILURE_DETECTION_REDUNDANCY. The higher the value for this parameter, the more keep-alive messages that must be missed before a failure is declared. A major goal of failure detection is for children to detect parent failures fast enough that there is a high probability they can rejoin the stream at another parent, before flow control has advanced the buffer window to a point where the child can not recover all lost messages in the stream. In order to attempt to do this, children detect a failure of a parent if FAILURE_DETECTION_REDUNDANCY * HeartbeatPeriod time goes by without any heartbeats. As part of buffer window advancement, described in section 7.2.4, all parents MAY choose to buffer all messages for a minimum of FAILURE_DETECTION_REDUNDANCY * 2 * HeartbeatPeriod seconds, which gives children a period of time to find a new parent before the buffers are freed. Children report parent failures to the Tree BB using the lostSN interface. A parent detects a preliminary failure of one of its children if it does not receive any TRACK messages from that child in FAILURE_DETECTION_REDUNDANCY * TrackTimeout seconds (see discussion of how TrackTimeout is computed in 7.2.1). Because a failed child can slow down the group's progress, it is very important that a parent resolve the child's status quickly. Once a parent declares a preliminary failure of a child, it issues a set of up to FAILURE_DETECTION_REDUNDANCY Heartbeat messages that are unicast (or multicast) to the failed receiver(s). These messages are spaced apart by 2*LocalRTT, where LocalRTT is the round trip time that has been measured to the child in question (see 7.4 for description of how LocalRTT is measured). These Heartbeat messages contain a ChildrenList field that contains the children who are requested to send a TRACK immediately. Whenever a child receives a Heartbeat message with an ImmediateTRACK field set to 1, it immediately sends a TRACK to its INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 19 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 parent. If a parent does not receive a TRACK message from a child after waiting a period of 2*ChildRTT after the last Heartbeat message to that child, it declares the child failed, and removes it from the parent's child membership list. It informs the Tree BB using the removeChild interface. A child or a repair head detects the failure of a sender if it does not receive a Data or NullData message from a sender in FAILURE_DETECTION_REDUNDANCY * NULL_DATA_PERIOD. Note that the more receivers there are in a tree, and the higher the loss rate, the larger FAILURE_DETECTION_REDUNDANCY must be, in order to give the same probability that erroneous failures won't be declared. 7.1.6 Fault Notification When a parent detects the failure of a child, it adds a failure notification field to the next TRACK messages that it sends up the tree. It sends this notification multiple times because TRACKs are not delivered reliably. A failure notification field includes the failure code, as well as a list of one or more failed nodes. Failure notifications are aggregated up the tree, according to the rules in 7.3. A failure notification is not a definitive report of a failure, as the child may have moved to a different repair head. 7.1.7 Fault Recovery The Fault Recovery algorithms require a list of one or more addresses of alternate parents that can be bound to, and that still provide loop free operation. If a child detects the failure of its parent, it then re-runs the Bind operation to a new parent candidate, in order to rejoin the tree. As described above in section 7.1.2, a node may perform a late join, i.e. binding with a repair head which cannot provide all the necessary repair data, only if allowed by the PI. 7.2 TRACK Generation This section describes the algorithms used by the receiver to determine when to send the TRACK messages. TRACK messages are sent from receivers to their parents. TRACK messages may be sent for the following purposes: - to request retransmission of messages - to advance the sender's transmission window for flow control purposes - to deliver end-to-end confirmation of data reception INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 20 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 - to propagate other relevant feedback information up through the session (such as RTT and loss reports, for congestion control) The TRACK PI also makes use of the NACK BB, which requests retransmission of messages from a parent. The TRACK request and response algorithms should be highly similar to the NACK algorithms for this specific case. 7.2.1 TRACK Generation with the Rotating TRACK Algorithm Each receiver sends a TRACK message to its parent once per AckWindow of data messages received. A receiver uses an offset from the boundary of each AckWindow to send its TRACK, in order to reduce burstiness of control traffic at the parents. Each parent has a maximum number of children, MaxChildren. When a child binds to the parent, the parent assigns a locally unique ChildID to that child, between 0 and MaxChildren-1. Each child in a tree generates a TRACK message at least once every AckWindow of data messages, when the most recent data message's sequence number, modulo AckWindow, is equal to MemberID. If the message that would have triggered a given TRACK for a given node is missed, the node will generate the TRACK as soon as it learns that it has missed the message, typically through receipt of a higher numbered data message. Together, AckWindow and MaxChildren determine the maximum ratio of control messages to data messages seen by each parent, given a constant load of data messages. In each data message, the sender advertises the current PacketRate (measured in messages per second) it is sending data at. This rate is generated by the congestion control algorithms in use at the sender. At the time a node sends a regular TRACK, it also computes a TRACKTimeout value: interval = AckWindow / PacketRate TRACKTimeout = 2 * interval If no TRACKs are sent within TRACKTimeout interval, a TRACK is generated, and TRACKTimeout is increased by a factor of 2, up to a value of MAX_TRACK_TIMEOUT. This timer mechanism is used by a receiver to ensure timely repair of lost messages and regular feedback propagation up the tree even when the sender is not sending data continuously. This mechanism complements the AckWindow-based regular TRACK generation mechanism. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 21 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 7.2.2 Local Repair A repair head maintains the following state for each of its children, for the purpose of providing repair service to the local group: - HighestConsecutivelyReceived: a sequence number indicating all Data messages up to this number (inclusive) have been received by a given child. - MissingPackets: a data structure to keep track of the reception status of the Data messages with sequence number higher than HighestConsecutivelyReceived. In addition, a repair head also maintains other state for purposes of feedback aggregation described in the next section. The minimum HighestConsecutivelyReceived value of all its children is kept as the variable LocalStable. A repair head also maintains a retransmission buffer. The size of the retransmission buffer must be greater than the maximum value of a sender's transmission window. The retransmission buffer must keep all the Data messages received by the repair head with sequence number higher than LocalStable, optionally some messages with sequence number lower than LocalStable if there is room (beyond the maximum value of sender's transmission window). The latter messages are kept in the retransmission buffer in case a receiver from another group losses its parent and needs to join this group. As TRACK messages are received, the repair head updates the above states. To perform local repair, a repair head implements a retransmission queue with memory. Each lost message (reported by a child using the BitMask field) is entered into the retransmission queue in increasing order according to its sequence number. If the same Data message has already been retransmitted recently (recognized due to the queue's memory) it is delayed by the local group RTT (see roundtrip time measurement) before retransmission. The retransmissions are sent using the same PacketRate is that used by the sender. 7.2.3 Flow Control Window Update When a receiver sends a TRACK to its parent, the HighestAllowed field provides information on the status of the receiver's flow INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 22 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 control window. The value of HighestAllowed is computed as follows: HighestAllowed = seqnum + ReceiverWindow Where seqnum is the highest sequence number of consecutively received data messages at the receiver. The size of the ReceiverWindow may either be based on a parameter local to the receiver or be a global parameter. 7.2.4 Reliability Window The sender and each repair head maintain a window of messages for possible retransmission. As messages are acknowledged by all of its children, they are released from the parent's retransmission buffer, as described in 7.2.2. In addition, there are two global parameters that can affect when a parent releases a data message from the retransmission buffer -- MinHoldTime, and MaxHoldTime. MinHoldTime specifies a minimum length of time a message must be held for retransmission from when it was received. This parameter is useful to handle scenarios where one or more children have been disconnected from their parent, and have to reconnect to another. If, for example, MinHoldTime is set to FAILURE_DETECTION_REDUNDANCY * 2 * ConstantHeartbeatPeriod, then there is a high likelihood that any child will be able to recover any lost messages after reconnecting to another parent. The sender continually advertises to the members of the data session both edges of its retransmission window. The higher value is the SeqNum field in each Data or NullData message, which specifies the highest sequence number of any data message sent. The trailing edge of the window is advertised in the HighestReleased field. This specifies the largest sequence number of any message sent that has subsequently been released from the sender's retransmission window. If both values are the same then the window is presently empty. Zero is not a legitimate value for a data sequence number, so if either field has a value of zero, then no messages have yet reached that state. All sequence number fields use sequence number arithmetic so that a data session can continue after exhausting the sequence number space. When a member of a data session receives an advertisement of a new HighestReleased value, it stores this, and is no longer allowed to ask for retransmission for any messages up to and including the HighestReleased value. If it has any outstanding missing messages that are less than or equal to HighestReleased, it MAY move forward and continue delivering the next data messages in the stream. It also SHOULD report an error for the messages that are no longer recoverable. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 23 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 MaxHoldTime specifies the maximum length of time a message may be held for retransmission. This parameter is set at the sender which uses it to set the HighestReleased field in data message headers. This is particularly useful for real-time, semi-reliable streams such as live video, where retransmissions are only useful for up to a few seconds. When combined with Unordered delivery semantics, and application-level jitter control at the receivers, this provides Time Bounded Reliability. Obviously, MaxHoldTime must always be larger than MinHoldTime. 7.2.5 Confirmed Delivery Flow control and the reliability window are concerned with goodput, of delivering data with a high probability that it is delivered at all receivers. However, neither mechanism provides explicit confirmation to the sender as to the list of recipients for each message. Confirmed delivery allows applications to determine the set of applications that have received a set of data messages. To request this service, a sender fills the AppSynch field of data messages with the sequence number of the highest data message it wishes to confirm delivery of. It continues to do so until it receives confirmation, moves the AppSynch point forward to a higher sequence number, or declares an error. When a receiver gets a data message with a non-zero AppSynch field, it starts including the highest sequence number that has been acknowledged by the application in the ApplicationConfirms field of each TRACK message that it sends up the tree. In order to provide reliable delivery of this acknowledgement, this continues so long as a receiver gets data messages with non-zero AppSynch fields. Each receiver is responsible for locally deciding the value of the ApplicationConfirms field. There are two primary issues a receiver must consider in setting this field: the reliability semantics of the data stream, and when a given message is considered confirmed at the receiver. As this is an application level confirmation, a handshake with the application is required to get this confirmation. One example of how an application can implicitly signal confirmation of delivery is through the freeing of buffers passed to it by the transport. The API could specify that whenever an application has freed up a buffer containing one or more data messages, then these messages are considered acknowledged by the application. Alternatively, the application could be required to explicitly acknowledge each message. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 24 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 With a given transport-application API for signaling acknowledgement, the transport then keeps track of all contiguous acknowledgements from that application, and reports these up in the ApplicationConfirms field. If one or more messages can not be acknowledged, the receiver should pass an error code describing the type of failure that occurred, and the sequence number of the first message that has not yet been delivered. If MaxHoldTime is not in use for a data stream, so that delivery is fully reliable, then any message that can not be delivered will be considered a fatal error for that receiver. If MaxHoldTime has a non-zero value, then any messages that could not be delivered, but are less than HighestReleased as advertised by the sender, are not reported as errors. In addition to the AppSynch field, a sender may also set the ImmediateACK field to 1. When a node gets a data message that has this flag set, it will immediately send a TRACK after processing that message. 7.3 Feedback Aggregation This section describes how repair heads perform aggregation on feedback information sent up in the fields of the TRACK message, and the purposes for performing such aggregation. There are many reasons for providing feedback from all the receivers to the sender in an aggregated form. The major ones are listed below: 1) End-to-end delivery confirmation. This confirmation tells the sender that all the receivers (in the entire tree) have received data packets up to a certain sequence number. The field that carries this information is AppSynch. 2) Flow control. The aggregated information is carried in the field HighestAllowed. It tells the sender the highest sequence number that all the receivers (in the entire tree) are prepared to receive. 3) Identifying the slowest receiver. The aggregated information is carried in the field Slowest. The sender can use this value as part of congestion control. 4) Counting current membership in the group. This information is carried in the field SubTreeCount. This lets the sender know the number of receivers currently connected to the repair tree. 5) Measuring the round-trip time from the sender to the "worstö receiver. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 25 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 A repair head maintains state for each child. Each time a TRACK (from a child) is received, the corresponding states for that child are updated based on the information in the TRACK message. When a repair head sends a TRACK message to its parent, the following fields of its TRACK message are derived from the aggregation of the corresponding states for its children. The following rules describe how the aggregation is performed: - AppSynch: take the minimum of the AppSynch value from all children - HighestAllowed: take the minimum of the HighestAllowed value from all children - Slowest: this is a measure of how slow the slowest member in the whole subtree is; take either the minimum (or maximum) of the Slowest value from all children (depending what the Slowest measure is). - SubTreeCount: take the sum of the SubTreeCount from all children - SenderDallyTime: take the minimum value, for all of the children, of child's reported SenderDallyTime + child's local dally time Note, the SendTimeStamp field is left alone. The sender will derive the roundtrip time to the worst receiver by doing its local aggregation for SenderDallyTime and then compute: RTT = currentTime - SendTimeStamp - SenderDallyTime. 7.4 Measuring Round Trip Times This TRACK BB provides two algorithms for distributed RTT calculations ù LocalRTT measurements and SenderRTT measurements. LocalRTT measurements are only between a parent and its children. SenderRTT measurements are end-to-end RTT measurements, measuring the RTT to the worst receiver as selected by the congestion control algorithms. The SenderRTT is useful for congestion control. It can be used to set the data rate based on the TCP response function, which is being proposed for the congestion control building block. The LocalRTT can be used to (a) quickly detect faulty children (as described in 7.1) or (b) avoid sending unnecessary retransmissions (as described in 7.2 in the local repair algorithm). INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 26 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 In the case of LocalRTT measurements, a parent initiates measurement by including a ParentTimestamp field in a Heartbeat message sent to its children. When a child receives a Heartbeat message with this field set, it notes the time of receipt using its local system clock, and stores this with the message as HeartbeatReceiveTime. When the child next generates a TRACK, just before sending it, it measures its system clock again as TRACKSendTime, and calculates the LocalDallyTime. LocalDallyTime = TRACKSendTime - HeartbeatReceiveTime. The child includes this value, along with the ParentTimestamp field, as fields in the next TRACK message sent. Every heartbeat message that is multicast to all children SHOULD include a ParentTimestamp field. The SenderRTT algorithm is similar. A sender initiates the process by including a SenderTimestamp field in a data message. When a receiver gets a message with this field set, it keeps track of the DataReceiveTime for that message, and when it generates the next TRACK message, includes the SenderTimestamp and SenderDallyTime value. These values are aggregated by Repair Heads, as described in section 7.3. Each node only keeps track of the most recent value for {SenderTimestamp, DataReceiveTime} and {ParentTimestamp, HeartbeatReceiveTime}, replacing any older values any time that a new message is received with these values set. As long as it has non-zero values to report, each node sends up both a {SenderTimestamp, SenderDallyTime} and a {ParentTimestamp, LocalDallyTime} set of fields in each TRACK message generated. These measurements need to be averaged by the TRACK PI. 8. Security This BB does not specifically deal with security. It is the responsibility of the TRACK PI or the Security BB. This issue is covered in the Security Requirements For TRACK draft [HW00]. 9. References [HW00] T. Hardjono, B. Whetten, "Security Requirements For TRACK," Internet Draft, Internet Engineering Task Force, June, 2000. [KLCWTCTK01] M. Kadansky, D. Chiu, B. Whetten, B. Levine, G. Taskale, B. Cain, D. Thaler, S. Koh, "Reliable Multicast Transport Building Block: Tree Auto-Configuration," Internet Draft, Internet Engineering Task Force, March, 2001. INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 27 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 [KV00] R. Kermode, L. Vicisano, "Author Guidelines for RMT Building Blocks and Protocol Instantiation Documents," Internet Draft, Internet Engineering Task Force, June, 2000. [SFCGLTLLBEJMRSV00] T. Speakman, D. Farinacci, J. Crowcroft, J. Gemmell, S. Lin, A. Tweedly, D. Leshchiner, M. Luby, N. Bhaskar, R. Edmonstone, K. M. Johnson, T. Montgomery, L. Rizzo, R. Sumanasekera, and L. Vicisano, "PGM Reliable Transport Protocol Specification," Internet Draft, Internet Engineering Task Force, November 2000. [WCP00] B. Whetten, D. Chiu, S. Paul, M. Kadansky, G. Taskale, "TRACK Architecture, A Scalable Real-Time Reliable Multicast Protocol," Internet Draft, Internet Engineering ask Force, July 2000. [WVKHFL00] B. Whetten, L. Vicisano, R. Kermode, M. Handley, S. Floyd, and M. Luby, "Reliable Multicast Transport Building Blocks for One-to-Many Bulk-Data Transfer," RFC 3048, January 2001. 10. Acknowledgements We would like to thank the follow people: Sanjoy Paul, Seok Joo Koh, Supratik Bhattacharyya, Joe Wesley, and Joe Provino. 11. Authors' Addresses Dah Ming Chiu dahming.chiu@sun.com Miriam Kadansky miriam.kadansky@sun.com Sun Microsystems Laboratories 1 Network Drive Burlington, MA 01803 Gursel Taskale gursel@talarian.com Brian Whetten whetten@talarian.com Talarian 333 Distel Circle Los Altos, CA 94022-1404 Full Copyright Statement "Copyright (C) The Internet Society (2001). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 28 INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt March 2001 published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." INTERNET DRAFT draft-ietf-rmt-track-bb-01.txt 29