Internet Engineering Task Force RMT WG INTERNET-DRAFT M.Luby/Digital Fountain draft-ietf-rmt-bb-lct-00.txt J.Gemmell/Microsoft L.Vicisano/Cisco L.Rizzo/ACIRI and Univ. Pisa M.Handley/ACIRI J. Crowcroft/UCL 17 November 2000 Expires: May 2001 Layered Coding Transport: A massively scalable multicast protocol Status of this Document This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as a "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt To view the list Internet-Draft Shadow Directories, see http://www.ietf.org/shadow.html. This document is a product of the IETF RMT WG. Comments should be addressed to the authors, or the WG's mailing list at rmt@lbl.gov. Abstract This document describes Layered Coding Transport, a massively scalable multicast protocol, hereafter referred to as LCT. Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft [Page 1] ^L INTERNET-DRAFT Expires: May 2001 November 2000 LCT can be used for multi-rate content delivery (for both reliable bulk data transfer and unreliable data streams) to large sets of receivers. In LCT, scalability and congestion control are supported through the use of layered coding techniques. When LCT is used for reliable data transfer, the coding also provides support for reliability. Congestion control is receiver driven, and is achieved by sending packets in the session to multiple ``LCT groups'', and having receivers join and leave LCT groups (thus adjusting their reception rate) in reaction to network conditions in a manner that is network friendly. Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft [Page 2] ^L INTERNET-DRAFT Expires: May 2001 November 2000 Table of Contents 1. Introduction. . . . . . . . . . . . . . . . . . . . . . 4 1.1. Related Documents. . . . . . . . . . . . . . . . . . 5 1.2. Environmental Requirements . . . . . . . . . . . . . 6 2. General Architecture. . . . . . . . . . . . . . . . . . 8 2.1. Delivery service models. . . . . . . . . . . . . . . 9 2.2. Congestion Control . . . . . . . . . . . . . . . . . 10 3. Packet Formats. . . . . . . . . . . . . . . . . . . . . 11 3.1. Data-Packet format . . . . . . . . . . . . . . . . . 11 3.2. Request-Packet format. . . . . . . . . . . . . . . . 12 3.3. LCT Packet header fields . . . . . . . . . . . . . . 13 3.4. Transmission Extensions. . . . . . . . . . . . . . . 15 3.5. Header-Extension Fields. . . . . . . . . . . . . . . 16 4. Procedures. . . . . . . . . . . . . . . . . . . . . . . 19 4.1. Sender Operation . . . . . . . . . . . . . . . . . . 19 4.2. Receiver Operation . . . . . . . . . . . . . . . . . 21 5. Security Considerations . . . . . . . . . . . . . . . . 23 6. IANA Considerations . . . . . . . . . . . . . . . . . . 23 7. Intellectual Property Issues. . . . . . . . . . . . . . 23 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . 24 9. Authors' Addresses. . . . . . . . . . . . . . . . . . . 25 10. Full Copyright Statement . . . . . . . . . . . . . . . 27 Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft [Page 3] ^L INTERNET-DRAFT Expires: May 2001 November 2000 1. Introduction This document describes a massively scalable protocol, Layered Coding Transport (LCT), for multi-rate content delivery using IP multicast. LCT supports both reliable and unreliable data transfer, and supports a congestion control mechanism which conformings to RFC2357. IP multicast [5] is a "best effort" service and does not guarantee packet reception, or reception order. Also it does not provide any support for flow or congestion control. While the basic service provided by IP multicast is largely scalable, adding features such as congestion control or reliability on top of it might cause severe scalability limitations, especially in presence of heterogeneous sets of receivers. Scalability refers to the behaviour of the protocol in relation to the number of receivers and network paths, their heterogeneity, and the ability to accommodate dynamically variable groups. Scalability limitations can come from memory or processing requirement, or from the amount of traffic generated by the protocol. In turn, such limitations derive from the features that a multicast transport protocol is expected to provide. Congestion control refers to the ability of the protocol to adapt its throughput to the available bandwidth on the path to the receivers, and to share bandwidth fairly with competing flows such as TCP. It is required that protocols implement some form of congestion control so that they not compete unfairly with existing and adaptive protocols such as TCP. Multi-rate protocols aim at splitting the set of receivers into multiple subsets, according to the available bandwidth each one has to the source. Conversely, single-rate multicast protocols make all receivers in a session experience the same throughput. The partitioning of receivers can be done statically or adaptively. Layered coding refers to the ability to produce a coded stream that can be split into multiple substreams (transmitted over different multicast groups). The coded stream can be generated either from a fixed piece of content, or from an ongoing data stream, and has the property that the quality experienced by a receiver (in terms of quality of playout, or overall transfer speed) is proportional to how many of the substreams the receiver is joined. Layered congestion control that is compliant with RFC 2357 must be used by receivers to dynamically adjust their reception rate by appropriately joining and or leaving groups carrying the substreams. The concept of layered coding was first introduced with reference to audio and video streams. For example, the information associated with a Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 1. [Page 4] ^L INTERNET-DRAFT Expires: May 2001 November 2000 TV broadcast can be thought as made of three layers, corresponding to black and white, color, and HDTV quality. Receivers can experience different quality without the need for the sender to replicate information in the different layers. The concept of layered coding can be naturally extended to reliable bulk data transfer protocols when Forward Error Correction (FEC) techniques are used for coding the data stream [15] [16] [7] [17] [18] [3]. By using FEC, the data stream is transformed in such a way that reconstruction of a data object does not depend on the reception of specific data packets, but only on the number of different packets received. As a result, by increasing the number of groups it is receiving from, a receiver can reduce the transfer time accordingly. More details on the use of FEC for reliable multicast can be found in [11]. Reliable protocols aim at giving guarantees on the reliable delivery of data from the source to the intended recipients. Guarantees vary from simple data integrity to strict ordering and atomic delivery. Several reliable multicast protocols have been built on top of IP multicast, but scalability was not a design goal for many of them. In some cases, scalability is achieved by introducing changes to routers or other infrastructure [PGM], an approach which has an impact on near term deployment. Two of the key difficulties in scaling reliable multicast are dealing with the amount of data that flows from receivers back to the sender, and the associated response (generally data retransmissions) from the sender. Protocols that avoid any such feedback, and minimize the amount of retransmissions, can be massively scalable. LCT relies on the availability of a layered codec to achieve reliability with little or no feedback. In this document we present the architecture of LCT, and illustrate its support for multi-rate congestion control. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [2]. 1.1. Related Documents A more in-depth description of the use of FEC in Reliable Multicast Transport (RMT) protocols is given in [11]. Some of the FEC codecs that may be used by LCT for reliable bulk data transfer are specified in [12]. LCT reserves opaque header fields that can be used to transport information related to the payload encoding. Implementors of LCT MUST also implement congestion control in accordance to RFC2357 [13]. One possible scheme is specified in [1]. LCT reserves Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 1.1. [Page 5] ^L INTERNET-DRAFT Expires: May 2001 November 2000 opaque header fields that can be used by the congestion control scheme to transport information related to congestion control. It is recommended that LCT implementors use some authentication scheme to protect the protocol from attacks. Suitable schemes are discussed in [] 1.2. Environmental Requirements LCT is intended for congestion controlled, multi-rate delivery of objects (both reliable bulk data transfer and unreliable streaming of multimedia information). LCT is most applicable for delivery of objects of substantial length, i.e., objects that range in length from hundreds of kilobytes to many gigabytes, and whose transfer time is in the order of tens of seconds or more. LCT is directly applicable to all multicast enabled networks, including asymmetric networks, wireless networks, and satellite networks. Thus, the inherent raw scalability of LCT is unlimited. However, when other specific applications are built on top of LCT, then these applications by their very nature may limit scalability. For example, if an application requires receivers to retrieve out of band information in order to join a session, or an application allows receivers to send requests back to the sender in order to extend an ongoing session, then the scalability of the application is limited by the ability to send, receive, and process this additional data. LCT requires that the underlying network layer can deliver and demultiplex packets for a given LCT session, and supply packet length information to the LCT receiver. In IP networks, this is normally achieved by using UDP, or any protocol that can provide an equivalent service, as the underlying transport protocol. LCT does not require reverse multicast connectivity, i.e. LCT receivers do not send multicast traffic. As such, LCT works with both the original multicast model introduced in [5], which we call Internet Standard Multicast (ISM) in this document, and with the Source Specific Multicast (SSM) model that is based on [10]. The definition of an LCT group used throughout this document is slightly different with ISM and with SSM. When using ISM, packets of an LCT group are sent to a multicast group address G. When using SSM, packets of an LCT group are sent to a channel address (S,G), where S is the IP address of the sender and G is a multicast group address. SSM is more attractive to LCT than ISM for a few reasons. First, LCT may use several LCT groups in a session, and the large, local namespace Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 1.2. [Page 6] ^L INTERNET-DRAFT Expires: May 2001 November 2000 for allocating multicast groups in SSM greatly simplifies the address allocation problem. Second, LCT over SSM performs well even in presence of very large and dynamically changing receiver sets. Changes in the multicast tree topology with SSM are light weight operations (a new branch from the receiver towards S grows when a receiver joins, and the branch is deleted when the receiver leaves), whereas with ISM changes can be heavier weight (involving transitions from a (*,G)-tree rooted at an RP to the tree rooted at S). Third, LCT over SSM scales well even when receivers span the global Internet, as the light weight mechanisms that SSM uses to cross ISP boundaries (standard BGP+ routing tables) is distinct advantage over the heavier weight mechanisms used by ISM (the MSDP and BGMP protocols, both of which are not needed by SSM). Finally, a receiver joins an LCT group by joining a channel (S,G) with SSM, and thus the receiver will only receive packets sent from the sender S. With ISM, the receiver joins an LCT group by joining a multicast group G, and all packets sent to G, regardless of their origin sender, will be received by the receiver. Thus, SSM has compelling security advantages over ISM for prevention of denial of service attacks. LCT also requires receivers to obtain Session Description Information before joining a session, as described in Section 4.1. The session description could be in a form such as SDP [8], or XML metadata, or HTTP/Mime headers [6], and distributed with SAP, HTTP or in other ways. The particular layered encoder and congestion control protocols used by LCT to provide a complete protocol have an impact on the performance and applicability of LCT. For example, some layered encoders used for video and audio streams can produce a very limited number of substreams, thus providing a very coarse control in the throughput of a session. When LCT is used for reliable data transfer, some FEC coders are inherently limited in the size of the object they can encode, and for objects larger than this size the reception overhead on the receivers can grow substantially. As another example, some networks are not amenable to some congestion control protocols that could be used with LCT. In particular, for a satellite or wireless network, there may be no mechanism for receivers to effectively reduce their reception rate since there may be a fixed transmission rate allocated to the session. Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 1.2. [Page 7] ^L INTERNET-DRAFT Expires: May 2001 November 2000 2. General Architecture An LCT session comprises all packets sent to one or more LCT groups from a single sender, and pertaining to the transmission of one or more objects that can be of interest for the receivers. For example, an LCT session could be used to deliver a TV channel on three groups. The first group would allow black and white reception, the first two groups would permit color reception, whereas the set of three groups delivers HDTV quality images. Objects in this example would correspond to individual programs (movies, news, commercial) being transmitted. As another example, a reliable LCT session could be used to reliably deliver hourly-updated weather maps (objects) using ten LCT groups at different rates, using FEC coding. A receiver may join and concurrently receive packets from subsets of these groups, until it has enough packets in total to recover the object, then leave the session (or remain there listening to control information only) until it is time to receive the next object. In this case, the quality metric is the time required to receive each object. Before joining a session, the receivers MUST obtain a session description, which MUST include the relevant session parameters needed by a receiver to participate in the session. The session description is determined and agreed upon by the senders, and typically communicated to the receivers out of band. In some cases, part of the session description MAY be included in the header of each packet. A layered encoder is used to generate the data that is placed in the payload of LCT packets. A suitable decoder is used to extract the original information from the payload. LCT congestion control is achieved by sending packets associated with a given session to several LCT groups. Individual receivers dynamically join to one or more of these groups, according to the network congestion as seen by the receiver. LCT packet headers include an opaque field which MUST be used to convey congestion control information to the receivers. The actual congestion control scheme to use with LCT is negotiated out-of-band. One of the algorithms that can be used to achieve congestion control in LCT is described in [1]. LCT can be used with other congestion control algorithms such as the one described in [17], or router-assisted scheme where the selection of which packets to forward is performed by routers. This latter approach potentially allows for finer grain congestion control and a faster reaction to network congestion, but requires changes to the router infrastructure. See [4] for a preliminary design description. We do not discuss this approach Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 2. [Page 8] ^L INTERNET-DRAFT Expires: May 2001 November 2000 further in this document. Depending on the service model in use, receiver can generate LCT request packets, which have no payload, and are used to request an extension of the duration of the session. 2.1. Delivery service models LCT can support several different delivery service models. Two examples are briefly described here. Streaming service model. This is the basic service model for the delivery of unreliable streams, such as the TV example of Section 1. In this case the receivers join the session, and dynamically adapt the number of LCT groups they subscribe to (and the reception quality) according to the available bandwidth. Receivers then drop from the session when they are not interested in the stream anymore. This service model can also be used for reliable data transfer, in case of objects that need periodic updates such as the weather maps example mentioned in Section 1. In this case, receivers join the session and dynamically adapt the number of LCT groups they subscribe to until they have accumulated a sufficient number of packets to reconstruct the object. Afterwards, they drop from the session (or listen to the lowest LCT group only) and wait for the transmission of the next object to resubscribe or restart bandwidth adaptation according to the congestion control scheme. As an example, assume that each object to be transmitted has a size of 5000 1KB packets, and objects are updated every hour. The sender could set the data rate on the lowest LCT group to 5 1KB packets/s, so that receivers using just this LCT group could complete reception in 1000 seconds in absence of loss, and would be able to complete reception even in presence of some substantial amount of losses or because they join the session after the start of a transmission. Furthermore, the sender could use a number of LCT groups such that the aggregate data rate when using all LCT groups is 100 1KB packets/s, so that a receiver could be able to complete reception of a single object in as little 50 seconds (assuming no loss and that the congestion control mechanism immediately converges to the use of all LCT groups). Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 2.1. [Page 9] ^L INTERNET-DRAFT Expires: May 2001 November 2000 On demand delivery model. This service model is mostly relevant for reliable LCT session, where the same object is made available by the sender for a sufficiently long amount of time (typically much larger than the download time for the object) to make it convenient for receivers to enact the download at their discretion. Receivers may join the ongoing object transmission session at their discretion, obtain the necessary encoding symbols to reproduce the object, and then leave the session. For an on demand service model, senders typically transmit for some given time period selected to be long enough to allow all the intended receivers to join the session and recover the object. For example a popular software update might be transmitted using LCT for several days, even though a receiver may be able to complete the download in one hour total of connection time, perhaps spread over several intervals of time. Other service models. There are many other delivery service models that LCT can be used for that are not covered above. The description of the many potential applications, the appropriate delivery service model, and the additional mechanisms to support such functionalities is beyond the scope of this document. This document only attempts to describe the minimal common scalable elements to these diverse applications using LCT as the delivery mechanism. 2.2. Congestion Control The specific congestion control algorithm to be used for LCT sessions depends on the type of data delivered. While the general behaviour of the congestion control algorithm is to reduce the throughput in presence of congestion and gradually increase it in the absence of congestion, the actual dynamic behaviour (e.g. response to single losses) can vary. A possible congestion control algorithm for reliable LCT sessions is specified in [1]. Different session types might require a different congestion control algorithm. Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 2.2. [Page 10] ^L INTERNET-DRAFT Expires: May 2001 November 2000 3. Packet Formats The primary type of packets used by LCT sessions is "Data Packet". Data packets are sent by the data sender(s) to an LCT group. Some instances of LCT sessions may require the generation of feedback from the receivers to the sender. Such information is carried in Request packets, which are OPTIONAL and have the sole purpose of implementing the "transmission extension" mechanism described in Section 3.4. The LCT packet format described in this document is intended to be used in conjunction to the UDP transport protocol [14], or other transport protocols that satisfy the requirements stated in Section 1.2, specifically about demultiplexing and delivery of packet size information. LCT Data packets consist of an LCT header and an optional payload, as shown in Figure 1. When present, the LCT payload immediately follows the LCT header. LCT Request Packets only consist of an LCT header, as shown in Figure 2. LCT Packet Headers have variable size, which is specified by a length field in the 3dr byte of the header. In the LCT Packet Header, all integer fields are carried in "big-endian" or "network order" format, that is, most significant byte (octet) first. Bits designated as "padding" or "reserved" (r) MUST by set to 0 by senders and ignored by receivers. Unless otherwise noted, numeric constants in this specification are in decimal (base 10). 3.1. Data-Packet format The format of LCT Data Packets is depicted in Figure 1. Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 3.1. [Page 11] ^L INTERNET-DRAFT Expires: May 2001 November 2000 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | V |D|r|T|X|Trans.Obj.Id.(TOI) | HDR_LEN | Codepoint (CP)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Congestion Control Information (CCI) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Demux Label (DL, if D = 1) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sender Current Time (SCT, if T = 1) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Expected Residual Time (ERT, if T = 1) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Header Extensions (only present if X = 1 ) | | | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | | | Payload | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1 - LCT Data Packet format 3.2. Request-Packet format When using on-demand service, LCT receivers MAY request that the sender extend the transmission of the packets pertaining to a given object. Requests should only be sent in response to data packets which are carrying the TEI field (have the T bit set). Request packets MUST be unicast to the node (designated out of band) in charge of receiving Request packets. The format of Request Packets is shown in Figure 2. Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 3.2. [Page 12] ^L INTERNET-DRAFT Expires: May 2001 November 2000 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | V |D|r|1|X| Trans.Obj.Id.(TOI)| HDR_LEN | 0 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Desired End Time (DET) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Demux Label (DL, if D = 1) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Header Extensions (only present if X = 1 ) | | | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ Figure 2: LCT Request Packet format 3.3. LCT Packet header fields The function each field in LCT packet headers is the following. Fields marked as "1" mean that the corresponding bits MUST be set to "1" by the generating agent. Fields marked as "r" or "0" mean that the corresponding bits MUST be set to "0" by the generating agent. LCT version number (V): 2 bits Indicates the LCT protocol version. The LCT version number for this specification is 0. Demux Label Present flag (D): 1 bit D = 1 indicates that the Demux Label (DL) field is present. D = 0 indicates "no DL field". The DL field contains a 32-bit identifier which can be used to filter packets belonging to the session. Transmission Extension Information Present flag (T): 1 bit T = 1 indicates that the Transmission Extension Information (TEI) field is present. T = 0 indicates "no TEI field". The TEI field is inserted by senders when they are willing to accept Transmission Extension Request packets from the receivers. Header Extension Present flag (X): 1 bit X = 1 indicates that Header Extensions are present. X = 0 indicates "no Header Extensions". Header Extensions are used in Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 3.3. [Page 13] ^L INTERNET-DRAFT Expires: May 2001 November 2000 LCT to accommodate optional header fields which are not always used or have variable size. Transport Object Identifier (TOI): 10 bits The Transport Object Identifier (TOI) field indicates which Transport Object within the session this Data packet or Request packet pertains to. For example, a source might send a number of files in the same session, using TOI=0 for the first file, TOI=1 for the second one, etc. LCT Header Length (HDR_LEN): 8 bits Length of the variable portion of the LCT header in units of 32-bit words (excluding IP or UDP headers). The total LCT header length is (HDR_LEN+2) 32-bit words. This field can be used for direct access to the beginning of the LCT payload. Codepoint (CP): 8 bits An opaque identifier which is passed to the payload decoder to convey information on the codec being used for the payload. The mapping between the codepoint and the actual codec is defined on a per session basis and communicated out-of-band as part of the session description information. The use of the CP field is similar to the Payload Type (PT) field in RTP headers []. Congestion Control Information (CCI): 32 bits Used to carry Congestion Control Information, e.g. for the scheme described in [1] or other congestion control schemes. This field is opaque for the purpose of this specification. Demux Label (DL): 32 bits (OPTIONAL) Used to carry a 32-bit identifier to be used for filtering purposes. All LCT packets belonging to the same LCT group MUST have the same DL value that has been communicated out of band to receivers as part of the session description information. Receivers MUST discard packets with a non-matching DL. In order to minimize the amount of information to be supplied out of band, it is suggested that the same DL is used for all LCT layers in the same session. Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 3.3. [Page 14] ^L INTERNET-DRAFT Expires: May 2001 November 2000 Sender Current Time (SCT): 32 bits (OPTIONAL) This field represents the current clock at the sender at the time this packet was transmitted, measured in units of 1ms and computed modulo 2^32 units. SCT is used, in conjunction with the ERT and DET fields, to support receiver request generation as described in Section 3.4. This field is only present in Data Packets when T=1. Expected Residual Time (ERT): 32 bits (OPTIONAL) This field represents the expected residual transmission time for the current object, measured in units of 1ms. Senders MUST NOT include SCT and ERT if the transmission of the current object is expected to last for more than 2^32-1 time units (approximately 49 days). See Section 3.4 for a detailed description on the use of this field. This field is only present in Data Packets when T=1. Desired End Time (DET): 32 bits This field represents the desired finish time for the transmission of the object, measured in units of 1ms and computed modulo 2^32 time units. See Section 3.4 for a detailed description on the use of this field. This field is only present in Request Packets when T=1. 3.4. Transmission Extensions Four fields in the packet headers are used to support the Transmission Extension mechanism: T, SCT, ERT, DET. These fields have the following purposes: o to communicate to receivers the expected finish time for the transmission of the current object o to let receivers produce requests to extend transmission which are idempotent. When a sender is willing to accept extension requests, it will set T=1 in the data packets, and also include the SCT and ERT fields as follows: o SCT is set to the current time known at the sender, measured in units of 1ms, and computed modulo 2^32 units. Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 3.4. [Page 15] ^L INTERNET-DRAFT Expires: May 2001 November 2000 o ERT is set to the expected residual transmission time, as known to the sender, and measured in units of 1ms. The maximum value that can be accommodated in this field is approximately 49 days. A sender MUST NOT generate these fields if the residual transmission time is larger than this maximum value. The Expected Finish Time (EFT) of the transmission at the sender site can be computed as EFT = SCT + ERT. A receiver can determine the Desired Residual Time (DRT) based on external information, such as the amount of missing data and the incoming data rate. DRT is the (estimated) transmission extention needed, measured from the time of estimation to the time of the desired end of transmission. The maximum value for DRT is 2^32-1 units of 1ms each. Higher values must be upper bounded to 2^32-1. A receiver MUST NOT generate Request packets if the reception is likely to complete before the expected end of the session, i.e. if DRT << ERT . If a receiver needs to extend the transmission, they compute the Desired End Time value to be put into Request packets as DET = SCT + DRT. The above procedures make requests idempotent. 3.5. Header-Extension Fields To allow for additional header fields and to extend the size of some of the predefined fields, the LCT header contains an additional header field flag, "X". If "X" is set to 0 then no additional header fields are included within the LCT header beyond the predefined fields. When additional headers beyond the predefined fields are used, the value of "X" within the LCT header MUST be set to 1. Examples of use of header extensions include: o extended-size version of already existing header fields. o Sender and Receiver authentication information. If present, Header Extensions must be processed before performing any congestion control procedure or otherwise accepting the packet. Packets with unrecognised Header Extensions MUST be discarded by the receiving Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 3.5. [Page 16] ^L INTERNET-DRAFT Expires: May 2001 November 2000 agent, hence the expected use of extentions should be signalled out-of- band before session startup. There are two formats for Header Extension fields, as depicted below. The first format is used for variable-length extensions, with HET values between 0 and 63. The second format is used for fixed length (one word) extension, using HET values from 64 to 127. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |L| HET (<=63) | HEL | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + . . . Header Extension Content (HEC) . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |L| HET (>=64) | Header Extension Content (HEC) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5 - format of additional headers The explanation of each sub-field is the following. Last Header Extension (L): 1 bit MUST be set to 1 in the last Header Extension field present in a packet header, MUST be set to 0 in all the others. Header Extension Type (HET): 7 bits The type of the header extension. This document defines a number of possible types. Additional types may be defined in future version of this specification. HET values from 0 to 63 are used for variable-length Header Extensions. HET values from 64 to 127 are used for fixed-length Header Extensions. Header Extension Length (HEL): 8 bits (OPTIONAL) The length of the whole Header Extension field, expressed in multiples of 32-bit words. This field is only present for Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 3.5. [Page 17] ^L INTERNET-DRAFT Expires: May 2001 November 2000 variable-length extension (HET between 0 and 63). Header Extension Content (HEC): variable length The content of the Header Extension. The format of this sub-field depends on the header extension type. For fixed-length header extensions, the HEC is 24 bits. For variable-length header extensions, the HEC field has variable size, as specified by the HEL field. Note that the length of each Header Extension field MUST be a multiple of 32-bit. Also note that the total size of all header extensions plus optional header fields cannot exceed 255 32-bit words. The originator of a packet with header extensions MUST not leave additional space between the end of the last Header Extension and the beginning of the LCT payload. All LCT agents MUST support the EXT_NOP header extension. The following header extension types are defined: EXT_NOP=0 No-Operation extension. The information present in this extension field MUST be ignored by receivers. EXT_CCI=1 Congestion Control Information extension. This extension field extends the CCI field present in the fixed part of the header. It is used when the congestion control information does not fit in the 32 bits CCI field. When this option is present, receivers MUST ignore the CCI field and use the value provided in this option instead. The interpretation of the data contained in EXT_CCI MUST be negotiated out-of-band. EXT_TOI=2 Tranport Object Identifier extension. This extension field extends the TOI field of the fixed header. It is used when the Tranport Object Identifier does not fit in 10 bits. When this option is present, receivers MUST ignore the TOI field in the fixed header and use the value provided in this option instead. The interpretation of the data contained in EXT_TOI MUST be negotiated out-of-band. EXT_AUTH=3 Authentication Extension Information used to authenticate the source of the packet. Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 3.5. [Page 18] ^L INTERNET-DRAFT Expires: May 2001 November 2000 If present, the format of this header extension and its processing must be communicated out-of-band as part of the session description. It is recommended that senders and receivers provide some form of authentication on the packet they transmit. If EXT_AUTH is present, whatever authentication checks that can be performed immediately upon reception of the packet must be performed before accepting the packet and performing any congestion control-related action on it. Some authentication schemes impose a delay of several seconds between when a packet is received and when the packet is fully authenticated. Any congestion control related action that is appropriate must not be delayed by any such full authentication delay. 4. Procedures 4.1. Sender Operation Before a session starts, an LCT sender MUST make available all applicable information regarding the session, including but not limited to: o number of LCT groups; o addresses, port numbers and data rates used for each LCT group; o the format of the payload (for example, the mapping of codepoints used in the session to FEC codec types and parameters); o the congestion control scheme being used; o the Demux Label (DL) value(s) used for the session; o the authentication scheme being used, and all relevant information which is necessary for sender authentication purposes; o the address of the node in charge of receiving Request packets; The session description could be in a form such as SDP [8], XML metadata, HTTP/Mime headers, etc. It might be carried in a session announcement protocol such as SAP [9], located on a Web page with Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 4.1. [Page 19] ^L INTERNET-DRAFT Expires: May 2001 November 2000 scheduling information, or conveyed via E-mail or other out of band methods. Discussion of session description format, and distribution of session descriptions is beyond the scope of this document. Within an LCT session, an LCT sender transmits a sequence of data packets each containing a payload encoded according to one of the codecs defined in the session description. Data packets are sent over one or more LCT groups which together constitute a session. Transmission rates may be different in different groups. This document does not specify the policy used to place symbols into packets, nor the order in which packets are transmitted, nor the scheduling of packets in multiple groups. Although these issues affect the efficiency of the protocol, they do not affect the correctness nor the inter-operability between senders and receivers. Multiple transport objects can be carried within the same LCT session. Each object is identified by a unique Transport Object Identifier (TOI). Objects MUST be transmitted sequentially, and the TOIs MUST be used in strict sequential order. A sender is not allowed to transmit packets for old objects after starting the transmission of packets for a new one. Note that despite this restriction, both the network and the underlying protocol layers can cause some reordering of packets, especially when sent over different LCT groups, and thus receivers MUST NOT assume that the reception of a packet for a new object means that there are no more packets in transit for the previous one, at least for some amount of time. Typically, the sender(s) continues to send data packets in a session until the transmission is considered complete. The transmission may be considered complete when some time has expired, a certain number of packets have been sent, or some out of band signal (possibly from a higher level protocol) has indicated completion by a sufficient number of receivers. Feedback through LCT Request packets MAY also be used to determine the end of the session. The specification of the processing of the payload carried in LCT packets is beyond the scope of this document. LCT will only act as a transport layer and will merely implement congestion control and convey payload and associated information (Codepoint and TOI) to the receivers. For the reasons mentioned above, this document does not pose any restriction on packet sizes. However, network efficiency considerations recommend that the sender uses as large as possible payload size, but in such a way that packets do not exceed the network's maximum transmission unit size (MTU), or fragmentation coupled with packet loss might introduce severe inefficiency in the transmission. It is also recommended that all packets have the same or very similar Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 4.1. [Page 20] ^L INTERNET-DRAFT Expires: May 2001 November 2000 sizes, as this can have a severe impact on the effectiveness of congestion control schemes such as the one described in [1]. An LCT sender MUST implement the sender-side part of one of the congestion control schemes that is in accordance with RFC 2357, and the corresponding receiver congestion control scheme MUST be communicated out of band and implemented by any receivers participating in the session. If a Sender implements the Transmission Extensions, then it MUST operate as described in Section 3.4. 4.2. Receiver Operation Receivers can operate differently depending on the delivery service model. For example, for an on demand service model receivers may join a session, obtain the necessary encoding symbols to reproduce the object, and then leave the session. As another example, for a streaming service model a receiver may be continuously joined to a set of multicast groups to download all objects in a session. To be able to participate in a session, a receiver MUST first obtain the relevant session description information as listed in Section 4.1. To be able to participate in a session, a receiver MUST implement the congestion control algorithm specified in the session description. If a receiver is not able to implement the congestion control algorithm used in the session, it MUST NOT join the session. If source authentication information is present in data packets, it must be used as specified in Section 3.5. If a receiver is unable to implement the authentication mechanism used by the session, it MUST NOT join the session. To be able to participate in a session, receivers MUST be able to process the payload of the packets. At a minimum this involves the ability to forward or store the payload, and possibly (in case of reliable LCT session) determine when an object can be successfully recovered. If a receiver is not able to process the payload of packets, it MUST either drop from the session, or reduce the receive bandwidth to the minimum value allowed by the congestion control algorithm being used. When the session is transmitted on multiple LCT groups, receivers MUST do it according to the specified startup behaviour of the congestion control algorithm itself. For a layered transmission on multiple groups, this typically means that a receiver will only join a minimal set of LCT groups, possibly a single one. This rule has the purpose of preventing Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 4.2. [Page 21] ^L INTERNET-DRAFT Expires: May 2001 November 2000 receivers from starting at high data rates. If the Transmission Extension Information field is present in data packets, receivers MAY originate Request packets to extend the transmission of an object as specified in Section 3.4. Receivers MUST NOT originate transmission extension request if the T flag in incoming data packets is set to 0. Receivers which generate Request packets MUST implement feedback- implosion avoidance procedures as follows: o Receivers must use the Expected Finishing Time advertised by the sender(s) to predict whether or not they will be able to recover the object from the packets they have already received and from the packets they can expect to receive in the future. This prediction SHOULD also consider data-rate fluctuations caused by congestion control adaptations. o When a receiver predicts that the residual object transmission time is not sufficient to successfully recover the object, it MAY schedule the transmission of an extension request at a random time in the future, before the scheduled end of the transmission. o When a receiver has a pending extension request scheduled for transmission, it must keep monitoring the progress of the reception and cancel the pending request if either of the following happens: - The residual object transmission time becomes larger the predicted time needed to complete the reception. - A Data packet for the object of interest is received with the T flag set to 0. o A receiver MUST cancel pending extension-requests when the transmission time of an object is over. The rules stated above are not sufficient to obtain a good implosion prevention in all the cases. For improved performance the following guidelines SHOULD be followed: o Extension requests should be *scheduled* only when the reception of the object is in an advanced status of completion (e.g. more than 50%). This improves the accuracy of the receivers' prediction Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 4.2. [Page 22] ^L INTERNET-DRAFT Expires: May 2001 November 2000 reducing the chance that an extension is requested uselessly. o The time needed for a Request to suppress pending Request from other receivers is approximatively a packet round trip time (unicast request to the sender and multicast data packets to the receivers). Using random-time scheduling for requests is an effective suppression mechanism only if the length of the interval from which the transmission time is selected is much larger than a round trip time. For this reason extension requests should be *scheduled* at least a few seconds before the end of transmission. 5. Security Considerations LCT can be subject to denial-of-service attacks by attackers which try to confuse the congestion control mechanism, or send forged packets to the session which would prevent successful reconstruction of large portions of the data stream. The same exact problems are present in TCP, where an attacker can forge packets and either slow down or increase the throughput of the session, or replace parts of the data stream with forged data. If the stream is carrying compressed or otherwise coded data, even a single forged packet could also cause incorrect reconstruction of the rest of the data stream. It is therefore recommended that LCT agents implement some form of authentication to protect themselves against such attacks. 6. IANA Considerations No information in this specification is subject to IANA registration. Building blocks components used by LCT may introduce additional IANA considerations. 7. Intellectual Property Issues No specific codec or congestion control scheme are specified or referenced as mandatory in this document. LCT may be used with congestion control protocols and codecs which are proprietary, or have pending or granted patents. Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 7. [Page 23] ^L INTERNET-DRAFT Expires: May 2001 November 2000 8. Acknowledgments Thanks to Bruce Lueckenhoff, Hayder Radha and Vincent Roca for detailed comments on this document. [1] Luby, M., Vicisano, L., Haken, A., "Layered congestion control building block", draft-ietf-rmt-bb-lcc-00.txt, November 2000. [2] Bradner, S., Key words for use in RFCs to Indicate Requirement Levels (IETF RFC 2119) http://www.rfc-editor.org/rfc/rfc2119.txt [3] Byers, J.W., Luby, M., Mitzenmacher, M., and Rege, A., "A Digital Fountain Approach to Reliable Distribution of Bulk Data", Proceedings ACM SIGCOMM '98, Vancouver, Canada, Sept 1998. [4] Cain, B., Speakman, T., and Towsley, D., "Generic Router Assist (GRA) Building Block, Motivation and Architecture", Internet Draft draft-ietf-rmt-gra-arch-00.txt, a work in progress. [5] Deering, S., "Host Extensions for IP Multicasting", RFC 1058, Stanford University, Stanford, CA, 1988. [6] Fielding, R., Gettys, J., Mogul, J. Frystyk, H., Berners-Lee, T., Hypertext Transfer Protocol - HTTP/1.1 (IETF RFC2068) http://www.rfc- editor.org/rfc/rfc2068.txt [7] Gemmell, J., Schooler, E., and Gray, J., "FCast Scalable Multicast File Distribution: Caching and Parameters Optimizations", Technical Report MSR-TR-99-14, Microsoft Research, Redmond, WA, April, 1999. [8] Handley, M., and Jacobson, V., "SDP: Session Description Protocol", RFC 2327, April 1998. [9] Handley, M., "SAP: Session Announcement Protocol", Internet Draft, IETF MMUSIC Working Group, Nov 1996. [10] Holbrook, H., Cheriton, D., "IP Multicast Channels: Experss Support for Large-scale Single-source Applications", ACM SIGCOMM'99 [11] Luby, M., Gemmell, Vicisano, L., J., Rizzo, L., Handley, M., Crowcroft, J., "The use of Forward Error Correction in Reliable Multicast", Internet Draft draft-ietf-rmt-info-fec-00.txt, November 2000. [12] Luby, M., Gemmell, J., Vicisano, L., Rizzo, L., Handley, M., Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 8. [Page 24] ^L INTERNET-DRAFT Expires: May 2001 November 2000 Crowcroft, J., "RMT BB: Forward Error Correction Codes", Internet Draft draft-ietf-rmt-bb-fec-01.txt, November 2000. [13] Mankin, A., Romanow, A., Brander, S., Paxson, V., "IETF Criteria for Evaluating Reliable Multicast Transport and Application Protocols," RFC2357, June 1998. [14] J. Postel, "User Datagram Protocol", RFC768, August 1980. [15] Rizzo, L, and Vicisano, L., "Reliable Multicast Data Distribution protocol based on software FEC techniques", Proceedings of the Fourth IEEES Workshop on the Architecture and Implementation of High Performance Communication Systems, HPCS'97, Chalkidiki, Greece, June 1997. [16] Rizzo, L., "Effective Erasure Codes for Reliable Computer Communication Protocols", ACM SIGCOMM Computer Communication Review, Vol.27, No.2, pp.24-36, Apr 1997. [17] Vicisano, L., Rizzo, L., Crowcroft, J., "TCP-like Congestion Control for Layered Multicast Data Transfer", IEEE Infocom '98, San Francisco, CA, Mar.28-Apr.1 1998. [18] Vicisano, L., "Notes On a Cumulative Layered Organization of Data Packets Across Multiple groups with Different Rates", University College London Computer Science Research Note RN/98/25, Work in Progress (May 1998). 9. Authors' Addresses Michael Luby luby@digitalfountain.com Digital Fountain 600 Alabama Street San Francisco, CA, USA, 94110 Jim Gemmell jgemmell@microsoft.com Microsoft Research 301 Howard St., #830 San Francisco, CA, USA, 94105 Lorenzo Vicisano lorenzo@cisco.com cisco Systems, Inc. 170 West Tasman Dr., Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 9. [Page 25] ^L INTERNET-DRAFT Expires: May 2001 November 2000 San Jose, CA, USA, 95134 Luigi Rizzo luigi@iet.unipi.it ACIRI/ICSI, 1947 Center St, Berkeley, CA, USA, 94704 and Dip. Ing. dell'Informazione, Univ. di Pisa via Diotisalvi 2, 56126 Pisa, Italy Mark Handley mjh@aciri.org ACIRI, 1947 Center St, Berkeley, CA, USA, 94704 Jon Crowcroft J.Crowcroft@cs.ucl.ac.uk Department of Computer Science University College London Gower Street, London WC1E 6BT, UK Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 9. [Page 26] ^L INTERNET-DRAFT Expires: May 2001 November 2000 10. Full Copyright Statement Copyright (C) The Internet Society (2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 10. [Page 27] ^L INTERNET-DRAFT Expires: May 2001 November 2000 Luby/Gemmell/Vicisano/Rizzo/Handley/Crowcroft Section 10. [Page 28]