Internet Engineering Task Force 				  RMT WG
INTERNET-DRAFT				     Luigi Rizzo/ACIRI & U. Pisa
draft-ietf-rmt-bb-pgmcc-00.txt	    Gianluca Iannaccone/Sprint & U. Pisa
						  Lorenzo Vicisano/Cisco
						      Mark Handley/ACIRI
							23 February 2001
						    Expires: August 2001


	    PGMCC single rate multicast congestion control:
			 Protocol Specification


Status of this Document

This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026.

Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups.  Note that other groups
may also distribute working documents as Internet-Drafts.

Internet-Drafts are valid for a maximum of six months and may be
updated, replaced, or obsoleted by other documents at any time.  It is
inappropriate to use Internet-Drafts as reference material or to cite
them other than as a "work in progress".

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt

To view the list Internet-Draft Shadow Directories, see
http://www.ietf.org/shadow.html.

This document is a product of the IETF RMT WG.	Comments should be
addressed to the authors, or the WG's mailing list at rmt@lbl.gov.


				Abstract


     This document describes PGMCC, a single rate multicast
     congestion control scheme which is TCP-friendly and achieves
     scalability, stability and fast response to variations in
     network conditions.  PGMCC is suitable for both non-reliable


Rizzo/Iannaccone/Vicisano/Handley				[Page 1]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


     and reliable data transfers.  It is mainly designed for NAK-
     based multicast protocols, and uses a window-based, TCP-like
     controller based on positive ACKs and run between the sender
     and one group's representative, called the ACKER, which is
     elected dynamically and may change over time.

     PGMCC is made of two components: a window-based control loop,
     which closely mimics TCP behaviour, and a fast and low-
     overhead procedure to select (and track changes of) the acker.
     The scheme is robust to measurement errors, and supports fast
     response to changes in the receiver set and/or network
     conditions.


Rizzo/Iannaccone/Vicisano/Handley				[Page 2]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


			   Table of Contents


     1. Introduction. . . . . . . . . . . . . . . . . . . . . .   4
      1.1. Terminology. . . . . . . . . . . . . . . . . . . . .   4
     2. Protocol Overview . . . . . . . . . . . . . . . . . . .   4
      2.1. Packet Contents. . . . . . . . . . . . . . . . . . .   6
       2.1.1. Data Packets. . . . . . . . . . . . . . . . . . .   6
       2.1.2. Feedback Packets. . . . . . . . . . . . . . . . .   7
      2.2. Window-based controller. . . . . . . . . . . . . . .   8
      2.3. Acker Selection. . . . . . . . . . . . . . . . . . .   9
      2.4. TCP Throughput Equation. . . . . . . . . . . . . . .  10
      2.5. RTT measurement. . . . . . . . . . . . . . . . . . .  11
       2.5.1. Explicit Timestamp. . . . . . . . . . . . . . . .  11
       2.5.2. Implicit timestamp. . . . . . . . . . . . . . . .  11
       2.5.3. Sequence numbers. . . . . . . . . . . . . . . . .  12
       2.5.4. Recommendations . . . . . . . . . . . . . . . . .  13
      2.6. Loss rate measurement. . . . . . . . . . . . . . . .  13
      2.7. Interaction with feedback suppression
      schemes . . . . . . . . . . . . . . . . . . . . . . . . .  14
     3. Procedures - Sender . . . . . . . . . . . . . . . . . .  14
     4. Procedures -- Receiver. . . . . . . . . . . . . . . . .  16
     5. Security Considerations . . . . . . . . . . . . . . . .  16
     6. Authors' Addresses. . . . . . . . . . . . . . . . . . .  17
     7. Acknowledgments . . . . . . . . . . . . . . . . . . . .  18
     8. Full Copyright Statement. . . . . . . . . . . . . . . .  19


Rizzo/Iannaccone/Vicisano/Handley				[Page 3]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


1.  Introduction

This document describes PGMCC, a single rate multicast congestion
control scheme which is TCP-friendly and achieves scalability, stability
and fast response to variations in network conditions.

PGMCC is designed for multicast sessions with one sender and one or more
receivers, and normally using NAK-based protocols to collect feedback
from the receivers.  The congestion control scheme implemented by PGMCC
mimics very closely the behaviour of TCP, as it uses a window-based
control loop which is run between the sender and a selected receiver
called the ACKER. The role of the ACKER is to provide timely feedback in
the same way as a TCP receiver; additionally, the ACKER is selected
dynamically among the receivers as the one which would experience the
lowest throughput if a separate TCP session were run between the sender
and each of the receivers.

Scalability in PGMCC comes from the use of negative acknowledgements
(NAKs) for collecting feedback from receivers other than the ACKER.  As
a consequence, the usual techniques for NAK suppression and aggregation
can be used to reduce the amount of feedback to the source and improve
the scalability of the scheme.

PGMCC is designed to completely decouple congestion control from data
integrity. As a consequence, the the scheme can work with both reliable
data transfer and unreliable communication protocols such as those used
for video or audio streaming.

While designed with multicast in mind, PGMCC can be equally used as a
replacement for TCP for unicast sessions (such as UDP transfers) with
application which require a lower degree of reliability than what TCP
offers.


1.1.  Terminology

In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
"SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" are to be interpreted as described in RFC 2119 and indicate
requirement levels for compliant PGMCC implementations.


2.  Protocol Overview

PGMCC is based on two separate but complementary mechanism:

  o A window-based control loop which closely emulates TCP congestion
    control.


Rizzo/Iannaccone/Vicisano/Handley		    Section 2.	[Page 4]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


    The window based control loop is simply an adaptation of the TCP
    congestion control scheme to a protocol where missing (because of
    network errors or congestion) data packets are not necessarily
    retransmitted, and so the congestion control scheme cannot rely on
    cumulative acknowledgements.  In PGMCC, the ``window'' is simulated
    using a token-based scheme which permits to decouple congestion
    control from retransmission state. One of the receivers in the group
    operates as the ACKER, i.e. the node in charge of sending positive
    acknowledgements back to the source and thus controlling the
    transfer.


  o A procedure to select the ACKER.
    The purpose of this procedure is to make sure that, in presence of
    multiple receivers, the ACKER is dynamically selected to be the
    receiver which would have the lowest throughput if a separate TCP
    session were run between the sender and each receiver.
    For the acker selection mechanism, PGMCC uses a throughput equation
    to determine the expected throughput for a given receiver as a
    function of the loss rate and round-trip time. Unlike other schemes
    [3], the TCP throughput equation is not used to determine the actual
    sending rate, which is completely controlled by the window-based
    control loop.


In principle, PGMCC's congestion control mechanism works as follows:


  o Receivers measure the loss rate and feed this information back to
    the sender, either in ACK or NAK messages.


  o The sender also uses these feedback messages to measure the round-
    trip time (RTT) to the source of each feedback message.


  o The loss rate and RTT are then fed into PGMCC's throughput equation,
    to determine the expected throughput to that receiver.


  o The sender then selects as the acker the receiver with the lowest
    expected throughput, as computed by the equation.

The dynamics of the acker selection mechanism are sensitive to how the
measurements are performed and applied. In the rest of this document we
suggest specific mechanisms to perform and apply these measurements.
Other mechanisms are possible, but it is important to understand how the
interactions between mechanisms affect the dynamics of PGMCC.


Rizzo/Iannaccone/Vicisano/Handley		    Section 2.	[Page 5]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


2.1.  Packet Contents

Before specifying the sender and receiver functionality, we describe the
information required by PGMCC to perform its tasks. This information is
carried in the data packets sent by the sender, and in the feedback
packets sent by the receiver.  As PGMCC will be used along with some
transport protocol, the actual data and feedback packets will contain
further information for use by the protocol itself. For this reason, we
do not specify packet formats, as these depend on the details of the
transport protocol used.

Note that the requirements of the transport protocol in terms of packet
generation may differ from those of PGMCC. As an example, most NAK-based
reliable multicast protocols do not use positive acknowledgements, but
PGMCC requires ACKs for clocking out data packets; unreliable transport
protocols might have no interest in generating NAKs for data integrity
purposes, yet PGMCC depends on NAKs to reach the source in order to
elect the ACKER.


Implementors may decide to insert PGMCC-related information in already
existing protocol packets whenever possible, but in cases such as the
ones described above it might be necessary to define and generate new
packets exclusively for congestion control purposes. As an example, in a
prototype implementation of PGMCC on top of the PGM protocol [10], some
of the information used by PGMCC is already present in the original
protocol packets, and PGMCC-specific information is carried as PGM
options in data and NAK packets. However, a new packet type has been
defined for ACKs, which are generated according to the rules defined in
this document.


2.1.1.	Data Packets

Each data packet sent by the data sender contains the following
information:


  o A sequence number. This number is incremented by one for each data
    packet transmitted.  The field must be sufficiently large that it
    does not wrap causing two different packets with the same sequence
    number to be in the receiver's recent packet history at the same
    time.


  o A timestamp (or equivalent information) indicating when the packet
    with this sequence number has been sent.  There is no requirement
    for synchronized clocks between the sender and the receivers.  The


Rizzo/Iannaccone/Vicisano/Handley		Section 2.1.1.	[Page 6]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


    resolution of the timestamp should typically be in the order of the
    maximum packet interarrival time.  Because PGMCC does not rely on a
    precise measurement of the RTT, the timestamp can be omitted, and
    its purpose can be fulfilled by using the sequence number, as
    described in Section 2.5.

  o The identity of the ACKER, i.e. the receiver in charge of sending an
    acknowledgement for this data packet.  The ACKER is elected as a
    result of the process described in Section 2.3. A special value is
    used to indicate that no ACKER is designated for this packet -- this
    can happen at the beginning of a session or when the current ACKER
    leaves the group.  Receivers interpret this value as a request to
    elect a new acker.


2.1.2.	Feedback Packets

There are two types of feedback packets used by PGMCC: ACK packets and
NAK packets.
ACK packets are generated by the current ACKER, and are used to detect
loss or successful delivery of packets, and to regulate the throughput
accordingly. ACK packets also contain information used to determine the
TCP-equivalent throughput for the ACKER.
NAK packets are sent by any receiver who experiences loss.  They contain
information used to determine the TCP-equivalent throughput for that
receiver. In an actual protocol instantiation such as PGM [10], NAK
packets might also be used by the protocol to request the retransmission
of specific packets, and indicate the identity of the packet being
requested.

Both ACK and NAK packets are sent by data receivers, and contain the
following information:


  o The timestamp (or equivalent information) for the most recently
    received data packet.
    This value is used by the sender to measure the RTT to the receiver
    who generated this feedback packet.

    When feedback is delayed by the receivers, the value echoed as a
    timestamp SHOULD be incremented by the time elapsed between the
    reception of the most recent data packet and the actual transmission
    of the feedback packet.

    If timestamps are not present in data packets, then the value echoed
    by the receiver in place of the timestamp is the highest sequence
    number among received data packets, as explained in Section 2.5.


Rizzo/Iannaccone/Vicisano/Handley		Section 2.1.2.	[Page 7]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


  o The receiver's current estimate of the loss rate, p.  The loss rate
    is represented as a 16-bit unsigned integer, in network format, with
    0 indicating no loss and 2^16-1 indicating 100% loss.  The loss rate
    is measured by receivers as described in Section 2.6

In addition to the above, ACK packets (sent by the acker designated in
the corresponding data packets) must also contain the following
information:

  o The highest sequence number among received data packets, RX_MAX, and
    a bitmap indicating the receive status of the 32 data packets with
    sequence numbers RX_MAX-31 to RX_MAX.
    This information is used by the sender to record which packets have
    been received or lost, and manipulate the transmit window
    accordingly.  Note that each ACK packet contains information about
    32 packets, and this increases the robustness of the scheme to loss
    of ACK packets.  This is necessary because ACKs are not sent
    reliably (unlike TCP's ACKs, which are cumulative).


2.2.  Window-based controller

In a window-based flow/congestion control scheme such as TCP, the
``window'' represents the amount of unacknowledged data at the sender.
As soon as acknowledgements arrive to the sender, they advance the
window and enable the sender to transmit more data packets.

The sender may dynamically change the size of the window, according to
the congestion control scheme being used. In TCP, and PGMCC, we use an
``Additive Increase Multiplicative Decrease'' (AIMD) scheme, meaning
that in absence of loss the window is increased by some fixed amount
(tipically one packet) per rount trip time (RTT), whereas upon loss the
window is reduced to a fraction of its original value (typically halved)
in each RTT in which a loss event is experienced.

Because PGMCC may be used with non-reliable communication, the window
must be simulated using a token-based mechanism, controlled by two
variable:

  o A ``Token Count'', T, which indicates the number of packets that can
    be transmitted;

  o A ``Window Size'', W, which controls the rate of increase/decrease
    of the window.

The information contained in each ACK is used to determine how many new
packets are acknowledged by the incoming ACK, and whether there are
unacknowledged packets which were not reported in previous ACKs.  The


Rizzo/Iannaccone/Vicisano/Handley		  Section 2.2.	[Page 8]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


sender also schedules a timeout to react in case no ACKs are received.

The sender behaves as follows:


  o INITIALIZATION
    At startup, or after a timeout, both W and T are set to 1.


  o ACK RECEPTION, NO LOSS DETECTED
    If the incoming ACK reports new acknowledged packets, and no loss
    (as defined in the next paragraph) is detected, then the window is
    inflated by one packet per RTT. This is achieved by modifying T and
    W as follows:

		   T = T + N * ( 1 + 1/W ) ,  W = W + N/W

    where N is the number of packets acknowledged by the ACK.


  o ACK RECEPTION, LOSS DETECTED
    If the incoming ACK reports an unacknowledged data packet which is
    followed by at least 3 acknowledged data packets, then the packet is
    assumed to be lost and PGMCC reacts by halving the window, in the
    same way as TCP after 3 duplicate acknowledgements.  This is
    achieved by modifying T and W as follows:

			   T = T - W/2 , W = W/2

    to simulate the multiplicative decrease.
    Additionally, all window manipulation is suspended for the
    subsequent RTT.  This is achieved by recording the current transmit
    sequence number, and canceling any further manipulation of the
    window until feedback is received for the next transmitted packet.


2.3.  Acker Selection

The ACKER selection process in PGMCC aims at locating the receiver which
would have the lowest throughput if each receiver were using a separate
TCP connection to transfer data.

Because the steady-state throughput of a TCP connection can be
characterised in a reasonably accurate way in terms of its loss rate and
round trip time [5], the throughput for each receiver can be estimated
by using these two parameters.


Rizzo/Iannaccone/Vicisano/Handley		  Section 2.3.	[Page 9]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


Whenever an ACK or NAK packet from any of the receivers reaches the
sender, the latter is able to compute the expected throughput T_i for
that receiver by using the equation shown in Section 2.4, with the round
trip time RTT and loss rate p and measured as described in Sections 2.5
and 2.6, respectively.	At any given time, the sender stores the
expected throughput for the current ACKER, T_acker. This value is
updated every time an ACK or NAK from the current ACKER is received
(note that, after a new ACKER is selected, the sender will typically
receive ACKs from the old ACKER for one RTT, and the feedback from
different ACKERs might be interleaved if the paths leading to them have
different round trip times).

Whenever an ACK or NAK is received from another node i (a previous ACKER
or some other receiver), the expected throughput T_i for that node is
computed, and compared with T_acker.  Node i is elected as the new acker
if

			   T_i < C * T_acker

where the constant C between 0 and 1 provides some histeresys and avoids
too frequent oscillations in selecting the acker.  A suggested value for
C is 0.75.

As an implementation detail, note that it is cheaper to compute T_i
^(-2), so the above equation must be rewritten accordingly.


2.4.  TCP Throughput Equation

Any realistic equation of TCP throughput as a function of loss event
rate and RTT should be suitable for use in PGMCC.  Unlike other schemes
[3] where the throughput equation directly controls the transmit rate,
in PGMCC the equation is used only for acker selection purposes, and the
throughput values are only compared among themselves.  As a consequence,
it is acceptable to use the simplified equation below:

					  1
			T_i = -----------------
			      RTT_i * sqrt(p_i)


where

    T_i is proportional to the throughput for the i-th receiver;

    RTT_i is the round trip time for the i-th receiver;


Rizzo/Iannaccone/Vicisano/Handley		 Section 2.4.  [Page 10]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


    p_i is the loss rate for the i-th receiver, between 0 and 1.0;

and multiplying constants are omitted.	The above equation is accurate
at low loss rates, where retransmission timeouts do not have a
significant impact on the behaviour of the protocol.

In future, different TCP equations may be substituted for this equation.
The requirement is that the throughput equation be a reasonable
approximation of the sending rate of TCP for conformant TCP congestion
control.

The parameters p and RTT need to be measured or calculated by a PGMCC
implementation.  The measurement of RTT is specified in Section 2.5; the
measurement of p is specified in Secion 2.6.

2.5.  RTT measurement

In PGMCC, the RTT is measured by the sender making use of the timestamp
(or equivalent information) echoed back by each receiver in feedback
messages. Three procedures are possible to measure the RTT, as follows.
In no cases it is required to have clock synchronization between sender
and receivers.


2.5.1.	Explicit Timestamp

This first technique relies on the transmission of a timestamp TS_j with
each data packet j.
The receiver will record the most recently received timestamp, and will
echo it back to the source when generating an ACK or a NAK. If the
feedback is delayed, the time elapsed between the reception of the
timestamp and the generation of the feedback should be added to the
echoed timestamp.
The sender computes the RTT by subtracting the received timestamp from
the current value of the clock.

The resolution of the timestamp value should be in the order of the
minimum packet interarrival time in order to achieve precise
measurements. If receivers need to apply correction for delayed
feedback, it is necessary that receiver know the resolution of the
timestamp clock. A suggested value is 1ms.


2.5.2.	Implicit timestamp

With this technique, the sender will record a timestamp TS_j for each
transmitted data packet j, but the timestamp will not be transmitted
with the packet itself.


Rizzo/Iannaccone/Vicisano/Handley	       Section 2.5.2.  [Page 11]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


The receiver will record the most recently received sequence number, and
will echo it back to the source when generating an ACK or a NAK.
The sender computes the RTT by looking up the timestamp associated with
the sequence number received in the feedback packet, and subtracting it
from the current clock value.

If the feedback from the receiver is delayed, as it is commonly the case
for NAKs, the receiver can compute, and send back to the source, a
correction term corresponding to the time elapsed between the reception
of the timestamp and the generation of the feedback. The correction term
will then be subtracted by the sender in order to obtain the correct
estimate of the RTT.

This RTT measurement technique is equivalent to the previous one, but it
saves some space in data packets as the timestamp does not need to be
sent explicitly. Feedback packets might become larger if the correction
value is transmitted explicitly; but in many cases, the sequence number
will already be present for other reasons (e.g. ACK packets), and
wherever space is a concern the sequence number and the correction term
can be packed in a single 32-bit word without loss of precision.


2.5.3.	Sequence numbers

This technique is the least precise, but it does not rely on the
presence of a high resolution clock on the nodes.
The sender will not compute any timestamp, and just send data packets
with their sequence number j.
The receiver will record the most recently received sequence number, and
will echo it back to the source when generating an ACK or a NAK.
The sender computes the RTT as the difference between the most recently
sent sequence number and the sequence number received from the ACK or
NAK packet.

Note that in this case the RTT is not measured in seconds, but in
"sequence numbers", which are monotonically, but not uniformly,
increasing with time. The two measurements are equivalent if the sender
transmits at a constant rate. When the data rate changes over time (as
it is normally happens, given that PGMCC controls the actual data rate),
then the "measured" RTT values grow with the actual transmit rate. This
can influence the correctness of the results when comparing two
measurement done over different and only partially overlapping time (and
sequence number) intervals where the transmit rate incurs a significant
change.


Rizzo/Iannaccone/Vicisano/Handley	       Section 2.5.3.  [Page 12]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


2.5.4.	Recommendations

Whenever possible, the measurement of the RTT should be carried out
using either explicit or implicit timestamps, and by keeping track of
the "correction term" (the delay between data reception and feedback
generation).

If the receiver does not have a clock with a suitable resolution, the
correction term might not be present (or be inaccurate). In this case
the timestamps received by the sender on NAK packets might be in error,
in the worst case, by as much as the packet interarrival time.	This
error will normally not be present on ACK packets, which are sent
immediately.  A suitable correction should be applied by the sender in
order to avoid systematic errors.

The measurement based on sequence numbers is less accurate, but also
less sensitive to errors due to the lack of the correction term. In
fact, the measurement error induced by the lack of the correction term
can be at most one unit.  This suggests that, when the correction term
is not available, measurements based on sequence numbers should be
favoured.  Simulations have shown that the acker selection mechanism
performs moderately better when the RTT measurement is based on
timestamps, but performance is reasonably good also with measurements
based on sequence numbers.


2.6.  Loss rate measurement


The loss measurement in PGMCC is entirely performed by receivers.  The
measurement results do not directly influence the transmit rate, but are
only used for comparison purposes. As a consequence, the scheme is
reasonably robust to different measurement techniques, as long as they
are not influenced too strongly by single loss events.

The main method suggested for loss measurement is Exponentially Moving
Weighted Average (EVMA), which is formally equivalent to a single-pole
digital low pass filter applied to a binary signal x_i, where x_i = 1 if
packet i is lost, x_i = 0 if packet i is successfully received.

The loss rate p_i upon reception or detection of loss of packet i is
computed as


		  p_i = c_p * p_i-1 + (1 - c_p ) * p_i

where the constant c_p between 0 and 1 is related to the bandpass of the


Rizzo/Iannaccone/Vicisano/Handley		 Section 2.6.  [Page 13]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


filter. Experiments have shown good performance with c = 500/65536, and
computations performed with fixed point arithmetic and 16 fractional
digits.

As an alternative to EWMA, the technique used in TFRC [3] can be
adopted. Simulations have shown a moderate improvement in the acker
selection mechanism by measuring loss using the TFRC technique. On the
other hand, the TFRC technique is less robust in presence of packet
reordering, whereas misordered packets can be easily accounted for with
the EWMA loss estimator.


2.7.  Interaction with feedback suppression schemes


Several schemes are used by NAK-based multicast protocols to reduce the
amount of feedback directed toward the source and make the protocol
scale with large populations of receivers.  Such schemes typically rely
on randomly delaying NAK generation, and suppressing pending NAKs when
an equivalent NAK or a retransmission is heard; or, intermediate nodes
such as routers can implement some form of feedback aggregation and
filtering.

Such schemes might prevent NAKs from potential ACKER candidates from
reaching the source. This filtering might impact the speed at which
PGMCC selects the correct ACKER, though initial experience from
simulations seem to suggest that PGMCC behaviour is not severely
affected by NAK suppression schemes.


3.  Procedures - Sender

The following pseudo-code specifies the complete behaviour of the sender
in PGMCC.


Rizzo/Iannaccone/Vicisano/Handley		   Section 3.  [Page 14]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


initialization:
  T = 1 ; W = 1 ; // initialize window and number of tokens
  RETRY = 0 ;
  < initialize p, RTT and SRTT for acker to default values >
  ACKER = NO_ACKER; // no acker is known
  < initialize sequence numbers >
  QUEUED = 0; /* packets waiting to be transmitted */

on transmission request:
  send_packet() ;

on timeout expiration :
  T = 1 ; W = 1 ; // initialize window and number of tokens
  RETRY = min (RETRY_MAX, RETRY + 1) ;
  send_packet() ;
  ACKER = NO_ACKER ; // no acker is known

on ACK/NAK reception from receiver I :
  < compute p and RTT for source of this ACK, see Sec. 2.5 and 2.6 >
  RETRY = 0 ;
  if (ACKER == NO_ACKER) {// select current as acker is no other known
    ACKER = I ;
    T = T + 1 ;
  }
  if (ACKER != I)
    < select acker according to Sec. 2.3 > ;
  else {
    // update acker statistics (p_ACKER, RTT_ACKER, SRTT_ACKER)
    if (packet_type == ACK) {
      < update_window according to Sec.2.4 >
      send_packet ;
      if (ack_pending)
	update_timeout ;
  }
}

send_packet() {
  if (QUEUED > 0 && T >= 1) {
    < transmit one packet >
    T = T - 1 ;
    QUEUED = QUEUED - 1 ;
  }
  if ( <there are unacked packet> )
    <set a timeout, see Sec.2.7 >
}


Rizzo/Iannaccone/Vicisano/Handley		   Section 3.  [Page 15]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


4.  Procedures -- Receiver

The following pseudo-code specifies the complete behaviour of the
receiver in PGMCC.

A receiver only transmits an ACK packet when it receives a data packet
for which the receiver is designated as the ACKER by the data packet
itself.  A receiver can transmit a NAK packet after it has detected that
a data packet is missing and a suitable delay has passed, as dictated by
the feedback suppression rules of the protocol in use.

The data packet contains acknowledgement status about the most recent 32
sequence numbers known to the receiver.


on initialization/session setup:
  < initialize state variables and ACK bitmap >

on DATA packet reception:
  < update p measurement according to Sec.2.4 >
  < record timestamp and packet reception time >
  if (ACKER == this_node) {
    < send an immediate ACK >
  }
  if ( <some data packet is missing and unacknowledged > )
    < schedule a timeout for NAK transmission >

on NAK reception:
  < suppress any pending NAK transmission for the sequence
    number indicated in the NAK >

on timeout:
  if ( < there are missing and unacknowledged packets > ) {
    < send a NAK for one or more of the missing packets >
    < mark such packets as acknowledged >
    if ( <some data packet is missing and unacknowledged > )
      < schedule a timeout for NAK transmission >
  }


5.  Security Considerations

PGMCC is not a transport protocol in its own right, but a congestion
control mechanism that is intended to be used in conjunction with a
transport protocol.  Therefore security primarily needs to be considered
in the context of a specific transport protocol and its authentication
mechanisms.


Rizzo/Iannaccone/Vicisano/Handley		   Section 5.  [Page 16]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


Congestion control mechanisms can potentially be exploited to create
denial of service.  This may occur through spoofed feedback.  Thus any
transport protocol that uses PGMCC should take care to ensure that
feedback is only accepted from the receiver of the data.  The precise
mechanism to achieve this will however depend on the transport protocol
itself.

In addition, congection control mechanisms may potentially be
manipulated by a greedy receiver that wishes to receive more than its
fair share of network bandwidth.  A receiver might do this by claiming
to have received packets that in fact were lost due to congestion.
Possible defenses against such a receiver would normally include some
form of nonce that the receiver must feed back to the sender to prove
receipt.  However, the details of such a nonce would depend on the
transport protocol, and in particular on whether the transport protocol
is reliable or unreliable.


6.  Authors' Addresses

   Luigi Rizzo
   luigi@iet.unipi.it
   ACIRI/ICSI,
   1947 Center St, Berkeley, CA, USA, 94704
   and
   Dip. Ing. dell'Informazione,
   Univ. di Pisa
   via Diotisalvi 2, 56126 Pisa, Italy

   Gianluca Iannaccone
   gianluca@sprintlabs.com
   Sprint ATL
   1 Adrian Ct., Burlingame, CA, USA
   and
   Dip. Ing. dell'Informazione,
   Univ. di Pisa
   via Diotisalvi 2, 56126 Pisa, Italy

   Lorenzo Vicisano
   lorenzo@cisco.com
   cisco Systems, Inc.
   170 West Tasman Dr.,
   San Jose, CA, USA, 95134

   Mark Handley
   mjh@aciri.org
   ACIRI/ICSI,
   1947 Center St, Berkeley, CA, USA, 94704


Rizzo/Iannaccone/Vicisano/Handley		   Section 6.  [Page 17]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


7.  Acknowledgments

We would like to acknowledge feedback and discussions on equation-based
congestion control with a wide range of people, including members of the
Reliable Multicast Research Group, the Reliable Multicast Transport
Working Group, and the End-to-End Research Group.


[1] Bradner, S., Key words for use in RFCs to Indicate Requirement
Levels (IETF RFC 2119) http://www.rfc-editor.org/rfc/rfc2119.txt

[2] Deering, S., "Host Extensions for IP Multicasting", RFC 1058,
Stanford University, Stanford, CA, 1988.

[3] Floyd, S., Handley, M., Padhye, J., Widmer, J., "Equation-Based
Congestion Control for Unicast Applications", ACM SIGCOMM 2000,
Stockholm, Aug. 2000

[4] Holbrook, H., Cheriton, D., "IP Multicast Channels: Experss Support
for Large-scale Single-source Applications", ACM SIGCOMM'99

[5] Padhye, J. and  Firoiu, V. and Towsley, D. and Kurose, J., "Modeling
TCP Throughput: A Simple Model and its Empirical Validation", Proc ACM
SIGCOMM 1998.

[6] Mankin, A., Romanow, A., Brander, S., Paxson, V., "IETF Criteria for
Evaluating Reliable Multicast Transport and Application Protocols,"
RFC2357, June 1998.

[7] Postel, J. "User Datagram Protocol", RFC768, August 1980.

[8] Rizzo, L., "pgmcc: a TCP-friendly single-rate multicast congestion
control scheme", ACM SIGCOMM 2000, Stockholm, Aug.2000

[9] Rizzo, L, and Vicisano, L., "Reliable Multicast Data Distribution
protocol based on software FEC techniques", Proceedings of the Fourth
IEEES Workshop on the Architecture and Implementation of High
Performance Communication Systems, HPCS'97, Chalkidiki, Greece, June
1997.

[10] Speakman, T. et al., "PGM Reliable Transport Protocol
Specification", Internet Draft, draft-speakman-pgm-spec-06.txt, Feb.2001

[11] Vicisano, L., Rizzo, L., Crowcroft, J., "TCP-like Congestion
Control for Layered Multicast Data Transfer", IEEE Infocom '98, San
Francisco, CA, Mar.28-Apr.1 1998.


Rizzo/Iannaccone/Vicisano/Handley		   Section 7.  [Page 18]
^L
INTERNET-DRAFT		  Expires: August 2001		   February 2001


8.  Full Copyright Statement

Copyright (C) The Internet Society (2000).  All Rights Reserved.

This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it or
assist in its implementation may be prepared, copied, published and
distributed, in whole or in part, without restriction of any kind,
provided that the above copyright notice and this paragraph are included
on all such copies and derivative works. However, this document itself
may not be modified in any way, such as by removing the copyright notice
or references to the Internet Society or other Internet organizations,
except as needed for the purpose of developing Internet standards in
which case the procedures for copyrights defined in the Internet
languages other than English.

The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.

This document and the information contained herein is provided on an "AS
IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK
FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT
LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT
INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR
FITNESS FOR A PARTICULAR PURPOSE."


Rizzo/Iannaccone/Vicisano/Handley		   Section 8.  [Page 19]