Internet Engineering Task Force                                A. Charny
Internet-Draft                                             Cisco Systems
Intended status: Informational                                  F. Huang
Expires: January 7, 2010                             Huawei Technologies
                                                          G. Karagiannis
                                                               U. Twente
                                                                M. Menth
                                                 University of Wuerzburg
                                                          T. Taylor, Ed.
                                                     Huawei Technologies
                                                            July 6, 2009


    PCN Boundary Node Behaviour for the Controlled Load (CL) Mode of
                               Operation
                  draft-ietf-pcn-cl-edge-behaviour-00

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on January 7, 2010.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info).



Charny, et al.           Expires January 7, 2010                [Page 1]

Internet-Draft       PCN CL Boundary Node Behaviour            July 2009


   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.

Abstract

   Precongestion notification (PCN) is a means for protecting quality of
   service for inelastic traffic admitted to a Diffserv domain.  The
   overall PCN architecture is described in RFC 5559.  This memo is one
   of a series describing possible boundary node behaviours for a PCN
   domain.  The behaviour described here is that for three-state
   measurement-based load control, known informally as Controlled Load
   (CL).


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Assumed Core Network Behaviour for CL  . . . . . . . . . . . .  4
   3.  Node Behaviours  . . . . . . . . . . . . . . . . . . . . . . .  5
     3.1.  Overview . . . . . . . . . . . . . . . . . . . . . . . . .  5
     3.2.  Behaviour of the PCN-Egress-Node . . . . . . . . . . . . .  6
       3.2.1.  PCN-Egress-Node Role In Flow Admission . . . . . . . .  6
       3.2.2.  PCN-Egress-Node Role in Flow Termination . . . . . . .  7
     3.3.  Behaviour of the PCN-Ingress-Node  . . . . . . . . . . . .  9
       3.3.1.  PCN-Ingress-Node Role In Flow Admission  . . . . . . .  9
       3.3.2.  PCN-Ingress-Node Role In Flow Termination  . . . . . .  9
   4.  Specification of Diffserv Per-Domain Behaviour . . . . . . . . 10
     4.1.  Applicability  . . . . . . . . . . . . . . . . . . . . . . 10
     4.2.  Technical Specification  . . . . . . . . . . . . . . . . . 10
     4.3.  Attributes . . . . . . . . . . . . . . . . . . . . . . . . 10
     4.4.  Parameters . . . . . . . . . . . . . . . . . . . . . . . . 10
     4.5.  Assumptions  . . . . . . . . . . . . . . . . . . . . . . . 10
     4.6.  Example Uses . . . . . . . . . . . . . . . . . . . . . . . 10
     4.7.  Environmental Concerns . . . . . . . . . . . . . . . . . . 11
     4.8.  Security Considerations  . . . . . . . . . . . . . . . . . 11
   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 11
   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 11
   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 11
   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
     8.1.  Normative References . . . . . . . . . . . . . . . . . . . 11
     8.2.  Informative References . . . . . . . . . . . . . . . . . . 12
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12








Charny, et al.           Expires January 7, 2010                [Page 2]

Internet-Draft       PCN CL Boundary Node Behaviour            July 2009


1.  Introduction

   The objective of Pre-Congestion Notification (PCN) is to protect the
   quality of service (QoS) of inelastic flows within a Diffserv domain,
   in a simple, scalable, and robust fashion.  Two mechanisms are used:
   admission control, to decide whether to admit or block a new flow
   request, and (in abnormal circumstances) flow termination to decide
   whether to terminate some of the existing flows.  To achieve this,
   the overall rate of PCN-traffic is metered on every link in the
   domain, and PCN-packets are appropriately marked when certain
   configured rates are exceeded.  These configured rates are below the
   rate of the link thus providing notification to boundary nodes about
   overloads before any congestion occurs (hence "pre-congestion"
   notification).  The level of marking allows boundary nodes to make
   decisions about whether to admit or terminate.  For more details see
   [RFC5559].

   Boundary node behaviours specify a detailed set of algorithms and
   edge node behaviours used to implement the PCN mechanisms.  Since the
   algorithms depend on specific metering and marking behaviour at the
   interior nodes, it is also necessary to specify the assumptions made
   about interior node behaviour.  Finally, because PCN uses DSCP values
   to carry its markings, a specification of boundary node behaviour
   must include the per domain behaviour (PDB) template specified in
   [RFC3086], filled out with the appropriate content.  The present
   document accomplishes these tasks for the controlled load (CL) mode
   of operation.

1.1.  Terminology

   In addition to the terms defined in [RFC5559], this document uses the
   following terms:

   Policy Decision Point (PDP)
      The node that provides policy input regarding admission and
      termination of flows.

   PCN-admission-state
      The state ("admit" or "block") derived by PCN-egress-node for a
      given ingress-egress-aggregate based on PCN packet marking
      statistics.  The PCN-ingress-node admits or blocks new flows
      offered to the aggregate based on the current value of the PCN-
      admission-state.  Individual decisions may be modified by policy
      input from the PDP.  For further details see Section 3.2.1 and
      Section 3.3.1.






Charny, et al.           Expires January 7, 2010                [Page 3]

Internet-Draft       PCN CL Boundary Node Behaviour            July 2009


   Congestion level estimate (CLE)
      A value derived from the measurement of PCN packets received at a
      PCN-egress-node for a given ingress-egress-aggregate, representing
      the ratio of marked to total PCN traffic (measured in octets) over
      a short period.  In this specification the CLE is an exponentially
      weighted moving average of the ratios observed in successive
      fixed-length measurement intervals.  For further details see
      Section 3.2.1.

   Admission decision threshold
      A fractional value to which the CLE is compared to determine the
      PCN-admission-state.  If the CLE is below the admission decision
      threshold the PCN-admission-state is set to "admit".  If the CLE
      is above the admission decision threshold the PCN-admission-state
      is set to "block".  For further details see Section 3.2.1.

   Normal regime
      The operating state of the PCN-egress-node with respect to a given
      ingress-egress-aggregate during periods when no excess-traffic-
      marked packets are received within that aggregate.

   Excess traffic regime
      The operating state of the PCN-egress-node with respect to a given
      ingress-egress-aggregate during periods when excess-traffic-marked
      packets are being received within that aggregate.  The transition
      from normal to excess traffic regime occurs when an excess-
      traffic-marked packet is received within the given ingress-egress-
      aggregate.  The transition from excess traffic regime to normal
      regime occurs when a complete measurement interval passes without
      receipt of an excess-traffic-marked packet within the given
      ingress-egress-aggregate.  For further details see Section 3.2.2.


2.  Assumed Core Network Behaviour for CL

   This section describes the assumed behaviour for nodes of the PCN-
   domain when acting in their role as PCN-interior-nodes.  The CL mode
   of operation assumes that:

   o  encoding of PCN status within individual packets is based on
      [ID.PCN-baseline], extended to provide a third PCN encoding state.
      Possible extensions for this purpose are documented in
      [ID.PCN3state] or alternatively [ID.PCN3in1];

   o  the domain satisfies the conditions specified in the applicable
      encoding extension document;





Charny, et al.           Expires January 7, 2010                [Page 4]

Internet-Draft       PCN CL Boundary Node Behaviour            July 2009


   o  each link has been configured with a PCN-threshold-rate having a
      value equal to the PCN-admissible-rate for the link;

   o  each link has been configured with a PCN-excess-rate having a
      value equal to the PCN-supportable-rate for the link;

   o  PCN-interior-nodes perform threshold-marking and excess-traffic-
      marking of packets according to the rules specified in
      [ID.PCN-marking], and any additional rules specified in the
      applicable encoding extension document;

   According to [ID.PCN-baseline], the encoding extension documents
   should specify the allowable transitions between marking states.
   However, to be absolutely clear, these allowable transitions are
   specified here.  At any interior node, the only permitted transitions
   are these:

   o  a PCN packet which is not marked (NM) MAY be threshold-marked
      (ThM) or excess-traffic-marked (ETM);

   o  a PCN packet which is threshold-marked (ThM) MAY be excess-
      traffic-marked (ETM).

   An interior node MUST NOT re-mark a packet from PCN to non-PCN, or
   vice versa.


3.  Node Behaviours

3.1.  Overview

   The Controlled Load (CL) mode of operation supports flow admission
   based on the smoothed ratio of threshold-marked to total PCN-traffic
   observed by the PCN-egress-node (the congestion level estimate, see
   Section 1.1) for each ingress-egress-aggregate.  When the PCN-
   admission-state (see Section 1.1) for a given ingress-egress-
   aggregate changes from "Admit" to "Block" or vice versa, the PCN-
   egress-node reports this change.  The PCN-ingress-node admits or
   blocks new PCN flows offered to a given ingress-egress-aggregate
   based on the PCN-admission-state, possibly modified by policy
   direction from the Policy Decision Point (PDP).

   Flow termination is triggered when the PCN-egress-node observes
   excess-traffic-marked packets within a given ingress-egress-
   aggregate.  The PCN-egress-node performs measurements to produce an
   estimate of the edge-to-edge supportable PCN traffic rate for the
   ingress-egress-aggregate concerned and reports this estimate.  When
   it is alerted to the need for flow termination, the PCN-ingress-node



Charny, et al.           Expires January 7, 2010                [Page 5]

Internet-Draft       PCN CL Boundary Node Behaviour            July 2009


   similarly performs measurements to determine the rate at which it is
   admitting PCN traffic to the same ingress-egress-aggregate.  The
   difference between the measured admission rate and the estimated
   edge-to-edge supportable rate is an estimate of the total amount of
   flow that must be terminated.  The PCN-ingress-node under the
   guidance of the PDP terminates selected previously-admitted flows
   within the affected ingress-egress-aggregate until no more excess-
   marked packets are observed at the PCN-egress-node.

      Flow termination may be spread out over a period of time to avoid
      over-termination.

   When Equal Cost Multipath (ECMP) routing has been configured in the
   network, it is possible that some flows within a given ingress-
   egress-aggregate pass through the bottleneck that is resulting in
   excess-traffic-marking, while others do not.  To ensure that the
   right set of flows is terminated, the PCN-egress-node supplies a list
   of excess-traffic-marked flows along with its estimate of the edge-
   to-edge supportable PCN traffic rate.  The PDP gives preference to
   this list when determining which flows to terminate.

3.2.  Behaviour of the PCN-Egress-Node

   The egress node generates reports for individual ingress-egress-
   aggregates based on measurements of PCN-packets it receives.
   Processing of measurements proceeds in two regimes (states).  In the
   normal regime (Section 1.1) no excess-traffic-marked packets are
   being observed within the ingress-egress-aggregate.  The reports
   generated are reports of changes in the PCN-admission-state
   (Section 1.1) for that aggregate.  In the excess traffic regime
   (Section 1.1), excess-traffic-marked packets are being observed
   within the ingress-egress-aggregate.  The reports generated contain
   estimates of the edge-to-edge supportable rate of PCN-traffic for the
   ingress-egress-aggregate.  The following sections give details of the
   processing within the regimes and the actions taken upon transition
   from one regime to the other.

3.2.1.  PCN-Egress-Node Role In Flow Admission

   For each ingress-egress-aggregate, while no excess-traffic-marked
   packets are observed (normal regime), the egress node continuously
   measures the following quantities over successive intervals of equal
   duration.  That duration is suggested to be in the range of 100 to
   500ms to provide a reasonable tradeoff between signalling demands on
   the network and the time taken to react to impending congestion.






Charny, et al.           Expires January 7, 2010                [Page 6]

Internet-Draft       PCN CL Boundary Node Behaviour            July 2009


   NM-count:
      Number of octets of PCN-traffic contained in received packets
      which are neither threshold-marked nor excess-traffic-marked.

   ThM-count:
      Number of octets of PCN-traffic contained in received packets
      which are threshold-marked.

   At the end of each measurement interval, the egress node calculates a
   ratio R. If both counts are zero for the interval, the ratio R is set
   to zero.  Otherwise, the egress node calculates the ratio as:

      R = ThM-count / (NM-count + ThM-count).

   The egress node then updates a congestion level estimate (CLE, see
   Section 1.1) with this ratio using exponential smoothing:

      new_CLE = k*R + (1-k)*old_CLE,

   where k is a constant chosen to put most (say 80%) of the weight in
   the accumulated average on the most recent 1 to 3 seconds of data.
   The value of k thus depends on the length of the measurement
   interval.

   The next step is to examine the relationship of old_CLE and new_CLE
   to a configured admission decision threshold (Section 1.1).  If
   old_CLE is below the threshold and new_CLE is above it, the egress
   node reports that the PCN-admission-state is now "block" for the
   ingress-egress-aggregate.  If old_CLE is above the threshold and
   new_CLE is below it, the egress node reports that the ingress-egress-
   aggregate PCN-admission-state is now "admit" for the ingress-egress-
   aggregate.  In the absence of one of these two threshold-crossing
   events, the egress node issues no report.

      Simulation results show that the process is not sensitive to the
      value of the decision threshold.  A value in the order of 0.5
      seems reasonable.

3.2.2.  PCN-Egress-Node Role in Flow Termination

   When the PCN-egress-node detects an excess-traffic-marked packet, it
   transitions to the excess traffic regime with respect to the ingress-
   egress-aggregate concerned.  As a consequence of this transition, it
   immediately resets NM-count and ThM-count and begins a new
   measurement interval.  In addition, it begins to collect a third
   quantity:





Charny, et al.           Expires January 7, 2010                [Page 7]

Internet-Draft       PCN CL Boundary Node Behaviour            July 2009


   ETM-count:
      Number of octets of PCN-traffic contained in received packets
      which are excess-traffic-marked.

   Finally, if so configured (i.e., because ECMP routing is being used),
   the PCN-egress-node begins to record flow identifiers of individual
   flows for which excess-traffic-marked packets have been observed.

   At the end of the new measurement interval, the PCN-egress-node
   calculates the sum

      NM-count + ThM-count,

   normalizes it to a rate in octets per second, and reports it as an
   estimate of the edge-to-edge supportable PCN traffic rate for the
   ingress-egress-aggregate concerned.  If flow identifiers of excess-
   traffic-marked flows were collected, these are also reported.

   The PCN-egress-node also calculates the ratio:

      R = (ThM-count + ETM-count) / (NM-count + ThM-count + ETM-count)

   and then proceeds to update the CLE estimate:

      new_CLE = k*R + (1-k)*old_CLE,

   with the same value of k as in Section 3.2.1.  However, the PCN-
   egress-node does not derive or report PCN-admission-state while
   excess-marked-traffic is being observed.

   The PCN-egress-node repeats the above procedures for successive
   measurement intervals until no more excess-marked-traffic is
   observed.  At the end of the first interval during which ETM-count is
   zero, the PCN-egress-node transitions to the normal regime.  As part
   of the transition, after updating the CLE with the latest results, it
   immediately reports the PCN-admission-state for the ingress-egress-
   aggregate based on the updated CLE.  This report serves to inform the
   PCN-ingress-node and PDP that no more flow termination is required.
   The PCN-egress-node then reverts to the normal regime procedures
   described in Section 3.2.1.

   The duration of measurement intervals during the excess traffic
   regime is the same as during the normal regime.  The only difference
   in measurement behaviour is the restart of measurements upon
   transition to the excess traffic regime and the collection of ETM-
   count while that regime prevails.





Charny, et al.           Expires January 7, 2010                [Page 8]

Internet-Draft       PCN CL Boundary Node Behaviour            July 2009


      ETM-countcould be collected within the normal regime, but by
      definition would always be zero.

3.3.  Behaviour of the PCN-Ingress-Node

   The PCN-related functions of the PCN-ingress-node are described
   briefly in section 4.2 of [RFC5559].  This section focusses on the
   specific behaviour associated with admission and flow termination.

   Procedures at the PCN-ingress-node for a given ingress-egress-
   aggregate can also be classed as occurring within a normal regime or
   during an excess traffic regime.  The transition from the normal
   regime to the excess traffic regime occurs when the PCN-ingress-node
   receives a report indicating the estimated edge-to-edge supportable
   rate of PCN traffic for the aggregate.  The transition back to normal
   regime occurs when the PCN-ingress-node receives a report of the
   current PCN-admission-state for the aggregate.

3.3.1.  PCN-Ingress-Node Role In Flow Admission

   When the PCN-ingress-node receives a report indicating that the PCN-
   admission-state for a given ingress-egress-aggregate is "admit", it
   admits new flows to that aggregate.  When the PCN-ingress-node
   receives a report indicating that the PCN-admission-state for a given
   ingress-egress-aggregate is "block", it ceases to admit new flows to
   that aggregate.  These actions may be modified by policy input from
   the Policy Decision Point (PDP).

3.3.2.  PCN-Ingress-Node Role In Flow Termination

   When the PCN-ingress-node receives a report providing an estimate of
   the edge-to-edge supportable PCN traffic rate for a given ingress-
   egress-aggregate, it ceases to admit new flows to that aggregate
   (unless directed otherwise for specific flows by the PDP).  It
   immediately begins to measure the rate of PCN-traffic that it is
   currently admitting to that aggregate.  When it has accumulated
   sufficient data for a reliable estimate [see Table 1 giving traffic
   rate vs. meaurement period, to be provided later], it normalizes the
   result to a rate in octets per second.  It reports the difference
   between this result and the estimated edge-to-edge supportable PCN
   traffic rate for the aggregate to the PDP.

   Subsequently, the PCN-ingress-node terminates already-admitted flows
   as directed by the PDP.  When it receives a report indicating the
   current PCN-admission state for the ingress-egress-aggregate, it
   resumes admitting or blocking new flows as described in
   Section 3.3.1.




Charny, et al.           Expires January 7, 2010                [Page 9]

Internet-Draft       PCN CL Boundary Node Behaviour            July 2009


4.  Specification of Diffserv Per-Domain Behaviour

   This section provides the specification required by [RFC3086] for a
   per-domain behaviour.

4.1.  Applicability

   This section draws heavily upon points made in the PCN architecture
   document, [RFC5559].

   The PCN CL boundary node behaviour specified in this document is
   applicable to inelastic traffic (particularly video and voice) where
   quality of service for admitted flows is protected primarily by
   admission control at the ingress to the domain.  In exceptional
   circumstances (e.g. due to network failures) already-admitted flows
   may be terminated to protect the quality of service of the remainder.
   The CL boundary node behaviour is less likely to terminate too many
   flows under such circumstances than some alternative PCN boundary
   node behaviours.

4.2.  Technical Specification

   The technical specification of the PCN CL per domain behaviour is
   provided by the contents of [RFC5559], [ID.PCN-baseline],
   [ID.PCN-marking], the specification of the encoding extension (e.g.
   [ID.PCN3state], [ID.PCN3in1]), and the present document.

4.3.  Attributes

   TBD -- basically low loss, low jitter.  Low delay would be nice but
   has to be quantified

4.4.  Parameters

   TBD.  Don't think RFC 3068 is looking for the list of configurable
   parameters given in the architecture document.

4.5.  Assumptions

   Assumed that a specific portion of link capacity has been reserved
   for PCN traffic.  Assumed that recovery from overloads by flow
   termination should happen within 1-3 seconds.

4.6.  Example Uses

   The PCN CL behaviour may be used to carry real-time traffic,
   particularly voice and video.




Charny, et al.           Expires January 7, 2010               [Page 10]

Internet-Draft       PCN CL Boundary Node Behaviour            July 2009


4.7.  Environmental Concerns

   In some markets, traffic preemption is considered to be
   impermissible.  In such environments, flow termination would not be
   enabled.

4.8.  Security Considerations

   Please see the security considerations in Section 5 as well as those
   in [RFC2474] and [RFC2475].


5.  Security Considerations

   [RFC5559] provides a general description of the security
   considerations for PCN.  This memo introduces no new considerations.


6.  IANA Considerations

   This memo includes no request to IANA.


7.  Acknowledgements

   Excluding the appendices, the content of this memo is drawn from
   [ID.briscoe-CL].  The authors of that document were Bob Briscoe,
   Philip Eardley, and Dave Songhurst of BT, Anna Charny and Francois Le
   Faucheur of Cisco, Jozef Babiarz, Kwok Ho Chan, and Stephen Dudley of
   Nortel, Giorgios Karagiannis of U. Twente and Ericsson, and Attila
   Bader and Lars Westberg of Ericsson.


8.  References

8.1.  Normative References

   [ID.PCN-baseline]
              Moncaster, T., Briscoe, B., and M. Menth, "Baseline
              Encoding and Transport of Pre-Congestion Information (Work
              in progress)", May 2009.

   [ID.PCN-marking]
              Eardley, P., "Metering and marking behaviour of PCN-nodes
              (Work in progress)", June 2009.

   [RFC2474]  Nichols, K., Blake, S., Baker, F., and D. Black,
              "Definition of the Differentiated Services Field (DS



Charny, et al.           Expires January 7, 2010               [Page 11]

Internet-Draft       PCN CL Boundary Node Behaviour            July 2009


              Field) in the IPv4 and IPv6 Headers", RFC 2474,
              December 1998.

   [RFC2475]  Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
              and W. Weiss, "An Architecture for Differentiated
              Services", RFC 2475, December 1998.

   [RFC5559]  Eardley, P., "Pre-Congestion Notification (PCN)
              Architecture", RFC 5559, June 2009.

8.2.  Informative References

   [ID.PCN3in1]
              Briscoe, B., "PCN 3-State Encoding Extension in a single
              DSCP (expired Internet Draft)", October 2008.

   [ID.PCN3state]
              Moncaster, T., Briscoe, B., and M. Menth, "A PCN encoding
              using 2 DSCPs to provide 3 or more states (Work in
              progress)", April 2009.

   [ID.briscoe-CL]
              Briscoe, B., "An edge-to-edge Deployment Model for Pre-
              Congestion Notification:  Admission Control over a
              DiffServ Region (expired Internet Draft)", 2006.

   [RFC3086]  Nichols, K. and B. Carpenter, "Definition of
              Differentiated Services Per Domain Behaviors and Rules for
              their Specification", RFC 3086, April 2001.


Authors' Addresses

   Anna Charny
   Cisco Systems
   300 Apollo Drive
   Chelmsford, MA  01824
   USA

   Email: acharny@cisco.com











Charny, et al.           Expires January 7, 2010               [Page 12]

Internet-Draft       PCN CL Boundary Node Behaviour            July 2009


   Fortune Huang
   Huawei Technologies
   Section F, Huawei Industrial Base,
   Bantian Longgang, Shenzhen  518129
   P.R. China

   Phone: +86 15013838060
   Email: fqhuang@huawei.com


   Georgios Karagiannis
   U. Twente


   Phone:
   Email: karagian@cs.utwente.nl


   Michael Menth
   University of Wuerzburg
   Am Hubland
   Wuerzburg  D-97074
   Germany

   Phone: +49-931-888-6644
   Email: menth@informatik.uni-wuerzburg.de


   Tom Taylor (editor)
   Huawei Technologies
   1852 Lorraine Ave
   Ottawa, Ontario  K1H 6Z8
   Canada

   Phone: +1 613 680 2675
   Email: tom.taylor@rogers.com















Charny, et al.           Expires January 7, 2010               [Page 13]