PCN                                                              T. Tsou
Internet-Draft                                                 T. Taylor
Expires: May 20, 2008                                             Huawei
                                                       November 17, 2007


                      PCN Boundary Node Behaviour
                  draft-tsou-pcn-boundary-behav-00.txt

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on May 20, 2008.

Copyright Notice

   Copyright (C) The IETF Trust (2007).


Tsou & Taylor             Expires May 20, 2008                  [Page 1]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


Abstract

   The Pre-Congestion Notification Architecture document defines a PCN
   domain and the PCN-ingress and PCN-egress nodes that form its
   boundary.  The present document is an attempt to describe the
   detailed behaviour of the PCN boundary nodes.  It is a contribution
   toward the PCN WG milestone: "Suggested Flow Admission and
   Termination Boundary Mechanisms".  This first version is expected to
   evolve with discussion and further thought toward a more precise and
   prescriptive view of boundary node behaviour.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Overview of Boundary Node Functions  . . . . . . . . . . . . .  5
   4.  Details of Boundary Node Functions . . . . . . . . . . . . . .  7
     4.1.  PCN-ingress-node Functions . . . . . . . . . . . . . . . .  7
     4.2.  PCN-egress-node Functions  . . . . . . . . . . . . . . . .  8
     4.3.  Peer Address Determination and Flow Aggregation  . . . . .  9
   5.  Communication Behavior between Boundary Nodes  . . . . . . . . 11
     5.1.  PCN-ingress-node to PCN-egress-node  . . . . . . . . . . . 11
     5.2.  PCN-egress-node to PCN-ingress-node  . . . . . . . . . . . 11
   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 13
   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 14
   8.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 15
   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 16
     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 16
     9.2.  Informative References . . . . . . . . . . . . . . . . . . 16
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17
   Intellectual Property and Copyright Statements . . . . . . . . . . 18


Tsou & Taylor             Expires May 20, 2008                  [Page 2]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


1.  Introduction

   The Pre-Congestion Notification Architecture document [I-D.PCNarch]
   defines a PCN domain and the PCN-ingress and PCN-egress nodes that
   form its boundary.  The present document is an attempt to describe
   the detailed behaviour of the PCN boundary nodes.  As part of this
   effort, it deals with the following issues:

   o  mapping from flows to aggregates;

   o  discovery of peer addresses;

   o  processing of observations at the PCN-egress-node;

   o  transmission policy for congestion level estimates (CLE).


Tsou & Taylor             Expires May 20, 2008                  [Page 3]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119.

   The formal definitions of "PCN-domain", "PCN-boundary-node", "PCN-
   interior-node", "PCN-ingress-node" and "PCN-egress-node" are given in
   section 2 of [I-D.PCNarch].  These terms are used here, generally
   without the hyphens, to have the same meaning.

   This memo uses the following abbreviations:

   CE     Congestion Experienced (ECN marking)

   CLE    Congestion level estimates

   DSCP   Differentiated Services Codepoint

   ECMP   Equal cost multi-path

   ECN    Explicit congestion notification

   PCN    Pre-congestion notification


Tsou & Taylor             Expires May 20, 2008                  [Page 4]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


3.  Overview of Boundary Node Functions

   A PCN domain is a self-controlled group of nodes, some of which, the
   PCN boundary nodes, connect the PCN domain to other network domains.
   PCN boundary nodes play an extremely important role in the
   implementation of the PCN mechanism; they are in charge of flow
   admission and flow termination so as to protect the existing payload
   within the PCN domain.  PCN boundary nodes fulfill two functional
   roles, those of PCN-ingress-node and of PCN-egress-node.  A given
   flow enters the PCN domain through the PCN-ingress-node and leaves it
   through the PCN-egress-node.  In physical terms, the flow passes
   through an ingress-egress pair.  This natural pairing of the PCN
   boundary nodes through which a given flow passes is bi-directional:
   the PCN boundary node that serves as the PCN-ingress-node for one
   flow also serves as the PCN-egress-node for a flow in the opposite
   direction and vice versa.

   As [I-D.PCNarch] specifies, a PCN domain is a Diffserv domain.  There
   are different priority traffic classes within the PCN domain.  When a
   flow is presented to the PCN domain, the PCN-ingress-node should
   figure out whether it is a PCN flow or not.  If it is, the PCN-
   ingress-node marks the flow packets accordingly.  When these packets
   leave the domain, the PCN-egress-node should remove the PCN markings
   so they do not confuse a subsequent domain.

   It is obvious that the PCN-ingress-node supports flow recognition.
   This is a necessity if the PCN-ingress-node is to enforce flow
   admission and termination (policy action).

   The PCN-egress-node measures the aggregate flow for each PCN-ingress/
   PCN-egress pair for which it is the PCN-egress-node.  In some
   circumstances, the PCN-egress-node should also support flow
   recognition.  At present, we do not require this, for reasons of
   scalability and simplicity.  If the PCN-egress-node did measure each
   individual flow, it would add too much cost to the cached flow table.

   As part of its measurements for a given ingress-egress aggregate, the
   PCN-egress-node obtains information relating to congestion level
   estimates (CLE).  The PCN-egress-node sends the CLE in a message to
   the PCN-ingress-node or to a node acting as centralized collection
   point.  The strategy for when the CLE messages are sent is discussed
   in Section 5.2.  The PCN-ingress-node or the centralized collector
   node acting on its behalf makes flow admission or termination
   judgements based on these CLE messages.

   When the aggregate flow from some PCN-ingress-node contains no
   traffic or too low a traffic level, no measurement or too inaccurate
   measurement of congestion can be performed.  No CLE message can be


Tsou & Taylor             Expires May 20, 2008                  [Page 5]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


   sent or the CLE message sent is too inaccurate to make the right
   decision.  In this case the PCN-ingress-node needs to send a probe
   message to the PCN-egress-node to gain more information.  According
   to the design requirement, ECMP within a PCN domain can also best be
   resolved by a probe message.  For the sake of efficiency, the probing
   operation requires careful design to ensure that it does not
   significantly affect the existing load within the domain.  Current
   discussion has concluded that probing is a topic that should be
   considered after the basic mechanisms have been defined.

   The following sections deal with three major topics.  Section 4 is a
   detailed functional definition of the PCN-ingress-node and PCN-
   egress-node.  This part will draw upon the existing content of
   [I-D.PCNarch], but will identify specific issues that have to be
   addressed.  Section 4.3 deals with the specific issues of flow-to-
   aggregate mapping and peer address determination.  Finally, Section 5
   considers the control and communication messaging that must occur
   between the PCN-ingress-node and PCN-egress-node.


Tsou & Taylor             Expires May 20, 2008                  [Page 6]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


4.  Details of Boundary Node Functions

4.1.  PCN-ingress-node Functions

   The PCN-ingress-node enforces flow admission and flow termination
   decisions on flows offered to the PCN domain.  Quoting from section
   5.2 of [I-D.PCNarch], its functions are:

   o  Packet classify

   o  Police

   o  PCN-color

   o  PCN-meter

   These basic actions imply some additional requirements on the PCN-
   ingress-node:

   a.  The PCN-ingress-node should know the address of the PCN-egress-
       node for each new flow request that arrives.  This is the key
       that allows the PCN-ingress-node to select the applicable CLE
       information from the messages it has received.  See Section 4.3
       for a discussion of how the PCN-ingress-node and PCN-egress-node
       determine each other's address and associate individual flows to
       the aggregate flow between them.

   b.  When the admission decision function is implemented in the PCN-
       ingress-node, that node should know the congestion level to each
       PCN-egress-node to which it has admitted flows.  This is needed
       to perform flow admission and flow termination.  Generally, the
       PCN-ingress-node should possess the latest congestion level
       information.  The information is saved and refreshed periodically
       as new CLE messages are received.  If too long a period elapses
       without receipt of congestion level information from a given PCN-
       egress-node, the PCN-ingress-node may have to take an active part
       in gathering the information, by polling or by sending a probing
       message.

          It would seem undesirable to add to network load by polling or
          probing unless there is a decision to be made.  Admission
          decisions (and consequent probes) are covered by the preceding
          bullet.  The one case that may require thought is where CLE
          messages fail to get through because of congestion in the
          egress-to-ingress direction, and this is matched by congestion
          in the ingress-to-egress direction that would call for flow
          termination.


Tsou & Taylor             Expires May 20, 2008                  [Page 7]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


       Although it is out of scope of the current charter, the admission
       decision function may be implemented in a centralized control
       node.  CLE maintenance and refreshment will then be the
       responsibility of this centralized node.  In that case, the PCN-
       ingress-node will act as a policy enforcement point only,
       admitting or rejecting flow in accord with the policy provided by
       the centralized control node.

   c.  Currently, the PCN architectural requirements do not include
       support of ECN.  [I-D.PCNarch] spells out the present assumptions
       about the interaction between ECN and PCN.

   d.  When the PCN-ingress-node terminates a flow, it should send a
       signaling message to notify the flow source about the congestion
       condition and reason for termination.

4.2.  PCN-egress-node Functions

   The PCN-egress-node is in charge of aggregate flow measurement and
   emission of CLE messages.  The basic functions of the PCN-egress-node
   are listed in [I-D.PCNarch]: packet classify, PCN-meter, and PCN-
   color -- but this section adds more details.

   a.  Packet classify - determine which PCN-ingress-node a PCN-packet
       has come from.  This is a requirement of measurement.  After
       packet classification, all packets arriving at the PCN-egress-
       node are grouped into their respective ingress-egress aggregate
       flows.  In the case of tunnelled packets, the PCN-egress-node
       differentiates ingress nodes according to the ingress node
       address in tunnel encapsulation header.  Otherwise the PCN-
       egress-node must use the source address and possibly other
       information within the packet header.

       In this latter case, the PCN-egress-node must in concept keep a
       boundary node address table, in which it saves the PCN-ingress-
       node address and flow source/destination prefix mapping.  By
       searching on the flow source/destination address, the PCN-egress-
       node can get the PCN-ingress-node address.  For a discussion on
       how this table can be set up, see Section 4.3.

   b.  PCN-meter - make "measurements of PCN-traffic".  The measurements
       are made on the aggregate flow of all PCN-packets from a
       particular PCN-ingress-node.  Smoothing is done over time, using
       for example the EWMA (Exponentially Weighted Moving Average)
       method applied separately to numerator and denominator of the
       congestion ratio.  The measurement period for individual
       observations requires careful calculation.  Shorter measurement
       periods increase the amount of computation required at the PCN-


Tsou & Taylor             Expires May 20, 2008                  [Page 8]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


       egress-node while increasing the volatility of the results.  Too
       long a measurement period reduces the responsiveness of the
       system to signs of approaching congestion.  Smoothing has a
       similar effect to lengthening the measurement period, but gives
       more weight to more recent measurements.

       Instead of smoothing, one might consider looking at the process
       as one of statistical estimation of a marking probability that is
       step-wise time-varying.  One assesses each new observation to
       decide whether it represents a continuation of the previous
       regime or is the result of a new value of the estimated
       probability.  The decision would use a standard deviation based
       on the assumption of a binomial probability distribution.

       Once the PCN-marking rate calculations have been carried out, the
       PCN-egress-node must send a CLE message back to the PCN-ingress-
       node providing the results.  Again there is a requirement to know
       the peer address.  See Section 4.3.

   c.  PCN-color - for PCN-packets, set the DSCP field or DSCP and ECN
       fields to the appropriate value(s) for use outside the PCN-
       domain.

4.3.  Peer Address Determination and Flow Aggregation

   Both at ingress and at egress the boundary nodes are faced with the
   problem of classifying flows by aggregate.  As mentioned in the
   previous section, this is conceptually equivalent to having a table
   in each node, mapping source/destination prefix pair to the identity
   of the peer node.  Using tunnels between ingress and egress requires
   the equivalent information, since otherwise the ingress node does not
   know the tunnel to which a given flow should be directed.  The PCN-
   egress-node must in addition have a mapping from the PCN-ingress-node
   identity to its address.  This mapping is trivial if the node address
   is used to represent its identity.

   The table just described is equivalent to the information required to
   establish full-mesh direct routing between the boundary nodes.  It
   seems unfortunate if it is really necessary to bypass the
   information-hiding benefits of routing through interior nodes.  Let
   us consider the possibilities for acquiring the necessary mappings,
   to see if we can do better.

   o  The mappings could be installed in each boundary node by
      configuration.  This raises obvious concerns for scalability and
      responsiveness to changes in the external prefixes served by each
      boundary node.


Tsou & Taylor             Expires May 20, 2008                  [Page 9]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


   o  The mappings could be acquired by an automatic peer-to-peer
      discovery procedure tied to the exchange of routing data within
      the PCN-domain.

   o  The mapping for each flow could be determined as it is offered to
      the PCN-ingress-node, through use of an RSVP PATH message or NSIS
      equivalent.

   The first two methods have the disadvantage that they require the
   persistent storage of a full mapping table at each boundary node.
   Their advantage is that they would require less messaging and
   associated resource consumption than the third approach.  Moreover,
   when a new flow is offered to the PCN-ingress-node, it is in a
   position to make an immediate decision to admit or not, rather than
   having to wait a round-trip for a response from the PCN-egress-node.

   The third method, per-flow mapping query, works best if flows tend to
   be focussed between specific ingress-egress pairs rather than spread
   uniformly around the network.  The per-flow burden will be reduced to
   the extent that flow mappings can be cached and reused at the ingress
   and egress nodes.  Since in general new flows will not be between the
   same source and destination addresses as existing ones, reusability
   of the mapping data requires that the information exchanged between
   the ingress and egress nodes be in the form of the prefixes routed
   through the respective nodes rather than the specific addresses
   involved in the flow that triggered the information exchange.

   While the use of one or more centralized collector nodes is out of
   scope of the current PCN charter, one can visualize a system wherein
   such nodes acquire the list of served prefixes from each boundary
   node and provide aggregate identifiers in response to per-flow
   queries from ingress and egress nodes.  The collector nodes receive
   the CLE messages from the PCN-egress-nodes, with metered results
   presented for each aggregate identifier active at the egress node.
   They forward the CLE results to the PCN-ingress-nodes after mapping
   from aggregate identifier to PCN-ingress-node address.  It would be
   desirable not to exclude this model of operation when creating the
   basic PCN design.


Tsou & Taylor             Expires May 20, 2008                 [Page 10]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


5.  Communication Behavior between Boundary Nodes

5.1.  PCN-ingress-node to PCN-egress-node

   The discussion of the previous section suggests that, except when
   configuration is used to provide the flow-to-aggregate and
   aggregate-to peer address mappings, the PCN-ingress-node will have to
   send some sort of message to the PCN-egress-node to establish these
   mappings, for a specific flow or all flows that could occur between
   them.  We have already remarked on the possible use of the RSVP PATH
   message (modified to carry source prefix information as well as the
   specific source and destination addresses) in the ingree-to-egress
   direction on a per-flow basis.  In the reverse direction, the RESV
   (which may contain a CLE message) would carry the destination prefix
   served by the PCN-egress-node.  Whether the prefix information in
   each direction is merely the range within which the specific source
   or destination address of the flow lies or the complete set of
   prefixes served by the respective node is for further discussion.

   In addition to the creation of mappings, the PCN-ingress-node may
   also send probe messages to the PCN-egress-node.  Current list
   discussion seems to lean toward putting off any consideration of
   probing in our initial work, but we may come back to it in the
   future.  Probe messages are useful when there is no traffic between
   ingress and egress or too little traffic for the PCN-egress-node to
   measure accurately.  Probing would be initiated at the PCN-ingress-
   node if after mapping an offered flow to an aggregate it found stale
   or no CLE information for that aggregate.  The design of the probing
   operation should consider the appropriate action if there is
   excessiive delay in receiving the probe response.

5.2.  PCN-egress-node to PCN-ingress-node

   The messaging that the PCN-egress-node may have to do as part of the
   flow-to-aggregate mapping procedure has already been discussed.  The
   previous section also suggested that the PCN-egress-node will have to
   respond to probe messages.  The nature of that response depends on
   the particular marking behaviour and algorithms used in the network.

   Aside from helping to generate mappings and responding to probes, the
   PCN-egress-node must report the results of its measurements.  As
   indicated already, this is done by sending a CLE message to each PCN-
   ingress-node (or to a central collector node on its behalf).  The
   content of these messages provides the basis for the PCN-ingress-node
   or some other policy entity to make flow admission or flow
   termination decisions.

   One question that must be addressed is: when should the CLE message


Tsou & Taylor             Expires May 20, 2008                 [Page 11]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


   be sent?  There are three basic possibilities:

   o  The CLE message is sent in response to polling by the PCN-ingress-
      node.  An interesting variant of this is that the trigger for
      sending the CLE is receipt of an RSVP PATH message or NSIS
      equivalent, sent to acquire the mapping between a specific flow
      and an aggregate.  The CLE information would thus be made
      available precisely when it is needed to make the admission
      decision.  This fails to take care of the requirements for flow
      termination, however, so something more would be needed.

   o  The second possibility is that the CLE message is sent
      autonomously whenever the congestion level estimate crosses pre-
      configured lower and upper thresholds.  Because of the lack of
      information redundancy in the CLE messages transmitted compared
      the other methods, this approach requires that the CLE message be
      delivered reliably.  This method could be used to supplement the
      polling approach described in the previous bullet, with the CLE
      messages being sent autonomously only for transitions across the
      upper threshold.

   o  The final possibility is that the CLE message is sent periodically
      at a fixed interval.  This method is more robust than the others
      when CLE messages go missing, since the PCN-ingress-node has past
      data which it can extrapolate until the next measurement arrives.


Tsou & Taylor             Expires May 20, 2008                 [Page 12]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


6.  Security Considerations

   PCN-ingress-node and PCN-egress-node dealing with DDOS attack and
   similarities.  When DDOS attack arrives at PCN-ingress-node.  PCN-
   ingress-node should figure them out and take some action to protect
   the existing payload and itself from failure.


Tsou & Taylor             Expires May 20, 2008                 [Page 13]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


7.  IANA Considerations

   This memo presents no IANA considerations.


Tsou & Taylor             Expires May 20, 2008                 [Page 14]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


8.  Acknowledgements

   Thanks to Gabriele Corliano for ideas contributed in preliminary
   discussion, and to Philip Eardley for an excellent review of an
   earlier version of this memo.


Tsou & Taylor             Expires May 20, 2008                 [Page 15]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


9.  References

9.1.  Normative References

   [I-D.PCNarch]
              Eardley, P., "Pre-Congestion Notification Architecture",
              October 2007.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

9.2.  Informative References

   [I-D.encodComp]
              Chan, K. and G. Karagiannis, "Pre-Congestion Notification
              Encoding Comparison", July 2007.

              draft-chan-pcn-encoding-comparison-00.txt

              (Work in progress.)

   [I-D.tsvwgCLarch]
              Briscoe, B., "Pre-Congestion Notification Encoding
              Comparison", October 2006.

              draft-briscoe-tsvwg-cl-architecture-04.txt

              (Expired work in progress.)

   [RFC1633]  Braden, B., Clark, D., and S. Shenker, "Integrated
              Services in the Internet Architecture: an Overview",
              RFC 1633, June 1994.


Tsou & Taylor             Expires May 20, 2008                 [Page 16]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


Authors' Addresses

   Tina Tsou
   Huawei Technologies
   F3-5-089S, R&D Center,
   Longgang District
   Shenzhen  518129
   China

   Email: tena@huawei.com


   Tom Taylor
   Huawei Technologies
   1852 Lorraine Ave
   Ottawa, Ontario  K1H 6Z8
   Canada

   Phone: +1 613 680 2675
   Email: tom.taylor@rogers.com


Tsou & Taylor             Expires May 20, 2008                 [Page 17]

Internet-Draft         PCN Boundary Node Behaviour         November 2007


Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Acknowledgment

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).


Tsou & Taylor             Expires May 20, 2008                 [Page 18]