Congestion and Pre Congestion T. Moncaster Internet-Draft BT Intended status: Standards Track B. Briscoe Expires: December 25, 2008 BT & UCL M. Menth University of Wuerzburg June 23, 2008 Baseline Encoding and Transport of Pre-Congestion Information draft-moncaster-pcn-baseline-encoding-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on December 25, 2008. Copyright Notice Copyright (C) The IETF Trust (2008). Abstract Pre-congestion notification (PCN) provides information to support admission control and flow termination in order to protect the Quality of Service of inelastic flows. It does this by marking packets when traffic load on a link is approaching or has exceeded a rate threshold below the physical link rate. This document specifies Moncaster, et al. Expires December 25, 2008 [Page 1] Internet-Draft Baseline PCN Encoding June 2008 how such marks are to be encoded into the IP header. The baseline encoding described here provides for only two PCN encoding states. Another document describes an extended encoding scheme that allows for three encoding states. Status This memo is posted as an Internet-Draft with an intent to eventually progress to standards track. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Requirements notation . . . . . . . . . . . . . . . . . . . . 3 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 4. Encoding two PCN States in IP . . . . . . . . . . . . . . . . 4 4.1. Rationale for Encoding . . . . . . . . . . . . . . . . . . 5 4.2. PCN-Enabled DiffServ Codepoints . . . . . . . . . . . . . 5 4.2.1. Implications of re-using a DiffServ Codepoint . . . . 5 4.3. Valid and Invalid Encoding Transitions at a PCN Node . . . 6 5. Backwards Compatability . . . . . . . . . . . . . . . . . . . 7 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 7. Security Considerations . . . . . . . . . . . . . . . . . . . 7 8. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 8 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 8 10. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 8 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 11.1. Normative References . . . . . . . . . . . . . . . . . . . 8 11.2. Informative References . . . . . . . . . . . . . . . . . . 8 Appendix A. Tunnelling Constraints . . . . . . . . . . . . . . . 9 Appendix B. Deployment Scenarios for PCN Using Baseline Encoding . . . . . . . . . . . . . . . . . . . . . . 10 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11 Intellectual Property and Copyright Statements . . . . . . . . . . 12 Moncaster, et al. Expires December 25, 2008 [Page 2] Internet-Draft Baseline PCN Encoding June 2008 1. Introduction Pre-congestion notification (PCN) provides information to support admission control and flow termination in order to protect the quality of service (QoS) of inelastic flows. This is achieved by marking packets according to the level of pre-congestion at nodes within the PCN-domain. Two algorithms exist for that purpose. Excess traffic marking marks all PCN packets exceeding a certain reference rate on a link while threshold marking marks all PCN packets on a link when the PCN traffic rate exceeds the reference rate. These markings are evaluated by the egress nodes of the PCN- domain. [PCN-arch] describes how PCN packet markings can be used to assure the QoS of inelastic flows within a single DiffServ domain. This document specifies how these PCN marks are encoded into the IP header. It also describes how packets are identified as belonging to a PCN flow. Some deployment models require two PCN encoding states, others require three. The baseline encoding described here only provides for two PCN encoding states. An extended encoding described in [PCN-3-enc-state] provides for three PCN encoding states. Changes from previous drafts (to be removed by the RFC Editor) From -00 to -01: Change of title from "Encoding and Transport of (Pre-)Congestion Information from within a DiffServ Domain to the Egress" Extensive changes to Introduction and abstract. Added a section on the implications of re-using a DSCP. Added appendix listing possible operator scenarios for using this baseline encoding. Minor changes throughout. 2. Requirements notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 3. Terminology The following terms are used in this document: Moncaster, et al. Expires December 25, 2008 [Page 3] Internet-Draft Baseline PCN Encoding June 2008 o Not PCN - packets that are not PCN capable. o PCN-marked - codepoint indicating packets that have been marked at a PCN interior node using some PCN marking behaviour. Also PM. o Not-Marked - codepoint indicating packets that are PCN capable but are not PCN-marked. Also NM. o PCN-Capable codepoints - collective term for all the NM and PM codepoints. o PCN enabled Diffserv codepoint - a Diffserv codepoint for which PCN has been enabled on a particular machine. In addition the document uses the terminology described in [PCN-arch]. 4. Encoding two PCN States in IP The PCN encoding states are defined using a combination of the DSCP field and ECN field in the IP header. The baseline PCN encoding closely follows the semantics of ECN [RFC3168]. It allows the encoding of two PCN states: Not Marked and PCN-Marked. It also allows for traffic that is not PCN capable to be marked as such (not- PCN). The following table defines how to encode these states in IP: +--------+--------------+-------------+-------------+---------+ | DSCP | Not-ECT (00) | ECT(0) (10) | ECT(1) (01) | CE (11) | +--------+--------------+-------------+-------------+---------+ | DSCP n | not-PCN | NM | NM | PM | +--------+--------------+-------------+-------------+---------+ Where DSCP n is a PCN-enabled DiffServ codepoint (see Section 4.2) Table 1: Encoding PCN in IP The following rules apply to all PCN traffic: o PCN traffic MUST be marked with a DiffServ codepoint that indicates PCN is enabled. To conserve DSCPs, DiffServ Codepoints SHOULD be chosen that are already defined for use with admission controlled traffic, such as the Voice-Admit codepoint defined in [voice-admit]. o Any packet that is not PCN capable (not-PCN) but which shares the same DiffServ codepoint as PCN capable traffic MUST have the ECN field set to 00. Moncaster, et al. Expires December 25, 2008 [Page 4] Internet-Draft Baseline PCN Encoding June 2008 o Any packet that belongs to a PCN capable flow MUST have the ECN field set to one of the two ECT codepoints 10 or 01 at the PCN- ingress-node. o Any packet that is PCN capable and has been PCN-marked by a PCN- interior-node MUST have the ECN field set to 11. 4.1. Rationale for Encoding The exact choice of encoding was dictated by the constraints imposed by existing IETF RFCs, in particular [RFC3168] and [RFC4774]. Full details are contained in [pcn-enc-compare]. One of the tightest constraints was the need for any PCN encoding to survive being tunnelled through either an IP in IP tunnel or an IPSec Tunnel. Appendix A explains this in detail. The main effect of this constraint was that any PCN marking has to use the ECN field set to 11 (CE codepoint). If the packet is being tunneled then only the CE codepoint gets copied into the inner header upon decapsulation. An additional constraint was the need to minimise the use of DiffServ codepoints as these are in increasingly short supply. Section 4.2 explains how we have minimised this still further by reusing pre- existing Diffserv codepoint(s) such that non-PCN traffic can still be distinguished from PCN traffic. The encoding scheme (Table 1) that best addresses the above constraints ends up looking very similar to ECN. This is perhaps not surprising given the similarity in architectural intent between PCN and ECN. 4.2. PCN-Enabled DiffServ Codepoints Equipment complying with the baseline PCN encoding MUST allow PCN to be enabled for a certain Diffserv codepoint or codepoints. This document defines the term 'PCN-Enabled Diffserv Codepoint' for such a DSCP. Enabling PCN for a DSCP switches on PCN marking behaviour for packets with that DSCP, but only if those packets also have their ECN field set to a codepoint other than not-PCN. Enabling PCN marking behaviour disables any other marking behaviour (e.g. enabling PCN also disables the default ECN marking behaviour introduced in [RFC3168]). The scheduling behaviour used for a packet does not change whether PCN is enabled for a DSCP or not and whatever the setting of the ECN field. 4.2.1. Implications of re-using a DiffServ Codepoint [RFC4774] requires that packets for which alternate ECN semantics (PCN semantics) are used are clearly distinguished from packets to Moncaster, et al. Expires December 25, 2008 [Page 5] Internet-Draft Baseline PCN Encoding June 2008 which the semantics according to [RFC3168] apply. This is done by using a DSCP to indicate that the ECN field is to be interpreted in the PCN context instead of the ECN context by PCN-enabled nodes. Non-PCN-enabled forwarding nodes outside or inside the PCN domain treat packets with a PCN-enabled DSCP like ECN traffic if appropriate ECN codepoints are set in the IP header. This has several consequences. o Care must be taken that the PCN encoding of packets is not falsely interpreted by forwarding nodes as ECN encoding, and that no harm is done if this were to happen. To that end, appropriate marking and re-marking is performed at the ingress and the egress of a PCN domain. o The re-used DSCP should be able to serve its original purpose which was not PCN support. This is achieved by marking the packets of such flows with a not-PCN codepoint. o The scheduling behaviour is coupled with the DSCP only. Therefore, the same scheduling and buffer management rules are applied for non-PCN-capable and PCN-capable traffic using the same PCN-enabled DSCP. o Once the ECN field of a packet is used for PCN encoding, it has lost its previous information unless this information was tunnelled through the PCN domain. Therefore, the baseline PCN encoding disables ECN for PCN-enabled DSCPs. [PCN-3-enc-state] provides end-to-end ECN support where this is needed. 4.3. Valid and Invalid Encoding Transitions at a PCN Node PCN edge node behaviour compliant with the PCN baseline encoding: o Any packets with the ECN field already marked as CE or ECT arriving at a PCN ingress node SHOULD be dropped or alternatively MAY be tunnelled through the PCN-domain. They MUST NOT be admitted to the PCN-domain directly. o On leaving the PCN-domain the ECN bits MUST be set to 00 (Not ECT). PCN interior node behaviour compliant with the PCN baseline encoding: o PCN Interior nodes MUST NOT change not-PCN to another codepoint and they MUST NOT change a PCN-Capable codepoint to not-PCN. o PCN interior nodes that are in a pre-congestion state above the configured level MUST set the PM codepoint by changing the ECN Moncaster, et al. Expires December 25, 2008 [Page 6] Internet-Draft Baseline PCN Encoding June 2008 bits of NM marked packets to 11. o The PM codepoint MUST NOT be changed to NM. 5. Backwards Compatability BCP 124 [RFC4774] gives guidelines for specifying alternative semantics for the ECN field. It sets out a number of factors that must be taken into consideration. It also suggests various techniques to allow the co-existence of default ECN and alternative ECN semantics. The alternative semantics specified here are compliant with this BCP: o they use a DSCP to allow routers to distinguish that traffic uses the alternate ECN semantics; o these semantics are defined for use within a controlled domain; o ECN marked traffic is blocked from entering the PCN domain directly (though it might be tunnelled through the domain). 6. IANA Considerations This document makes no request to IANA. It does however suggest a change to the default ([RFC3168]) behaviour for the ECN field for the Voice-Admit [voice-admit] DSCP. 7. Security Considerations Packets claim entitlement to be PCN marked by carrying a PCN-enabled DSCP and a PCN-Capable ECN codepoint. This encoding document is intended to stand independently of the architecture used to determine whether specific packets are authorised to be PCN marked, which will be described in a future separate document on PCN edge-node behaviour. The PCN working group has initially been chartered to only consider a PCN-domain to be entirely under the control of one operator, or a set of operators who trust each other [PCN-charter]. However there is a requirement to keep inter-domain scenarios in mind when defining the PCN encoding. One way to extend to multiple domains would be to concatenate PCN-domains and use PCN-boundary- nodes back to back at borders. Then any one domain's security against its neighbours would be described as part of the edge-node behaviour document as above. One proposal on the table allows one to extend PCN across multiple domains without PCN edge nodes back-to- back at borders [re-PCN]. It is believed that the encoding described Moncaster, et al. Expires December 25, 2008 [Page 7] Internet-Draft Baseline PCN Encoding June 2008 here would not be incompatible with the security framework described there. 8. Conclusions This document defines the baseline PCN encoding utilising a combination of a PCN-enabled DSCP and the ECN field in the IP header. This baseline encoding allows the existence of two PCN encoding states, Not Marked and PCN-Marked. It also allows for the co- existence of non-PCN traffic within the same DSCP. The encoding scheme is conformant with [RFC4774]. 9. Acknowledgements This document builds extensively on work done in the PCN working group by Kwok Ho Chan, Georgios Karagiannis, Philip Eardley and others. Full details of the alternative schemes that were considered for adoption can be found in the document [pcn-enc-compare]. Thanks to Ruediger Geib for providing comments on this document. 10. Comments Solicited Comments and questions are encouraged and very welcome. They can be addressed to the IETF Transport Area working group mailing list , and/or to the authors. 11. References 11.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC4774] Floyd, S., "Specifying Alternate Semantics for the Explicit Congestion Notification (ECN) Field", BCP 124, RFC 4774, November 2006. 11.2. Informative References [PCN-3-enc-state] Moncaster, T., Briscoe, B., and M. Menth, "A three state extended PCN encoding scheme", draft-moncaster-pcn-3-state-encoding-00 (work in progress), June 2008. Moncaster, et al. Expires December 25, 2008 [Page 8] Internet-Draft Baseline PCN Encoding June 2008 [PCN-arch] Eardley, P., "Pre-Congestion Notification Architecture", draft-ietf-pcn-architecture-03 (work in progress), February 2008. [PCN-charter] "IETF Charter for Congestion and Pre-Congestion Notification Working Group". [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, September 2001. [RFC4301] Kent, S. and K. Seo, "Security Architecture for the Internet Protocol", RFC 4301, December 2005. [pcn-enc-compare] Chan, K., Karagiannis, G., Moncaster, T., Menth, M., Eardley, P., and B. Briscoe, "Pre-Congestion Notification Encoding Comparison", draft-chan-pcn-encoding-comparison-03 (work in progress), February 2008. [re-PCN] Briscoe, B., "Emulating Border Flow Policing using Re-ECN on Bulk Data", draft-briscoe-re-pcn-border-cheat-00 (work in progress), July 2007. [voice-admit] Baker, F., Polk, J., and M. Dolly, "DSCPs for Capacity- Admitted Traffic", draft-ietf-tsvwg-admitted-realtime-dscp-04 (work in progress), February 2008. Appendix A. Tunnelling Constraints The rules that govern the behaviour of the ECN field for IP-in-IP tunnels were defined in [RFC3168]. This allowed for two tunnel modes to exist. The limited functionality mode sets the outer header to Not ECT, regardless of the value of the inner header. The full functionality mode copies the inner ECN field into the outer header if the inner header is Not ECT or either of the 2 ECT codepoints. If the inner header is CE then the outer header is set to ECT(0). On decapsulation, if the CE codepoint is set on the outer header then this is copied into the inner header. Otherwise the inner header is left unchanged. The apparent reason for blocking CE from being copied to the outer header was to prevent this from being used as a covert channel through IPSec tunnels. Moncaster, et al. Expires December 25, 2008 [Page 9] Internet-Draft Baseline PCN Encoding June 2008 The IPSec protocol [RFC4301] changed the ECN tunnelling rule to allow IPSec tunnels to simply copy the inner header into the outer header. On decapsulation the outer header is discarded and the ECN field is only copied down if it is set to CE. Because of the possible existence of tunnels, only CE (11) can be used as a PCN marking as it is the only mark that will survive decapsulation. There is a further issue involving tunnelling. In RFC3168, IP in IP tunnels are expected to set the ECN field to ECT(0) if the inner ECN field is set to CE. This leads to the possibility that some packets within the PCN field that have already been marked may have that mark concealed further into the domain. This is undesirable for many PCN schemes and thus standard IP in IP tunnels SHOULD NOT be used within a PCN-domain. Appendix B. Deployment Scenarios for PCN Using Baseline Encoding We illustrate the use of PCN baseline encoding for different PCN deployment scenarios and explain also a case for which baseline encoding is not applicable. {Note this appendix is provided for information only} 1. An operator may wish to use PCN-based admission control only. To that end, threshold marking based on admissible rates may be used as the only PCN metering and marking algorithm. As a consequence, the packet marks M are interpreted as admission-stop (AS) marks. The admission-control algorithm is based on "admissible-rate overload". 2. An operator may wish to use PCN-based flow termination only. To that end, excess rate marking based on supportable rates may be used as the only PCN metering and marking algorithm. As a consequence, the packet marks M are interpreted as excess-traffic (ET) marks. The flow termination algorithm is based on "supportable-rate overload". 3. An operator may wish to use both PCN-based admission control and flow termination. To that end, excess rate marking based on admissible rates may be used as the only PCN metering and marking algorithm. As a consequence, the packet marks are interpreted as admission-stop (AS) marks. Both the admission control and the flow termination algorithm are based on "admissible-rate overload". 4. An operator may wish to implement admission control based on threshold marking at admissible rates and flow termination based on excess rate marking at supportable rates because these methods Moncaster, et al. Expires December 25, 2008 [Page 10] Internet-Draft Baseline PCN Encoding June 2008 are believed to work better with small ingress-egress aggregates. Then two different markings are needed that cannot be recorded by the PCN baseline encoding. Authors' Addresses Toby Moncaster BT B54/70, Adastral Park Martlesham Heath Ipswich IP5 3RE UK Phone: +44 1473 648734 Email: toby.moncaster@bt.com URI: http://www.cs.ucl.ac.uk/staff/B.Briscoe/ Bob Briscoe BT & UCL B54/77, Adastral Park Martlesham Heath Ipswich IP5 3RE UK Phone: +44 1473 645196 Email: bob.briscoe@bt.com Michael Menth University of Wuerzburg room B206, Institute of Computer Science Am Hubland Wuerzburg D-97074 Germany Phone: +49 931 888 6644 Email: menth@informatik.uni-wuerzburg.de Moncaster, et al. Expires December 25, 2008 [Page 11] Internet-Draft Baseline PCN Encoding June 2008 Full Copyright Statement Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgments Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). This document was produced using xml2rfc v1.32 (of http://xml.resource.org/) from a source in RFC-2629 XML format. Moncaster, et al. Expires December 25, 2008 [Page 12]