idnits 2.17.1 

draft-briscoe-tsvwg-cl-phb-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 27.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 1901.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1878.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1885.

  ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line
     1905), which is fine, but *also* found old RFC 2026, Section 10.4C,
     paragraph 1 text on line 49.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.

  ** The document seems to lack an RFC 3979 Section 5, para. 3 IPR Disclosure
     Invitation -- however, there's a paragraph with a matching beginning.
     Boilerplate error?


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == There are 2 instances of lines with non-ascii characters in the document.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([CL-ARCH]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == Line 437 has weird spacing: '...reshold    thr...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'CL-ARCH' is mentioned on line 1494, but not defined

  == Unused Reference: 'GSPa' is defined on line 1722, but no explicit
     reference was found in the text

  == Unused Reference: 'GSP- TR' is defined on line 1726, but no explicit
     reference was found in the text

  == Unused Reference: 'GSP-TR' is defined on line 1728, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2474' is defined on line 1751, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2597' is defined on line 1760, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-04) exists of
     draft-briscoe-tsvwg-cl-architecture-02

  -- Possible downref: Normative reference to a draft: ref. 'CL-arch' 

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DCAC'

  == Outdated reference: A later version (-02) exists of
     draft-floyd-ecn-alternates-00

  -- Possible downref: Normative reference to a draft: ref. 'Floyd' 

  -- Possible downref: Non-RFC (?) normative reference: ref. 'GSPa'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'GSP- TR'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'GSP-TR'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Hovell'

  == Outdated reference: A later version (-01) exists of
     draft-briscoe-tsvwg-re-ecn-border-cheat-00

  -- Possible downref: Normative reference to a draft: ref. 'Re-PCN' 

  ** Downref: Normative reference to an Informational RFC: RFC 2475

  ** Downref: Normative reference to an Historic RFC: RFC 3540

  == Outdated reference: A later version (-20) exists of
     draft-ietf-nsis-rmd-06

  ** Downref: Normative reference to an Experimental draft:
     draft-ietf-nsis-rmd (ref. 'RMD')

  -- Possible downref: Normative reference to a draft: ref. 'RTECN' 

  -- Possible downref: Normative reference to a draft: ref. 'Westberg' 


     Summary: 12 errors (**), 0 flaws (~~), 14 warnings (==), 16 comments
     (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	TSVWG                                                        B. Briscoe
2	Internet Draft                                               P. Eardley
3	draft-briscoe-tsvwg-cl-phb-01.txt                           D. Songhurst
4	Expires: September 2006                                              BT

6	                                                        F. Le Faucheur
7	                                                              A. Charny
8	                                                              V. Liatsos
9	                                                    Cisco Systems, Inc

11	                                                           J. Babiarz
12	                                                                K. Chan
13	                                                            S. Dudley
14	                                                               Nortel

16	                                                         6 March, 2006

18	                    Pre-Congestion Notification marking
19	                     draft-briscoe-tsvwg-cl-phb-01.txt

21	Status of this Memo

23	   By submitting this Internet-Draft, each author represents that
24	   any applicable patent or other IPR claims of which he or she is
25	   aware have been or will be disclosed, and any of which he or she
26	   becomes aware will be disclosed, in accordance with Section 6 of
27	   BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF), its areas, and its working groups.  Note that
31	   other groups may also distribute working documents as Internet-
32	   Drafts.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   The list of current Internet-Drafts can be accessed at
40	        http://www.ietf.org/ietf/1id-abstracts.txt

42	   The list of Internet-Draft Shadow Directories can be accessed at
43	        http://www.ietf.org/shadow.html

45	   This Internet-Draft will expire on September 2006.

47	Copyright Notice

49	   Copyright (C) The Internet Society (2006).  All Rights Reserved.

51	Abstract

53	   Pre-Congestion Notification (PCN) builds on the concepts of RFC 3168,
54	   "The addition of Explicit Congestion Notification to IP". However,
55	   Pre-Congestion Notification aims at providing notification before any
56	   congestion actually occurs. Pre-Congestion Notification is applied to
57	   real-time flows (such as voice, video and multimedia streaming) in
58	   DiffServ networks. As described in [CL-ARCH], it enables "pre"
59	   congestion control through two procedures, flow admission control and
60	   flow pre-emption. The draft proposes algorithms that determine when a
61	   PCN-enabled router writes Admission Marking and Pre-emption Marking
62	   in a packet header, depending on the traffic level. The draft also
63	   proposes how to encode these markings. We present simulation results
64	   with PCN working in an edge-to-edge scenario using the marking
65	   algorithms described. Other marking algorithms will be investigated
66	   in the future.

68	Conventions used in this document

70	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
71	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
72	   document are to be interpreted as described in [RFC2119].

74	Table of Contents

76	   1. Overview....................................................4
77	      1.1. Introduction...........................................4
78	      1.2. Terminology............................................8
79	   2. Admission Marking algorithm..................................9
80	      2.1. Outline................................................9
81	      2.2. Virtual queue based algorithm for Admission Marking......9
82	      2.3. Admission control within a CL-region using Pre-Congestion
83	      Notification...............................................11
84	   3. Pre-emption Marking........................................12
85	      3.1. Outline...............................................12
86	      3.2. Token bucket based algorithm for Pre-emption Marking....12
87	      3.3. Flow pre-emption within a CL-region using Pre-Congestion
88	      Notification...............................................15
89	   4. Simulation results.........................................16
90	   5. Encoding the Admission Marked and Pre-emption Marked states..17
91	   6. Acknowledgements...........................................19
92	   7. Comments solicited.........................................19
93	   8. Changes from earlier version of the draft...................19
94	   9. Appendix A: Explicit Congestion Notification................20
95	   10. Appendix B - Details of simulations........................22
96	      10.1. Network and signalling model..........................22
97	      10.2. Simulated Traffic types...............................23
98	         10.2.1. Voice CBR........................................24
99	         10.2.2. On-off traffic approximating voice with silence
100	         compression.............................................24
101	         10.2.3. High-rate on-off traffic.........................24
102	      10.3. Admission Control Simulations.........................24
103	         10.3.1. Summary of the key parameters for CAC............24
104	            10.3.1.1. Virtual Queue settings......................24
105	            10.3.1.2. Egress measurement parameters...............25
106	         10.3.2. Overview of the Admission Control Results.........25
107	         10.3.3. Sensitivity to Poisson Arrivals assumption........27
108	         10.3.4. Sensitivity to marking parameters................29
109	         10.3.5. Sensitivity to RTT...............................30
110	         10.3.6. Future Work for Admission Control Experiments.....31
111	      10.4. Flow Pre-emption Simulations..........................31
112	         10.4.1. Flow Pre-emption Model and key parameters.........31
113	         10.4.2. Summary of Flow Pre-emption Experiments...........33
114	         10.4.3. Future Work on Flow Pre-emption Experiments.......33
115	   11. Appendix C - Alternative ways of encoding the Admission Marked
116	   and Pre-emption Marked States..................................35
117	      11.1. Alternative 1........................................35
118	      11.2. Alternative 2........................................35
119	      11.3. Alternative 3........................................36
120	      11.4. Alternative 4........................................36
121	      11.5. Alternative 5........................................37
122	      11.6. Comparison of Alternatives............................37
123	         11.6.1. How compatible is the encoding scheme with RFC 3168
124	         ECN?....................................................38
125	         11.6.2. Does the encoding scheme allow an "ECN-nonce"?....40
126	         11.6.3. Does the encoding scheme require new DSCP(s)?.....41
127	         11.6.4. Impact on measurements...........................42
128	         11.6.5. Other issues.....................................42
129	   12. References................................................43
130	   Authors' Addresses............................................45
131	   Intellectual Property Statement................................46
132	   Disclaimer of Validity........................................47
133	   Copyright Statement...........................................47

135	1. Overview

137	1.1. Introduction

139	   Pre-Congestion Notification builds on the concepts of RFC 3168, "The
140	   addition of Explicit Congestion Notification to IP". Pre-Congestion
141	   Notification is applied to real-time flows (such as voice, video and
142	   multimedia streaming) in DiffServ-enabled networks. The reader is
143	   referred to [CL-ARCH] for description of how PCN enables "pre"
144	   congestion control through two procedures, flow admission control and
145	   flow pre-emption. Flow admission control determines whether a new
146	   microflow is added into the network. Flow pre-emption reduces the
147	   current traffic load by terminating selected microflows.

149	   Note this draft concerns the admission control and pre-emption of
150	   *flows*, not of packets.

152	   Appendix A provides a brief summary of Explicit Congestion
153	   Notification (ECN) [RFC3168]. It specifies that a router sets the ECN
154	   field to the Congestion Experienced (CE) value as a warning of
155	   incipient congestion. RFC3168 doesn't specify a particular algorithm
156	   for setting the CE codepoint, although RED (Random Early Detection)
157	   is expected to be used. RFC3168 states that "specifications for
158	   Diffserv PHBs [RFC2475] MAY provide more specifics" on the CE marking
159	   algorithm. This document can be seen as effectively providing such
160	   "specifics" for PHBs targeting real time services. We imagine future
161	   specifications for Diffserv PHBs MAY define their ECN marking
162	   algorithm by reference to this document. In particular we imagine a
163	   CL PHB definition would refer to EF [RFC3246] for its scheduling
164	   behaviour and to this draft for its ECN marking behaviour. However,
165	   currently this draft merely documents pre-congestion notification
166	   algorithms and encoding schemes that we believe are reasonably good,
167	   but not necessarily the best. On-going work will consider various
168	   alternatives and reach rough consensus on the best.

170	   This draft does not propose to change the name of the ECN field. The
171	   term PCN is solely used for the marking process. So we say pre-
172	   congestion marking is applied to the ECN field (not to the PCN
173	   field). We also keep the names of the ECN codepoints, except wherever
174	   new codepoint semantics are required. When we talk of PCN-routers, we
175	   mean routers arranged so that they will use PCN to mark packets
176	   carrying specific, configured DSCPs. PCN routers will still use
177	   default ECN semantics to mark packets carrying other DSCPs.

179	   A router enabled with Pre-Congestion Notification marks packets at a
180	   lower traffic level than an ECN-router, when there still isn't any
181	   significant build-up of real-time packets in the queue. So PCN-marked
182	   packets act as an "early warning" that the amount of packets flowing
183	   is getting close to the engineered capacity and hence indicate to the
184	   admission control system that requests to admit new real-time flows
185	   should be rejected.

187	   In addition to admission control, another essential Quality of
188	   Service feature in deployed networks is the ability to cope with
189	   failures of nodes and links. In this situation the network's capacity
190	   is reduced and selected flows may need to be terminated (pre-empted)
191	   in order to preserve the quality of service of the remaining real-
192	   time flows. Therefore PCN-routers also include the ability to PCN-
193	   mark packets to alert that flow pre-emption may be needed.

195	   So a PCN-router needs to be configured with two reference rates:

197	   o configured-admission-rate

199	   o configured-pre-emption-rate

201	   Clearly flow pre-emption should happen at a higher traffic rate than
202	   admission control. Both these rates will be lower than the physical
203	   line rate.

205	   Note that admission control is the primary mechanism used to prevent
206	   congestion from occurring and flow pre-emption would rarely be
207	   invoked under normal conditions; it is a safety mechanism to prevent
208	   congestion from persisting after link failures, re-routes and other
209	   similar events.

211	   Together, admission control and flow pre-emption protect the
212	   forwarding service offered to admitted and non-pre-empted flows, as
213	   well as protecting service to the traffic classes using the remainder
214	   of the link capacity.

216	   Note well that a PCN-router does not achieve admission control or
217	   flow pre-emption on its own. Just like ECN, a PCN router requires a
218	   feedback system in order to control the load causing the congestion
219	   it is suffering. [CL-ARCH] describes a framework to achieve an end-
220	   to-end controlled load service by using - within a large region of
221	   the Internet - DiffServ and edge-to-edge distributed measurement-
222	   based admission control and flow pre-emption. Controlled load (CL)
223	   service is a quality of service (QoS) closely approximating the QoS
224	   that the same flow would receive from a lightly loaded network
225	   element [RFC2211]. The edge-to-edge region (which we call the CL-
226	   region) is a controlled environment, in that all routers in the CL-
227	   region are enabled with Pre-Congestion Notification and packets can
228	   only enter / leave the CL-region through (enhanced) gateways. PCN-
229	   marked packets are detected by an egress gateway and associated
230	   information is sent to the relevant ingress gateway to decide whether
231	   to admit a new flow, or even pre-empt an existing flow. [CL-ARCH]
232	   also describes a number of assumptions about the CL-region, such as
233	   that there are a large number of real-time flows between each pair of
234	   gateways; hence the CL-region is typically the backbone of an
235	   operator.

237	   We also would like to use PCN-routers in other frameworks, such as:

239	   o Where the CL-region spans networks run by different operators.

241	   o End-host to end-host, i.e. a similar architecture to that
242	      described in [RTECN]

244	   o a similar architecture to that described in [RMD]

246	   These scenarios are for further study as some of the assumptions made
247	   about the CL-region in [CL-ARCH] no longer hold. We plan later drafts
248	   to describe if and how PCN can work in these frameworks.

250	   This document describes Pre-Congestion Notification:

252	   o (Section 2) The algorithm that determines when a packet is marked
253	      so as to warn the admission control mechanism that admission
254	      control may be needed

256	   o (Section 3) The algorithm that determines when a packet is marked
257	      so as to warn the pre-emption mechanism that pre-emption may be
258	      needed

260	   o (Section 4 & Appendix B) Simulation results that demonstrate the
261	      effectiveness of stateless admission control and flow pre-emption.
262	      The results were obtained using the algorithms of Sections 2 and
263	      3. The pdf version of this document includes graphs of simulation
264	      results that aren't in the text version. It can be found at
265	      http://www.cs.ucl.ac.uk/staff/B.Briscoe/projects/ipe2eqos/gqs/pape
266	      rs/draft-briscoe-tsvwg-cl-phb-01.pdf

268	   o (Section 5 & Appendix C) How to encode the markings, i.e. what
269	      change to make to which bits of a packet so as to convey the
270	      admission marking and pre-emption marking to the admission control
271	      and pre-emption mechanisms on the egress gateway

273	   Sections 2 and 3 describe the algorithms a PCN-enabled router uses to
274	   decide whether it needs to set a packet into the Admission Marked or
275	   Pre-emption Marked state. The algorithms are driven by the amount of
276	   traffic in the specified real-time service class. Note that the
277	   measurement is made on an aggregate basis, i.e. it doesn't
278	   distinguish between real-time microflows. We present example
279	   implementations but the same effect may be implemented in different
280	   ways. Indeed, both the admission control and pre-emption algorithms
281	   could have been implemented as variants of token buckets, but the
282	   former is implemented as a virtual queue, to present an alternative
283	   (yet still fairly similar) implementation.

285	                          +------------+
286	                          |   Result   |
287	                          |            V
288	                      +-------+    +--------+
289	                      | Bulk  |    |  PCN   |
290	       Packets    ===>| Meter |===>| Marker |===> Marked Packets
291	                      |       |    |        |
292	                      +-------+    +--------+

294	   Figure 1: Block Diagram of Meter and Marker Function

296	   In Sections 2 and 3 we also hint at how Pre-Congestion Notification
297	   can be used within the CL-region, in order to achieve measurement-
298	   based admission control and flow pre-emption "edge-to-edge" across
299	   the CL-region. Details are in [CL-ARCH].

301	   Section 4 reports some simulation results obtained using these
302	   algorithms in the CL-region framework. Note that the aim of our
303	   simulations is to demonstrate to the IETF community that these
304	   measurement-based admission control and flow pre-emption mechanisms
305	   work successfully. It isn't to show that the particular marking
306	   algorithms simulated are the optimum ones; although we believe they
307	   are a reasonably good choice, on-going work will compare them with
308	   various alternatives.

310	   Section 5 presents one possibility for how to encode the markings.
311	   Although we believe it is a reasonable choice, there are other
312	   possibilities, some of which are listed and discussed in Appendix C.
313	   We seek advice and debate as to what scheme should be standardised.
314	   Note that the choice of how to encode the markings is non-trivial
315	   because we have five things we potentially want to encode, and only
316	   have four states in the two bits of the ECN field:

318	   o Admission Marking - the traffic level is such that the router
319	      Admission Marks the packet

321	   o Pre-emption Marking - the traffic level is such that the router
322	      Pre-emption Marks the packet

324	   o ECT(0) - the first ECT codepoint, for backwards compatibility with
325	      the ECN nonce

327	   o ECT(1) - the other ECT codepoint, for backwards compatibility with
328	      the ECN nonce

330	   o Not ECT - to indicate to a router that the traffic is not PCN-
331	      capable.

333	1.2. Terminology

335	   o Pre-Congestion Notification (PCN): two new algorithms that
336	      determine when a PCN-enabled router Admission Marks and Pre-
337	      emption Marks a packet, depending on the traffic level.

339	   o Admission Marking condition- the traffic level is such that the
340	      router Admission Marks packets. The router provides an "early
341	      warning" that the load is nearing the engineered admission control
342	      capacity, before there is any significant build-up of CL packets
343	      in the queue.

345	   o Pre-emption Marking condition- the traffic level is such that the
346	      router Pre-emption Marks packets. The router warns explicitly that
347	      pre-emption may be needed.

349	   o Configured-admission-rate - the reference rate used by the
350	      admission marking algorithm in a PCN-enabled router.

352	   o Configured-pre-emption-rate - the reference rate used by the pre-
353	      emption marking algorithm in a PCN-enabled router.

355	2. Admission Marking algorithm

357	2.1. Outline

359	   A PCN-enabled router monitors the aggregate traffic in the specified
360	   real-time service class. Based on this measurement, the probability
361	   that the router sets a packet into the Admission Marked state is
362	   determined by the algorithm detailed below, configured to use the
363	   configured-admission-rate. The algorithm ensures that packets are set
364	   into the Admission Marked state before the actual queue builds up,
365	   but when it is in danger of doing so soon; the probability increases
366	   with the danger. Hence such packets act as an "early warning" that
367	   the engineered capacity is nearly reached, and that no more real-time
368	   flows should be admitted.

370	2.2. Virtual queue based algorithm for Admission Marking

372	   In order to make the description more specific we assume a virtual
373	   queue is used; other implementations are possible. By a virtual queue
374	   we mean a *conceptual* queue - it doesn't store packets, it is just
375	   an integer. The integer represents the dynamically changing length of
376	   a queue that would exist if the real-time packets were drained at the
377	   configured-admission-rate instead of the real scheduling rate for the
378	   relevant PHB. Note that there is a virtual queue for each outgoing
379	   link and it operates in bulk and not per microflow, i.e. the same
380	   virtual queue is used for all the real-time packets on that link. The
381	   virtual queue could be implemented, for example, with a variation of
382	   a leaky bucket.

384	   The virtual queue is:

386	   o Emptied at the configured-admission-rate, which is slower (perhaps
387	      considerably slower) than the link speed and the relevant PHB
388	      scheduling rate. This provides a safety margin to minimise the
389	      chances of unnecessarily triggering the pre-emption mechanism, for
390	      instance.

392	   o Filled when a packet arrives carrying a DSCP that has been
393	      configured for PCN (even if the packet is already admission or
394	      pre-emption marked). The amount added is the same as the number of
395	      octets in the packet.

397	   The procedure is visualised in Figure 2:

399	            _________________      _________________      ____________
400	PCN        |increment length |    | calculate       |    |decide      |
401	packet --> |of virtual queue | -> |probability of   | -> |whether to  |
402	arrives    | by size of      |    |admission marking|    |admission   |
403	           |   packet        |    | packet          |    |mark packet |
404	            -----------------      -----------------      ------------
405	Figure 2: Router action to support admission marking

407	   The router computes the probability that the packet should be set
408	   into the Admission Marked state according to the size of the virtual
409	   queue, using the following RED-like algorithm:

411	   Size of virtual queue < min-marking-threshold, probability = 0;

413	   min-marking-threshold < Size of virtual queue < max-marking-
414	   threshold,

416	     probability = (Size of virtual queue - min-marking-threshold) /
417	     (max-marking-threshold - min-marking-threshold);

419	   Size of virtual queue > max-marking-threshold, probability = 1

421	Probability   ^
422	of setting    |
423	packet into   |
424	Admission   1_|                   _______________
425	Marked        |                  /
426	state         |                 /
427	              |                /
428	              |               /
429	              |              /
430	              |             /
431	              |            /
432	            0_|___________/
433	              |
434	               -----------|-------|-------------->
435	                         min-    max-          Size of virtual queue
436	                     marking-    marking-
437	                    threshold    threshold

439	Figure 3: Probability of router setting a packet into the Admission
440	Marked state
441	   So if the CL traffic is sustained at a level greater than the
442	   configured-admission-rate then all packets are eventually admission
443	   marked. However, a short burst of traffic at greater than the
444	   configured-admission-rate (measured over the burst) may not trigger
445	   any admission marking if the burst is sufficiently short that the
446	   virtual queue doesn't grow beyond the min-marking-threshold.

448	   A packet that is already pre-emption marked is never re-marked to the
449	   admission marked state. The decision whether to set a particular
450	   packet into the Admission Marked state is made on a per-packet basis
451	   i.e. independently of the decision for the previous packet.

453	2.3. Admission control within a CL-region using Pre-Congestion
454	   Notification

456	   As an example of how the Admission Marking algorithm enables
457	   admission control, we briefly consider the edge-to-edge framework
458	   described in [CL-ARCH]. As real-time packets enter a CL-region, they
459	   are re-marked to enable PCN marking using the CL DSCP and the
460	   appropriate ECT field. As these CL-packets travel across the edge-to-
461	   edge CL-region, nodes may set the packets into the Admission Marked
462	   state, as determined by the algorithm described above. The egress
463	   gateway of the region measures the fraction of the real-time traffic
464	   that is in the Admission Marked state, with a separate measurement
465	   made for traffic from each ingress gateway. It calculates the
466	   fraction as an exponentially weighted moving average (which we term
467	   Congestion-Level-Estimate, or CLE). When signalling for a new flow
468	   arrives at the egress gateway, it reports the CLE to the CL-region's
469	   ingress gateway piggy-backed on the signalling. The ingress gateway
470	   only admits the new real-time microflow if the CLE is less than the
471	   CLE-threshold. Hence previously accepted microflows are protected and
472	   so suffer minimal queuing delay, jitter and loss.

474	3. Pre-emption Marking

476	3.1. Outline

478	   A PCN-enabled router monitors the aggregate traffic in the specified
479	   real-time service class. Based on this measurement, when the rate of
480	   real-time traffic exceeds the configured-pre-emption-rate for some
481	   time, the router will set packets into the Pre-emption Marked state
482	   as determined by the algorithm detailed below. The configured-pre-
483	   emption-rate is less than the link speed and less than the relevant
484	   PHB scheduling rate, so that Pre-emption Marked packets act as an
485	   explicit alert that the engineered capacity is nearly reached, and
486	   that some real-time flows may need to be pre-empted. This minimises
487	   the chances of a router randomly dropping packets, and hence the
488	   Quality of Service of the remaining flows is fully preserved. Also,
489	   service is preserved to traffic in other service classes using the
490	   remaining capacity.

492	   Pre-emption Marking of packets is similar in motivations to ECN-
493	   marking of packets in [RFC3168]. With [RFC3168] feedback of an ECN-
494	   marked packet causes the TCP source to halve its effective rate,
495	   whereas in our mechanism feedback of pre-emption marking enables an
496	   upstream node to terminate real-time flow(s). Pre-emption is
497	   therefore more aggressive against selected flows, but the gain is
498	   that it enables the full QoS of the remaining flows to be preserved.
499	   Note that in [RFC3168] ECN-marking a given packet is intended to
500	   result in rate adjustment of the flow to which the packet belongs;
501	   while in this draft and [CL-ARCH], Pre-emption marking a packet
502	   simply provides an indication that pre-emption may be needed and the
503	   pre-emption algorithm will then select flows to be pre-empted
504	   independently of which flow the marked packet belonged to.

506	3.2. Token bucket based algorithm for Pre-emption Marking

508	   In order to make the description more specific we assume a token
509	   bucket is used; other implementations are possible.

511	   All PCN routers maintain a token bucket per outgoing link:

513	   o Tokens are added at the configured-pre-emption-rate, which is
514	      slower than the link speed (and the relevant PHB scheduling rate).

516	   o Tokens are removed when a real-time packet arrives; the amount
517	      removed is the same as the number of octets in the packet.
518	      However, if the real-time packet has already been Pre-emption
519	      marked, then tokens are not removed. Also, if there are
520	      insufficient tokens (because removing them would cause a negative
521	      number of tokens in the token bucket), then tokens are not removed
522	      and the packet is set into the Pre-emption Marked state. This
523	      procedure is visualised in Figure 4.

525	                _   _
526	               /     \
527	              /packet \           ----------------
528	RT packet    /  in     \     Y   |Don't remove    |
529	arrives --->/Pre-emption\ -----> |any tokens from |
530	            \ Marked    /        |token bucket    |
531	             \ state?  /          ----------------
532	              \       /                  ^
533	               \_   _/                   |
534	                  |                      |
535	                N |               ---------------
536	                  |              | Set pkt into  |
537	                  |              | Pre-emption   |
538	                  |              | Marked state  |
539	                  |                --------------
540	                  v                      ^
541	                _   _                    |
542	               /     \                   |
543	              / are   \                  |
544	             / there   \                N|
545	            /sufficient \----------------+
546	            \ tokens in /               Y|        -------------------
547	             \ token   /                 |       |  Remove tokens    |
548	              \bucket?/                  +-----> | (= octets in pkt) |
549	               \_   _/                           | from token bucket |
550	                                                  ------------------

552	Figure 4: Router action to support explicit pre-emption alerting
553	   The router computes the probability that an 'unmarked' packet should
554	   be set into the Pre-emption Marked state according to the amount of
555	   tokens in the token bucket:

557	   Size of packet <= tokens in token bucket, probability = 0;

559	   Size of packet >  tokens in token bucket, probability = 1.

561	   'Unmarked' here means 'not in the Pre-emption Marked state'.

563	Probability
564	of setting      ^
565	unmarked-packet |
566	into            |
567	Pre-emption   1_|___________
568	Marked          |           |
569	state           |           |
570	                |           |
571	                |           |
572	                |           |
573	                |           |
574	                |           |
575	                |           |
576	                |           |
577	              0_|           |__________________
578	                |
579	                 -----------|------------------>
580	                           size of          Amount of tokens
581	                           packet           in token bucket

583	Figure 5: Probability of router setting a packet into Pre-emption
584	Marked state

586	   So if the CL traffic is sustained at a level greater than the
587	   configured-pre-emption-rate then 'unmarked' packet arrivals in excess
588	   of this rate (but not those below it) are pre-emption marked.
589	   However, a short burst of traffic at greater than the configured-pre-
590	   emption-rate (measured over the burst) may not trigger any pre-
591	   emption marking if the burst is sufficiently short that the token
592	   bucket doesn't run out of tokens.

594	3.3. Flow pre-emption within a CL-region using Pre-Congestion
595	   Notification

597	   As an example of how the Pre-emption Marking algorithm enables flow
598	   pre-emption, we briefly consider the edge-to-edge framework described
599	   in [CL-ARCH]. As real-time packets travel across the edge-to-edge CL-
600	   region, nodes may set the packets into the Pre-emption Marked state,
601	   as determined by the algorithm described above.

603	   When the egress gateway of the region detects a Pre-emption Marked
604	   packet, it measures the rate of real-time traffic *excluding* any
605	   packets that are set into the Pre-emption Marked state. Hence it
606	   measures the amount of traffic that the network can actually support
607	   safely (which we term Sustainable-Aggregate-Rate). The measurement is
608	   made for traffic from a particular ingress gateway, and then reported
609	   to that ingress gateway. When it receives this message, the ingress
610	   gateway measures the aggregate-rate of real-time traffic that is
611	   being sent towards the particular egress gateway. If this measured
612	   aggregate-rate exceeds the Sustainable-Aggregate-Rate, then the
613	   ingress gateway pre-empts sufficient number of real-time flow(s) to
614	   bring down the aggregate-rate to (approximately) the Sustainable-
615	   Aggregate-Rate.

617	   Different implementations of the rate measurement (and the timescale
618	   of this measurement) at the egress and ingress nodes are possible.

620	4. Simulation results

622	   We have performed an initial set of simulations of admission control
623	   and flow pre-emption mechanisms described in this document and
624	   consistent with [CL-ARCH].

626	   We investigated the performance of the admission control and flow
627	   pre-emption mechanisms with traffic modelling CBR voice, on-off
628	   traffic approximating voice with silence compression, and more
629	   aggressive on-off traffic with larger packet sizes and peak and mean
630	   rates approximating that of video traffic.

632	   In summary, both the admission control and flow pre-emption
633	   mechanisms worked well for all of these traffic types under the
634	   assumptions of [CL-ARCH] (in particular under the assumption that
635	   there are many micro-flows between any pair of ingress / egress
636	   gateways, which, in turn, translates in the assumption that
637	   relatively high speed links are used). Details of the simulation
638	   study are given in Appendix B. In the pdf version of this document
639	   Appendix B also include graphs of simulation results. It can be found
640	   at
641	   http://www.cs.ucl.ac.uk/staff/B.Briscoe/projects/ipe2eqos/gqs/papers/
642	   draft-briscoe-tsvwg-cl-phb-01.pdf

644	   So far the simulations have been run with a sensible estimate of
645	   suitable parameters. While a limited amount of work has been done to
646	   evaluate sensitivity of the results to the simulation parameters (see
647	   Appendix B), investigating further the sensitivity to these
648	   parameters is the next step.

650	   Due to time constraints, we were able to simulate a single
651	   "congestion point" only, i.e. there was a single node where pre-
652	   congestion notification for admission control and/or pre-emption was
653	   triggered. Furthermore, admission control and flow pre-emption
654	   simulations were performed independently.  A study of the interaction
655	   of admission control and flow pre-emption is also a subject of future
656	   work.

658	5. Encoding the Admission Marked and Pre-emption Marked states

660	   In this Section we describe one proposal for how to encode the
661	   Admission Marking and Pre-emption Marking states in a packet, i.e.
662	   what change to make to which bits of a packet.

664	   The encoding scheme uses the two ECN (Explicit Congestion
665	   Notification) bits in the IP header. The four ECN codepoints are used
666	   as follows:

668	         +-----+-----+
669	         | ECN FIELD |
670	         +-----+-----+
671	         bit 6  bit 7
672	            0     0         Admission Marking
673	            0     1         ECT(1)
674	            1     0         ECT(0)
675	            1     1         Pre-emption Marking
676	          Other DSCPs       Non-PCN-Capable

678	   Figure 6: Pre-Congestion Notification's use of the ECN Field in IP

680	   To explain this, we assume that Pre-Congestion Notification is being
681	   used in the architecture described in [CL-ARCH]. It is therefore a
682	   controlled environment, with all routers in the CL-region upgraded
683	   with the PCN capability. Within the CL-region, this encoding meets
684	   the requirements of [Floyd] because a router knows a packet is PCN-
685	   capable if

687	   o Its differentiated services codepoint (DSCP) is one configured for
688	      PCN marking.

690	   When an ingress gateway gets a packet that it has agreed to treat as
691	   part of a PCN-capable microflow, then it sets the ECN field to either
692	   ECT(0) or ECT(1) as it chooses, and if necessary it sets the DSCP to
693	   a PCN-capable Diffserv codepoint. Packets with this DSCP indicate a
694	   PCN-capable transport if any of the four ECN codepoints are set.

696	   When a router gets a PCN-capable packet, then (if necessary) it re-
697	   sets the ECN field to '00' to indicate Admission Marking and to '11'
698	   to indicate Pre-emption Marking. Packets with Admission Marking may
699	   be re-marked to Pre-emption Marking, but not vice-versa.

701	   Other frameworks would be very similar. For example, in a framework
702	   where Pre-Congestion Notification operates from one end-host to
703	   another, then the sending end-host would set the ECN field to either
704	   ECT(0) or ECT(1).

706	   One advantage of this encoding scheme is that it allows the use of
707	   the ECN nonce, thus providing similar protection against a cheater as
708	   [RFC3540]. However, if PCN marking is desired on traffic with a pre-
709	   existing scheduling behaviour (such as EF) a drawback is that a new
710	   DSCP will be required to distinguish PCN-capable traffic from traffic
711	   that isn't PCN-capable, so that a router can identify which traffic
712	   it should PCN mark.

714	   Note that although we believe the encoding scheme is reasonable, it
715	   is not our final proposal. Alternatives are listed and discussed in
716	   Appendix C. We welcome advice and comments as to the most appropriate
717	   scheme.

719	6. Acknowledgements

721	   This work has evolved from several previous independent efforts:

723	   o Guaranteed QoS Synthesis [Hovell], which evolved from the
724	      Guaranteed Stream Provider developed in the M3I project [GSPa,
725	      GSP-TR], which in turn was based on the theoretical work of
726	      Gibbens and Kelly [DCAC]

728	   o RTECN (Real-Time Explicit Congestion Notification) [RTECN]

730	   o RMD (Resource Management in DiffServ) [RMD] and [Westberg]

732	7. Comments solicited

734	   Comments and questions are encouraged and very welcome. They can be
735	   sent to the Transport Area Working Group's mailing list,
736	   tsvwg@ietf.org, and/or to the authors.

738	8. Changes from earlier version of the draft

740	   The main changes are:

742	   From -00 to -01

744	   The description of how to use pre-congestion notification marking in
745	   a CL-region is now described in [CL-arch].

747	   Only one admission marking algorithm is now described.

749	   A pre-emption marking scheme has been added.

751	   Various options for encoding the marking are described and discussed
752	   in Appendix C.

754	   Simulation results are described in Appendix B and summarised in
755	   Section 4.

757	9. Appendix A: Explicit Congestion Notification

759	   This Appendix provides a brief summary of Explicit Congestion
760	   Notification (ECN).

762	   [RFC3168] specifies the incorporation of ECN to TCP and IP, including
763	   ECN's use of two bits in the IP header. It specifies a method for
764	   indicating incipient congestion to end-nodes (e.g. as in RED, Random
765	   Early Detection), where the notification is through ECN marking
766	   packets rather than dropping them.

768	   ECN uses two bits in the IP header of both IPv4 and IPv6 packets:

770	            0     1     2     3     4     5     6     7
771	         +-----+-----+-----+-----+-----+-----+-----+-----+
772	         |          DS FIELD, DSCP           | ECN FIELD |
773	         +-----+-----+-----+-----+-----+-----+-----+-----+

775	           DSCP: differentiated services codepoint
776	           ECN:  Explicit Congestion Notification

778	   Figure A.1: The Differentiated Services and ECN Fields in IP.

780	   The two bits of the ECN field have four ECN codepoints, '00' to '11':
781	         +-----+-----+
782	         | ECN FIELD |
783	         +-----+-----+
784	           ECT   CE
785	            0     0         Not-ECT
786	            0     1         ECT(1)
787	            1     0         ECT(0)
788	            1     1         CE

790	   Figure A.2: The ECN Field in IP.

792	   The not-ECT codepoint '00' indicates a packet that is not using ECN.

794	   The CE codepoint '11' is set by a router to indicate congestion to
795	   the end nodes. The term 'CE packet' denotes a packet that has the CE
796	   codepoint set.

798	   The ECN-Capable Transport (ECT) codepoints '10' and '01' (ECT(0) and
799	   ECT(1) respectively) are set by the data sender to indicate that the
800	   end-points of the transport protocol are ECN-capable. Routers treat
801	   the ECT(0) and ECT(1) codepoints as equivalent. Senders are free to
802	   use either the ECT(0) or the ECT(1) codepoint to indicate ECT, on a
803	   packet-by-packet basis. The use of both the two codepoints for ECT is
804	   motivated primarily by the desire to allow mechanisms for the data
805	   sender to verify that network elements are not erasing the CE
806	   codepoint, and that data receivers are properly reporting to the
807	   sender the receipt of packets with the CE codepoint set.

809	   ECN requires support from the transport protocol, in addition to the
810	   functionality given by the ECN field in the IP packet header.
811	   [RFC3168] addresses the addition of ECN Capability to TCP, specifying
812	   three new pieces of functionality: negotiation between the endpoints
813	   during connection setup to determine if they are both ECN-capable; an
814	   ECN-Echo (ECE) flag in the TCP header so that the data receiver can
815	   inform the data sender when a CE packet has been received; and a
816	   Congestion Window Reduced (CWR) flag in the TCP header so that the
817	   data sender can inform the data receiver that the congestion window
818	   has been reduced.

820	   The transport layer (e.g. TCP) must respond, in terms of congestion
821	   control, to a *single* CE packet as it would to a packet drop.

823	   The advantage of setting the CE codepoint as an indication of
824	   congestion, instead of relying on packet drops, is that it allows the
825	   receiver(s) to receive the packet, thus avoiding the potential for
826	   excessive delays due to retransmissions after packet losses.

828	10. Appendix B - Details of simulations

830	   This section provides some details on the simulation study reference
831	   in Section 4.

833	   Note that the pdf version of this document includes graphs of
834	   simulation results that aren't in the text version. It can be found
835	   at
836	   http://www.cs.ucl.ac.uk/staff/B.Briscoe/projects/ipe2eqos/gqs/papers/
837	   draft-briscoe-tsvwg-cl-phb-01.pdf

839	10.1. Network and signalling model

841	   In most simulations, the network is modelled as a single link between
842	   an ingress and an egress node, all flows sharing the same link.
843	   Figure B.1 shows the modelled network. A is the ingress node and B is
844	   the egress node.

846	         A --- B

848	Figure B.1: Simulated Single Link Network.

850	                           A

852	                            \

854	                          B  - D - F

856	                              /

858	                           C

860	   Figure B.2: Simulated Multi Link Network.

862	   A subset of simulations uses a network structured similarly to the
863	   network shown on figure B.2. A set of ingresses (A,B,C) connected to
864	   an interior node in the network (D) with links of different
865	   propagation delay. This node in turn is connected to the egress (F).
866	   In this topology, different sets of flows between each ingress and
867	   the egress converge on the single link, where pre-congestion
868	   notification algorithm is enabled. In our simulations, the network
869	   has 100 ingress nodes, each connected to the interior node with a
870	   different propagation delay (1ms to 100ms). The point of congestion
871	   is taken to be the link (D-F) connecting the interior node to the
872	   egress node. This link is modelled with a 10ms propagation delay.
873	   Therefore the range of RTTs is from 22ms to 220ms.

875	   The simple network topology was due to a lack of time for the
876	   simulations.

878	   Our simulations concentrated primarily on the range of capacities of
879	   'bottleneck' links with sufficient aggregation - above 10 Mbps for
880	   voice and 622 Mbps for "video", up to 1 Gbps. But we also
881	   investigated slower 'bottleneck' links down to 512 kbps.

883	   In the simulation model, a call request arrives at the ingress and
884	   immediately sends a message to the egress. The message arrives at the
885	   egress after the propagation time plus link processing time (but no
886	   queuing delay). When the egress receives this message, it immediately
887	   responds to the ingress with the current Congestion-Level-Estimate.
888	   If the Congestion-Level-Estimate is below the specified CLE-
889	   threshold, the call is admitted, otherwise it is rejected.

891	   The life of a call outside the domain described above is not
892	   modelled. Propagation delay from source to the ingress and from
893	   destination to the egress is assumed negligible and is not modelled.

895	10.2. Simulated Traffic types

897	   Three types of traffic were simulated (CBR voice, on-off traffic
898	   approximating voice with silence compression, and on-off traffic with
899	   higher peak and mean rates (we termed the latter "video" as the
900	   chosen peak and mean rate was similar to that of an mpeg video
901	   stream, although no attempt was made to match any other parameters of
902	   this traffic to those of a video stream).  The distribution of flow
903	   duration was chosen to be exponentially distributed with mean 2min,
904	   regardless of the traffic type. In most of the experiments flows
905	   arrived according to a Poisson distribution with mean arrival rate
906	   chosen to achieve a desired amount of overload over the configured-
907	   pre-emption-rate or configured-admission-limit in each experiment.
908	   Overloads in the range 2x to 5x have been investigated.

910	   In addition, some experiments investigated a batch Poisson model.
911	   Here the batch represented a set of calls arriving at almost the same
912	   time. The batch arrival process was Poisson, and the batch size was
913	   geometrically distributed with a mean of up to 5 calls per batch.

915	   For on-off traffic, on and off periods were exponentially distributed
916	   with the specified mean.

918	   Traffic parameters for each flow are summarized below:

920	10.2.1. Voice CBR

922	   * Average rate 64 Kbps,

924	   * Packet length 160 bytes

926	   * packet inter-arrival time 20ms

928	10.2.2. On-off traffic approximating voice with silence compression

930	   * Packet length 160 bytes

932	   * Long-term average rate 21.76 Kbps

934	   * On Period mean duration 340ms; during the on period traffic is sent
935	   with the CBR voice parameters described above

937	   * Off Period mean duration 660ms; no traffic is sent during the off
938	   period.

940	10.2.3. High-rate on-off traffic

942	   * Long term average rate 4 Mbps

944	   * On Period mean duration 340ms; during the on-period the packets are
945	   sent at 12 Mbps (1500 byte packets, packet inter-arrival: 1ms)

947	   * Off Period mean duration 660ms

949	10.3. Admission Control Simulations

951	10.3.1. Summary of the key parameters for CAC

953	10.3.1.1. Virtual Queue settings

955	   Most of the simulations were run with the following Virtual Queue
956	   thresholds:

958	   * min-marking-threshold: 5ms at link speed,
959	   *  max-marking-threshold: 15ms at link speed,

961	   *  virtual-queue-upper-limit: 20ms at link speed.

963	   The virtual-queue-upper-limit puts an upper bound on how much the
964	   virtual queue can grow.

966	   Note that the virtual queue is drained at a configured rate smaller
967	   than the link speed. Most of the simulations were set with the
968	   configured-admission-rate of the virtual queue at half the link
969	   speed.

971	   Note that as long as there is no packet loss, the admission control
972	   scheme successfully keeps the load of admitted flows at the desired
973	   level regardless of the actual setting of the configured-admission-
974	   limit.  However, it is not clear if this remains true when the
975	   configured-admission-rate is close to the link speed/actual queue
976	   service rate.  Further work is necessary to quantify the performance
977	   of the scheme with smaller service rate/virtual queue rate ratio,
978	   where packet loss may be an issue.

980	10.3.1.2. Egress measurement parameters.

982	   In our simulations, the CLE-threshold was chosen as 0.5. The CLE is
983	   computed as an exponential weighted moving average (EWMA) with a
984	   weight of 0.01. The CLE is computed on a per-packet basis.

986	10.3.2. Overview of the Admission Control Results

988	   We found that on links of capacity from 10Mbps to OC3, congestion
989	   control for CBR voice and ON_OFF voice traffic work reliably with the
990	   range of parameters we simulated, both with Poisson and Batch call
991	   arrivals.  As the performance of the algorithm was quite good at
992	   these speeds, and generally becomes the better the higher the degree
993	   of aggregation of traffic, we chose to not investigate higher link
994	   speeds for CBR and on-off voice, within the time constraints of this
995	   effort.

997	   For higher-rate on-off "video" traffic, due to time limitations we
998	   simulated 1Gbps and OC12 (622 Mbps) links and Poisson arrivals only.
999	   Note that due to the high mean and peak rates of this traffic model,
1000	   slower links are unlikely to yield sufficient level of aggregation of
1001	   this type of traffic to satisfy the flow aggregation assumptions of

1003	   [CL-ARCH]. Our simulations indicated that this model also behaved
1004	   quite well, although the deviation from the configured-admission-rate
1005	   is slightly higher in this case than for the less bursty traffic
1006	   models.

1008	   For these link speeds and traffic models, we investigated the demand
1009	   overload of 2x-5x.

1011	   Table B.1 below summarizes the worst case difference between the
1012	   admitted load vs. configured-admission-rate. The worst case
1013	   difference was taken over all experiments with the corresponding
1014	   range of link speeds and demand overloads. In general, the higher the
1015	   demand, the more challenging it is for the admission control
1016	   algorithm due to a larger number of near-simultaneous arrivals at
1017	   higher overloads, and as a result the worst case results in Table B.1
1018	   correspond to the 5x demand overload experiments.

1020	------------------------------------------------------------------
1021	|               |         |           | diff between  |          |
1022	| Link type     | traffic | call      | mean admitted | standard |
1023	|               | type    | arrival   | load &        | deviation|
1024	|               |         | process   | conf-adm-rate |          |
1025	------------------------------------------------------------------
1026	|T3,100Mbps,OC3 | CBR     | POISSON   |    0.5%       |   0.5%   |
1027	------------------------------------------------------------------
1028	|
1029	|T3,100Mbps,OC3 |ON-OFF V | POISSON   |    2.5%       |   2.5%   |
1030	------------------------------------------------------------------
1031	|T3,100Mbps,OC3 | CBR     |  BATCH    |    1.0%       |   1.0%   |
1032	------------------------------------------------------------------
1033	|T3,100Mbps,OC3 |ON-OFF V |  BATCH    |    3.0%       |   3.0%   |
1034	------------------------------------------------------------------
1035	|  1Gbps        | "Video" |  POISSON  |    2.0%       |   8.0%   |
1036	------------------------------------------------------------------
1037	|  OC12        |"Video   |  POISSON  |    0.0%       |  10.0%    |
1038	------------------------------------------------------------------
1039	Table B.1. Summary of the admission control results for links above T3
1040	speeds
1041	Note: T1 = 1.5Mbps, T3 = 45Mbps, OC3 = 155Mbps, OC12 = 622Mbps

1043	   Sample simulation graphs for the experiments summarized in Table 6.1
1044	   can be viewed in the PDF version of this draft. It can be found at
1045	   http://www.cs.ucl.ac.uk/staff/B.Briscoe/projects/ipe2eqos/gqs/papers/
1046	   draft-briscoe-tsvwg-cl-phb-01.pdf

1048	   On slower links, accuracy of admission control algorithm was lower
1049	   with Poisson arrivals, and was especially challenging with burstier
1050	   Batch arrivals. This is described in section 6.3.3 below.

1052	   In general, we find that the admission control algorithm perform the
1053	   better the larger degree of aggregation of traffic on the link. The
1054	   algorithm performs well in the range of link speeds we expect to see
1055	   in a CL region.

1057	10.3.3. Sensitivity to Poisson Arrivals assumption

1059	   We investigated whether making the call arrival process burstier than
1060	   Poisson has an effect on the performance of the admission control
1061	   algorithm. To that end we investigated the comparative performance of
1062	   the algorithm with Poisson and Batch call arrival processes,
1063	   described in section 10.2. The mean call arrival rate was the same
1064	   for both processes, with the demand overloads ranging from 2x to 5x.

1066	   We found that the admission control algorithm works reliably for both
1067	   CBR and VBR at links of 1Mbps and above for up to 5x overloads for
1068	   both Poisson and Batch call arrivals. We also found that the
1069	   admission control algorithm only works reasonably well at links of 1
1070	   Mb/s if we assume CBR traffic and Poisson arrival. At T1 speeds and
1071	   below, Batch arrivals resulted in over-admission, the degree of which
1072	   increased on slower links.

1074	   Table B.2 below summarizes the difference between the admitted load
1075	   and the configured-admission-rate for CBR Voice in the case of
1076	   Poisson and Batch arrivals. Table B.3 provides a similar summary for
1077	   on-off traffic simulating voice with silence compression. The results
1078	   in the tables correspond to the worst case across all overload
1079	   factors (and when multiple links speeds are listed, across all those
1080	   link speeds).

1082	-------------------------------------------------------------
1083	|              |             | diff between  |              |
1084	| Link type    |  arrival    | mean admitted | standard     |
1085	|              |  model      | load &        | deviation    |
1086	|              |             | conf-adm-rate |              |
1087	------------------------------------------------------------
1088	| 1Mbps, T1    |    BATCH    |      30.0%    |      30.0%   |
1089	-------------------------------------------------------------
1090	|  10 Mbps     |    BATCH    |       5.0%    |       8.0%   |
1091	-------------------------------------------------------------
1092	|T3,100Mbps,OC3|    BATCH    |       1.0%    |       1.0%   |
1093	-------------------------------------------------------------
1094	|  1Mbps, T1   |  POISSON    |       5.0%    |      10.0%   |
1095	-------------------------------------------------------------
1096	| 10 Mbps      |  POISSON    |       1.0%    |       2.0%   |
1097	-------------------------------------------------------------
1098	|T3,100Mbps,OC3|  POISSON    |       0.5%    |       0.5%   |
1099	-------------------------------------------------------------
1100	Table B.2. Comparison of Poisson and Batch call arrival models for CBR
1101	voice.   Note: T1 = 1.5Mbps, T3 = 45Mbps, OC3 = 155Mbps, OC12 = 622Mbps
1102	------------------------------------------------------------
1103	|              |             | diff between  |              |
1104	| Link type    |  arrival    | mean admitted | standard     |
1105	|              |  model      | load &        | deviation    |
1106	|              |             | conf-adm-rate |              |
1107	------------------------------------------------------------
1108	| 1Mbps, T1    |    BATCH    |      40.0%    |      30.0%   |
1109	-------------------------------------------------------------
1110	|  10 Mbps     |    BATCH    |       8.0%    |       6.0%   |
1111	-------------------------------------------------------------
1112	|T3,100Mbps,OC3|   BATCH     |       3.0%    |       3.0%   |
1113	-------------------------------------------------------------
1114	|  1Mbps, T1   |  POISSON    |      15.0%    |      20.0%   |
1115	-------------------------------------------------------------
1116	| 10 Mbps      |  POISSON    |       7.0%    |       6.0%   |
1117	-------------------------------------------------------------
1118	|T3,100Mbps,OC3|  POISSON    |       2.5%    |       2.5%   |
1119	-------------------------------------------------------------
1120	Table B.3. Comparison of Poisson and Batch call arrival models for on-
1121	off voice with silence compression.
1122	Note: T1 = 1.5Mbps, T3 = 45Mbps, OC3 = 155Mbps, OC12 = 622Mbps
1123	10.3.4. Sensitivity to marking parameters

1125	   The behaviour of the congestion control algorithm in all simulation
1126	   experiments did not substantially differ depending on whether the
1127	   marking was "ramp", i.e. whether a separate min-marking-threshold and
1128	   max-marking-threshold were used, with linear marking probability
1129	   between these thresholds, or whether the marking was "step" with the
1130	   min-marking-threshold and max-marking-threshold collapsed at the max-
1131	   marking-threshold value, and marking all packets with probability 1
1132	   above this collapsed threshold.

1134	   However, the difference between "ramp" and "step" may be more visible
1135	   in the multiple congestion point case (recall that only a single
1136	   congestion point experiments were performed so far).

1138	   Another possible reason for this apparent lack of difference between
1139	   "ramp" and "step" may relate to the choice of the egress measurement
1140	   parameters and a relatively high CLE threshold of 50%. Choosing a
1141	   lower CLE-acceptance threshold and a faster measurement timescale may
1142	   result in a better sensitivity to lower levels of marked traffic.
1143	   Investigating the interaction between settings of the marking
1144	   thresholds, the CLE-threshold, and the measurement parameters at the
1145	   egress is an area of future investigation.

1147	   In contrast, the limited number of simulation experiments we
1148	   performed indicate that the choice of the absolute value of the min-
1149	   marking-threshold, the max-marking-threshold and the virtual-queue-
1150	   upper-limit can have an effect on the algorithm performance.
1151	   Specifically, choosing the min-marking-threshold and the max-marking-
1152	   threshold too small may cause substantial underutilization,
1153	   especially on the slow links. However, at larger values of the min-
1154	   marking-threshold and the max-marking-threshold, preliminary
1155	   experiments suggest the algorithm's performance is insensitive to
1156	   their values. The choice of the virtual-queue-upper-limit affects the
1157	   amount of over-admission (above the configured-admission-rate
1158	   threshold) in some cases, although this effect is not consistent
1159	   throughout the experiments.

1161	   The Table B.4 below gives a summary of the difference between the
1162	   admitted load and the configured-admission-rate as a function of the
1163	   virtual queue parameters, for the 4 Mbps on-off traffic model.  The
1164	   results in the table represent the worst case result among the
1165	   experiments with different degree of demand overloads in the range of
1166	   2x-5x. Typically, higher deviation of admitted load from the
1167	   configured-admission-rate occurs for the higher degree of demand
1168	   overload.

1170	-------------------------------------------------------------
1171	|            |               | diff between  |              |
1172	| Link type  |min-threshold, | mean admitted | standard     |
1173	|            |max-threshold, | load &        | deviation    |
1174	|            |upper-limit(ms)| conf-adm-rate |              |
1175	------------------------------------------------------------
1176	|  1Gbps     |5, 15, 20      |       6.0%    |       8.0%   |
1177	-------------------------------------------------------------
1178	|  1Gbps     |1, 5, 10       |       2.0%    |       7.0%   |
1179	-------------------------------------------------------------
1180	|  1Gbps     |5, 15, 45      |       2.0%    |       8.0%   |
1181	-------------------------------------------------------------
1182	|  OC12      |5, 15, 20      |       5.0%    |      11.0%   |
1183	-------------------------------------------------------------
1184	|  OC12      |1, 5, 10       |       2.0%    |      13.0%   |
1185	-------------------------------------------------------------
1186	|  OC12      |5, 15, 45      |       0.0%    |      10.0%   |
1187	-------------------------------------------------------------
1188	Table B.4. Sensitivity of 4 Mbps on-off "video" traffic to the virtual
1189	queue settings.
1190	Note: T1 = 1.5Mbps, T3 = 45Mbps, OC3 = 155Mbps, OC12 = 622Mbps

1192	   Impact of the virtual queue parameter setting is a subject of further
1193	   study.

1195	10.3.5. Sensitivity to RTT

1197	   We performed a limited amount of sensitivity of the admission control
1198	   algorithm used to the range of round trip propagation time (which is
1199	   the dominant component of the control delay in the typical
1200	   environment using pre-congestion notification).

1202	   Specifically, we studied the case when different groups of flows
1203	   sharing a single bottleneck link in the network have a range of
1204	   roundtrip delays between 22 and 220 ms, as shown in Figure B.2.

1206	   The results were good for all types of traffic tested, implying that
1207	   the admission control algorithm is not sensitive to the either the
1208	   absolute value of the round-trip propagation time or relative value
1209	   of the round-trip propagation time, at least in the range of values
1210	   tested. We expect this to remain true for a wider range of round-trip
1211	   propagation times.

1213	10.3.6. Future Work for Admission Control Experiments

1215	   Areas of future investigation include extending the study of
1216	   sensitivity to multiple congestion points and topologies, further
1217	   investigation of sensitivity to factors such as marking parameters,
1218	   implementation details and time scale of egress measurements, the
1219	   CLE-threshold. Also variations on the marking algorithm will be
1220	   studied.

1222	   Another area of investigation is to understand the sensitivity to the
1223	   ratio of configured-admission-rate to the actual queue service
1224	   rate/link speed, and specifically study how close the configured-
1225	   admission-rate can be to the actual queue draining rate. A related
1226	   investigation is to understand the effect of packet loss on the
1227	   admission control mechanisms. Packet loss can occur if the
1228	   configured-admission-rate is sufficiently close to the actual queue
1229	   rate.

1231	   More realistic Video modelling and the mix of video and voice traffic
1232	   in the same queue is also an area of further study.

1234	10.4. Flow Pre-emption Simulations

1236	10.4.1. Flow Pre-emption Model and key parameters

1238	   The same single-congestion-point network model as described in
1239	   section 10.1 for admission control is used for flow pre-emption. Flow
1240	   arrival and traffic models are also the same as for CAC admission
1241	   control simulations.

1243	   In all flow pre-emption simulations, flows arrive at the ingress
1244	   according to a Poisson distribution, with the mean load of
1245	   "unrestricted" arrivals exceeding the pre-emption threshold by a
1246	   factor of 2 to 5. However, as explained below, the pre-emption
1247	   simulation involve a very sudden surge of traffic to simulate a
1248	   network failure scenario.

1250	   In the simulation, the router implementing PCN Pre-emption Marking
1251	   operates as described in section 3, marking packets which find no
1252	   token in the token bucket. When an egress gateway receives a marked
1253	   packet from the ingress, it will start measuring its Sustainable-
1254	   Aggregate-Rate for this ingress, if it is not already in the pre-
1255	   emption mode.

1257	   If a marked packet arrives while the egress is already in the pre-
1258	   emption mode, the packet is ignored.

1260	   The measurement is interval based, with 100ms measurement interval
1261	   chosen in all simulations.

1263	   At the end of the measurement interval, the egress sends the measured
1264	   Sustainable-Aggregate-Rate to the ingress, and leaves the pre-emption
1265	   mode.

1267	   When the ingress receives the sustainable rate from the egress, it
1268	   starts its own interval immediately (unless it is already in a
1269	   measurement interval), and measures its sending rate to that egress.
1270	   Then at the end of that measurement interval, it pre-empts the
1271	   necessary amount of traffic. The ingress then leaves the pre-emption
1272	   mode until the next time it receives the sustainable rate estimate
1273	   from the egress.

1275	   Due to time limitations, in all our simulations the ingress used the
1276	   same length of the measurement interval as the egress. Investigation
1277	   of the impact of different measurement intervals is an important area
1278	   of future work.

1280	   To avoid excessive pre-emption due to the rate measurement errors, we
1281	   used two error factors, Error1 and Error2 to trigger decisions on
1282	   when to pre-empt and how much to pre-empt at the ingress. To that
1283	   end, the ingress did not trigger pre-emption unless the sending rate
1284	   it measured was greater than SAR + Error1 (SAR=Sustainable Aggregate
1285	   Rate). Similarly, the ingress pre-empted enough flows to reduce its
1286	   sending rate to SAR - Error2. Both Error1 and Error2 in all
1287	   simulations were in the range of 2-5%.

1289	   The configured-pre-emption-rate was set to 50% of link speed. Token
1290	   bucket depth was set to 64 packets for CBR and 128 packets for on-off
1291	   traffic.

1293	   We only tested on the network shown in Figure B.1 and we experimented
1294	   with different propagation delay values: 10ms, 50ms and 100ms.

1296	   Due to time limitation, only links above T3 rate were simulated in
1297	   Pre-emption experiments.

1299	   In all pre-emption experiments, we simulated the base load of traffic
1300	   below pre-emption threshold. At some point during the experiment, the
1301	   load was suddenly increased to simulate sudden overload such that
1302	   might occur after a link failure causes rerouting of some traffic to
1303	   a previously un-congested link. In order to model the fact that a
1304	   link failure may cause flows rerouting to a particular link over a
1305	   period of time, we simulated a "one-wave" traffic surge, where the
1306	   extra flows arrived near simultaneously, and a "three-wave" traffic
1307	   surge, where there are two surges of traffic arriving close together
1308	   (within one measurement interval), followed by a third surge at a
1309	   later time.

1311	10.4.2. Summary of Flow Pre-emption Experiments.

1313	   Our initial simulations demonstrated that in general performance of
1314	   the flow pre-emption mechanism was good, and the appropriate amount
1315	   of traffic was pre-empted in all simulated cases, as long as the
1316	   depth of the pre-emption token bucket was set appropriately (64
1317	   packets for CBR, 128 or higher for on-off traffic). The pre-emption
1318	   always occurred very fast (in particular, in the simulation graphs
1319	   shown in the pdf version of this document with time granularity of 1
1320	   second, pre-emption looks instantaneous).

1322	   Perhaps the most useful result of the simulation experiments we were
1323	   able to run so far was the importance of choosing the token bucket
1324	   depth deep enough to accommodate the expected burstiness on CL
1325	   traffic. If the token bucket depth is too small, instantaneous bursts
1326	   may cause false pre-emption events. Note that if traffic load is
1327	   stable or decreasing, then marking some packets erroneously during a
1328	   an unexpected short burst does not cause any false pre-emption,
1329	   because the rate measurement of the sustained rate is not affected by
1330	   a small amount of pre-emption-marked packets.  However, if the
1331	   traffic load is increasing (while still remaining below pre-emption
1332	   level on the average), a packet marked for pre-emption because it
1333	   found no tokens in the too-shallow token bucket, may cause a false
1334	   pre-emption event.

1336	10.4.3. Future Work on Flow Pre-emption Experiments

1338	   Further work is required to study potential ways of reducing
1339	   sensitivity of the algorithm to the token bucket depth. Potential
1340	   approaches may be to smooth out pre-emption signal by requiring a
1341	   certain amount of pre-emption-marked packets to arrive to the egress
1342	   before measurement of the sustainable rate is triggered. An obvious
1343	   trade-off to be quantified is the corresponding increase in the
1344	   reaction time to receiving a pre-emption-marked packet.

1346	   Further quantification of the sensitivity to traffic burstiness and
1347	   rate measurement implementation and time scales is an important area
1348	   for future work.

1350	   More realistic Video modelling and the mix of video and voice traffic
1351	   in the same queue is also an area of further study.

1353	   Another area of further investigation is the interaction of flow pre-
1354	   emption and admission control, and specifically understanding of how
1355	   close the admission and pre-emption rates can be on one link. A
1356	   related topic is the interaction of flow pre-emption and admission
1357	   control triggered by different links for the same ingress-egress
1358	   pair.

1360	   The exact algorithm for selecting which flows to pre-empt in the case
1361	   of variable rate flows and mixture of traffic profile is subject of
1362	   further study.

1364	   Representative graphs for pre-emption experiments are presented in
1365	   the PDF version of this draft. It can be found at
1366	   http://www.cs.ucl.ac.uk/staff/B.Briscoe/projects/ipe2eqos/gqs/papers/
1367	   draft-briscoe-tsvwg-cl-phb-01.pdf

1369	11. Appendix C - Alternative ways of encoding the Admission Marked and
1370	   Pre-emption Marked States

1372	   In this Appendix we list and discuss alternative ways of encoding the
1373	   Admission Marked and Pre-emption Marked states. We ignore minor
1374	   variants such as swapping the encoding for the Admission Marked and
1375	   Pre-emption Marked states.

1377	11.1. Alternative 1

1379	   The first alternative is the one given in Section 5 above.

1381	         +-----+-----+
1382	         | ECN FIELD |
1383	         +-----+-----+
1384	         bit 6  bit 7
1385	            0     0         Admission Marking
1386	            0     1         ECT(1)
1387	            1     0         ECT(0)
1388	            1     1         Pre-emption Marking

1390	         Other DSCPs        Not ECN capable

1392	   Figure C.1: Encoding scheme Alternative 1

1394	11.2. Alternative 2

1396	   In the second alternative, both Admission Marking and Pre-emption
1397	   Marking are encoded as '11', depending on the original ECT marking:

1399	   o Setting the ECN field of an ECT(1) packet to '11' indicates
1400	      Admission Marking

1402	   o Setting the ECN field of an ECT(0) packet to '11' indicates Pre-
1403	      emption Marking
1404	         +-----+-----+
1405	         | ECN FIELD |
1406	         +-----+-----+
1407	         bit 6  bit 7
1408	            0     0         Not-ECT
1409	            0     1         ECT(1/A)  re-mark ECT(1) to '11' to encode
1410	                                      Admission Marking
1411	            1     0         ECT(0/P)  re-mark ECT(0) to '11' to encode
1412	                                      Pre-emption Marking
1413	            1     1         Admission Marking or Pre-emption Marking

1415	   Figure C.2: Encoding scheme Alternative 2

1417	11.3. Alternative 3

1419	   The third alternative is a combination of the previous two schemes.

1421	         +-----+-----+
1422	         | ECN FIELD |
1423	         +-----+-----+
1424	         bit 6  bit 7
1425	            0     0         Admission Marking
1426	            0     1         ECT(1/A)  re-mark ECT(1) to '00' to encode
1427	                                      Admission Marking
1428	            1     0         ECT(0/P)  re-mark ECT(0) to '11' to encode
1429	                                      Pre-emption Marking
1430	            1     1         Pre-emption Marking

1432	         Other DSCPs        Not ECN capable

1434	   Figure C.3: Encoding scheme Alternative 3

1436	11.4. Alternative 4

1438	   In the fourth alternative a packet is re-marked with a new DSCP to
1439	   indicate Pre-emption Marking.

1441	         +-----+-----+
1442	         | ECN FIELD |
1443	         +-----+-----+
1444	         bit 6  bit 7
1445	            0     0         Not ECN capable
1446	            0     1         ECT(1)
1447	            1     0         ECT(0)
1448	            1     1         Admission Marking

1450	            New DSCP        Pre-emption Marking

1452	   Figure C.4: Encoding scheme Alternative 4

1454	11.5. Alternative 5

1456	   The fifth alternative doesn't include the ECN nonce.

1458	         +-----+-----+
1459	         | ECN FIELD |
1460	         +-----+-----+
1461	         bit 6  bit 7
1462	            0     0         Not ECN capable
1463	            0     1         PCN capable
1464	            1     0         Admission Marking
1465	            1     1         Pre-emption Marking

1467	   Figure C.5: Encoding scheme Alternative 5

1469	11.6. Comparison of Alternatives

1471	   In this section we compare the encoding alternatives against various
1472	   criteria. No scheme is perfect. We would like feedback and advice
1473	   from the IETF community as to which is most suitable. The choice of
1474	   how to encode the markings is non-trivial because we have five things
1475	   we want to encode, and only have four states available in the two
1476	   bits of the ECN field:

1478	   o Admission Marking - the traffic level is such that the router
1479	      Admission Marks the packet

1481	   o Pre-emption Marking - the traffic level is such that the router
1482	      Pre-emption Marks the packet

1484	   o ECT(0) - the first ECT codepoint, for backwards compatibility with
1485	      the ECN nonce

1487	   o ECT(1) - the other ECT codepoint, for backwards compatibility with
1488	      the ECN nonce

1490	   o Not ECN - to indicate to a router that the traffic is not ECN-
1491	      capable, and indeed not PCN-capable.

1493	   Some of the issues won't be relevant in particular scenarios. For
1494	   example, with the CL-region framework[CL-ARCH], the edge-to-edge
1495	   region is a controlled environment so an ECN (RFC3168) packet should
1496	   never encounter a PCN-enabled router.

1498	   Occasionally we use the terminology of the CL-region framework. This
1499	   is merely to make the language more specific.

1501	11.6.1. How compatible is the encoding scheme with RFC 3168 ECN?

1503	   All the encoding schemes for Pre-Congestion Notification use the ECN
1504	   field, so there will be interactions between PCN and ECN. Three
1505	   aspects are:

1507	   o What happens if an ECN (RFC3168) packet encounters a PCN-enabled
1508	      router?

1510	   o What happens if a PCN-capable packet encounters an ECN-enabled
1511	      router?

1513	   o What happens if a flow that has been admitted, using the PCN-based
1514	      admission control mechanism, wants to use ECN (i.e. from end-point
1515	      to end-point as in RFC3168)?

1517	   The first two bullets are about an "unusual" situation, perhaps where
1518	   re-routing means that a PCN-enabled packet gets routed onto an ECN
1519	   router - or perhaps where one of the CL-regions ingress gateways is
1520	   misconfigured so that it allows in ECN packets into the CL traffic
1521	   class. The third bullet is when the end-point wants its flow, which
1522	   has been reserved using PCN-based admission control, to also use ECN-
1523	   congestion control. There has been some discussion (and disagreement)
1524	   about whether this is a realistic requirement [Floyd] [tsvwg-ml].

1526	   o What happens if an ECN (RFC3168) packet encounters a PCN-enabled
1527	      router?

1529	   The main issue here is if traffic at the PCN-router is above the
1530	   admission or pre-emption threshold, and what then happens when the
1531	   ECN packet reaches the RFC3168 ECN end-point.

1533	   Alternative 2 and 4 are very safe. If the PCN-router Admission Marks
1534	   a packet ('11'), the ECN end-point interprets this as the CE
1535	   codepoint. The admission threshold is lower (perhaps much lower) than
1536	   an ECN threshold would be.

1538	   Alternative 3 is also safe. If the PCN-router Pre-emption Marks a
1539	   packet ('11'), the ECN end-point interprets this as the CE codepoint.
1540	   The pre-emption threshold is likely to be lower than an ECN threshold
1541	   would be, and is definitely lower than the traffic level at which
1542	   packets would start to be dropped.

1544	   Alternative 5 is probably OK. However if the level of RFC3168 traffic
1545	   is above the PCN router's configured-admission-rate but below its
1546	   configured-pre-emption-rate, then packets are admission marked (to
1547	   '10') but not pre-emption marked (to '11'). Therefore the ECN traffic
1548	   would tend to block new PCN flows, but not reduce its own rate. This
1549	   would be safer with the encodings for admission marking and pre-
1550	   emption marking swapped.

1552	   With Alternatives 1 and 3, if traffic is above the admission
1553	   threshold then packets will be re-marked to '00'. A subsequent ECN
1554	   router will therefore think the packet isn't ECN-capable.

1556	   With Alternative 5 packets are admission marked to '10', which could
1557	   confuse an ECN RFC3168 end-point using the ECN nonce.

1559	   o What happens if a PCN-capable packet encounters an ECN-enabled
1560	      router?

1562	   The main issue is if the ECN-router is becoming congested, so it
1563	   changes the ECN field to '11', to indicate Congestion Experienced
1564	   (CE).

1566	   With Alternatives 1, 3 and 5 '11' will be interpreted as Pre-emption
1567	   Marking, so the pre-emption mechanism will be triggered.

1569	   With Alternative 2 either the pre-emption or admission mechanism
1570	   would be triggered (depending whether it was originally a '10' or
1571	   '01' packet).

1573	   With Alternative 4 the admission control mechanism will be triggered.

1575	   Interpretation of '11' as pre-emption marking is probably safer than
1576	   interpreting it as admission marking, because it then pre-empts flows
1577	   going through a congested ECN router. However, it isn't clear-cut
1578	   what 'safe' means in this context.

1580	   o What happens if a flow that has been admitted, using the PCN-based
1581	      admission control mechanism, wants to use ECN (i.e. from end-point
1582	      to end-point as in RFC3168)?

1584	   For instance with the CL-region framework, it isn't clear what the
1585	   ingress gateway should do if it gets a packet with the CE codepoint,
1586	   '11'. All the PCN encoding schemes have the same issue. Some options:

1588	   - the ingress gateway could re-set a '11' packet to one of the ECT
1589	      codepoints. However, as far as the ECN-end-point is concerned, the
1590	      CE information is lost.

1592	   - The ingress gateway could pre-empt the flow. This is safer, but
1593	      perhaps harsh as the flow would now be handled by the non-PCN-
1594	      capable class within the CL-region, and by the non-ECN-capable
1595	      class after that.

1597	   - Tunnelling between the ingress and egress gateways, e.g. all PCN-
1598	      capable traffic could be tunnelled. This preserves both the ECN
1599	      and PCN functionality, but at the cost of the tunnelling.

1601	11.6.2. Does the encoding scheme allow an "ECN-nonce"?

1603	   The Explicit Congestion Notification (ECN)-nonce is an optional
1604	   addition to ECN that protects against accidental or malicious
1605	   concealment of marked packets from the TCP sender. It uses the two
1606	   ECN-Capable Transport (ECT) codepoints in the ECN field of the IP
1607	   header. It improves the robustness of congestion control by enabling
1608	   co-operative senders to prevent receivers from exploiting ECN to gain
1609	   an unfair share of network bandwidth.

1611	   Pre-Congestion Notification is targeted at real-time traffic, which
1612	   we'd expect to use UDP or DCCP rather than TCP. However, we imagine
1613	   an "ECN-nonce" could be defined for DCCP and perhaps UDP with similar
1614	   functionality to the ECN-nonce.

1616	   Analysing the encoding schemes in the context of an ECN-nonce:

1618	   o Alternatives 2 and 4 would allow an ECN-nonce

1620	   o Alternatives 1 and 3 would party allow an ECN-nonce - in terms of
1621	      the edge-to-edge framework, an egress gateway would be able to
1622	      detect a cheating ingress gateway, but it wouldn't detect an
1623	      interior router re-marking the ECN field from '11' to '00'.

1625	   o Alternative 5 wouldn't allow an ECN-nonce

1627	   An alternative scheme intended to prevent cheating when using ECN for
1628	   admission control is proposed in [Re-PCN]. This scheme claims to
1629	   provide protection against a much wider range of cheating strategies
1630	   than the ECN-Nonce, including against cheating ingress nodes or
1631	   senders. Whereas the ECN-nonce requires the sender to be trusted.
1632	   This scheme uses a bit outside the ECN field, so Alternative 5
1633	   combined with that scheme could solve the problem of fitting five
1634	   states into four codepoints.

1636	11.6.3. Does the encoding scheme require new DSCP(s)?

1638	   o Alternatives 2 and 5 do not.

1640	   o Alternative 1 does not allow indication of a non-PCN-capable
1641	      transport within the same DSCP as used by PCN-capable transports.
1642	      Therefore, if the PCN-routers are used with a pre-existing
1643	      scheduling behaviour (such as EF) an extra DSCP would have to be
1644	      used to indicate the combination of PCN marking with EF
1645	      scheduling.

1647	   o Alternative 4 needs a new DSCP so a PCN-router can Pre-emption
1648	      Mark a packet.

1650	   In Section 5 we suggested that the Expedited Forwarding DSCP might be
1651	   used to indicate to a PCN-router that a packet is part of a PCN-
1652	   capable flow. However PCN could be used similarly to add admission
1653	   control and flow pre-emption to other DSCP classes. With Alternative
1654	   4 a new DSCP would be needed for each PCN-enabled class.

1656	   It's not clear to what extent the requirement for extra DSCP(s)
1657	   matters. DSCPs are plentiful in an IP network, but scarce in an MPLS
1658	   network where the DSCP/ECN byte is mapped to the three MPLS header
1659	   EXP bits [MPLS/EXP]. However, note that there is at least no need to
1660	   encode the ECN-nonce in the MPLS EXP field, as it is sufficient to
1661	   encode the ECN-nonce in the underlying IP header.

1663	11.6.4. Impact on measurements

1665	   With some of the Alternatives, the measurements by the egress gateway
1666	   for instance, have to be modified:

1668	   With Alternative 2 and 3, it has to measure the rate of ECT(1/A) in
1669	   order to deduce the total number of bits in admission marked packets.

1671	   With Alternative 2, the egress moves into the pre-emption alert state
1672	   if the rate of ECT(0/P) is significantly less than 50%. This is
1673	   slower than the other Alternatives which are triggered by a single
1674	   pre-emption marked packet. It also makes it more likely that the
1675	   egress moves into the pre-emption alert state when the traffic level
1676	   actually doesn't justify this.

1678	   With Alternative 4 the egress has to monitor the new DSCP in order to
1679	   measure pre-emption marked packets.

1681	11.6.5. Other issues

1683	   With Alternatives 2 and 3, Admission Marking means re-marking the ECN
1684	   field of a '01' packet and Pre-emption Marking means re-marking a
1685	   '10' packet. Therefore extra work is required compared with the other
1686	   Alternatives; exactly what the work is depends on the details of the
1687	   framework using PCN.

1689	   With Alternatives 1 and 5 Pre-emption Marking overwrites Admission
1690	   Marking.

1692	   With Alternative 4 Pre-emption Marking is indicated by a new DSCP.
1693	   Some ECMP (Equal Cost Multipath Routing) algorithms use the DSCP
1694	   field as one of the input fields used to calculate which link to
1695	   forward a packet on. Therefore, with a network running ECMP there is
1696	   a danger that a Pre-emption Marked packet might be forwarded on a
1697	   different path to other PCN-capable packets. The extent that this
1698	   matters is for further study. It is not an issue for the other
1699	   encoding Alternatives.

1701	12. References

1703	   A later version will distinguish normative and informative
1704	   references.

1706	   [CL-arch]     B. Briscoe, P. Eardley, D. Songhurst, F. Le Faucheur,
1707	                 A.   Charny, S. Dudley, J. Babiarz, K. Chan. A
1708	                 Framework for Admission Control over DiffServ using
1709	                 Pre-Congestion Notification, draft-briscoe-tsvwg-cl-
1710	                 architecture-02.txt", (work in progress), March 2006

1712	   [DCAC]        Richard J. Gibbens and Frank P. Kelly "Distributed
1713	                 connection acceptance control for a connectionless
1714	                 network", In: Proc. International Teletraffic Congress
1715	                 (ITC16), Edinburgh, pp. 941�952 (1999).

1717	   [Floyd]       S. Floyd, 'Specifying Alternate Semantics for the
1718	                 Explicit Congestion Notification (ECN) Field', draft-
1719	                 floyd-ecn-alternates-00.txt (work in progress), April
1720	                 2005

1722	   [GSPa]        Karsten (Ed.), Martin "GSP/ECN Technology \&
1723	                 Experiments", Deliverable: 15.3 PtIII, M3I Eu Vth
1724	                 Framework Project IST-1999-11429, URL:
1725	                 http://www.m3i.org/ (February, 2002) (superseded by
1726	                 [GSP- TR])

1728	   [GSP-TR]      Martin Karsten and Jens Schmitt, "Admission Control
1729	                 Based on Packet Marking and Feedback Signalling �--
1730	                 Mechanisms, Implementation and Experiments", TU-
1731	                 Darmstadt Technical Report TR-KOM-2002-03, URL:
1732	                 http://www.kom.e-technik.tu-
1733	                 darmstadt.de/publications/abstracts/KS02-5.html (May,
1734	                 2002)

1736	   [Hovell]      P. Hovell, R. Briscoe, G. Corliano, "Guaranteed QoS
1737	                 Synthesis - an example of a scalable core IP quality
1738	                 of service solution", BT Technology Journal, Vol 23 No
1739	                 2, April 2005

1741	   [Re-PCN]      B. Briscoe, "Emulating Border Flow Policing using Re-
1742	                 ECN on Bulk Data", draft-briscoe-tsvwg-re-ecn-border-
1743	                 cheat-00 (work in progress), February 2006

1745	   [RFC2119]     Bradner, S., "Key words for use in RFCs to Indicate
1746	                 Requirement Levels", BCP 14, RFC 2119, March 1997.

1748	   [RFC2211]     J. Wroclawski, Specification of the Controlled-Load
1749	                 Network Element Service, September 1997

1751	   [RFC2474]     Nichols, K., Blake, S., Baker, F. and D. Black,
1752	                 "Definition of the Differentiated Services Field (DS
1753	                 Field) in the IPv4 and IPv6 Headers", RFC 2474,
1754	                 December 1998

1756	   [RFC2475]     Blake, S., Black, D., Carlson, M., Davies, E., Wang,
1757	                 Z. and W. Weiss, "An Architecture for Differentiated
1758	                 Services", RFC 2475, December 1998.

1760	   [RFC2597]     Heinanen, J., Baker, F., Weiss, W. and J. Wrocklawski,
1761	                 "Assured Forwarding PHB Group", RFC 2597, June 1999.

1763	   [RFC3168]     Ramakrishnan, K., Floyd, S. and D. Black "The Addition
1764	                 of Explicit Congestion Notification (ECN) to IP", RFC
1765	                 3168, September 2001.

1767	   [RFC3246]     B. Davie, A. Charny, J.C.R. Bennet, K. Benson, J.Y. Le
1768	                 Boudec, W. Courtney, S. Davari, V. Firoiu, D.
1769	                 Stiliadis, 'An Expedited Forwarding PHB (Per-Hop
1770	                 Behavior)', RFC 3246, March 2002.

1772	   [RFC3540]     N. Spring, D. Wetherall, D. Ely, 'Robust Explicit
1773	                 Congestion Notification (ECN) Signaling with Nonces',
1774	                 RFC 3540, June 2003.

1776	   [RMD]         A Bader, L Westberg, G Karagiannis, C Kappler, T
1777	                 Phelan, 'RMD-QOSM - The Resource Management in
1778	                 DiffServ QoS model', draft-ietf-nsis-rmd-06 Work in
1779	                 Progress, February 2006

1781	   [RTECN]       Babiarz, J., Chan, K. and V. Firoiu, 'Congestion
1782	                 Notification Process for Real-Time Traffic', draft-
1783	                 babiarz-tsvwg-rtecn-05 Work in Progress, October 2005.

1785	   [tsvwg-ml]    Discussion on the TSVWG mailing list, Nov/Dec 2005.

1787	   [Westberg]    L. Westberg, Z. R. Turanyi, D. Partain, A. Bader, G.
1788	                 Karagiannis, "Load Control of Real-Time Traffic",
1789	                 draft-westberg-loadcntr-04.txt (Work in progress), Dec
1790	                 2005

1792	Authors' Addresses

1794	   Bob Briscoe
1795	   BT Research
1796	   B54/77, Sirius House
1797	   Adastral Park
1798	   Martlesham Heath
1799	   Ipswich, Suffolk
1800	   IP5 3RE
1801	   United Kingdom
1802	   Email: bob.briscoe@bt.com

1804	   Dave Songhurst
1805	   BT Research
1806	   B54/69, Sirius House
1807	   Adastral Park
1808	   Martlesham Heath
1809	   Ipswich, Suffolk
1810	   IP5 3RE
1811	   United Kingdom
1812	   Email: dsonghurst@jungle.bt.co.uk

1814	   Philip Eardley
1815	   BT Research
1816	   B54/77, Sirius House
1817	   Adastral Park
1818	   Martlesham Heath
1819	   Ipswich, Suffolk
1820	   IP5 3RE
1821	   United Kingdom
1822	   Email: philip.eardley@bt.com

1824	   Vassilis Liatsos
1825	   Cisco Systems, Inc.
1826	   1414 Massachusetts Avenue
1827	   Boxborough,
1828	   MA 01719,
1829	   USA
1830	   Email: vliatsos@ciscoyahoo.com

1832	   Francois Le Faucheur
1833	   Cisco Systems, Inc.
1834	   Village d'Entreprise Green Side - Batiment T3
1835	   400, Avenue de Roumanille
1836	   06410 Biot Sophia-Antipolis
1837	   France
1838	   Email: flefauch@cisco.com
1839	   Anna Charny
1840	   Cisco Systems, Inc.
1841	   14164 Massachusetts Ave
1842	   Boxborough,
1843	   MA 01719
1844	   USA
1845	   Email: acharny@cisco.com

1847	   Jozef Babiarz
1848	   Nortel Networks
1849	   3500 Carling Avenue
1850	   Ottawa, Ont.  K2H 8E9
1851	   Canada
1852	   Email: babiarz@nortel.com

1854	   Kwok Ho Chan
1855	   Nortel Networks
1856	   600 Technology Park Drive
1857	   Billerica, MA 01821
1858	   USA
1859	   Email: khchan@nortel.com

1861	   Stephen Dudley
1862	   Nortel Networks
1863	   4001 E. Chapel Hill Nelson Highway
1864	   P.O. Box 13010, ms 570-01-0V8
1865	   Research Triangle Park, NC 27709
1866	   USA
1867	   Email: smdudley@nortel.com

1869	Intellectual Property Statement

1871	   The IETF takes no position regarding the validity or scope of any
1872	   Intellectual Property Rights or other rights that might be claimed to
1873	   pertain to the implementation or use of the technology described in
1874	   this document or the extent to which any license under such rights
1875	   might or might not be available; nor does it represent that it has
1876	   made any independent effort to identify any such rights.  Information
1877	   on the procedures with respect to rights in RFC documents can be
1878	   found in BCP 78 and BCP 79.

1880	   Copies of IPR disclosures made to the IETF Secretariat and any
1881	   assurances of licenses to be made available, or the result of an
1882	   attempt made to obtain a general license or permission for the use of
1883	   such proprietary rights by implementers or users of this
1884	   specification can be obtained from the IETF on-line IPR repository at
1885	   http://www.ietf.org/ipr.

1887	   The IETF invites any interested party to bring to its attention any
1888	   copyrights, patents or patent applications, or other proprietary
1889	   rights that may cover technology that may be required to implement
1890	   this standard.  Please address the information to the IETF at
1891	   ietf-ipr@ietf.org

1893	Disclaimer of Validity

1895	   This document and the information contained herein are provided on an
1896	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1897	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1898	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1899	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1900	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1901	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1903	Copyright Statement

1905	   Copyright (C) The Internet Society (2006).

1907	   This document is subject to the rights, licenses and restrictions
1908	   contained in BCP 78, and except as set forth therein, the authors
1909	   retain all their rights.