idnits 2.17.1 

draft-ietf-pwe3-congestion-frmwk-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 17.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 1164.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1175.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1182.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1188.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 28, 2008) is 5810 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Outdated reference: A later version (-16) exists of
     draft-ietf-pwe3-fc-encap-07

  -- Obsolete informational reference (is this intentional?): RFC 2001
     (Obsoleted by RFC 2581)

  -- Obsolete informational reference (is this intentional?): RFC 2581
     (Obsoleted by RFC 5681)

  -- Obsolete informational reference (is this intentional?): RFC 3448
     (Obsoleted by RFC 5348)

  -- Obsolete informational reference (is this intentional?): RFC 4447
     (Obsoleted by RFC 8077)

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)


     Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 12 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          S. Bryant
3	Internet-Draft                                                  B. Davie
4	Intended status: Standards Track                              L. Martini
5	Expires: November 29, 2008                                      E. Rosen
6	                                                     Cisco Systems, Inc.
7	                                                            May 28, 2008

9	                Pseudowire Congestion Control Framework
10	                draft-ietf-pwe3-congestion-frmwk-01.txt

12	Status of this Memo

14	   By submitting this Internet-Draft, each author represents that any
15	   applicable patent or other IPR claims of which he or she is aware
16	   have been or will be disclosed, and any of which he or she becomes
17	   aware will be disclosed, in accordance with Section 6 of BCP 79.

19	   Internet-Drafts are working documents of the Internet Engineering
20	   Task Force (IETF), its areas, and its working groups.  Note that
21	   other groups may also distribute working documents as Internet-
22	   Drafts.

24	   Internet-Drafts are draft documents valid for a maximum of six months
25	   and may be updated, replaced, or obsoleted by other documents at any
26	   time.  It is inappropriate to use Internet-Drafts as reference
27	   material or to cite them other than as "work in progress."

29	   The list of current Internet-Drafts can be accessed at
30	   http://www.ietf.org/ietf/1id-abstracts.txt.

32	   The list of Internet-Draft Shadow Directories can be accessed at
33	   http://www.ietf.org/shadow.html.

35	   This Internet-Draft will expire on November 29, 2008.

37	Abstract

39	   Pseudowires are sometimes used to carry non-TCP data flows.  In these
40	   circumstances the service payload will not react to network
41	   congestion by reducing its offered load.  Pseudowires should
42	   therefore reduces their network bandwidth demands in the face of
43	   significant packet loss, including if necessary completely ceasing
44	   transmission.  Since it is difficult to determine a priori the number
45	   of equivalent TCP flow that a pseudowire represents, a suitably
46	   "fair" rate of back-off cannot be pre-determined.  This document
47	   describes pseudowire congestion problem and provides guidance on the
48	   development suitable solutions.

50	Requirements Language

52	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
53	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
54	   document are to be interpreted as described in RFC 2119 [RFC2119].

56	Table of Contents

58	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
59	   2.  PW Congestion Context  . . . . . . . . . . . . . . . . . . . .  3
60	     2.1.  Congestion in IP Networks  . . . . . . . . . . . . . . . .  3
61	     2.2.  A Short Tutorial on TCP Congestion Control . . . . . . . .  4
62	     2.3.  Alternative Approaches to Congestion Control . . . . . . .  5
63	     2.4.  Pseudowires and TCP  . . . . . . . . . . . . . . . . . . .  6
64	     2.5.  Arguments Against PW Congestion as a Practical Problem . .  7
65	     2.6.  Goals and non-goals of this draft  . . . . . . . . . . . .  8
66	   3.  Challenges for PW Congestion Management  . . . . . . . . . . .  9
67	     3.1.  Scale  . . . . . . . . . . . . . . . . . . . . . . . . . .  9
68	     3.2.  Interaction among control loops  . . . . . . . . . . . . .  9
69	     3.3.  Determining the Appropriate Rate . . . . . . . . . . . . . 10
70	     3.4.  Constant Bit Rate PWs  . . . . . . . . . . . . . . . . . . 10
71	     3.5.  Valuable and Vulnerable PWs  . . . . . . . . . . . . . . . 11
72	   4.  Congestion Control Mechanisms  . . . . . . . . . . . . . . . . 11
73	   5.  Detecting Congestion . . . . . . . . . . . . . . . . . . . . . 12
74	     5.1.  Using Sequence Numbers to Detect Congestion  . . . . . . . 13
75	     5.2.  Using VCCV to Detect Congestion  . . . . . . . . . . . . . 14
76	     5.3.  Explicit Congestion Notification . . . . . . . . . . . . . 15
77	   6.  Feedback from Receiver to Transmitter  . . . . . . . . . . . . 15
78	     6.1.  Control Plane Feedback . . . . . . . . . . . . . . . . . . 16
79	     6.2.  Using Reverse Data Packets for Feedback  . . . . . . . . . 17
80	     6.3.  Reverse VCCV Traffic . . . . . . . . . . . . . . . . . . . 17
81	   7.  Responding to Congestion . . . . . . . . . . . . . . . . . . . 17
82	     7.1.  Interaction with TCP . . . . . . . . . . . . . . . . . . . 18
83	   8.  Rate Control per Tunnel vs. per PW . . . . . . . . . . . . . . 19
84	   9.  Constant Bit Rate Services . . . . . . . . . . . . . . . . . . 20
85	   10. Managed vs Unmanaged Deployment  . . . . . . . . . . . . . . . 20
86	   11. Related Work: Pre-Congestion Notification  . . . . . . . . . . 21
87	   12. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 22
88	   13. Informative References . . . . . . . . . . . . . . . . . . . . 22
89	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24
90	   Intellectual Property and Copyright Statements . . . . . . . . . . 26

92	1.  Introduction

94	   Pseudowires are sometimes used to emulate services network that carry
95	   non-TCP data flows.  In these circumstances the service payload will
96	   not react to network congestion by reducing its offered load.  In
97	   order to protect the newtork, pseudowires (PW) should therefore
98	   reduces their network bandwidth demands in the face of significant
99	   packet loss, including if necessary completely ceasing transmission.

101	   Pseudowires are commonly deployed as part of the managed
102	   infrastructure deployed by a service provider.  It is likely in these
103	   cases that the PW will be used to carry many TCP equivalent flows.
104	   In these environments the operator needs to make a judgement call on
105	   a fair estimate of the number of equivalent flows that the PW
106	   represents.  The counterpoint to a well managed network is a plug and
107	   play network, of the type typically found in a consumer environment.
108	   Whilst PWs have been designed to provide an infrastructure tool to
109	   the service provider, we cannot be sure that they will not find some
110	   consumer or similar plug and play application.  In this environment a
111	   conservative is needed, with PW congestion avoidance enabled by
112	   default, and the throughput set by default to the equivalent of a
113	   single TCP flow.

115	   This framework first considers the context in which PWs operate, and
116	   hence the context in which PW congestion control needs to operate.
117	   It then explores the issues of detection, feedback and load control
118	   in PW context, and the special issues that apply to constant bit rate
119	   services.  Finally it considers the applicability of congestion
120	   control in various network environments.

122	   This document defines the framework to be used in designing a
123	   congestion control mechanism for PWs.  It does not prescribe a
124	   solution for an particular PW type, and it is possible that more than
125	   one mechanism may be required, depending on the PW type and the
126	   environment in which it is to be deployed.

128	2.  PW Congestion Context

130	2.1.  Congestion in IP Networks

132	   Congestion in an IP or MPLS enabled IP network occurs when the amount
133	   of traffic that needs to use a particular network resource exceeds
134	   the capacity of that resource.  This results first in long queues
135	   within the network, and then in packet loss.  If the amount of
136	   traffic is not then reduced, the packet loss rate will climb,
137	   potentially until it reaches 100%.  If the situation is left unabated
138	   the network enters a state where it is no longer able to sustain any
139	   productive information flows and the result is "congestive collapse"

141	   To prevent this sort of "congestive collapse", there must be
142	   congestion control: a feedback loop by which the presence of
143	   congestion anywhere on the path from the transmitter to the receiver
144	   forces the transmitters to reduce the amount of traffic being sent.
145	   As a connectionless protocol, IP has no way to push back directly on
146	   the originator of the traffic.  Procedures for (a) detecting
147	   congestion, (b) providing the necessary feedback to the transmitters,
148	   and (c) adjusting the transmission rates, are thus left the to higher
149	   protocol layers such as TCP.

151	2.2.  A Short Tutorial on TCP Congestion Control

153	   TCP includes an elaborate congestion control mechanism that causes
154	   the end systems to reduce their transmission rates when congestion
155	   occurs.  For those readers not intimately familiar with the details
156	   of TCP congestion control, we give below a brief summary, greatly
157	   simplified and not entirely accurate, of TCP's very complicated
158	   feedback mechanism.  The details of TCP congestion control can be
159	   found in [RFC2581].  [RFC2001] is an earlier but more accessible
160	   discussion.  [RFC2914] articulates a number of general principles
161	   governing congestion control in the Internet.

163	   In TCP congestion control, a lost packet is considered to be an
164	   indication of congestion.  Roughly, TCP considers a given packet to
165	   be lost if that packet is not acknowledged within a specified time,
166	   or if three subsequent packets arrive at the receiver before the
167	   given packet.  The latter condition manifests itself at the
168	   transmitter as the arrival of three duplicate acknowledgements in a
169	   row.  The algorithm by which TCP detects congestion is thus highly
170	   dependent on the mechanisms used by TCP to ensure reliable and
171	   sequential delivery.

173	   Once a TCP transmitter becomes aware of congestion, it halves its
174	   transmission rate.  If congestion still occurs at the new rate, the
175	   rate is halved again.  When a rate is found at which congestion no
176	   longer occurs, the rate is increased by one MSS ("Maximum Segment
177	   Size") per RTT ("Round Trip Time").  The rate is increased each RTT
178	   until congestion is encountered again, or until something else limits
179	   it (e.g., the flow control window reached, or the application is
180	   transmitting at its max desired rate, or at line rate).

182	   This sort of mechanism is known as an "Additive Increase,
183	   Multiplicative Decrease" (AIMD) mechanism.  Congestion causes
184	   relatively rapid decreases in the transmission rate, while the
185	   absence of congestion causes relatively slow increases in the allowed
186	   transmission rate.

188	2.3.  Alternative Approaches to Congestion Control

190	   [RFC2914]defines the notion of a "TCP-compatible flow":

192	   "A TCP-compatible flow is responsive to congestion notification, and
193	   in steady state uses no more bandwidth than a conformant TCP running
194	   under comparable conditions (drop rate, RTT [round trip time], MTU
195	   [maximum transmission unit], etc.)"

197	   TCP-compatible flows respond to congestion in much the way TCP does,
198	   so that they do not starve the TCP flows or otherwise obtain an
199	   unfair advantage.  [RFC2914] further points out:

201	   "any form of congestion control that successfully avoids a high
202	   sending rate in the presence of a high packet drop rate should be
203	   sufficient to avoid congestion collapse from undelivered packets."

205	   "This does not mean, however, that concerns about congestion collapse
206	   and fairness with TCP necessitate that all best-effort traffic deploy
207	   congestion control based on TCP's Additive-Increase Multiplicative-
208	   Decrease (AIMD) algorithm of reducing the sending rate in half in
209	   response to each packet drop."

211	   "However, the list of TCP-compatible congestion control procedures is
212	   not limited to AIMD with the same increase/ decrease parameters as
213	   TCP.  Other TCP-compatible congestion control procedures include
214	   rate-based variants of AIMD; AIMD with different sets of increase/
215	   decrease parameters that give the same steady-state behavior;
216	   equation-based congestion control where the sender adjusts its
217	   sending rate in response to information about the long-term packet
218	   drop rate ... and possibly other forms that we have not yet begun to
219	   consider."

221	   The AIMD procedures are not mandated for non-TCP traffic, and might
222	   not be optimal for non-TCP PW traffic.  Choosing a proper set of
223	   procedures which are TCP-compatible while being optimized for a
224	   particular type of traffic is no simple task.

226	   [RFC3448], "TCP Friendly Rate Control (TFRC)" provides an
227	   alternative: TFRC is designed to be reasonably fair when competing
228	   for bandwidth with TCP flows, where a flow is "reasonably fair" if
229	   its sending rate is generally within a factor of two of the sending
230	   rate of a TCP flow under the same conditions.  However, TFRC has a
231	   much lower variation of throughput over time compared with TCP, which
232	   makes it more suitable for applications such as telephony or
233	   streaming media where a relatively smooth sending rate is of
234	   importance."
235	   "For its congestion control mechanism, TFRC directly uses a
236	   throughput equation for the allowed sending rate as a function of the
237	   loss event rate and round-trip time.  In order to compete fairly with
238	   TCP, TFRC uses the TCP throughput equation, which roughly describes
239	   TCP's sending rate as a function of the loss event rate, round-trip
240	   time, and packet size."

242	   "Generally speaking, TFRC's congestion control mechanism works as
243	   follows:

245	   o  The receiver measures the loss event rate and feeds this
246	      information back to the sender.

248	   o  The sender also uses these feedback messages to measure the round-
249	      trip time (RTT).

251	   o  The loss event rate and RTT are then fed into TFRC's throughput
252	      equation, giving the acceptable transmit rate.

254	   o  The sender then adjusts its transmit rate to match the calculated
255	      rate."

257	   Note that the TFRC procedures require the transmitter to calculate a
258	   throughput equation.  For these procedures to be feasible as a means
259	   of PW congestion control, they must be computationally efficient.
260	   Section 8 of [RFC3448] describes an implementation technique that
261	   appears to make it efficient to calculate the equation.

263	   In the context of pseudowires, we note that TFRC aims to achieve
264	   comparable throughput in the long term to a single TCP connection
265	   experiencing similar loss and round trip time.  As noted above, this
266	   may not be an appropriate rate for a PW that is carrying many flows.

268	2.4.  Pseudowires and TCP

270	   Currently, traffic in IP and MPLS networks is predominantly TCP
271	   traffic.  Even the layer 2 tunnelled traffic (e.g., PPP frames
272	   tunnelled through L2TP) is predominantly TCP traffic from the end-
273	   users.  If pseudowires (PWs) [RFC3985] were to be used only for
274	   carrying TCP flows, there would be no need for any PW-specific
275	   congestion mechanisms.  The existing TCP congestion control
276	   mechanisms would be all that is needed, since any loss of packets on
277	   the PW would be detected as loss of packets on a TCP connection, and
278	   the TCP flow control mechanisms would ensure a reduction of
279	   transmission rate.

281	   If a PW is carrying non-TCP traffic, then there is no feedback
282	   mechanism to cause the end-systems to reduce their transmission rates
283	   in response to congestion.  When congestion occurs, any TCP traffic
284	   that is sharing the congested resource with the non-TCP traffic will
285	   be throttled, and the non-TCP traffic may "starve" the TCP traffic.
286	   If there is enough non-TCP traffic to congest the network all by
287	   itself, there is nothing to prevent congestive collapse.

289	   The non-TCP traffic in a PW can belong to any higher layer
290	   whatsoever, and there is no way to ensure that a TCP-like congestion
291	   control mechanisms will be present to regulate the traffic rate.
292	   Hence it appears that there is a need for an edge-to-edge (i.e., PE-
293	   to-PE) feedback mechanism which forces a transmitting PE to reduce
294	   its transmission rate in the face of network congestion.

296	   As TCP uses window-based flow control, controlling the rate is really
297	   a matter of limiting the amount of traffic which can be "in flight"
298	   (i.e., transmitted but not yet acknowledged) at any one time.  Where
299	   a non-windowed protocol is used for transmitting data on a PW a
300	   different technique is obviously required to control the transmission
301	   rate.

303	2.5.  Arguments Against PW Congestion as a Practical Problem

305	   One may argue that congestion due to non-TCP PW traffic is only a
306	   theoretical problem and that no congestion control mechanism is
307	   needed.  For example the following cases have been put forward:

309	   o  "99.9% of all the traffic in PWs is really IP traffic"

311	      If this is the case, then the traffic is either TCP traffic, which
312	      is already congestion-controlled, or "other" IP traffic.  While
313	      the congestion control issue may exist for the "other" IP traffic,
314	      this is a general issue and hence is outside the scope of the
315	      pseudowire design.

317	      Unfortunately, we cannot be sure that this is the case.  It may
318	      well be the case for the PW offerings of certain providers, but
319	      perhaps not for others.  It does appear that many providers want
320	      to be able to use PWs for transporting "legacy traffic" of various
321	      non-IP protocols.  Constant bit-rate services are an example of
322	      this, and raise particular issues for congestion control
323	      (discussed below).

325	   o  "PW traffic usually stays within one SP's network, and an SP
326	      always engineers its network carefully enough so that congestion
327	      is an impossibility"

329	      Perhaps this will be true of "most" PWs, but inter-provider PWs
330	      are certainly expected to have a significant presence.

332	      Even within a single provider's network, the provider might
333	      consider whether he is so confident of his network engineering
334	      that he does not need a feedback loop reducing the transmission
335	      rate in response to congestion.

337	      There is also the issue of keeping the network running (i.e., out
338	      of congestive collapse) after an unexpected reduction of capacity.

340	      Finally the PW may be deployed on the Internet rather than in an
341	      SP environment.

343	   o  "If one provider accepts PW traffic from another, policing will be
344	      done at the entry point to the second provider's network, so that
345	      the second provider is sure that the first provider is not sending
346	      too much traffic.  This policing, together with the second
347	      provider's careful network engineering, makes congestion an
348	      impossibility"

350	      This could be the case given carefully controlled bilateral
351	      peering arrangements.  Note though that if the second provider is
352	      merely providing transit services for a PW whose endpoints are in
353	      other providers, it may be difficult for the transit provider to
354	      tell which traffic is the PW traffic and which is "ordinary" IP
355	      traffic.

357	   o  "The only time we really need a general congestion control
358	      mechanism is when traffic goes through the public Internet.
359	      Obviously this will never be the case for PW traffic."

361	      It is not at all difficult to imagine someone using an IPsec
362	      tunnel across the public Internet to transport a PW from one
363	      private IP network to another.

365	      Nor is it difficult to imagine some enterprise implementing a PW
366	      and transporting it across some SP's backbone, e.g., if that SP is
367	      providing VPN service to that enterprise.

369	   The arguments that non-TCP traffic in PWs will never make any
370	   significant contribution to congestion thus do not seem to be totally
371	   compelling.

373	2.6.  Goals and non-goals of this draft

375	   As a framework, this draft aims to explore the issues surrounding PW
376	   congestion and to lay out some of the design trade offs that will
377	   need to be made if a solution is to be developed.  It does not intend
378	   to propose a particular solution, but it does point out some of the
379	   problems that will arise with certain solution approaches.

381	   The over-riding technical goal of this work is to avoid scenarios in
382	   which the deployment of PW technology leads to congestive collapse of
383	   the PSN over which the PWs are tunnelled.  It is a non-goal of this
384	   work to ensure that PWs receive a particular Quality of Service (QoS)
385	   particular level in order to protect them from congestion(
386	   (Section 3.5)).  While such an outcome may be desirable, it is beyond
387	   the scope of this draft.

389	3.  Challenges for PW Congestion Management

391	3.1.  Scale

393	   It might appear at first glance that an easy solution to PW
394	   congestion control would be to run the PWs through a TCP connection.
395	   This would provide congestion control automatically.  However, the
396	   overhead is prohibitive for the PW application.  The PWE3 data plane
397	   may be implemented in a micro-coded or hardware engine which needs to
398	   support thousands of PWs, and needs to do as little as possible for
399	   each data packet; running a TCP state machine, and implementing TCP's
400	   flow control procedures, would impose too high a processing and
401	   memory cost in this environment.  Nor do we want to add the large
402	   overhead of TCP to the PWs -- the large headers, the plethora of
403	   small acknowledgements in the reverse direction, etc., etc.  In fact,
404	   we need to avoid acknowledgements altogether.  These same
405	   considerations lead us away from using e.g., DCCP [RFC4340], or SCTP
406	   [RFC4960], even in its partially reliable mode [RFC3758].  Therefore
407	   we will investigate some PW-specific solutions for congestion
408	   control.

410	   The PW designed also needs to minimise the amount of interaction
411	   between the data processing path (which is likely to be distributed
412	   among a set of line cards) and the control path; be especially
413	   careful of interactions which might require atomic read/modify/write
414	   operations from the control path, or which might require atomic read/
415	   modify/write operations between different processors in a
416	   multiprocessing implementation, as such interactions can cause
417	   scaling problems.

419	   Thus, feasible solutions for PW-specific congestion will require
420	   scalable means to detect congestion and to reduce the amount of
421	   traffic sent into the network when congestion is detected.  These
422	   topics are discussed in more detail in subsequent sections.

424	3.2.  Interaction among control loops

426	   As noted above, much of the traffic that is carried on PWs is likely
427	   to be TCP traffic, and will therefore be subject the congestion
428	   control mechanisms of TCP.  It will typically be difficult for a PW
429	   endpoint to tell whether or not this is the case.  Thus there is the
430	   risk that the PE-PE congestion control mechanisms applied over the PW
431	   may interact in undesirable ways with the end-to-end congestion
432	   control mechanisms of TCP.  The PW-specific congestion control
433	   mechanisms should be designed to minimise the negative impact of such
434	   interaction.

436	3.3.  Determining the Appropriate Rate

438	   TCP tends to share the bandwidth of a bottleneck among flows somewhat
439	   evenly, although flows with shorter round-trip-times and larger MSS
440	   values will tend to get more throughput in the steady state than
441	   those with longer RTTs or smaller MSS.  TFRC simply tries to deliver
442	   the same rate to a flow that a TCP flow would obtain in the steady
443	   state under similar conditions (loss rate, RTT and MSS).  The
444	   challenge fin the PW environment is to determine what constitutes a
445	   "flow".  While it is tempting to consider a single PW to be
446	   equivalent to a "flow" for the purposes of fair rate estimation, it
447	   is likely in many cases that one PW will be carrying a large number
448	   of flows, in which case it would seem quite unfair to throttle that
449	   PW down to the same rate that a single TCP flow would obtain.

451	   The issue of what constitutes fairness and the perils of using the
452	   TCP flow as the basic unit of fairness have been explored at some
453	   length in [I-D.briscoe-tsvarea-fair].

455	   In the PW environment some estimate (measured or configured) of the
456	   number flows in the PW needs to be applied to the estimate of fair
457	   throughput as part of the process of determining appropriate
458	   congestion behavior.

460	3.4.  Constant Bit Rate PWs

462	   Some types of PW, for example SAToP (Structure Agnostic TDM over
463	   Packet) [RFC4553], CESoPSN (Circuit Emulation over Packet Switched
464	   Networks) [I-D.ietf-pwe3-cesopsn], TDM over IP
465	   [I-D.ietf-pwe3-tdmoip][I-D.ietf-pwe3-sonet], SONET/SDH and Constant
466	   Bit Rate ATM PWs represent an inelastic constant bit-rate (CBR) flow.
467	   Such PWs cannot respond to congestion in a TCP-friendly manner
468	   prescribed by [RFC2914].  However this inability to respond well to
469	   congestion by reducing the amount of total bandwidth consumed if
470	   offset by the fact that such a PW maintains a constant bandwidth
471	   demand rather than being greedy (in the sense of trying to increase
472	   their share of a link, as TCP does).  AIMD or even more gradual TFRC
473	   techniques are clearly not applicable to such services; it is not
474	   feasible to reduce the rate of a CBR service without violating the
475	   service definition.  Such services are also frequently more sensitive
476	   to packet loss than connectionless packet PWs.  Given that CBR
477	   services are not greedy, there may be a case for allowing them
478	   greater latitude during congestion peaks.  If some CBR PWs are not
479	   able to endure any significant packet loss or reduction in rate
480	   without compromising the transported service, such PWs must be
481	   shutdown when the level of congestion becomes excessive, but at
482	   suitably low levels of congestion they should be allowed to continue
483	   to offer traffic to the network.

485	   Some CBR services may be carried over connectionless packet PWs.  An
486	   example of such a case would be a CBR MPEG-2 video stream carried
487	   over an Ethernet PW.  One could argue that such a service - provided
488	   the rate was policed at the ingress PE - should be offered the same
489	   latitude as a PW that explicitly provided a CBR service.  Likewise,
490	   there may not be much value in trying to throttle such a service
491	   rather than cutting it off completely during severe congestion.
492	   However, this clearly raises the issue of how to know that a PW is
493	   indeed carrying a CBR service.

495	3.5.  Valuable and Vulnerable PWs

497	   Some PWs are a premium service for which the user has paid a premium
498	   to the provider for higher availability, lower packet loss rate etc.
499	   Some PWs, for example the CBR services described in Section 3.4are
500	   particularly vulnerable to packet loss.  In both of these cases it
501	   may be tempting to relax the congestion considerations so that these
502	   services can press on as best they can in the event of congestion.
503	   However a more appropriate strategy is to engineer the network paths,
504	   QoS parameters, queue sizes and scheduling parameters so that these
505	   services do not suffer congestion discard.  If despite this network
506	   engineering these services still experience congestion, that is an
507	   indication that the network is having difficulty servicing their
508	   needs, and the services, like any other service should reduce their
509	   network load.

511	4.  Congestion Control Mechanisms

513	   There are three components of the congestion control mechanism that
514	   we need to consider:

516	   1.  Congestion state detection

518	   2.  Feedback from the receiver to the transmitter

520	   3.  Method used by the transmitter to respond to congestion state

522	   Reference to congestion state apply to both the detection of the
523	   onset of the congestion state, and detection of the return of the
524	   network to a state in which greater or normal bandwidth is available.

526	   We discuss the design framework for each of these components in the
527	   sections below.

529	5.  Detecting Congestion

531	   In TCP, congestion is detected by the transmitter; the receipt of
532	   three successive duplicate TCP acknowledgements are taken to be
533	   indicative of congestion.  What this actually means is that the
534	   several packets in a row were received at the remote end, such that
535	   none of those packets had the next expected sequence number.  This is
536	   interpreted as meaning that the packet with the next expected
537	   sequence number was lost in the network, and the loss of a single
538	   packet in the network is taken as a sign of congestion.  (Naturally,
539	   the presence of congestion is also inferred if TCP has to retransmit
540	   a packet.)  Note that it is possible for misordered packets to be
541	   misinterpreted as lost packets, if they do not arrive "soon enough".

543	   In TCP, a time-out while awaiting an acknowledgement is also
544	   interpreted as a sign of congestion.

546	   Since there are normally no acknowledgements on a PW (the only PW
547	   design that has so far attempted a sophisticated flow control is
548	   [I-D.ietf-pwe3-fc-encap]), the PW-specific congestion control
549	   mechanism cannot normally be based on either the presence of or the
550	   absence of acknowledgements.  Some types of pseudowire (the CBR PWs)
551	   have a single bit that indicates that a preset amount of data has
552	   been lost, but this is a non-quantitative indicator.  CBR PWs have
553	   the advantage that there is a constant two way data flow, while other
554	   PW types do not have the constant symmetric flow of payload on which
555	   to piggyback the congestion notification.  Most PW types therefore
556	   provide no way for a transmitter to determine (or even to make an
557	   educated guess as to) whether any data has been lost.

559	   Thus we need to add a mechanism for determining whether data packets
560	   on a PW have become lost.  There are several possible methods for
561	   doing this:

563	   o  Detect Congestion Using PW Sequence Numbers

565	   o  Detect Congestion Using Modified VCCV Packets [I-D.ietf-pwe3-vccv]

567	   o  Rely on Explicit Congestion Notification (ECN) [RFC3168]

569	   We discuss each option in turn in the following sections.

571	5.1.  Using Sequence Numbers to Detect Congestion

573	   When the optional sequencing feature is in use on a PW [RFC4385], it
574	   is necessary for the receiver to maintain a "next expected sequence
575	   number" for the PW.  If a packet arrives with a sequence number that
576	   is earlier than the next expected (a "misordered packet"), the packet
577	   is discarded; if it arrives with a sequence number that is greater
578	   than or equal to the next expected, the packet is delivered, and the
579	   next expected sequence number becomes the sequence number of the
580	   current packet plus 1.

582	   It is easy to tell when there is one or more missing packets (i.e.,
583	   there is a "gap" in the sequence space) -- that is the case when a
584	   packet arrives whose sequence number is greater than the next
585	   expected.  What is difficult to tell is whether any misordered
586	   packets that arrive after the gap are indeed the missing packets.
587	   One could imagine that the receiver remembers the sequence number of
588	   each missing packet for a period of time, and then checks off each
589	   such sequence number if a misordered packet carrying that sequence
590	   number later arrives.  The difficulty is doing this in a manner which
591	   is efficient enough to be done by the microcoded hardware handling
592	   the PW data path.  This approach does not really seem feasible.

594	   One could make certain simplifying assumptions, such as assuming that
595	   the presence of any gaps at all indicates congestion.  While this
596	   assumption makes it feasible to use the sequence numbers to "detect
597	   congestion", it also throttles the PW unnecessarily if there is
598	   really just misordering and no congestion.  Such an approach would be
599	   considerably more likely to misinterpret misordering as congestion
600	   than would TCP's approach.

602	   An intermediate approach would be to keep track of the number of
603	   missing packets and the number of misordered packets for each PW.
604	   One could "detect congestion" if the number of missing packets is
605	   significantly larger than the number of misordered packets over some
606	   sampling period.  However, gaps occurring near the end of a sampling
607	   period would tend to result in false indications of congestion.  To
608	   avoid this one might try to smooth the results over several sampling
609	   periods; While this would tend to decrease the responsiveness, it is
610	   inevitable that there will be a trade-off between the rapidity of
611	   responsiveness and the rate of false alarms.

613	   One would not expect the hardware or microcode to keep track of the
614	   sampling period; presumably software would read the necessary
615	   counters from hardware at the necessary intervals.

617	   Such a scheme would have the advantage of being based on existing PW
618	   mechanisms.  However, it has the disadvantage of requiring
619	   sequencing, which introduces a fairly complicated interaction between
620	   the control processing and the data path, and precludes the use of
621	   some pipelined forwarding designs.

623	5.2.  Using VCCV to Detect Congestion

625	   It is reasonable to suppose that the hardware keeps counts of the
626	   number of packets sent and received on each PW.  Suppose that the PW
627	   periodically inserts VCCV packets into the PE data stream, where each
628	   VCCV packet carries:

630	   o  A sequence number, increasing by 1 for each successive VCCV
631	      packet;

633	   o  The current value of the transmission counter for the PW

635	   We assume that the size of the counter is such that it cannot wrap
636	   during the interval between n VCCV packets, for some n > 1.

638	   When the receiver gets one of these VCCV packets on a PW, it inserts
639	   into the packet, or packet metadata the count of received packets for
640	   that PW, and then delivers the VCCV packet to the software.  The
641	   receiving software can now compute, for the inter-VCCV intervals, the
642	   count of packets transmitted and the count of packets received.  The
643	   presence of congestion can be inferred if the count of packets
644	   transmitted is significantly greater than the count of packets
645	   received during the most recent interval.  Even the loss rate could
646	   be calculated.  The loss rate calculated in this way could be used as
647	   input to the TFRC rate equation.

649	   VCCV messages would not need to be sent on a PW (for the purpose of
650	   detecting congestion) in the absence of traffic on that PW.

652	   Of course, misordered packets that are sent during one interval but
653	   arrive during the next will throw off the loss rate calculation;
654	   hence the difference between sent traffic and received traffic should
655	   be "significant" before the presence of congestion is inferred.  The
656	   value of "significance" can be made larger or smaller depending on
657	   the probability of misordering.

659	   Note that congestion can cause a VCCV packet to go missing, and
660	   anything that misorders packets can misorder a VCCV packet as well as
661	   any other.  One may not want to infer the presence of congestion if a
662	   single VCCV packet does not arrive when expected, as it may just be
663	   delayed in the network, even if it hasn't been misordered.  However,
664	   failure to receive a VCCV packet after a certain amount of time has
665	   elapsed since the last VCCV was received (on a particular PW) may be
666	   taken as evidence of congestion.  This scheme has the disadvantage of
667	   requiring periodic VCCV packets, and it requires VCCV packet formats
668	   to be modified to include the necessary counts.  However, the
669	   interaction between the control path and the data path is very
670	   simple, as there is no polling of counters, no need for timers in the
671	   data path, and no need for the control path to do read-modify-write
672	   operations on the data path hardware.  A bigger disadvantage may
673	   arise from the possible inability to ensure that the transmit counts
674	   in the VCCVs are exactly correct.  The transmitting hardware may not
675	   be able to insert a packet count in the VCCV IMMEDIATELY before
676	   transmission of the VCCV on the wire, and if it cannot, the count of
677	   transmit packets will only be approximate.

679	   Neither scheme can provide the same type of continuous feedback that
680	   TCP gets.  TCP gets a continuous stream of acknowledgments, whereas
681	   this type of PW congestion detection mechanism would only be able to
682	   say whether congestion occurred during a particular interval.  If the
683	   interval is about 1 RTT, the PW congestion control would be
684	   approximately as responsive as TCP congestion control, and there does
685	   not seem to be any advantage to making it smaller.  However, sampling
686	   at an interval of 1 RTT might generate excessive amounts of overhead.
687	   Sampling at longer intervals would reduce responsiveness to
688	   congestion but would not necessarily render the congestion control
689	   mechanism "TCP-unfriendly".

691	5.3.  Explicit Congestion Notification

693	   In networks that support explicit congestion notification (ECN)
694	   [RFC3168] the ECN notification provides congestion information to the
695	   PEs before the onset of congestion discard.  This is particularly
696	   useful to PWs that are sensitive to packet loss, since it gives the
697	   PE the opportunity to intelligently reduce the offered load.  ECN
698	   marking rates of packets received on a PW could be used to calculate
699	   the TFRC rate for a PW.  However ECN is not widely deployed at the
700	   time of writing; hence it seems that PEs must also be capable of
701	   operating in a network where packet loss is the only indicator of
702	   congestion.

704	6.  Feedback from Receiver to Transmitter

706	   Given that the receiver can tell, for each sampling interval, whether
707	   or not a PW's traffic has encountered congestion, the receiver must
708	   provide this information as feedback to the transmitter, so that the
709	   transmitter can adjust its transmission rate appropriately.  The
710	   feedback could be as simple as a bit stating whether or not there was
711	   any packet loss during the specified interval.  Alternatively, the
712	   actual loss rate could be provided in the feedback, if that
713	   information turns out to be useful to the transmitter (e.g. to enable
714	   it to calculate an appropriate rate at which to send).  There are a
715	   number of possible ways in which the feedback can be provided:
716	   control plane, reverse data traffic, or VCCV messages.  We discuss
717	   each in turn below.

719	6.1.  Control Plane Feedback

721	   A control message can be sent periodically to indicate the presence
722	   or absence of congestion.  For example, when LDP is the control
723	   protocol [RFC4447], the control message would of course be delivered
724	   reliably by TCP.  (The same considerations apply for any protocol
725	   which has a reliable control channel.)  When congestion is detected,
726	   a control message can be sent indicating that fact.  No further
727	   congestion control messages would need to be sent until congestion is
728	   no longer detected.  If the loss rate is being sent, changes in the
729	   loss rate would need to be sent as well.  When there is no longer any
730	   congestion, a message indicating the absence of congestion would have
731	   to be sent.

733	   Since congestion in the reverse direction can prevent the delivery of
734	   these control messages, periodic "no congestion detected" messages
735	   would need to be sent whenever there is no congestion.  Failure to
736	   receive these in a timely manner would lead the control protocol peer
737	   to infer that there is congestion.  (Actually, there might or might
738	   not be congestion in the transmitting direction, but in the absence
739	   of any feedback one cannot assume that everything is fine.)  If
740	   control messages really cannot get through at all, control protocol
741	   keep-alives will fail and the control connection will go down anyway.

743	   If the control messages simply say whether or not congestion was
744	   detected, then given a reliable control channel, periodic messages
745	   are not needed during periods of congestion.  Of course, if the
746	   control messages carry more data, such as the loss rate, then they
747	   need to be sent whenever that data changes.

749	   If it is desired to control congestion on a per-tunnel basis, these
750	   control messages will simply say that there was congestion on some PW
751	   (one or more) within the tunnel.  If it is desired to control
752	   congestion on a per-PW basis, the control message can list the PWs
753	   which have experienced congestion, most likely by listing the
754	   corresponding labels.  If the VCCV method of detecting congestion is
755	   used, one could even include the sent/received statistics for
756	   particular VCCV intervals.

758	   This method is very simple, as one does not have to worry about the
759	   congestion control messages themselves getting lost or out of
760	   sequence.  Feedback traffic is minimized, as a single control message
761	   relays feedback about an entire tunnel.

763	6.2.  Using Reverse Data Packets for Feedback

765	   If a receiver detects congestion on a particular PW, it can set a bit
766	   in the data packets that are travelling on that PW in the reverse
767	   direction; when no congestion is detected, the bit would be clear.
768	   The bit would be ignored on any packet that is received out of
769	   sequence, of course.  There are several disadvantages to this
770	   technique:

772	   o  There may be no (or insufficient) data traffic in the reverse
773	      direction

775	   o  Sequencing of the data stream is required

777	   o  The transmission of the congestion indications is not reliable

779	   o  The most one could hope to convey is one bit of information per PW
780	      (if there is even a bit available in the encapsulation).

782	6.3.  Reverse VCCV Traffic

784	   Congestion indications for a particular PW could be carried in VCCV
785	   packets travelling in the reverse direction on that PW.  Of course,
786	   this would require that the VCCV packets be sent periodically in the
787	   reverse direction whether or not there is reverse direction traffic.
788	   For congestion feedback purposes they might need to be sent more
789	   frequently than they'd need to be sent for OAM purposes.  It would
790	   also be necessary for the VCCVs to be sequenced (with respect to each
791	   other, not necessarily with respect to the data stream).  Since VCCV
792	   transmission is unreliable, one would want to send multiple VCCVs
793	   within whatever period we want to be able to respond in.  Further,
794	   this method provides no means of aggregating congestion information
795	   into information about the tunnel.

797	7.  Responding to Congestion

799	   In TCP, one tends to think of the transmission rate in terms of MTUs
800	   per RTT, which defines the maximum number of unacknowledged packets
801	   that TCP is allowed to maintain "in flight".  Upon detection of a
802	   lost packet, this rate is halved ("multiplicative decrease").  It
803	   will be halved again approximately every RTT until the missing data
804	   gets through.  Once all missing data has gotten through, the
805	   transmission rate is increased by one MTU per RTT.  Every time a new
806	   acknowledgment (i.e., not a duplicate acknowledgment) is received,
807	   the rate is similarly increased (additive increase).  Thus TCP can
808	   adjust its transmit rate very rapidly, i.e., it responds on the order
809	   of a RTT.  By contrast, TCP-friendly rate control adjusts its rate
810	   rather more gradually.

812	   For simplicity, this discussion only covers the "congestion
813	   avoidance" phase of TCP congestion control.  The analogy of TCP's
814	   "slow start phase" would also be needed.

816	   TCP can easily estimate the RTT, since all its transmissions are
817	   acknowledged.  In PWE3, the best way to estimate the RTT might be via
818	   the control protocol.  In fact, if the control protocol is TCP-based,
819	   getting the RTT estimate from TCP might be a good option.

821	   TCP's rate control is window-based, expressed as a number of bytes
822	   that can be in flight.  PWE3's rate control would need to be rate-
823	   based.  The TFRC specification [RFC3448] provides the equation for
824	   the TCP-friendly rate for a given loss rate, RTT, and MTU.  Given
825	   some means of determining the loss rate, as described in Section 5,
826	   the TCP friendly rate for a PW or a tunnel can be calculated at the
827	   ingress PE.  However as we noted earlier, a PW may be carrying many
828	   flows, in which case the use of a single flow TCP rate would be a
829	   significant underestimate of the true fair rate application, and
830	   hence damaging to the operation of the PW.

832	   If the congestion detection mechanism only produces an approximate
833	   result, the probability of a "false alarm" (thinking that there is
834	   congestion when there really is not) for some interval becomes
835	   significant.  It would be better then to have some algorithm which
836	   smoothes the result over several intervals.  The TFRC procedures,
837	   which tend to generate a smoother and less abrupt change in the
838	   transmission rate than the AIMD procedures, may also be more
839	   appropriate in this case.

841	   Once a PE has determined the appropriate rate at which to transmit
842	   traffic on a given PW or tunnel, it needs some means to enforce that
843	   rate via policing, shaping, or selective shutting down of PWs.  There
844	   are tradeoffs to be made among these options, depending on various
845	   factors including the higher layer service that is carried.  The
846	   effect of different mechanisms when the higher layer traffic is
847	   already using TCP is discussed below.

849	7.1.  Interaction with TCP

851	   Ideally there should be no PW-specific congestion control mechanism
852	   used when the higher layer traffic is already running over TCP and is
853	   thus subject to TCP's existing congestion control.  However it may be
854	   difficult to determine what the higher layer is on any given PW.
855	   Thus, interaction between PW-specific congestion control and TCP's
856	   congestion control needs to be considered.

858	   As noted in Section 3.2, a PW-specific congestion control mechanism
859	   may interact poorly with the "outer" control loop of TCP if the PW
860	   carries TCP traffic.  A well-documented example of such poor
861	   interaction is a token bucket policer that drops packets outside the
862	   token bucket.  TCP has difficulty finding the "bottleneck" bandwidth
863	   in such an environment and tends to overshoot, incurring heavy losses
864	   and consequent loss of throughput.

866	   A shaper that queues packets at the PE and only injects them into the
867	   network at the appropriate rate may be a better choice, but may still
868	   interact unpredictably with the "outer control loop" of TCP flows
869	   that happen to traverse the PW.  This issue warrants further study.

871	   Another possibility is simply to shut down a PW when the rate of
872	   traffic on the PW significantly exceeds the appropriate rate that has
873	   been determined for the PW.  While this might be viewed as draconian,
874	   it does ensure that any PW that is allowed to stay up will behave in
875	   a predictable manner.  Note that this would also be the most likely
876	   choice of action for CBR PWs (as discussed in Section 9).  Thus all
877	   PWs would be treated alike and there would be no need to try to
878	   determine what sort of upper layer payload a PW is carrying.

880	8.  Rate Control per Tunnel vs. per PW

882	   Rate controls can be applied on a per-tunnel basis or on a per-PW
883	   basis.  Applying them on a per-tunnel basis (and obtaining congestion
884	   feedback on a per-tunnel basis) would seem to provide the most
885	   efficient and most scalable system.  Achieving fairness among the PWs
886	   then becomes a local issue for the transmitter.  However, if the
887	   different PWs follow different paths through the network (e.g.
888	   because of ECMP over the tunnel), it is possible that some PWs will
889	   encounter congestion while some will not.  If rate controls are
890	   applied on a per-tunnel basis, then if any PW in a tunnel is affected
891	   by congestion, all the PWs in the tunnel will be throttled.  While
892	   this is sub-optimal, it is not clear that this would be a significant
893	   problem in practise, and it may still be the best trade-off.

895	   Per-tunnel rate control also has some desirable properties if the
896	   action taken during congestion is to selectively shut down certain
897	   PWs.  Since a tunnel will typically carry many PWs, it will be
898	   possible to make relatively small adjustments in the total bandwidth
899	   consumed by the tunnel by selectively shutting down or bringing up
900	   one or more PWs.

902	   Note again the issue of estimating the correct rate.  We know how
903	   many PWs there per tunnel, but we do not know how many flows there
904	   are per PW.

906	9.  Constant Bit Rate Services

908	   As noted above, some PW services may require a fixed rate of
909	   transmission, and it may be impossible to provide the service while
910	   throttling the transmission rate.  To provide such services, the
911	   network paths must be engineered so that congestion is impossible;
912	   providing such services over the Internet is thus not very likely.
913	   In fact, as congestion control cannot be applied to such services, it
914	   may be necessary to prohibit these services from being provided in
915	   the Internet, except in the case where the payload is known to
916	   consist of TCP connections or other traffic that is congestion-
917	   controlled by the end-points.  It is not clear how such a prohibition
918	   could be enforced.

920	   The only feasible mechanism for handling congestion affecting CBR
921	   services would appear to be to selectively turn off PWs, or channels
922	   with the PW, when congestion occurs.  Clearly it is important to
923	   avoid "false alarms" in this case.  It is also important to avoid
924	   bringing PWs (or channels within the PW) back up too quickly and re-
925	   introducing congestion.

927	   The idea of controlling rate per tunnel rather than PW, discussed
928	   above, seems particularly attractive when some of the PWs are CBR.
929	   First, it provides the possibility that non-CBR PWs could be
930	   throttled before it is necessary to shut down the CBR PWs.  Second,
931	   with the aggregation of multiple PWs on a single rate-controlled
932	   tunnel, it becomes possible to gradually increase or decrease the
933	   total offered load on the tunnel by selectively bringing up or
934	   shutting down PWs.  As noted above, local policies at a PE could be
935	   used to determine which PWs to shut down or bring up first.  Similar
936	   approaches would apply if the CBR PW offers a channelized service,
937	   with selected channels being shut down and brought up to control the
938	   total rate of the PW.

940	10.  Managed vs Unmanaged Deployment

942	   As discussed in Section 2, there are a significant set of scenarios
943	   in which PW-specific congestion control may not be necessary.  One
944	   might therefore argue that it doesn't seem to make sense to require
945	   PW-specific congestion control to be used on all PWs at all times.
946	   On the other hand, if the option of turning off PW-specific
947	   congestion control is available, there is nothing to stop a provider
948	   from turning it off in inappropriate situations.  As this may
949	   contribute to congestive collapse outside the provider's own network,
950	   it may not be advisable to allow this.

952	   The circumstance in which it is most vocally argued that congestion
953	   control is not needed are where the PW is part of the service
954	   providers own network infrastructure.  In cases where the PW is
955	   deployed in a managed, well-engineered network it is probably
956	   necessary to permit the operator to take the congestion risk upon
957	   themselves if they desire to do so by disabling any congestion
958	   control mechanism.  Ironically work in progress on the design of an
959	   MPLS Transport Profile indicates that some of these users require
960	   that an OAM is run (end-to-end, or over part of the tunnel (known as
961	   Tandem monitoring)), and when the OAM detects that the performance
962	   parameters are breached (delay, jitter or packet loss) this is used
963	   to trigger a failover to a backup path.  When the backup path
964	   mechanism operates in 1 to 1 mode, this moves the congestive traffic
965	   to another network path, which is the characteristic we require. [**
966	   add reference **]

968	   Further in such an environment it is likely that the PW will be used
969	   to carry many TCP equivalent flows.  In these managed circumstances
970	   the operator will have to make a judgement call on a fair estimate of
971	   the number of equivalent flows that the PW represents and set the
972	   congestion parameters accordingly.

974	   The counterpoint to a well managed network is a plug and play
975	   network, of the type typically found in a consumer environment.
976	   Whilst PWs have been designed to provide an infrastructure tool to
977	   the service provider, we cannot be sure that they will not find some
978	   consumer or similar plug and play application.  In such an
979	   environment a conservative approach seems appropriate, and the
980	   default PW configuration MUST enable the congestion avoidance
981	   mechanism, with parameters set to the equivalent of a single TCP
982	   flow.

984	11.  Related Work: Pre-Congestion Notification

986	   It has been suggested that Pre-congestion Notification (PCN)
987	   [I-D.briscoe-tsvwg-cl-architecture][I-D.briscoe-tsvwg-cl-phb] might
988	   provide a basis for addressing the PW congestion control problem.
989	   Using PCN, it would potentially be possible to determine if the level
990	   of congestion currently existing between an ingress and an egress PE
991	   was sufficiently low to safely allow a new PW to be established.
992	   PCN's pre-emption mechanisms could be used to notify a PE that one or
993	   more PWs need to be brought down, which again could be coupled with
994	   local policies to determine exactly which PWs should be shut down
995	   first.  This approach certainly merits further examination, but we
996	   note that PCN is considerably further away from deployment in the
997	   Internet than ECN, and thus cannot be considered as a near-term
998	   solution to the problem of PW-induced congestion in the Internet.

1000	12.  Conclusion

1002	   Pseudowires are sometimes used to emulate services network that carry
1003	   non-TCP data flows.  In these circumstances the service payload will
1004	   not react to network congestion by reducing its offered load.
1005	   Pseudowires SHOULD therefore reduces their network bandwidth demands
1006	   in the face of significant packet loss, including if necessary
1007	   completely ceasing transmission.  In some service provider network
1008	   environments the network operator may choose to not deploy congestion
1009	   avoidance mechanisms, but they should make that choice in the full
1010	   knowledge that they are avoiding a network safety valve.  In an
1011	   unmanaged environment a conservative approach to congestion is
1012	   REQUIRED.

1014	   Since it is difficult to determine a priori the number of equivalent
1015	   TCP flow that a pseudowire represents, a suitably "fair" rate of
1016	   back-off cannot easily be pre-determined.  In a managed environment
1017	   setting the congestion parameters will require some level of informed
1018	   judgement.  In an unmanaged environment the equivalent TCP flow
1019	   should be set to one.

1021	   The selection of the appropriate mechanism(s) to implement congestion
1022	   avoidance is work in progress in the IETF PWE3 working group.

1024	13.  Informative References

1026	   [I-D.briscoe-tsvarea-fair]
1027	              Briscoe, B., "Flow Rate Fairness: Dismantling a Religion",
1028	              draft-briscoe-tsvarea-fair-02 (work in progress),
1029	              July 2007.

1031	   [I-D.briscoe-tsvwg-cl-architecture]
1032	              Briscoe, B., "An edge-to-edge Deployment Model for Pre-
1033	              Congestion Notification: Admission  Control over a
1034	              DiffServ Region", draft-briscoe-tsvwg-cl-architecture-04
1035	              (work in progress), October 2006.

1037	   [I-D.briscoe-tsvwg-cl-phb]
1038	              Briscoe, B., "Pre-Congestion Notification marking",
1039	              draft-briscoe-tsvwg-cl-phb-03 (work in progress),
1040	              October 2006.

1042	   [I-D.ietf-pwe3-cesopsn]
1043	              Vainshtein, S., "Structure-aware TDM Circuit Emulation
1044	              Service over Packet Switched Network  (CESoPSN)",
1045	              draft-ietf-pwe3-cesopsn-07 (work in progress), May 2006.

1047	   [I-D.ietf-pwe3-fc-encap]
1048	              Roth, M., "Encapsulation Methods for Transport of Fibre
1049	              Channel frames Over MPLS  Networks",
1050	              draft-ietf-pwe3-fc-encap-07 (work in progress),
1051	              January 2008.

1053	   [I-D.ietf-pwe3-sonet]
1054	              Malis, A., "SONET/SDH Circuit Emulation over Packet
1055	              (CEP)", draft-ietf-pwe3-sonet-14 (work in progress),
1056	              December 2006.

1058	   [I-D.ietf-pwe3-tdmoip]
1059	              Stein, Y., "TDM over IP", draft-ietf-pwe3-tdmoip-06 (work
1060	              in progress), December 2006.

1062	   [I-D.ietf-pwe3-vccv]
1063	              Nadeau, T. and C. Pignataro, "Pseudowire Virtual Circuit
1064	              Connectivity Verification (VCCV) A Control  Channel for
1065	              Pseudowires", draft-ietf-pwe3-vccv-15 (work in progress),
1066	              September 2007.

1068	   [RFC2001]  Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast
1069	              Retransmit, and Fast Recovery Algorithms", RFC 2001,
1070	              January 1997.

1072	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1073	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1075	   [RFC2581]  Allman, M., Paxson, V., and W. Stevens, "TCP Congestion
1076	              Control", RFC 2581, April 1999.

1078	   [RFC2914]  Floyd, S., "Congestion Control Principles", BCP 41,
1079	              RFC 2914, September 2000.

1081	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1082	              of Explicit Congestion Notification (ECN) to IP",
1083	              RFC 3168, September 2001.

1085	   [RFC3448]  Handley, M., Floyd, S., Padhye, J., and J. Widmer, "TCP
1086	              Friendly Rate Control (TFRC): Protocol Specification",
1087	              RFC 3448, January 2003.

1089	   [RFC3758]  Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P.
1090	              Conrad, "Stream Control Transmission Protocol (SCTP)
1091	              Partial Reliability Extension", RFC 3758, May 2004.

1093	   [RFC3985]  Bryant, S. and P. Pate, "Pseudo Wire Emulation Edge-to-
1094	              Edge (PWE3) Architecture", RFC 3985, March 2005.

1096	   [RFC4340]  Kohler, E., Handley, M., and S. Floyd, "Datagram
1097	              Congestion Control Protocol (DCCP)", RFC 4340, March 2006.

1099	   [RFC4385]  Bryant, S., Swallow, G., Martini, L., and D. McPherson,
1100	              "Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for
1101	              Use over an MPLS PSN", RFC 4385, February 2006.

1103	   [RFC4447]  Martini, L., Rosen, E., El-Aawar, N., Smith, T., and G.
1104	              Heron, "Pseudowire Setup and Maintenance Using the Label
1105	              Distribution Protocol (LDP)", RFC 4447, April 2006.

1107	   [RFC4553]  Vainshtein, A. and YJ. Stein, "Structure-Agnostic Time
1108	              Division Multiplexing (TDM) over Packet (SAToP)",
1109	              RFC 4553, June 2006.

1111	   [RFC4960]  Stewart, R., "Stream Control Transmission Protocol",
1112	              RFC 4960, September 2007.

1114	Authors' Addresses

1116	   Stewart Bryant
1117	   Cisco Systems, Inc.
1118	   250 Longwater
1119	   Green Park, Reading  RG2 6GB
1120	   U.K.

1122	   Phone:
1123	   Fax:
1124	   Email: stbryant@cisco.com
1125	   URI:

1127	   Bruce Davie
1128	   Cisco Systems, Inc.
1129	   1414 Mass. Ave.
1130	   Boxborough, MA  01719
1131	   USA

1133	   Email: bsd@cisco.com
1134	   Luca Martini
1135	   Cisco Systems, Inc.
1136	   9155 East Nichols Avenue, Suite 400.
1137	   Englewood, CO  80112
1138	   USA

1140	   Email: lmartini@cisco.com

1142	   Eric Rosen
1143	   Cisco Systems, Inc.
1144	   1414 Mass. Ave.
1145	   Boxborough, MA  01719
1146	   USA

1148	   Email: erosen@cisco.com

1150	Full Copyright Statement

1152	   Copyright (C) The IETF Trust (2008).

1154	   This document is subject to the rights, licenses and restrictions
1155	   contained in BCP 78, and except as set forth therein, the authors
1156	   retain all their rights.

1158	   This document and the information contained herein are provided on an
1159	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1160	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
1161	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
1162	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
1163	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1164	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1166	Intellectual Property

1168	   The IETF takes no position regarding the validity or scope of any
1169	   Intellectual Property Rights or other rights that might be claimed to
1170	   pertain to the implementation or use of the technology described in
1171	   this document or the extent to which any license under such rights
1172	   might or might not be available; nor does it represent that it has
1173	   made any independent effort to identify any such rights.  Information
1174	   on the procedures with respect to rights in RFC documents can be
1175	   found in BCP 78 and BCP 79.

1177	   Copies of IPR disclosures made to the IETF Secretariat and any
1178	   assurances of licenses to be made available, or the result of an
1179	   attempt made to obtain a general license or permission for the use of
1180	   such proprietary rights by implementers or users of this
1181	   specification can be obtained from the IETF on-line IPR repository at
1182	   http://www.ietf.org/ipr.

1184	   The IETF invites any interested party to bring to its attention any
1185	   copyrights, patents or patent applications, or other proprietary
1186	   rights that may cover technology that may be required to implement
1187	   this standard.  Please address the information to the IETF at
1188	   ietf-ipr@ietf.org.