idnits 2.17.1 

draft-weiss-cooperative-drop-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-26) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 14 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 74 has weird spacing: '...   even  minim...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (March 1998) is 9539 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Possible downref: Non-RFC (?) normative reference: ref. '1'

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  -- Possible downref: Non-RFC (?) normative reference: ref. '3'

  ** Obsolete normative reference: RFC 1349 (ref. '4') (Obsoleted by RFC 2474)

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'


     Summary: 10 errors (**), 0 flaws (~~), 3 warnings (==), 8 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                          Walter Weiss
3	Internet Draft                                    Lucent Technologies
4	Expiration: September 1998                                 March 1998

6	               Providing Differentiated Services through
7	               Cooperative Dropping and Delay Indication

9	                 <draft-weiss-cooperative-drop-00.txt>

11	Status of this Memo

13	   This document is an Internet Draft. Internet Drafts are working
14	   documents of the Internet Engineering Task Force (IETF), its Areas,
15	   and its Working Groups. Note that other groups may also distribute
16	   working documents as Internet Drafts.

18	   Internet Drafts are draft documents valid for a maximum of six
19	   months. Internet Drafts may be updated, replaced, or obsoleted by
20	   other documents at any time. It is not appropriate to use Internet
21	   Drafts as reference material or to cite them other than as a "working
22	   draft" or "work in progress."

24	   Please check the 1id-abstracts.txt listing contained in the
25	   internet-drafts Shadow Directories on nic.ddn.mil, nnsc.nsf.net,
26	   nic.nordu.net, ftp.nisc.sri.com, or munnari.oz.au to learn the
27	   current status of any Internet Draft.

29	Abstract

31	   The current state of the Internet only supports a single class of
32	   service.  To further the success of the Internet, new capabilities
33	   must be deployed which allow for deterministic end to end behavior
34	   irrespective of location or the number of domains along the path.
35	   Experience with existing signaling based protocols have proven diffi-
36	   cult to deploy due to technical and economic factors.  This document
37	   proposes using in-band frame marking to support a diverse set of ser-
38	   vices.  In addition, the mechanisms described here will provide end
39	   users and/or enterprise backbones with new capabilities such as ser-
40	   vice validation, congestion localization, and uniform service
41	   irrespective of the type of service contract.  For ISPs this document
42	   proposes mechanisms providing more available bandwidth by creating
43	   strong incentives for adaptive behavior in applications as well as
44	   mechanisms for providing both sender based and receiver based service
45	   contracts.

47	1. Introduction

49	   It is widely recognized that there are many types of services which
50	   can be offered and many means for providing them.  These services can
51	   be reduced to three basic components: the control of the bandwidth,
52	   the control of the delay, and the control of delay variation.  The
53	   control of delay variation requires frame scheduling at the granular-
54	   ity of flows.  This, in turn, requires per flow state in each hop
55	   along the path of the flow.  This capability requires configured or
56	   signaled reservations.  Therefore the management of delay variation
57	   is beyond the scope of this document.

59	   Although in-band delay variation is too difficult to support within a
60	   Differentiated Services framework, it is feasible to provide
61	   bandwidth and delay management using a less sophisticated model.
62	   However, any model which attempts to satisfy service contracts
63	   without an awareness of available network capacities along the path
64	   faces two issues.

66	   First of all, how can end to end (or edge to edge) bandwidth guaran-
67	   tees be satisfied when the capacity and available bandwidth of down-
68	   stream links are unknown?  Some indication of capacity can be gleaned
69	   through routing protocols like OSPF.  However, there is no effective
70	   mechanism for detecting the level of persistent downstream conges-
71	   tion, as a result of capacity limits like cross-oceanic links or
72	   because of singular but long lasting events such as NASA's "Life on
73	   Mars" announcement.  Thus, attempting to use a profile to promise
74	   even  minimal bandwidth guarantees is virtually impossible, given
75	   factors of such as distance, time of day, the specific path taken and
76	   link consumption.

78	   VPN and Virtual Leased Line services can be supported by configuring
79	   reserved capacity.  However, this does not deminish the benefits of
80	   congestion awareness.  As with traditional network links, Virtual
81	   Networks have a large cross section of applications using them at any
82	   given time.  Some applications may need to reserve strict bandwidth
83	   and delay guarantees.  However, there are other applications which
84	   can adapt to changes in available bandwidth.  These adaptive applica-
85	   tions are dependent on effective congestion awareness to operate
86	   properly.

88	   In addition, ISPs need service differentiation to deploy Electronic
89	   Commerce solutions.  ISPs desire the ability to provide individual
90	   users with different service offerings.  In these cases bandwidth
91	   cannot be pre-allocated because the destinations for these services
92	   are not pinned.  Therefore congestion awareness is crucial to antici-
93	   pating and adapting to available bandwidth along the path to a desti-
94	   nation.

96	   The second issue with a service contract model that does not employ
97	   signaling is that one router is not aware of another router's conges-
98	   tion control actions.  Hence, when congestion occurs on multiple hops
99	   along the path of a particular flow, individual congestion control
100	   algorithms could independently drop frames.  But far worse, they
101	   could collectively drop a sequence of frames, causing stalling or
102	   slow start behavior.  Hence, to the end user, the service is per-
103	   ceived as erratic.  Even if a customer is guaranteed more bandwidth
104	   with a more expensive profile, the "perceived" benefit is greatly
105	   deminished with choppy service.

107	   This document describes a differentiated services architecture that
108	   allows maximal flexibility for specifying bandwidth and delay sensi-
109	   tive policies while reducing or eliminating the choppy behavior as it
110	   exists in the Internet today.

112	2. The Differentiated Services service model

114	2.1. Congestion Control

116	   Differentiated services without the use of signaling relies on two
117	   basic components: a user or group profile specifying the level of
118	   service to which the flow, user, or group is entitled and a means for
119	   encoding this profile into each packet on which the profile is
120	   applied.  To provide consistency in the network, all traffic within a
121	   routing domain must be administered with an associated traffic pro-
122	   file.  The most economic way of enforcing this requirement is to
123	   apply or enforce the profile on all edges of the domain as described
124	   in Clark/Wroclawski [1].  Network Edge devices are defined in this
125	   document to be those devices capable of metering, assigning and
126	   enforcing bandwidth profiles and delay profiles.

128	   Because the amount of traffic on the egress link at any given time is
129	   non-deterministic, the encoding of the profile in each packet pro-
130	   vides an effective means for taking appropriate action on each
131	   packet.  By comparing current congestion conditions of a link with
132	   the profile associated with the packet a consistent action can be
133	   applied to the packet.  Possible actions on the packet could range
134	   from dropping the packet to giving it preferential access to the out-
135	   bound link.

137	   This document proposes using the TOS octet in the IP header to encode
138	   a profile into the packet stream of each session.  The profile will
139	   be capable of specifying a delay priority as well as a relative
140	   bandwidth share.  Relative bandwidth share is the relative bandwidth
141	   to which the profile's owner is entitled relative to other profiles.
142	   Through a combination of profile enforcement and the judicious use of
143	   the TOS octet, share-based bandwidth sharing, as described in USD
144	   [2], can be provided without proliferating individual profiles to
145	   each router in the network.

147	   The problem of coordinating dropping policies between routers on a
148	   per flow basis is too complex to be feasible.  However, to provide
149	   effective coordination, each router does not need to know what pack-
150	   ets other routers have dropped.  Instead, a router can create a set
151	   of hierarchical drop Priority Classes for each link.  These Priority
152	   Classes are only for congestion control; they do not affect the ord-
153	   ering of packets.  A new 3 bit field called the Priority Class field,
154	   is provided in the TOS octet to allow Priority Class assignment on a
155	   per packet basis.  A flow distributes packets evenly to each Priority
156	   Class by having the Priority Class value assigned to the 3 bit field
157	   in the TOS octet.

159	   When a congestion threshold is reached, dropping will be initiated
160	   for the lowest Priority Class.  As congestion increases more and more
161	   packets will be dropped from this Priority Class.  If congestion con-
162	   tinues to increase, all packets in the Priority Class will be dropped
163	   and partial dropping will begin in the next higher Priority Class.
164	   However, the packets in the flow that use the remaining, higher
165	   Priority Classes will be unaffected.

167	   Dropping packets only within a single Priority Class creates many
168	   beneficial side effects.  First, routers today have difficulty deter-
169	   mining how to drop packets fairly across all flows.  The main issue
170	   is that routers have no knowledge of the profiles of individual
171	   flows, so they also have no knowledge of the relative amount of
172	   bandwidth to which the flow is entitled.  Also, most drop algorithms
173	   are based on a random discard model.  Without per flow state it is
174	   possible (probable for high bandwidth apps) for multiple and succes-
175	   sive drops to be performed against the same flow.  Even algorithms
176	   with per flow state have limitations due to a lack of profile aware-
177	   ness.  However, the Priority Class encoding mechanism described in
178	   this document allows routers to drop packets fairly based on the pro-
179	   file encoded in each flow.

181	   Further, this approach provides implicit cooperation between routers.
182	   If two or more routers along the path of a flow are dropping packets
183	   in the lowest Priority Class, all traffic in the higher Priority
184	   Classes is protected irrespective of the number of routers which
185	   experience congestion.  This provides much more predictable service
186	   irrespective of the location or the current condition of the network
187	   as a whole.

189	   Also, with profiles which limit the number of packets that can be
190	   sent in each Priority Class (or alternatively the time interval
191	   between packets in a given Priority Class), a bandwidth share is
192	   implicitly assigned to each user.  The proportional share to which
193	   each user is entitled remains constant irrespective of the level of
194	   congestion or the number of flows on the link.

196	   This model provides an implicit congestion notification mechanism to
197	   senders for TCP based applications.  When a sender keeps track of the
198	   Priority Classes of sent packets, TCP acknowledgments provide infor-
199	   mation on the level of congestion along the path.  This provides end
200	   users with an easy tool for service contract verification.  Further,
201	   it provides equivalent functionality to ECN [3] without consuming
202	   additional bits in the TOS header.

204	   When routers are only dropping packets up to a specific Priority
205	   Class, the other Priority Classes are implicitly protected.  This
206	   allows Service Providers to charge more accurately for end-to-end
207	   Goodput rather than Throughput.

209	   This mechanism can greatly reduce or eliminate bandwidth consumed by
210	   packets that will be dropped somewhere along the path.  With implicit
211	   congestion notification, applications can stop sending packets in
212	   Priority Classes that they know will be dropped.  In fact, if an
213	   application knows that packets in a given Priority Class are
214	   guaranteed to be dropped, it benefits by not sending the packets
215	   because it can use the in-profile bandwidth in that Priority Class
216	   for a different flow to an uncongested destination.  If the edge
217	   routers which enforce profiles also snoop the TCP sessions (or use
218	   the Congestion Check mechanism described below), they could perform
219	   aggressive policing by dropping packets in unavailable Priority
220	   Classes, thus providing additional network bandwidth and encouraging
221	   adaptive behavior in end systems.

223	   While congestion awareness can be used to restrict aggressive or
224	   abusive bandwidth consumption, it can also be used to allow bandwidth
225	   to grow beyond normal limits when there is no congestion.  This can
226	   have the effect of maximizing available bandwidth when it is avail-
227	   able.

229	   Using this mechanism in conjunction with TraceRoute, end users and
230	   network administrators could verify service contracts by identifying
231	   the precise location of the highest level of congestion.  This
232	   clearly fixes blame when service contracts are not met and also
233	   easily identifies those links which need to be upgraded.

235	   Current TCP congestion control algorithms grow bandwidth incremen-
236	   tally and cut bandwidth in half when congestion occurs.  With an
237	   awareness of which Priority Classes are being dropped, TCP growth and
238	   cutback algorithms could be applied to the same Priority Class that
239	   is performing the partial drop.  This smoothes the bandwidth in the
240	   flow while still adjusting to current bandwidth availability as it
241	   increases and decreases.

243	   Another benefit is that the Priority Class assignment can be
244	   sequenced in any order based on transport specific criteria.  If it
245	   is desirable to lose a sequence of packets with congestion, a
246	   sequence such as 0,1,2,3,4,5,6,7 would drop a block of packets based
247	   on the current congestion level. If it is desirable to spread the
248	   dropping of packets out, a sequence such as 0,4,1,5,2,6,3,7 provides
249	   a very high probability that two packets will not be dropped in a
250	   row.  If it is desirable to distinguish important packets from less
251	   important ones, Priority Class can be assigned in a more discretion-
252	   ary manner.

254	   The last benefit is that this model is extremely efficient and simple
255	   to implement in routers.  A router only needs to set a congestion
256	   threshold and apply a dropper algorithm to that single Priority
257	   Class.  If congestion increases, all packets in the current Priority
258	   Class are dropped and the dropper algorithm is applied to the next
259	   higher Priority Class.

261	2.2. Delay Control

263	   The other mechanism necessary to support Quality of Service is a
264	   means for controlling link access based on a combination of service
265	   contract and profile.  This document proposes using a 2 bit field in
266	   the TOS octet to provide up to 4 Delay Classes. It is believed that 4
267	   classes are adequate for current needs. Vendors may choose to map
268	   these classes to at least 2 delay classes.

270	   There are two issues that must be addressed when providing control
271	   over packet delay.  One is how the packet scheduling is handled
272	   across delay classes.  For example, both Class Based and Priority
273	   Queuing provide specific features.  These capabilities may be more or
274	   less appropriate depending on customer needs.  Therefore, it is left
275	   to vendors to choose the technology most appropriate to the specific
276	   market.

278	   The other issue is how Delay Classes and Priority Classes work
279	   together.  Is it better to for Priority Class droppers to operate
280	   autonomously in each Delay Class or is it better to have a single
281	   Priority Class dropper that is indifferent to which Delay Class a
282	   frame belongs to?  This issue is in reality identical to the CBQ vs.
283	   Priority Queuing issue.  The former creates specific bandwidth limits
284	   thereby creating specific delay limits.  The latter allows more
285	   flexible/dynamic bandwidth allocations at the expense of possible
286	   starvation and looser delay guarantees.  In certain environments like
287	   the Internet, where the number of hops is fairly non-deterministic,
288	   it may make more sense to use a single Priority Class dropper across
289	   all Delay Classes.  However, in most private networks where the
290	   number of hops is deterministic, it is feasible to provide specific
291	   delay limits.  Therefore it also makes sense to support independent
292	   Priority Class droppers within each Delay Class.  It is therefore
293	   left to the vendor to choose the model most appropriate to their
294	   market and customers.

296	3. The TOS octet

298	   This document proposes using the 8-bit TOS field in the IPv4[4]
299	   header to provide differentiated services. The identical format would
300	   also be used in the Class field of the IPv6[5] header.  The format of
301	   the TOS field is shown below.

303	           0   1   2   3   4   5   6   7
304	         +---+---+---+---+---+---+---+---+
305	         | CC|     TC    | RR|   DC  | RB|
306	         +---+---+---+---+---+---+---+---+

308	           CC:  Congestion Check
309	           PC:  Priority Class
310	           RR:  Request/Response
311	           DC:  Delay Class
312	           RB:  Receiver Billing

314	   The Priority Class field is used to provide congestion control as
315	   described earlier. This field allows for 8 possible values.  From a
316	   congestion management prospective, this provides congestion/traffic
317	   management in 12.5% chunks.  The Priority Class semantics are as fol-
318	   lows:

320	        7:  Least likely to be dropped
321	        .
322	        .
323	        .
324	        0:  Most likely to be dropped

326	   The Delay Class field is used to specify the delay sensitivity of the
327	   packet.  It is strongly recommended that a flow not use different
328	   Delay Class values.  This would create packet ordering problems and
329	   make effective congestion management more difficult.  The Delay Class
330	   semantics are as follows:

332	        3:  Low delay (most delay sensitive)
333	        2:  Moderate delay
334	        1:  Modest delay
335	        0:  High delay (indifferent to delay)

337	   The Congestion Check bit is used as an efficient means for determin-
338	   ing the current congestion level along the path to a destination.
339	   When this bit is set, each hop will assign the congestion level of
340	   the (downstream) target link in the Priority Class field if the
341	   congestion level is greater than the value currently assigned to the
342	   Priority Class field.  When the packet arrives at the destination,
343	   the Priority Class field should contain the highest Priority Class on
344	   which packets are being dropped.

346	   The usage of the Congestion Check bit is sensitive to the value of
347	   the Delay Class field.  The Priority Class field will be assigned to
348	   the congestion level of the delay class specified in the Delay Class
349	   field.  This bit could be set during connection establishment to
350	   optimize the initial windowing and congestion control algorithms. The
351	   Priority Class for this packet should be considered to have a value
352	   of 7 and should be charged against sender profiles accordingly.

354	   When both the Congestion Check bit and the Request/Response bit are
355	   set, this is an indication that the contents of the Priority Class
356	   field is the congestion level of traffic in the opposite direction.
357	   The Request/Response bit is used to indicate that the current Prior-
358	   ity Class value is a response to a Congestion Check request.  The
359	   Priority Class field is to be ignored by intermediate routers and the
360	   Priority Class for this packet should be treated as if it contained a
361	   value of 7.

363	   It is possible for the response capability to be provided out of
364	   band. However, if the end station is not capable of supporting the
365	   new TOS octet and the edge router wants to perform the TOS byte
366	   assignments on behalf of the end station (or is performing aggressive
367	   dropping), this is an effective mechanism for snooping congestion
368	   levels without new protocols or extra bandwidth.

370	   Because this Congestion Check request and response mechanism behaves
371	   as a packet with a Priority Class of 7, profile meters should treat
372	   (and charge for) these packets as Priority Class 7 packets to prevent
373	   abuse.  Further, because congestion checking is sensitive to the
374	   Receiver Billing bit, these request and response packets are always
375	   charged to the sender.

377	   The Receiver Billing bit is provided to indicate that bandwidth will
378	   be charged to the receiver.  This bit may only be set when the
379	   receiver's bandwidth profile has been provided to the sender.  The
380	   mechanisms or protocol extensions used to propagate bandwidth pro-
381	   files to senders are beyond the scope of this document.

383	   Because Receiver Billing requires a different profile and Priority
384	   Class based dropping may be applied because a profile has been
385	   exceeded, two types of Congestion Checks are possible: one for sender
386	   billing and one for receiver billing.  The Receiver Billing bit is
387	   set in conjunction with the Congestion Check bit, to determine the
388	   congestion level for Receiver Billing packets.  More details on pro-
389	   file based dropping and Receiver Billing will be provided later in
390	   this document.

392	   It is not clear at this time whether charging of a Congestion Check
393	   Response packet should be against the sender or receiver.  It makes
394	   sense to charge Congestion Check Request to the sender when the
395	   Receiver Billing bit is reset and charge to the receiver when the
396	   Receiver Billing bit is set.  However, if Congestion Check Response
397	   packets are charged based on the value of the Receiver Billing bit,
398	   then it may preclude concurrent sender and receiver charging within
399	   the same flow.

401	   This new use of the IPv4 TOS octet subsumes the previous functional-
402	   ity of this field as described in RFC1349[4] and similarly in
403	   IPv6[5].  Current usage of this field leaves little room for the
404	   coexistence of the original semantics with the semantics described in
405	   this document.  This document concurs with Nichols et. al.[6] in
406	   requiring the remarking of packets between differentiated services
407	   networks and non-differentiated services networks.  This will
408	   minimally require configuration support to demarcate differentiated
409	   services network boundaries.

411	4. Operational Model

413	   The operational model is based on the assumption that all traffic
414	   entering a network domain will be verified against a user or group
415	   profile.  This profile has a number of potential components.  One
416	   component is the allowable Delay Class(es).  Another component may be
417	   the maximum bandwidth allocation. Maximum bandwidth allocation is
418	   particularly important for receiver billing to prevent excessive
419	   sending and overcharging.  Another component may be a maximum
420	   bandwidth allocation for each given Priority Class.  It is generally
421	   more useful to distribute bandwidth evenly among all Priority
422	   Classes.  However, some policy models may choose to block or reserve
423	   certain Priority Classes for specific applications.  Alternatively a
424	   policy may provide more bandwidth to a specific Priority Class to
425	   support specialized services such as premium service as described in
426	   Nichols[7].

428	4.1 Inter-Domain Edge devices

430	   An Inter-Domain Edge Device is defined in this document as a dif-
431	   ferentiated services capable device that is part of one differen-
432	   tiated services aware network and connected to another network
433	   domain. As service contracts between Inter-Domain Edge Devices usu-
434	   ally assume a statistical limit on the bandwidth between domains, the
435	   actual bandwidth may be higher or lower at any given time depending
436	   on the number of sessions active at the time.

438	   When bandwidth falls below the specified service contract, it can be
439	   beneficial to increase the Priority Class values on some or all pack-
440	   ets to take optimal advantage of service agreements.  This can allow
441	   the packets a higher probability of getting through.  However, there
442	   is only marginal benefit in increasing Priority Class values because
443	   the congestion check mechanism would be unaware of this action and
444	   would not increase the bandwidth to take full advantage of this
445	   option.

447	   If the service contract between the two domains is exceeded, the
448	   correct behavior must be to begin dropping packets in the lowest
449	   Priority Class.  If the Priority Class values in packets were decre-
450	   mented, there would be potential anomalies between the Congestion
451	   Check algorithm and the original Priority Class values assigned to
452	   packets.

454	4.2. Between the End Station and Network Edge devices

456	   End Stations can choose to use the new TOS octet semantics or not.
457	   Network Edge devices should be aware of the End Station's TOS field
458	   semantics assumptions. If the Network Edge device knows that con-
459	   nected End Stations are performing TOS octet assignments themselves,
460	   then the Network Edge device must operate as a profile meter.

462	   Profile meters are forwarding devices that match each packet with the
463	   appropriate profile and verify that the TOS octet assignments are
464	   within the profile associated with the user, group, or application.
465	   Their behavior should be identical to that of Inter-Domain Edge Dev-
466	   ices.  When packets are arriving below the bandwidth profile, profile
467	   meters may choose to increase the Priority Class of some or all pack-
468	   ets.  If the packets arriving exceed a maximum bandwidth profile(if
469	   any), packets in all Priority Classes must be dropped.

471	   When Network Edge Devices receive packets destined to an End Station
472	   which does not support differentiated services and the bandwidth
473	   exceeds the capabilities of the End Station, the Network Edge device
474	   connected to the End Station should treat this as congestion and
475	   begin dropping low Priority Class packets.

477	4.3. Bandwidth Scaling

479	   It is important for bandwidth to be able to grow and shrink across
480	   the Priority Classes.  For example, a server may have a very large
481	   bandwidth profile, but the clients it connects to may have drasti-
482	   cally different bandwidth limits.  Traditionally bandwidth grows
483	   until congestion occurs and is then cut back.  So far, there has been
484	   detailed discussion about how congestion can be managed.  However,
485	   there still needs to be a mechanism to determine what measure of
486	   bandwidth each end of a connection can tolerate.

488	   The best way to achieve this is to gradually increase the bandwidth
489	   across all Priority Classes up to the limit of the profile.  In the
490	   past, when the receiver became incapable of keeping up with the
491	   sender, it usually began dropping packets.  This mechanism needs to
492	   be refined so that a sender can be notified that a bandwidth limit
493	   has been reached.  For this scenario, it is reasonable for a receiver
494	   to absorb all packets up to its capability.  After that point, it
495	   begins to randomly drop packets.  When a sender discovers that pack-
496	   ets are randomly being discarded, it will throttle its bandwidth back
497	   evenly across all Priority Classes.  Some research will be required
498	   to determine the most appropriate bandwidth growth and cutback rates.

500	4.4. Receiver Billing

502	   Receiver based billing is a model that charges bandwidth and delay
503	   services to the receiver's profile rather than the sender's.  This is
504	   an important capability because a service is usually bought or given.
505	   The cost of a telephone conversation is not typically shared between
506	   both parties.  It is usually paid for by the caller or by the callee
507	   (an 800 number).

509	   There are three main issues with a Receiver Billing model.  First,
510	   the sender must know what the profile limits of the receiver are.
511	   Second, the receiver must be charged for the traffic that fits the
512	   profile.  Third, a receiver must be protected from excessive
513	   bandwidth sent by a malicious sender.

515	   As mentioned earlier, a sender should not be allowed to set the
516	   Receiver Billing bit unless it has received the sender's profile.
517	   The means for sending this profile is beyond the scope of this docu-
518	   ment.  However, there are a number of alternate mechanisms including
519	   static configuration, a standardized profile header in the TCP
520	   options header, or extensions to application headers.

522	   In order for the receiver to be charged for the traffic, profiles
523	   must be defined in bi-directional terms.  It is conceivable that a
524	   single profile is an aggregation of both the bandwidth sent and the
525	   bandwidth received.  However, usually the amount of bandwidth sent is
526	   different from the bandwidth received.  Therefore, independent
527	   accounting will be required at a minimum.  Because traffic through a
528	   Domain Edge could be charged to a sender or a receiver, different
529	   accounting may be required for each.

531	   As mentioned earlier, the traffic sent and the traffic received are
532	   seldom symmetric.  Therefore, when sender and receiver billing pro-
533	   files are specifically defined, a Priority Class dropper will need to
534	   be supported for each.  By providing receiver based bandwidth manage-
535	   ment, a solution to the second issue is provided.

537	   The malicious sender is a sender that sends packets using the
538	   receiver's bandwidth.  This is a difficult problem to solve because
539	   it requires all networks along the path between the sender and the
540	   receiver to be aware of the sender's right to send using receiver
541	   billing.  This problem can really be broken up into three problems.
542	   One is malicious deterioration of the end receiver's bandwidth with
543	   no interest in the data.  Another is malicious deterioration of
544	   intermediate ISP bandwidth with no interest in the data.  The last is
545	   an attempt to charge bandwidth to the receiver without the receiver's
546	   consent with an interest in the data.

548	   The problem of dealing with users who are incorrectly attempting to
549	   reverse charges for services is fairly easy to solve.  When a
550	   receiver determines that a packet is sent with Receiver Billing set
551	   and the receiver did not ask for it, the packet can be dropped.  This
552	   is in effect denial of service.  If a receiver determines it is worth
553	   receiving the packet, it can accept it.  The issue of determining
554	   when Receiver Billing is and is not acceptable will need to be
555	   resolved when mechanisms are put in place for propagating profiles to
556	   senders.

558	   The first and second problems associated with malicious deterioration
559	   of bandwidth exists in the Internet today.  A subset of these cases
560	   can be handled by terminating the session.  For other cases, this
561	   problem will likely require a protocol between Network Edge Devices
562	   which propagate denial of service between the sender and receiver
563	   back to the source.  This type of protocol is likely to be required
564	   irrespective of the Receiver Billing issue to resolve current possi-
565	   bilities for malicious Internet abuse.

567	5. Supported Service Models

569	   There are a number of services that have been suggested by the Dif-
570	   ferentiated Services Working Group.  One type of service, described
571	   by Clark/Wroclawski[1], has been commonly referred to as Assured
572	   Service.  The premise for this service is that packets can be marked
573	   as either "in" profile or "out" of profile.  During congestion, the
574	   packets marked as out of profile are dropped.  The proposals in this
575	   document support Assured Service directly using the same model.  The
576	   only distinction is that this document provides layers or classes of
577	   assurance.  As mentioned earlier, this proposal has the unique, addi-
578	   tional benefit of allowing cooperative congestion control between
579	   forwarding devices.

581	   Another service model, described by Van Jacobson[7], is called prem-
582	   ium service.  This service provides preferential treatment for all
583	   traffic marked as premium.  All unmarked traffic would continue to be
584	   treated as Best Effort.  On the other hand, premium traffic would
585	   have a guarantee of delivery, provided that the traffic is within
586	   profile.  All traffic exceeding the profile would be dropped by the
587	   profile meter.  The mechanisms described in this document can satisfy
588	   the service described by Van through a combination of forced dropping
589	   at the profile meter and by setting packets to higher (or the
590	   highest) Priority Classes as congestion occurs.  Premium Service also
591	   gives preferential access to all links over Best Effort traffic.
592	   This aspect could be accommodated using high Delay Classes.

594	   In addition, other service models can also be supported.  When
595	   congestion occurs along the path of a flow, Congestion Check can be
596	   used to prevent the sending of all packets which fall below the
597	   current highest congestion level. This would leave additional
598	   bandwidth available in the profile that could be used to communicate
599	   with destinations experiencing less congestion or no congestion.

601	   This strategy provides very flexible and optimized communication
602	   throughout the Internet.  Further, any combination of Priority Class
603	   values are possible.  For connections that are considered less impor-
604	   tant but which must be kept alive, packets with higher Priority Class
605	   values could be used to keep the session alive while lower Priority
606	   Class values would be used to send data when congestion decreased
607	   enough to permit it.

609	   Another possible service model that provides bandwidth guarantees
610	   irrespective of the level of congestion could be supported through a
611	   combination of Congestion Checking and adaptive assignment of the
612	   Priority Class values by the End Station.  Various combinations of
613	   the services described above can be supported as well.

615	6. Acknowledgments

617	   This document is a collection of ideas taken from David Clark, Van
618	   Jacobson, Zheng Wang, Kalevi Kilkki, Paul Ferguson and Kathleen
619	   Nichols.  In addition the many opportunities described in this docu-
620	   ment were inspired by the issues surfaced on the Diff-Serv mailing
621	   list.

623	7. References

625	   [1]  D. Clark and J. Wroclawski, "An Approach to Service
626	        Allocation in the Internet", Internet Draft
627	        <draft-clark-diff-svc-alloc-00.txt>, July 1997.

629	   [2]  Z. Wang, "User-Share Differentiation (USD), Scalable
630	        bandwidth allocation for differentiated services",
631	        Internet Draft <draft-wang-diff-serv-usd-00.txt>, May 1998.

633	   [3]  S. Floyd, "TCP and Explicit Congestion Notification",
634	        ACM Computer Communications Review, Vol. 24 no. 5, pp. 10-23,
635	        October 1994.

637	   [4]  RFC1349, "Type of Service in the Internet Protocol Suite",
638	        P. Almquist. July 1992.

640	   [5]  S. Deering and R. Hinden, "Internet Protocol, Version 6
641	        (IPv6) Specification", Internet Draft
642	        <draft-ietf-ipngwg-ipv6-spec-v2-01.txt>, November 1997.

644	   [6]  Nichols, et. al., "Differentiated Services Operational Model
645	        and Definitions", Internet Draft
646	        <draft-nichols-dsopdef-00.txt>, August 1998.

648	   [7]  K. Nichols, V. Jacobson, L. Zhang, "A Two-bit Differentiated
649	        Services Architecture for the Internet", Internet Draft
650	        <draft-nichols-diff-svc-arch-00.txt>, May 1998.

652	8. Author's address

654	   Walter Weiss
655	   Lucent Technologies
656	   300 Baker Avenue, Suite 100,
657	   Concord, MA USA 01742-2168
658	   Email: wweiss@lucent.com