idnits 2.17.1 

draft-ietf-intserv-guaranteed-svc-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-26) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([2], [3], [4], [5], [6], [8],
     [9], [10], [1]), which it shouldn't.  Please replace those with straight
     textual mentions of the documents in question.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 94: '...   the path MUST be determined and add...'
     RFC 2119 keyword, line 138: '... network element MUST ensure that the ...'
     RFC 2119 keyword, line 149: '... network element MUST ensure that its ...'
     RFC 2119 keyword, line 159: '... network element MUST ensure that the ...'
     RFC 2119 keyword, line 177: '...an the MTU of the link MUST be policed...'
     (41 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     Policing is done at the edge of the network.  Reshaping is done at
     all heterogeneous source branch points and at all source merge points.  A
     heterogeneous source branch point is a spot where the multicast
     distribution tree from a source branches to multiple distinct paths, and
     the TSpec's of the reservations on the various outgoing links are not all
     the same.  Reshaping need only be done if the TSpec on the outgoing link
     is "less than" (in the sense described in the Ordering section) the TSpec
     reserved on the immediately upstream link.  A source merge point is where
     the distribution paths or trees from two different sources (sharing the
     same reservation) merge.  It is the responsibility of the invoker of the
     service (a setup protocol, local configuration tool, or similar
     mechanism) to identify points where policing is required.  Reshaping may
     be done at other points as well as those described above.  Policing MUST
     not be done except at the edge of the network.

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (13 August 1996) is 10118 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: '7' is defined on line 841, but no explicit reference
     was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. '1'

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  -- Possible downref: Non-RFC (?) normative reference: ref. '3'

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'

  -- Possible downref: Non-RFC (?) normative reference: ref. '8'

  -- Possible downref: Non-RFC (?) normative reference: ref. '9'

  -- Possible downref: Non-RFC (?) normative reference: ref. '10'


     Summary: 11 errors (**), 0 flaws (~~), 3 warnings (==), 12 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                   Integrated Services WG
3	INTERNET-DRAFT                         S. Shenker/C. Partridge/R. Guerin
4	draft-ietf-intserv-guaranteed-svc-06.txt                   Xerox/BBN/IBM
5	                                                          13 August 1996
6	                                                        Expires: 2/13/97

8	             Specification of Guaranteed Quality of Service

10	Status of this Memo

12	   This document is an Internet-Draft.  Internet-Drafts are working
13	   documents of the Internet Engineering Task Force (IETF), its areas,
14	   and its working groups.  Note that other groups may also distribute
15	   working documents as Internet-Drafts.

17	   Internet-Drafts are draft documents valid for a maximum of six months
18	   and may be updated, replaced, or obsoleted by other documents at any
19	   time.  It is inappropriate to use Internet- Drafts as reference
20	   material or to cite them other than as ``work in progress.''

22	   To learn the current status of any Internet-Draft, please check the
23	   ``1id-abstracts.txt'' listing contained in the Internet- Drafts
24	   Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
25	   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
26	   ftp.isi.edu (US West Coast).

28	   This document is a product of the Integrated Services working group
29	   of the Internet Engineering Task Force.  Comments are solicited and
30	   should be addressed to the working group's mailing list at int-
31	   serv@isi.edu and/or the author(s).

33	   This draft reflects minor changes from the IETF meeting in Los
34	   Angeles and comments received after circulating draft 5.

36	Abstract

38	   This memo describes the network element behavior required to deliver
39	   a guaranteed service (guaranteed delay and bandwidth) in the
40	   Internet.  Guaranteed service provides firm (mathematically provable)
41	   bounds on end-to-end datagram queueing delays.  This service makes it
42	   possible to provide a service that guarantees both delay and
43	   bandwidth.  This specification follows the service specification
44	   template described in [1].

46	Introduction

48	   This document defines the requirements for network elements that
49	   support guaranteed service.  This memo is one of a series of
50	   documents that specify the network element behavior required to
51	   support various qualities of service in IP internetworks.  Services
52	   described in these documents are useful both in the global Internet
53	   and private IP networks.

55	   This document is based on the service specification template given in
56	   [1]. Please refer to that document for definitions and additional
57	   information about the specification of qualities of service within
58	   the IP protocol family.

60	End-to-End Behavior

62	   The end-to-end behavior provided by a series of network elements that
63	   conform to this document is an assured level of bandwidth that, when
64	   used by a policed flow, produces a delay-bounded service with no
65	   queueing loss for all conforming datagrams (assuming no failure of
66	   network components or changes in routing during the life of the
67	   flow).

69	   The end-to-end behavior conforms to the fluid model (described under
70	   Network Element Data Handling below) in that the delivered queueing
71	   delays do not exceed the fluid delays by more than the specified
72	   error bounds.  More precisely, the end-to-end delay bound is [(b-
73	   M)/R*(p-R)/(p-r)]+(M+Ctot)/R+Dtot for p>R>=r, and (M+Ctot)/R+Dtot for
74	   r<=p<=R, (where b, r, p, M, R, Ctot, and Dtot are defined later in
75	   this document).

77	      NOTE: While the per-hop error terms needed to compute the end-to-
78	      end delays are exported by the service module (see Exported
79	      Information below), the mechanisms needed to collect per-hop
80	      bounds and make the end-to-end quantities Ctot and Dtot known to
81	      the applications are not described in this specification.  These
82	      functions are provided by reservation setup protocols, routing
83	      protocols or other network management functions and are outside
84	      the scope of this document.

86	   The maximum end-to-end queueing delay (as characterized by Ctot and
87	   Dtot) and bandwidth (characterized by R) provided along a path will
88	   be stable.  That is, they will not change as long as the end-to-end
89	   path does not change.

91	   Guaranteed service does not control the minimal or average delay of
92	   datagrams, merely the maximal queueing delay.  Furthermore, to
93	   compute the maximum delay a datagram will experience, the latency of
94	   the path MUST be determined and added to the guaranteed queueing
95	   delay.  (However, as noted below, a conservative bound of the latency
96	   can be computed by observing the delay experienced by any one
97	   packet).

99	   This service is subject to admission control.

101	Motivation

103	   Guaranteed service guarantees that datagrams will arrive within the
104	   guaranteed delivery time and will not be discarded due to queue
105	   overflows, provided the flow's traffic stays within its specified
106	   traffic parameters.  This service is intended for applications which
107	   need a firm guarantee that a datagram will arrive no later than a
108	   certain time after it was transmitted by its source.  For example,
109	   some audio and video "play-back" applications are intolerant of any
110	   datagram arriving after their play-back time.  Applications that have
111	   hard real-time requirements will also require guaranteed service.

113	   This service does not attempt to minimize the jitter (the difference
114	   between the minimal and maximal datagram delays); it merely controls
115	   the maximal queueing delay.  Because the guaranteed delay bound is a
116	   firm one, the delay has to be set large enough to cover extremely
117	   rare cases of long queueing delays.  Several studies have shown that
118	   the actual delay for the vast majority of datagrams can be far lower
119	   than the guaranteed delay.  Therefore, authors of playback
120	   applications should note that datagrams will often arrive far earlier
121	   than the delivery deadline and will have to be buffered at the
122	   receiving system until it is time for the application to process
123	   them.

125	   This service represents one extreme end of delay control for
126	   networks.  Most other services providing delay control provide much
127	   weaker assurances about the resulting delays.  In order to provide
128	   this high level of assurance, guaranteed service is typically only
129	   useful if provided by every network element along the path (i.e. by
130	   both routers and the links that interconnect the routers).  Moreover,
131	   as described in the Exported Information section, effective provision
132	   and use of the service requires that the set-up protocol or other
133	   mechanism used to request service provides service characterizations
134	   to intermediate routers and to the endpoints.

136	Network Element Data Handling Requirements

138	   The network element MUST ensure that the service approximates the
139	   "fluid model" of service.  The fluid model at service rate R is
140	   essentially the service that would be provided by a dedicated wire of
141	   bandwidth R between the source and receiver.  Thus, in the fluid
142	   model of service at a fixed rate R, the flow's service is completely
143	   independent of that of any other flow.

145	   The flow's level of service is characterized at each network element
146	   by a bandwidth (or service rate) R and a buffer size B.  R represents
147	   the share of the link's bandwidth the flow is entitled to and B
148	   represents the buffer space in the network element that the flow may
149	   consume.  The network element MUST ensure that its service matches
150	   the fluid model at that same rate to within a sharp error bound.

152	   The definition of guaranteed service relies on the result that the
153	   fluid delay of a flow obeying a token bucket (r,b) and being served
154	   by a line with bandwidth R is bounded by b/R as long as R is no less
155	   than r.  Guaranteed service with a service rate R, where now R is a
156	   share of bandwidth rather than the bandwidth of a dedicated line,
157	   approximates this behavior.

159	   Consequently, the network element MUST ensure that the queueing delay
160	   of any datagram be less than b/R+C/R+D, where C and D describe the
161	   maximal local deviation away from the fluid model.  It is important
162	   to emphasize that C and D are maximums.  So, for instance, if an
163	   implementation has occasional gaps in service (perhaps due to
164	   processing routing updates), D needs to be large enough to account
165	   for the time a datagram may lose during the gap in service.  (C and D
166	   are described in more detail in the section on Exported Information).

168	      NOTE: Strictly speaking, this memo requires only that the service
169	      a flow receives is never worse than it would receive under this
170	      approximation of the fluid model.  It is perfectly acceptable to
171	      give better service.  For instance, if a flow is currently not
172	      using its share, R, algorithms such as Weighted Fair Queueing that
173	      temporarily give other flows the unused bandwidth, are perfectly
174	      acceptable (indeed, are encouraged).

176	   Links are not permitted to fragment datagrams as part of guaranteed
177	   service.  Datagrams larger than the MTU of the link MUST be policed
178	   as nonconformant which means that they will be policed according to
179	   the rules described in the Policing section below.

181	Invocation Information

183	   Guaranteed service is invoked by specifying the traffic (TSpec) and
184	   the desired service (RSpec) to the network element.  A service
185	   request for an existing flow that has a new TSpec and/or RSpec SHOULD
186	   be treated as a new invocation, in the sense that admission control
187	   SHOULD be reapplied to the flow.  Flows that reduce their TSpec
188	   and/or their RSpec (i.e., their new TSpec/RSpec is strictly smaller
189	   than the old TSpec/RSpec according to the ordering rules described in
190	   the section on Ordering below) SHOULD never be denied service.

192	   The TSpec takes the form of a token bucket plus a peak rate (p), a
193	   minimum policed unit (m), and a maximum datagram size (M).

195	   The token bucket has a bucket depth, b, and a bucket rate, r.  Both b
196	   and r MUST be positive.  The rate, r, is measured in bytes of IP
197	   datagrams per second, and can range from 1 byte per second to as
198	   large as 40 terabytes per second (or close to what is believed to be
199	   the maximum theoretical bandwidth of a single strand of fiber).
200	   Clearly, particularly for large bandwidths, only the first few digits
201	   are significant and so the use of floating point representations,
202	   accurate to at least 0.1% is encouraged.

204	   The bucket depth, b, is also measured in bytes and can range from 1
205	   byte to 250 gigabytes.  Again, floating point representations
206	   accurate to at least 0.1% are encouraged.

208	   The range of values is intentionally large to allow for the future
209	   bandwidths.  The range is not intended to imply that a network
210	   element has to support the entire range.

212	   The peak rate, p, is measured in bytes of IP datagrams per second and
213	   has the same range and suggested representation as the bucket rate.
214	   The peak rate is the maximum rate at which the source and any
215	   reshaping points (reshaping points are defined below) may inject
216	   bursts of traffic into the network.  More precisely, it is a
217	   requirement that for all time periods the amount of data sent cannot
218	   exceed M+pT where M is the maximum datagram size and T is the length
219	   of the time period.  Furthermore, p MUST be greater than or equal to
220	   the token bucket rate, r.  If the peak rate is unknown or
221	   unspecified, then p MUST be set to infinity.

223	   The minimum policed unit, m, is an integer measured in bytes.  All IP
224	   datagrams less than size m will be counted, when policed and tested
225	   for conformance to the TSpec, as being of size m.  The maximum
226	   datagram size, M, is the biggest datagram that will conform to the
227	   traffic specification; it is also measured in bytes.  The flow MUST
228	   be rejected if the requested maximum datagram size is larger than the
229	   MTU of the link.  Both m and M MUST be positive, and m MUST be less
230	   than or equal to M.

232	      The guaranteed service uses the general TOKEN_BUCKET_TSPEC
233	      parameter defined in Reference [8] to describe a data flow's
234	      traffic characteristics. The description above is of that
235	      parameter.  The TOKEN_BUCKET_TSPEC is general parameter number
236	      127. Use of this parameter for the guaranteed service TSpec
237	      simplifies the use of guaranteed Service in a multi-service
238	      environment.

240	   The RSpec is a rate R and a slack term S, where R MUST be greater
241	   than or equal to r and S MUST be nonnegative.  The RSpec rate can be
242	   bigger than the TSpec rate because higher rates will reduce queueing
243	   delay.  The slack term signifies the difference between the desired
244	   delay and the delay obtained by using a reservation level R.  This
245	   slack term can be utilized by the network element to reduce its
246	   resource reservation for this flow. When a network element chooses to
247	   utilize some of the slack in the RSpec, it MUST follow specific rules
248	   in updating the R and S fields of the RSpec; these rules are
249	   specified in the Ordering and Merging section.  If at the time of
250	   service invocation no slack is specified, the slack term, S, is set
251	   to zero.  No buffer specification is included in the RSpec because
252	   the network element is expected to derive the required buffer space
253	   to ensure no queueing loss from the token bucket and peak rate in the
254	   TSpec, the reserved rate and slack in the RSpec, the exported
255	   information received at the network element, i.e., Ctot and Dtot or
256	   Csum and Dsum, combined with internal information about how the
257	   element manages its traffic.

259	   The TSpec can be represented by three floating point numbers in
260	   single-precision IEEE floating point format followed by two 32-bit
261	   integers in network byte order.  The first floating point value is
262	   the rate (r), the second floating point value is the bucket size (b),
263	   the third floating point is the peak rate (p), the first integer is
264	   the minimum policed unit (m), and the second integer is the maximum
265	   datagram size (M).

267	   The RSpec rate term, R, can also be represented using single-
268	   precision IEEE floating point.

270	   The Slack term, S, can be represented as a 32-bit integer.

272	   When r, b, p, and R terms are represented as IEEE floating point
273	   values, the sign bit MUST be zero (all values MUST be non-negative).
274	   Exponents less than 127 (i.e., 0) are prohibited.  Exponents greater
275	   than 162 (i.e., positive 35) are discouraged, except for specifying a
276	   peak rate of infinity.  Infinity is represented with an exponent of
277	   all ones (255) and a sign bit and mantissa of all zeroes.

279	Exported Information

281	   Each guaranteed service module MUST export at least the following
282	   information.  All of the parameters described below are
283	   characterization parameters.

285	   A network element's implementation of guaranteed service is
286	   characterized by two error terms, C and D, which represent how the
287	   element's implementation of the guaranteed service deviates from the
288	   fluid model.  These two parameters have an additive composition rule.

290	   The error term C is the rate-dependent error term.  It represents the
291	   delay a datagram in the flow might experience due to the rate
292	   parameters of the flow.  An example of such an error term is the need
293	   to account for the time taken serializing a datagram broken up into
294	   ATM cells, with the cells sent at a frequency of 1/r.

296	      NOTE: It is important to observe that when computing the delay
297	      bound, parameter C is divided by the reservation rate R.  This
298	      division is done because, as with the example of serializing the
299	      datagram, the effect of the C term is a function of the
300	      transmission rate.  Implementors should take care to confirm that
301	      their C values, when divided by various rates, give appropriate
302	      results.  Delay values that are not dependent on the rate SHOULD
303	      be incorporated into the value for the D parameter.

305	   The error term D is the rate-independent, per-element error term and
306	   represents the worst case non-rate-based transit time variation
307	   through the service element.  It is generally determined or set at
308	   boot or configuration time.  An example of D is a slotted network, in
309	   which guaranteed flows are assigned particular slots in a cycle of
310	   slots.  Some part of the per-flow delay may be determined by which
311	   slots in the cycle are allocated to the flow.  In this case, D would
312	   measure the maximum amount of time a flow's data, once ready to be
313	   sent, might have to wait for a slot.  (Observe that this value can be
314	   computed before slots are assigned and thus can be advertised.  For
315	   instance, imagine there are 100 slots.  In the worst case, a flow
316	   might get all of its N slots clustered together, such that if a
317	   packet was made ready to send just after the cluster ended, the
318	   packet might have to wait 100-N slot times before transmitting.  In
319	   this case one can easily approximate this delay by setting D to 100
320	   slot times).

322	   If the composition function is applied along the entire path to
323	   compute the end-to-end sums of C and D (Ctot and Dtot) and the
324	   resulting values are then provided to the end nodes (by presumably
325	   the setup protocol), the end nodes can compute the maximal datagram
326	   queueing delays.  Moreover, if the partial sums (Csum and Dsum) from
327	   the most recent reshaping point (reshaping points are defined below)
328	   downstream towards receivers are handed to each network element then
329	   these network elements can compute the buffer allocations necessary
330	   to achieve no datagram loss, as detailed in the section Guidelines
331	   for Implementors.  The proper use and provision of this service
332	   requires that the quantities Ctot and Dtot, and the quantities Csum
333	   and Dsum be computed.  Therefore, we assume that usage of guaranteed
334	   service will be primarily in contexts where these quantities are made
335	   available to end nodes and network elements.

337	   The error term C is measured in units of bytes.  An individual
338	   element can advertise a C value between 1 and 2**28 (a little over
339	   250 megabytes) and the total added over all elements can range as
340	   high as (2**32)-1.  Should the sum of the different elements delay
341	   exceed (2**32)-1, the end-to-end error term MUST be set to (2**32)-1.

343	   The error term D is measured in units of one microsecond.  An
344	   individual element can advertise a delay value between 1 and 2**28
345	   (somewhat over two minutes) and the total delay added over all
346	   elements can range as high as (2**32)-1.  Should the sum of the
347	   different elements delay exceed (2**32)-1, the end-to-end delay MUST
348	   be set to (2**32)-1.

350	   The guaranteed service is service_name 2.

352	   The RSpec parameter is numbered 130.

354	   Error characterization parameters C and D are numbered 131 and 132.
355	   The end-to-end composed values for C and D (Ctot and Dtot) are
356	   numbered 133 and 134.  The since-last-reshaping point composed values
357	   for C and D (Csum and Dsum) are numbered 135 and 136.

359	Policing

361	   There are two forms of policing in guaranteed service.  One form is
362	   simple policing (hereafter just called policing to be consistent with
363	   other documents), in which arriving traffic is compared against a
364	   TSpec.  The other form is reshaping, where an attempt is made to
365	   restore (possibly distorted) traffic's shape to conform to the TSpec,
366	   and the fact that traffic is in violation of the TSpec is discovered
367	   because the reshaping fails (the reshaping buffer overflows).

369	   Policing is done at the edge of the network.  Reshaping is done at
370	   all heterogeneous source branch points and at all source merge
371	   points.  A heterogeneous source branch point is a spot where the
372	   multicast distribution tree from a source branches to multiple
373	   distinct paths, and the TSpec's of the reservations on the various
374	   outgoing links are not all the same.  Reshaping need only be done if
375	   the TSpec on the outgoing link is "less than" (in the sense described
376	   in the Ordering section) the TSpec reserved on the immediately
377	   upstream link.  A source merge point is where the distribution paths
378	   or trees from two different sources (sharing the same reservation)
379	   merge.  It is the responsibility of the invoker of the service (a
380	   setup protocol, local configuration tool, or similar mechanism) to
381	   identify points where policing is required.  Reshaping may be done at
382	   other points as well as those described above.  Policing MUST not be
383	   done except at the edge of the network.

385	   The token bucket and peak rate parameters require that traffic MUST
386	   obey the rule that over all time periods, the amount of data sent
387	   cannot exceed M+min[pT, rT+b-M], where r and b are the token bucket
388	   parameters, M is the maximum datagram size, and T is the length of
389	   the time period (note that when p is infinite this reduces to the
390	   standard token bucket requirement).  For the purposes of this
391	   accounting, links MUST count datagrams which are smaller than the
392	   minimum policing unit to be of size m.  Datagrams which arrive at an
393	   element and cause a violation of the the M+min[pT, rT+b-M] bound are
394	   considered non-conformant.

396	   At the edge of the network, traffic is policed to ensure it conforms
397	   to the token bucket.  Non-conforming datagrams SHOULD be treated as
398	   best-effort datagrams.  [If and when a marking ability becomes
399	   available, these non-conformant datagrams SHOULD be ''marked'' as
400	   being non-compliant and then treated as best effort datagrams at all
401	   subsequent routers.]

403	   Best effort service is defined as the default service a network
404	   element would give to a datagram that is not part of a flow and was
405	   sent between the flow's source and destination.  Among other
406	   implications, this definition means that if a flow's datagram is
407	   changed to a best effort datagram, all flow control (e.g., RED [2])
408	   that is normally applied to best effort datagrams is applied to that
409	   datagram too.

411	      NOTE: There may be situations outside the scope of this document,
412	      such as when a service module's implementation of guaranteed
413	      service is being used to implement traffic sharing rather than a
414	      quality of service, where the desired action is to discard non-
415	      conforming datagrams.  To allow for such uses, implementors SHOULD
416	      ensure that the action to be taken for non-conforming datagrams is
417	      configurable.

419	   Inside the network, policing does not produce the desired results,
420	   because queueing effects will occasionally cause a flow's traffic
421	   that entered the network as conformant to be no longer conformant at
422	   some downstream network element.  Therefore, inside the network,
423	   network elements that wish to police traffic MUST do so by reshaping
424	   traffic to the token bucket.  Reshaping entails delaying datagrams
425	   until they are within conformance of the TSpec.

427	   Reshaping is done by combining a buffer with a token bucket and peak
428	   rate regulator and buffering data until it can be sent in conformance
429	   with the token bucket and peak rate parameters.  (The token bucket
430	   regulator MUST start with its token bucket full of tokens).  Under
431	   guaranteed service, the amount of buffering required to reshape any
432	   conforming traffic back to its original token bucket shape is
433	   b+Csum+(Dsum*r), where Csum and Dsum are the sums of the parameters C
434	   and D between the last reshaping point and the current reshaping
435	   point.  Note that the knowledge of the peak rate at the reshapers can
436	   be used to reduce these buffer requirements (see the section on
437	   "Guidelines for Implementors" below).  A network element MUST provide
438	   the necessary buffers to ensure that conforming traffic is not lost
439	   at the reshaper.

441	   If a datagram arrives to discover the reshaping buffer is full, then
442	   the datagram is non-conforming.  Observe this means that a reshaper
443	   is effectively policing too.  As with a policer, the reshaper SHOULD
444	   relegate non-conforming datagrams to best effort.  [If marking is
445	   available, the non-conforming datagrams SHOULD be marked]

447	      NOTE: As with policers, it SHOULD be possible to configure how
448	      reshapers handle non-conforming datagrams.

450	   Note that while the large buffer makes it appear that reshapers add
451	   considerable delay, this is not the case.  Given a valid TSpec that
452	   accurately describes the traffic, reshaping will cause little extra
453	   actual delay at the reshaping point (and will not affect the delay
454	   bound at all).  Furthermore, in the normal case, reshaping will not
455	   cause the loss of any data.

457	   However, (typically at merge or branch points), it may happen that
458	   the TSpec is smaller than the actual traffic.  If this happens,
459	   reshaping will cause a large queue to develop at the reshaping point,
460	   which both causes substantial additional delays and forces some
461	   datagrams to be treated as non-conforming.  This scenario makes an
462	   unpleasant denial of service attack possible, in which a receiver who
463	   is successfully receiving a flow's traffic via best effort service is
464	   pre-empted by a new receiver who requests a reservation for the flow,
465	   but with an inadequate TSpec and RSpec.  The flow's traffic will now
466	   be policed and possibly reshaped.  If the policing function was
467	   chosen to discard datagrams, the best-effort receiver would stop
468	   receiving traffic.  For this reason, in the normal case, policers are
469	   simply to treat non-conforming datagrams as best effort (and marking
470	   them if marking is implemented).  While this protects against denial
471	   of service, it is still true that the bad TSpec may cause queueing
472	   delays to increase.

474	      NOTE: To minimize problems of reordering datagrams, reshaping
475	      points may wish to forward a best-effort datagram from the front
476	      of the reshaping queue when a new datagram arrives and the
477	      reshaping buffer is full.

479	      Readers should also observe that reclassifying datagrams as best
480	      effort (as opposed to dropping the datagrams) also makes support
481	      for elastic flows easier.  They can reserve a modest token bucket
482	      and when their traffic exceeds the token bucket, the excess
483	      traffic will be sent best effort.

485	   A related issue is that at all network elements, datagrams bigger
486	   than the MTU of the network element MUST be considered non-conformant
487	   and SHOULD be classified as best effort (and will then either be
488	   fragmented or dropped according to the element's handling of best
489	   effort traffic).  [Again, if marking is available, these reclassified
490	   datagrams SHOULD be marked.]

492	Ordering and Merging

494	   TSpec's are ordered according to the following rules.

496	   TSpec A is a substitute ("as good or better than") for TSpec B if (1)
497	   both the token rate r and bucket depth b for TSpec A are greater than
498	   or equal to those of TSpec B; (2) the peak rate p is at least as
499	   large in TSpec A as it is in TSpec B; (3) the minimum policed unit m
500	   is at least as small for TSpec A as it is for TSpec B; and (4) the
501	   maximum datagram size M is at least as large for TSpec A as it is for
502	   TSpec B.

504	   TSpec A is "less than or equal" to TSpec B if (1) both the token rate
505	   r and bucket depth b for TSpec A are less than or equal to those of
506	   TSpec B; (2) the peak rate p in TSpec A is at least as small as the
507	   peak rate in TSpec B; (3) the minimum policed unit m is at least as
508	   large for TSpec A as it is for TSpec B; and (4) the maximum datagram
509	   size M is at least as small for TSpec A as it is for TSpec B.

511	   A merged TSpec may be calculated over a set of TSpecs by taking (1)
512	   the largest token bucket rate, (2) the largest bucket size, (3) the
513	   largest peak rate, (4) the smallest minimum policed unit, and (5) the
514	   largest maximum datagram size across all members of the set.  This
515	   use of the word "merging" is similar to that in the RSVP protocol
516	   [10]; a merged TSpec is one which is adequate to describe the traffic
517	   from any one of constituent TSpecs.

519	   A summed TSpec may be calculated over a set of TSpecs by computing
520	   (1) the sum of the token bucket rates, (2) the sum of the bucket
521	   sizes, (3) the sum of the peak rates, (4) the smallest minimum
522	   policed unit, and (5) the maximum packet size parameter.

524	   The minimum of two TSpecs differs according to whether the TSpecs can
525	   be ordered.  If one TSpec is less than the other TSpec, the smaller
526	   TSpec is the minimum.  Otherwise, the minimum TSpec of two TSpecs is
527	   determined by comparing the respective values in the two TSpecs and
528	   choosing (1) the smaller token bucket rate, (2) the larger token
529	   bucket size (3) the smaller peak rate, (4) the smaller minimum
530	   policed unit, and (5) the smaller maximum packet size.

532	   The RSpec's are merged in a similar manner as the TSpecs, i.e. a set
533	   of RSpecs is merged onto a single RSpec by taking the largest rate R,
534	   and the smallest slack S.  More precisely, RSpec A is a substitute
535	   for RSpec B if the value of reserved service rate, R, in RSpec A is
536	   greater than or equal to the value in RSpec B, and the value of the
537	   slack, S, in RSpec A is smaller than or equal to that in RSpec B.

539	   Each network element receives a service request of the form (TSpec,
540	   RSpec), where the RSpec is of the form (Rin, Sin).  The network
541	   element processes this request and performs one of two actions:

543	    a. it accepts the request and returns a new Rspec of the form
544	       (Rout, Sout);
545	    b. it rejects the request.

547	   The processing rules for generating the new RSpec are governed by the
548	   delay constraint:

550	          Sout + b/Rout + Ctoti/Rout <= Sin + b/Rin + Ctoti/Rin,

552	   where Ctoti is the cumulative sum of the error terms, C, for all the
553	   network elements that are upstream of the current element, i.  In
554	   other words, this element consumes (Sin - Sout) of slack and can use
555	   it to reduce its reservation level, provided that the above
556	   inequality is satisfied.  Rin and Rout MUST also satisfy the
557	   constraint:

559	                             r <= Rout <= Rin.

561	   When several RSpec's, each with rate Rj, j=1,2..., are to be merged
562	   at a split point, the value of Rout is the maximum over all the rates
563	   Rj, and the value of Sout is the minimum over all the slack terms Sj.

565	      NOTE: The various TSpec functions described above are used by
566	      applications which desire to combine TSpecs.  It is important to
567	      observe, however, that the properties of the actual reservation
568	      are determined by combining the TSpec with the RSpec rate (R).

570	      Because the guaranteed reservation requires both the TSpec and the
571	      RSpec rate, there exist some difficult problems for shared
572	      reservations in RSVP, particularly where two or more source
573	      streams meet.  Upstream of the meeting point, it would be
574	      desirable to reduce the TSpec and RSpec to use only as much
575	      bandwidth and buffering as is required by the individual source's
576	      traffic.  (Indeed, it may be necessary if the sender is
577	      transmitting over a low bandwidth link).

579	      However, the RSpec's rate is set to achieve a particular delay
580	      bound (and is notjust a function of the TSpec), so changing the
581	      RSpec may cause the reservation to fail to meet the receiver's
582	      delay requirements.  At the same time, not adjusting the RSpec
583	      rate means that "shared" RSVP reservations using guaranteed
584	      service will fail whenever the bandwidth available at a particular
585	      link is less than the receiver's requested rate R, even if the
586	      bandwidth is adequate to support the number of senders actually
587	      using the link.  At this time, this limitation is an open problem
588	      in using the guaranteed service with RSVP.

590	 Guidelines for Implementors

592	   This section discusses a number of important implementation issues in
593	   no particular order.

595	   It is important to note that individual subnetworks are network
596	   elements and both routers and subnetworks MUST support the guaranteed
597	   service model to achieve guaranteed service.  Since subnetworks
598	   typically are not capable of negotiating service using IP-based
599	   protocols, as part of providing guaranteed service, routers will have
600	   to act as proxies for the subnetworks they are attached to.

602	   In some cases, this proxy service will be easy.  For instance, on
603	   leased line managed by a WFQ scheduler on the upstream node, the
604	   proxy need simply ensure that the sum of all the flows' RSpec rates
605	   does not exceed the bandwidth of the line, and needs to advertise the
606	   rate-based and non-rate-based delays of the link as the values of C
607	   and D.

609	   In other cases, this proxy service will be complex.  In an ATM
610	   network, for example, it may require establishing an ATM VC for the
611	   flow and computing the C and D terms for that VC.  Readers may
612	   observe that the token bucket and peak rate used by guaranteed
613	   service map directly to the Sustained Cell Rate, Burst Size, and Peak
614	   Cell Rate of ATM's Q.2931 QoS parameters for Variable Bit Rate
615	   traffic.

617	   The assurance that datagrams will not be lost is obtained by setting
618	   the router buffer space B to be equal to the token bucket b plus some
619	   error term (described below).

621	   Another issue related to subnetworks is that the TSpec's token bucket
622	   rates measure IP traffic and do not (and cannot) account for link
623	   level headers.  So the subnetwork network elements MUST adjust the
624	   rate and possibly the bucket size to account for adding link level
625	   headers.  Tunnels MUST also account for the additional IP headers
626	   that they add.

628	   For datagram networks, a maximum header rate can usually be computed
629	   by dividing the rate and bucket sizes by the minimum policed unit.
630	   For networks that do internal fragmentation, such as ATM, the
631	   computation may be more complex, since one MUST account for both
632	   per-fragment overhead and any wastage (padding bytes transmitted) due
633	   to mismatches between datagram sizes and fragment sizes.  For
634	   instance, a conservative estimate of the additional data rate imposed
635	   by ATM AAL5 plus ATM segmentation and reassembly is

637	                         ((r/48)*5)+((r/m)*(8+52))

639	   which represents the rate divided into 48-byte cells multiplied by
640	   the 5-byte ATM header, plus the maximum datagram rate (r/m)
641	   multiplied by the cost of the 8-byte AAL5 header plus the maximum
642	   space that can be wasted by ATM segmentation of a datagram (which is
643	   the 52 bytes wasted in a cell that contains one byte).  But this
644	   estimate is likely to be wildly high, especially if m is small, since
645	   ATM wastage is usually much less than 52 bytes.  (ATM implementors
646	   should be warned that the token bucket may also have to be scaled
647	   when setting the VC parameters for call setup and that this example
648	   does not account for overhead incurred by encapsulations such as
649	   those specified in RFC 1483).

651	   To ensure no loss, network elements will have to allocate some
652	   buffering for bursts.  If every hop implemented the fluid model
653	   perfectly, this buffering would simply be b (the token bucket size).
654	   However, as noted in the discussion of reshaping earlier,
655	   implementations are approximations and we expect that traffic will
656	   become more bursty as it goes through the network.  However, as with
657	   shaping the amount of buffering required to handle the burstiness is
658	   bounded by b+Csum+Dsum*R.  If one accounts for the peak rate, this
659	   can be further reduced to

661	                  M + (b-M)(p-X)/(p-r) + (Csum/R + Dsum)X

663	   where X is set to r if (b-M)/(p-r) is less than Csum/R+Dsum and X is
664	   R if (b-M)/(p-r) is greater than or equal to Csum/R+Dsum and p>R;
665	   otherwise, X is set to p.  This reduction comes from the fact that
666	   the peak rate limits the rate at which the burst, b, can be placed in
667	   the network.  Conversely, if a non-zero slack term, Sout, is returned
668	   by the network element, the buffer requirements are increased by
669	   adding Sout to Dsum.

671	   While sending applications are encouraged to set the peak rate
672	   parameter and reshaping points are required to conform to it, it is
673	   always acceptable to ignore the peak rate for the purposes of
674	   computing buffer requirements and end-to-end delays.  The result is
675	   simply an overestimate of the buffering and delay.  As noted above,
676	   if the peak rate is unknown (and thus potentially infinite), the
677	   buffering required is b+Csum+Dsum*R.  The end-to-end delay without
678	   the peak rate is b/R+Ctot/R+Dtot.

680	   The parameter D for each network element SHOULD be set to the maximum
681	   datagram transfer delay variation (independent of rate and bucket
682	   size) through the network element.  For instance, in a simple router,
683	   one might compute the difference between the worst case and best case
684	   times it takes for a datagram to get through the input interface to
685	   the processor, and add it to any variation that may occur in how long
686	   it would take to get from the processor to the outbound link
687	   scheduler (assuming the queueing schemes work correctly).

689	   For datagramized weighted fair queueing, D is set to the link MTU
690	   divided by the link bandwidth, to account for the possibility that a
691	   packet arrives just as a maximum-sized packet begins to be
692	   transmitted, and that the arriving packet should have departed before
693	   the maximum-sized packet.  For a frame-based, slotted system such as
694	   Stop and Go queueing, D is the maximum number of slots a datagram may
695	   have to wait before getting a chance to be transmitted.

697	   Note that multicasting may make determining D more difficult.  In
698	   many subnets, ATM being one example, the properties of the subnet may
699	   depend on the path taken from the multicast sender to the receiver.
700	   There are a number of possible approaches to this problem.  One is to
701	   choose a representative latency for the overall subnet and set D to
702	   the (non-negative) difference from that latency.  Another is to
703	   estimate subnet properties at exit points from the subnet, since the
704	   exit point presumably is best placed to compute the properties of its
705	   path from the source.

707	      NOTE: It is important to note that there is no fixed set of rules
708	      about how a subnet determines its properties, and each subnet
709	      technology will have to develop its own set of procedures to
710	      accurately compute C and D and slack values.

712	   D is intended to be distinct from the latency through the network
713	   element.  Latency is the minimum time through the device (the speed
714	   of light delay in a fiber or the absolute minimum time it would take
715	   to move a packet through a router), while parameter D is intended to
716	   bound the variability in non-rate-based delay.  In practice, this
717	   distinction is sometimes arbitrary (the latency may be minimal) -- in
718	   such cases it is perfectly reasonable to combine the latency with D
719	   and to advertise any latency as zero.

721	      NOTE: It is implicit in this scheme that to get a complete
722	      guarantee of the maximum delay a packet might experience, a user
723	      of this service will need to know both the queueing delay
724	      (provided by C and D) and the latency.  The latency is not
725	      advertised by this service but is a general characterization
726	      parameter (advertised as specified in [8]).

728	      However, even if latency is not advertised, this service can still
729	      be used.  The simplest approach is to measure the delay
730	      experienced by the first packet (or the minimum delay of the first
731	      few packets) received and treat this delay value as an upper bound
732	      on the latency.

734	   The parameter C is the data backlog resulting from the vagaries of
735	   how a specific implementation deviates from a strict bit-by-bit
736	   service. So, for instance, for datagramized weighted fair queueing, C
737	   is set to M to account for packetization effects.

739	   If a network element uses a certain amount of slack, Si, to reduce
740	   the amount of resources that it has reserved for a particular flow,
741	   i, the value Si SHOULD be stored at the network element.
742	   Subsequently, if reservation refreshes are received for flow i, the
743	   network element MUST use the same slack Si without any further
744	   computation. This guarantees consistency in the reservation process.

746	   As an example for the use of the slack term, consider the case where
747	   the required end-to-end delay, Dreq, is larger than the maximum delay
748	   of the fluid flow system. The latter is obtained by setting R=r in
749	   the fluid delay formula (for stability, R>=r must be true), and is
750	   given by

752	                           b/r + Ctot/r + Dtot.

754	   In this case the slack term is

756	                     S = Dreq - (b/r + Ctot/r + Dtot).

758	   The slack term may be used by the network elements to adjust their
759	   local reservations, so that they can admit flows that would otherwise
760	   have been rejected. A network element at an intermediate network
761	   element that can internally differentiate between delay and rate
762	   guarantees can now take advantage of this information to lower the
763	   amount of resources allocated to this flow. For example, by taking an
764	   amount of slack s <= S, an RCSD scheduler [5] can increase the local
765	   delay bound, d, assigned to the flow, to d+s. Given an RSpec, (Rin,
766	   Sin), it would do so by setting Rout = Rin and Sout = Sin - s.

768	   Similarly, a network element using a WFQ scheduler can decrease its
769	   local reservation from Rin to Rout by using some of the slack in the
770	   RSpec. This can be accomplished by using the transformation rules
771	   given in the previous section, that ensure that the reduced
772	   reservation level will not increase the overall end-to-end delay.

774	Evaluation Criteria

776	   The scheduling algorithm and admission control algorithm of the
777	   element MUST ensure that the delay bounds are never violated and
778	   datagrams are not lost, when a source's traffic conforms to the
779	   TSpec.  Furthermore, the element MUST ensure that misbehaving flows
780	   do not affect the service given to other flows.  Vendors are
781	   encouraged to formally prove that their implementation is an
782	   approximation of the fluid model.

784	 Examples of Implementation

786	   Several algorithms and implementations exist that approximate the
787	   fluid model.  They include Weighted Fair Queueing (WFQ) [2], Jitter-
788	   EDD [3], Virtual Clock [4] and a scheme proposed by IBM [5].  A nice
789	   theoretical presentation that shows these schemes are part of a large
790	   class of algorithms can be found in [6].

792	 Examples of Use

794	   Consider an application that is intolerant of any lost or late
795	   datagrams.  It uses the advertised values Ctot and Dtot and the TSpec
796	   of the flow, to compute the resulting delay bound from a service
797	   request with rate R. Assuming R < p, it then sets its playback point
798	   to [(b-M)/R*(p-R)/(p-r)]+(M+Ctot)/R+Dtot.

800	Security Considerations

802	   This memo discusses how this service could be abused to permit denial
803	   of service attacks.  The service, as defined, does not allow denial
804	   of service (although service may degrade under certain
805	   circumstances).

807	Appendix 1: Use of the Guaranteed service with RSVP

809	   The use of guaranteed service in conjunction with the RSVP resource
810	   reservation setup protocol is specified in reference [9]. This
811	   document gives the format of RSVP FLOWSPEC, SENDER_TSPEC, and ADSPEC
812	   objects needed to support applications desiring guaranteed service
813	   and gives information about how RSVP processes those objects. The
814	   RSVP protocol itself is specified in Reference [10].

816	References

818	   [1] S. Shenker and J. Wroclawski. "Network Element QoS Control
819	   Service Specification Template". Internet Draft, July 1996, <draft-
820	   ietf-intserv-svc-template-03.txt>

822	   [2] A. Demers, S. Keshav and S. Shenker, "Analysis and Simulation of
823	   a Fair Queueing Algorithm," in Internetworking: Research and
824	   Experience, Vol 1, No. 1., pp. 3-26.

826	   [3] L. Zhang, "Virtual Clock: A New Traffic Control Algorithm for
827	   Packet Switching Networks," in Proc. ACM SIGCOMM '90, pp. 19-29.

829	   [4] D. Verma, H. Zhang, and D. Ferrari, "Guaranteeing Delay Jitter
830	   Bounds in Packet Switching Networks," in Proc. Tricomm '91.

832	   [5] L. Georgiadis, R. Guerin, V. Peris, and K. N. Sivarajan,
833	   "Efficient Network QoS Provisioning Based on per Node Traffic
834	   Shaping," IBM Research Report No. RC-20064.

836	   [6] P. Goyal, S.S. Lam and H.M. Vin, "Determining End-to-End Delay
837	   Bounds in Heterogeneous Networks," in Proc. 5th Intl. Workshop on
838	   Network and Operating System Support for Digital Audio and Video,
839	   April 1995.

841	   [7] A.K.J. Parekh, A Generalized Processor Sharing Approach to Flow
842	   Control in Integrated Services Networks, MIT Laboratory for
843	   Information and Decision Systems, Report LIDS-TH-2089, February 1992.

845	   [8] S. Shenker and J. Wroclawski. "General Characterization
846	   Parameters for Integrated Service Network Elements", Internet Draft,
847	   July 1996, <draft-ietf-intserv-charac-02.txt>

849	   [9] J. Wroclawski, "Use of RSVP with IETF Integrated Services",
850	   Internet Draft, July 1996, <draft-ietf-intserv-rsvp-use-00.txt>

852	   [10] B. Braden, et. al. "Resource Reservation Protocol (RSVP) -
853	   Version 1 Functional Specification", Internet Draft, July 1996,
854	   <draft-ietf-rsvp-spec-13.txt>

856	Authors' Addresses:

858	   Scott Shenker
859	   Xerox PARC
860	   3333 Coyote Hill Road
861	   Palo Alto, CA  94304-1314

863	   email: shenker@parc.xerox.com
864	   415-812-4840
865	   415-812-4471 (FAX)

867	   Craig Partridge
868	   BBN
869	   2370 Amherst St
870	   Palo Alto CA 94306

872	   email: craig@bbn.com

874	   Roch Guerin
875	   IBM T.J. Watson Research Center
876	   Yorktown Heights, NY 10598

878	   email: guerin@watson.ibm.com
879	   914-784-7038
880	   914-784-6318 (FAX)