idnits 2.17.1 

draft-ietf-intserv-guaranteed-svc-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-24) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([2], [3], [4], [5], [6], [7],
     [TBA], [1]), which it shouldn't.  Please replace those with straight
     textual mentions of the documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 198 has weird spacing: '...55) and  a sig...'

  == Line 477 has weird spacing: '...   flow  and c...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (15 December 1995) is 10358 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: 'TBA' on line 634

  -- Possible downref: Non-RFC (?) normative reference: ref. '1'

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  -- Possible downref: Non-RFC (?) normative reference: ref. '3'

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'


     Summary: 9 errors (**), 0 flaws (~~), 3 warnings (==), 10 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                   Integrated Services WG
2	INTERNET-DRAFT                                   S. Shenker/C. Partridge
3	draft-ietf-intserv-guaranteed-svc-03.txt                       Xerox/BBN
4	                                                        15 December 1995
5	                                                        Expires: 5/15/96

7	             Specification of Guaranteed Quality of Service

9	Status of this Memo

11	   This document is an Internet-Draft.  Internet-Drafts are working
12	   documents of the Internet Engineering Task Force (IETF), its areas,
13	   and its working groups.  Note that other groups may also distribute
14	   working documents as Internet-Drafts.

16	   Internet-Drafts are draft documents valid for a maximum of six months
17	   and may be updated, replaced, or obsoleted by other documents at any
18	   time.  It is inappropriate to use Internet- Drafts as reference
19	   material or to cite them other than as ``work in progress.''

21	   To learn the current status of any Internet-Draft, please check the
22	   ``1id-abstracts.txt'' listing contained in the Internet- Drafts
23	   Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
24	   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
25	   ftp.isi.edu (US West Coast).

27	   This document is a product of the Integrated Services working group
28	   of the Internet Engineering Task Force.  Comments are solicited and
29	   should be addressed to the working group's mailing list at int-
30	   serv@isi.edu and/or the author(s).

32	   This draft reflects changes from the IETF meeting in Dallas.

34	Abstract

36	   This memo describes the network element behavior required to deliver
37	   guaranteed service in the Internet.  Guaranteed service provides firm
38	   (mathematically provable) bounds on end-to-end packet delays.  This
39	   specification follows the service specification template described in
40	   [1].

42	Introduction

44	   This document defines the requirements for network elements that
45	   support guaranteed service.  This memo is one of a series of
46	   documents that specify the network element behavior required to
47	   support various qualities of service in IP internetworks.  Services
48	   described in these documents are useful both in the global Internet
49	   and private IP networks.

51	   This document is based on the service specification template given in
52	   [1]. Please refer to that document for definitions and additional
53	   information about the specification of qualities of service within
54	   the IP protocol family.

56	End-to-End Behavior

58	   The end-to-end behavior provided by a series of service elements that
59	   conform to this document is an assured level of bandwidth that, when
60	   used by a policed flow, produces a delay bounded service with no
61	   queueing loss for all conforming datagrams (assuming no failure of
62	   network components or changes in routing during the life of the
63	   flow).

65	   The end-to-end behavior conforms to the fluid model (described below)
66	   in that the delivered delays do not exceed the fluid delays by more
67	   than the specified error bounds.  More precisely, the end-to-end
68	   delay bound is [(b-M)/R*(p-R)/(p-r)]+(M+Ctot)/R+Dtot for p>R, and
69	   (M+Ctot)/R+Dtot for p<=R, (where b, r, p, M, R, Ctot, and Dtot are
70	   defined later in this document).  Guaranteed service does not control
71	   the minimal delay of packets, merely the maximal delays.

73	      NOTE: While the per-hop error terms needed to compute the end-to-
74	      end delays are exported by the service module (see Exported
75	      Information below), the mechanisms needed to collect per-hop
76	      bounds and make the end-to-end quantities Ctot and Dtot known to
77	      the applications are not described in this specification.  These
78	      functions, which can be provided by reservation setup protocols,
79	      routing protocols or by other network management functions, are
80	      outside the scope of this document.

82	   The end-to-end behavior (as characterized by Ctot and Dtot) provided
83	   along a path should be stable.  That is, it should not change as long
84	   as the end-to-end path does not change.

86	   This service is subject to admission control.

88	Motivation

90	   Guaranteed service guarantees that packets will arrive within the
91	   guaranteed delivery time and will not be discarded due to queue
92	   overflows, provided the flow traffic stays within its specified
93	   traffic parameters.  This service is intended for applications which
94	   need a firm guarantee that a packet will arrive no later than a
95	   certain time after it was transmitted by its source.  For example,
96	   some audio and video "play-back" applications are intolerant of any
97	   packet arriving after their play-back time.  Applications that have
98	   hard real-time requirements will also require guaranteed service.

100	   This service does not attempt to minimize the jitter (the difference
101	   between the minimal and maximal packet delays); it merely controls
102	   the maximal delay.  Because the guaranteed bound is a firm one, it
103	   must be large enough to cover extremely rare cases of long queueing
104	   delays.  Several studies have shown that the actual delay for the
105	   vast majority of packets can be far lower than the guaranteed delay.
106	   Therefore, authors of playback applications should note that packets
107	   will often arrive far earlier than the delivery deadline and will
108	   have to be buffered at the receiving system until it is time for the
109	   application to process them.

111	   This service represents one extreme end of delay control for
112	   networks.  Most other services providing delay control provide much
113	   weaker assurances about the resulting delays.  In order to provide
114	   this high level of assurance, guaranteed service is typically only
115	   useful if provided by every network element along the path.
116	   Moreover, as described in the Exported Information section, effective
117	   provision and use of the service requires that the set-up protocol
118	   used to request service provides service characterizations to
119	   intermediate routers and to the endpoints.

121	Network Element Data Handling Requirements

123	   The network element must ensure that the service approximates the
124	   "fluid model" of service.  The fluid model at service rate R is
125	   essentially the service that would be provided by a dedicated wire of
126	   bandwidth R between the source and receiver.  Thus, in the fluid
127	   model of service at a fixed rate R, the flow's service is completely
128	   independent of that of any other flow.

130	   The flow's level of service is characterized at each network element
131	   by a bandwidth (or service rate) R and a buffer size B.  R represents
132	   the share of the link's bandwidth the flow is entitled to and B
133	   represents the buffer space in the router that the flow may consume.
134	   The network element must ensure that its service matches the fluid
135	   model at that same rate to within a sharp error bound.

137	   The definition of guaranteed service relies on the result that the
138	   fluid delay of a flow obeying a token bucket (r,b) and being served
139	   by a line with bandwidth R is bounded by b/R as long as R is no less
140	   than r.  Guaranteed service with a service rate R, where now R is a
141	   share of bandwidth rather than the bandwidth of a dedicated line,
142	   approximates this behavior.

144	   More specifically, the network element must ensure that the delay of
145	   any packet be less than b/R+C/R+D, where C and D describe the maximal
146	   deviation away from the fluid model.  It is important to emphasize
147	   that C and D are maximums.  So, for instance, if an implementation
148	   has occasional gaps in service (perhaps due to processing routing
149	   updates), D needs to be large enough to account for the time a packet
150	   may lose during the gap in service.

152	   Links are not permitted to fragment packets as part of guaranteed
153	   service.  Packets larger than the MTU of the link must be policed as
154	   nonconformant which means that they will be policed according to the
155	   rules described in the Policing section below.

157	Invocation Information

159	   Guaranteed service is invoked by specifying the traffic (TSpec) and
160	   the desired service (RSpec) to the network element.  A service
161	   request for an existing flow that has a new TSpec and/or RSpec should
162	   be treated as a new invocation, in the sense that admission control
163	   must be reapplied to the flow.  Flows that reduce their TSpec and/or
164	   their RSpec (i.e., their new TSpec/RSpec is strictly smaller than the
165	   old TSpec/RSpec according to the ordering rules described in the
166	   section on Ordering below) should never be denied service.

168	   The TSpec takes the form of a token bucket plus a peak rate (p), a
169	   minimum policed unit (m), and a maximum packet size (M).

171	   The token bucket has a bucket depth, b, and a bucket rate, r.  Both b
172	   and r must be positive.  The rate, r, is measured in bytes of IP
173	   datagrams per second, and can range from 1 byte per second to as
174	   large as 40 terabytes per second (or about what is believed to be the
175	   maximum theoretical bandwidth of a single strand of fiber).  Clearly,
176	   particularly for large bandwidths, only the first few digits are
177	   significant and so the use of floating point representations,
178	   accurate to at least 0.1% is encouraged.

180	   The bucket depth, b, is also measured in bytes and can range from 1
181	   byte to 250 gigabytes.  Again, floating point representations
182	   accurate to at least 0.1% are encouraged.

184	   The range of values is intentionally large to allow for the future
185	   bandwidths.  The range is not intended to imply that a network
186	   element must support the entire range.

188	   The peak rate, p, is measured in bytes of IP datagrams per second and
189	   has the same range and suggested representation as the bucket rate.
190	   The peak rate is the maximum rate at which the source and any
191	   reshaping points (reshaping points are defined below) may inject
192	   bursts of traffic into the network.  More precisely, it is the
193	   requirement that for all time periods the amount of data sent cannot
194	   exceed M+pT where M is the maximum packet size and T is the length of
195	   the time period.  Furthermore, p must be greater than or equal to the
196	   token bucket rate, r.  If the peak rate is unknown or unspecified,
197	   then p is set to infinity, which in the IEEE floating point format
198	   corresponds to an exponent of all ones (255) and  a sign bit and
199	   mantissa of all zeroes.

201	   The minimum policed unit, m, is an integer measured in bytes.  All IP
202	   datagrams less than size m will be counted, when policed and tested
203	   for conformance to the TSpec, as being of size m.  The maximum packet
204	   size, M, is the biggest packet that will conform to the traffic
205	   specification; it is also measured in bytes.  The flow must be
206	   rejected if the requested maximum packet size is larger than the MTU
207	   of the link.  Both m and M must be positive, and m must be less than
208	   or equal to M.

210	   The RSpec is a rate R and a slack term S, where R must be greater
211	   than or equal to r and S must be nonnegative.  The RSpec rate can be
212	   bigger than the TSpec rate because higher rates will reduce queueing
213	   delay.  The slack term signifies the difference between the desired
214	   delay and the delay obtained by using a reservation level R.  This
215	   slack term can be utilized by the service element to reduce its
216	   resource reservation for this flow. When a service element chooses to
217	   utilize some of the slack in the RSpec, it must follow specific rules
218	   in updating the R and S fields of the RSpec; these rules are
219	   specified in the Ordering and Merging section.  If at the time of
220	   service invocation no slack is specified, the slack term, S, is set
221	   to zero.  No buffer specification is included in the RSpec because
222	   the service element is expected to derive the required buffer space
223	   to ensure no queueing loss from the token bucket and peak rate in the
224	   TSpec, the reserved rate and slack in the RSpec, combined with
225	   internal information about how the element manages its traffic.

227	   The TSpec can be represented by three floating point numbers in
228	   single-precision IEEE floating point format followed by two 32-bit
229	   integers in network byte order.  The first value is the rate (r), the
230	   second value is the bucket size (b), the third is the peak rate (p),
231	   the fourth is the minimum policed unit (m), and the fifth is the
232	   maximum packet size (M).

234	   The RSpec rate, R, and the slack term, S, can also be represented
235	   using single-precision IEEE floating point.

237	   For all IEEE floating point values, the sign bit must be zero. (All
238	   values must be positive).  Exponents less than 127 (i.e., 0) are
239	   prohibited.  Exponents greater than 162 (i.e., positive 35) are
240	   discouraged, except for specifying a peak rate of infinity.

242	Exported Information

244	   Each guaranteed service module must export at least the following
245	   information.  All of the parameters described below are
246	   characterization parameters.

248	   A network elements implementation of guaranteed service is
249	   characterized by two error terms, C and D, which represent how the
250	   element's implementation of the guaranteed service deviates from the
251	   fluid model.  These two parameters have an additive composition rule.

253	   If the composition function is applied along the entire path to
254	   compute the end-to-end sums of C and D (Ctot and Dtot) and the
255	   resulting values are then provided to the end nodes (by presumably
256	   the setup protocol), the end nodes can compute the maximal packet
257	   delays.  Moreover, if the partial sums (Csum and Dsum) from the most
258	   recent reshaping point (reshaping points are defined below)
259	   downstream towards receivers are handed to each network element then
260	   these network elements can compute the buffer allocations necessary
261	   to achieve no packet loss, as detailed in the section Guidelines for
262	   Implementors.  The proper use and provision of this service requires
263	   that the quantities Ctot and Dtot, and the quantities Csum and Dsum
264	   be computed.  Therefore, we assume that usage of guaranteed service
265	   will be primarily in contexts where these quantities are made
266	   available to end nodes and network elements.

268	   The error term C is measured in units of bytes.  An individual
269	   element can advertise a C value between 1 and 2**28 (a little over
270	   250 megabytes) and the total added over all elements can range as
271	   high as (2**32)-1.  Should the sum of the different elements delay
272	   exceed (2**32)-1, the end-to-end error term should be (2**32)-1.

274	   The error term D is measured in units of one microsecond.  An
275	   individual element can advertise a delay value between 1 and 2**28
276	   (somewhat over two minutes) and the total delay added all elements
277	   can range as high as (2**32)-1.  Should the sum of the different
278	   elements delay exceed (2**32)-1, the end-to-end delay should be
279	   (2**32)-1.

281	   The guaranteed service is service_name 2.

283	   Error characterization parameter C is numbered 1 and parameter D is
284	   numbered 2.

286	   The end-to-end composed value (Ctot) for C is numbered 3 and the
287	   end-to-end composed value for D (Dtot) is numbered 4.

289	   The since-last-reshaping point composed value (Csum) for C is
290	   numbered 5 and the since-last-reshaping point composed value for D
291	   (Dsum) is numbered 6.

293	   No other exported data is required by this specification.

295	Policing

297	   Policing is done at the edge of the network, at all heterogeneous
298	   source branch points and at all source merge points.  A heterogeneous
299	   source branch point is a spot where the multicast distribution tree
300	   from a source branches to multiple distinct paths, and the TSpec's of
301	   the reservations on the various outgoing links are not all the same.
302	   Policing need only be done if the TSpec on the outgoing link is "less
303	   than" (in the sense described in the Ordering section) the TSpec
304	   reserved on the immediately upstream link.  A source merge point is
305	   where the multicast distribution trees from two different sources
306	   (sharing the same reservation) merge.  It is the responsibility of
307	   the invoker of the service (a setup protocol, local configuration
308	   tool, or similar mechanism) to identify points where policing is
309	   required.  Policing may be done at other points as well as those
310	   described above.

312	   The token bucket and peak rate parameters require that traffic must
313	   obey the rule that over all time periods, the amount of data sent
314	   cannot exceed M+min[pT, rT+b-M], where r and b are the token bucket
315	   parameters, M is the maximum packet size, and T is the length of the
316	   time period (note that when p is infinite this reduces to the
317	   standard token bucket requirement).  For the purposes of this
318	   accounting, links must count packets which are smaller than the
319	   minimal policing unit to be of size m.  Packets which arrive at an
320	   element and cause a violation of the the M+min[pT, rT+b-M] bound are
321	   considered non-conformant.  Policing to conformance with this token
322	   bucket is done in two different ways.

324	   At the edge of the network, non-conforming packets are treated as
325	   best-effort datagrams.  [If and when a marking ability becomes
326	   available, these non-conformant packets should be ''marked'' as being
327	   non-compliant and then treated as best effort packets at all
328	   subsequent routers.]

330	      NOTE: There may be situations outside the scope of this document,
331	      such as when a service module's implementation of guaranteed
332	      service is being used to implement traffic sharing rather than a
333	      quality of service, where the desired action is to discard non-
334	      conforming packets.  To allow for such uses, implementors should
335	      ensure that the action to be taken for non-conforming packets is
336	      configurable.

338	   Inside the network, this approach does not produce the desired
339	   results, because queueing effects will occasionally cause a flow's
340	   traffic that entered the network as conformant to be no longer
341	   conformant at some downstream network element.  Therefore, inside the
342	   network, service elements must reshape traffic before applying the
343	   token bucket test.  Reshaping entails delaying packets until they are
344	   within conformance of the TSpec.

346	   Reshaping is done by combining a buffer with a token bucket and peak
347	   rate regulator and buffering data until it can be sent in conformance
348	   with the token bucket and peak rate parameters.  (The token bucket
349	   regulator should start with its token bucket full of tokens).  Under
350	   guaranteed service, the amount of buffering required to reshape any
351	   conforming traffic back to its original token bucket shape is
352	   b+Csum+(Dsum*r), where Csum and Dsum are the sums of the parameters C
353	   and D between the last reshaping point and the current reshaping
354	   point.  Note that the above buffer requirement is an upper bound that
355	   can be significantly reduced if the cumulative latency [7] from the
356	   last reshaping point is known. More precisely, in the above formula
357	   Dsum can be replaced by Dsum - (cumulative latency). In addition, the
358	   knowledge of the peak rate at the reshapers can also be used to
359	   further reduce the buffer requirements.  A network element must
360	   provide the necessary buffers to ensure that conforming traffic is
361	   not lost at the reshaper.

363	   If a datagram arrives to discover the reshaping buffer is full, then
364	   the datagram is non-conforming.  Observe this means that a reshaper
365	   is effectively policing too.  As with a policer, the reshaper should
366	   relegate non-conforming datagrams to best effort.  [If marking is
367	   available, the non-conforming datagrams should be marked]

369	      NOTE: As with policers, it should be possible to configure how
370	      reshapers handle non-conforming packets.

372	   Note that while the large buffer makes it appear that reshapers add
373	   considerable delay, this is not the case.  Given a valid TSpec that
374	   accurately describes the traffic, reshaping will cause little extra
375	   delay at the reshaping point.  However, if the TSpec is smaller than
376	   the actual traffic, reshaping will cause a large queue to develop at
377	   the reshaping point, which both causes substantial additional delays
378	   and forces some datagrams to be treated as non-conforming.  This
379	   scenario makes an unpleasant denial of service attack possible, in
380	   which a receiver who is successfully receiving a flow's traffic via
381	   best effort service is pre-empted by a new receiver who requests a
382	   reservation for the flow, but with an inadequate TSpec and RSpec.
383	   The flow's traffic will now be policed and possibly reshaped.  If the
384	   policing function was chosen to discard datagrams, the best-effort
385	   receiver would stop receiving traffic.  For this reason, in the
386	   normal case, policers are simply to mark packets as best effort.
387	   While this protects against denial of service, it is still true that
388	   the bad TSpec may cause queueing delays to increase.

390	      NOTE: To minimize problems of reordering datagrams, reshaping
391	      points may wish to forward a best-effort datagram from the front
392	      of the reshaping queue when a new datagram arrives and the
393	      reshaping buffer is full.

395	      Readers should also observe that reclassifying datagrams as best
396	      effort also makes support for elastic flows easier.  They can
397	      reserve a modest token bucket and when their traffic exceeds the
398	      token bucket, the excess traffic will be sent best effort.

400	   A related issue is that at all network elements, packets bigger than
401	   the MTU of the network element must be considered non-conformant and
402	   should be classified as best effort (and will then either be
403	   fragmented or dropped according to the element's handling of best
404	   effort traffic).  [Again, if marking is available, these reclassified
405	   packets should be marked.]

407	Ordering and Merging

409	   TSpec's are ordered according to the following rule: TSpec A is a
410	   substitute ("as good or better than") for TSpec B if (1) both the
411	   token rate r and bucket depth b for TSpec A are greater than or equal
412	   to those of TSpec B, (2) the peak rate p is at least as large in
413	   TSpec A as it is in TSpec B.  (3) the minimum policed unit m is at
414	   least as small for TSpec A as it is for TSpec B, and (4) the maximum
415	   packet size M is at least as large for TSpec A as it is for TSpec B.

417	   A merged TSpec may be calculated over a set of TSpecs by taking the
418	   largest token bucket rate, largest bucket size, largest peak rate,
419	   smallest minimal policed unit, and largest maximum packet size across
420	   all members of the set.  This use of the word "merging" is similar to
421	   that in the RSVP protocol; a merged TSpec is one which is adequate to
422	   describe the traffic from any one of a number of flows.

424	   The RSpec's are merged in a similar manner as the TSpecs, i.e. a set
425	   of RSpecs is merged onto a single RSpec by taking the largest rate R,
426	   and the smallest slack S.  More precisely, RSpec A is a substitute
427	   for RSpec B if the value of reserved service rate, R, in RSpec A is
428	   greater than or equal to the value in RSpec B, and the value of the
429	   slack, S, in RSpec A is smaller than or equal to that in RSpec B.

431	   Each network element receives a service request of the form (TSpec,
432	   RSpec), where the RSpec is of the form (Rin, Sin).  The network
433	   element processes this request and performs one of two actions:

435	    a. it accepts the request and returns a new Rspec of the form
436	       (Rout, Sout);
437	    b. it rejects the request.

439	   The processing rules for generating the new RSpec are governed by the
440	   delay constraint:

442	          Sout + b/Rout + Ctoti/Rout <= Sin + b/Rin + Ctoti/Rin,

444	   where Ctoti is the cumulative sum of the error terms, C, for all the
445	   network elements that are upstream of the current element, i.  In
446	   other words, this element consumes (Sin - Sout) of slack and can use
447	   it to reduce its reservation level, provided that the above
448	   inequality is satisfied.  Rin and Rout must also satisfy the
449	   constraint:

451	                             r <= Rout <= Rin.

453	   When several RSpec's, each with rate Rj, j=1,2..., are to be merged
454	   at a split point, the value of Rout is the maximum over all the rates
455	   Rj, and the value of Sout is the minimum over all the slack terms Sj.

457	 Guidelines for Implementors

459	   This section discusses a number of important implementation issues in
460	   no particular order.

462	   It is important to note that individual subnetworks are service
463	   elements and both routers and subnetworks must support the guaranteed
464	   service model to achieve guaranteed service.  Since subnetworks
465	   typically are not capable of negotiating service using IP-based
466	   protocols, as part of providing guaranteed service, routers will have
467	   to act as proxies for the subnetworks they are attached to.

469	   In some cases, this proxy service will be easy.  For instance, on
470	   leased line, the proxy need simply ensure that the sum of all the
471	   flows' RSpec rates does not exceed the bandwidth of the line, and
472	   needs to advertise the serialization and transmission delays of the
473	   link as the values of C and D.

475	   In other cases, this proxy service will be complex.  In an ATM
476	   network, for example, it may require establishing an ATM VC for the
477	   flow  and computing the C and D terms for that VC.  Readers may
478	   observe that the token bucket and peak rate used by guaranteed
479	   service map directly to the Sustained Cell Rate, Burst Size, and Peak
480	   Cell Rate of ATM's Q.2931 QoS parameters for Variable Bit Rate
481	   traffic.

483	   The assurance that packets will not be lost is obtained by setting
484	   the router buffer space B to be equal to the token bucket b plus some
485	   error term (described below).

487	   Another issue related to subnetworks is that the TSpec's token bucket
488	   rates measure IP traffic and do not (and cannot) account for link
489	   level headers.  So the subnetwork service elements must adjust the
490	   rate and possibly the bucket size to account for adding link level
491	   headers.  Tunnels must also account for the additional IP headers
492	   that they add.

494	   For packet networks, a maximum header rate can usually be computed by
495	   dividing the rate and bucket sizes by the minimum policed unit.  For
496	   networks that do internal fragmentation, such as ATM, the computation
497	   may be more complex, since one must account for both per-fragment
498	   overhead and any wastage (padding bytes transmitted) due to
499	   mismatches between packet sizes and fragment sizes.  For instance, a
500	   conservative estimate of the additional data rate imposed by ATM AAL5
501	   plus ATM segmentation and reassembly is

503	                         ((r/48)*5)+((r/m)*(8+52))

505	   which represents the rate divided into 48-byte cells multiplied by
506	   the 5-byte ATM header, plus the maximum packet rate (r/m) multiplied
507	   by the cost of the 8-byte AAL5 header plus the maximum space that can
508	   be wasted by ATM segmentation of a packet (which is the 52 bytes
509	   wasted in a cell that contains one byte).  But this estimate is
510	   likely to be wildly high, especially if m is small, since ATM wastage
511	   is usually much less than 52 bytes.  (ATM implementors should be
512	   warned that the token bucket may also have to be scaled when setting
513	   the VC parameters for call setup and that this example does not
514	   account for overhead incurred by encapsulations such as those
515	   specified in RFC 1483).

517	   To ensure no loss, service elements will have to allocate some
518	   buffering for bursts.  If every hop implemented the fluid model
519	   perfectly, this buffering would simply be b (the token bucket size).
520	   However, as noted in the discussion of reshaping earlier,
521	   implementations are approximations and we expect that traffic will
522	   become more bursty as it goes through the network.  However, as with
523	   shaping the amount of buffering required to handle the burstiness is
524	   bounded by b+Csum+Dsum*R.  If one accounts for the peak rate, this
525	   can be further reduced to

527	                  M + (b-M)(p-X)/(p-r) + (Csum/R + Dsum)X

529	   where X is set to r if (b-M)/(p-r) is less than Csum/R+Dsum and X is
530	   R if (b-M)/(p-r) is greater than or equal to Csum/R+Dsum and p>R;
531	   otherwise, X is set to p.  This reduction comes from the fact that
532	   the peak rate limits the rate at which the burst, b, can be placed in
533	   the network. As before, the buffer requirements can be lowered by
534	   subtracting from Dsum the propagation delay since the last reshaping
535	   point, if it is known. Conversely, if a non-zero slack term, Sout, is
536	   returned by the network element, the buffer requirements are
537	   increased by adding Sout to Dsum.

539	   While sending applications are encouraged to set the peak rate
540	   parameter and reshaping points are required to conform to it, it is
541	   always acceptable to ignore the peak rate for the purposes of
542	   computing buffer requirements and end-to-end delays.  The result is
543	   simply an overestimate of the buffering and delay.  As noted above,
544	   if the peak rate is unknown (and thus potentially infinite), the
545	   buffering required is b+Csum+Dsum*R.  The end-to-end delay without
546	   the peak rate is b/R+Ctot/R+Dtot.

548	   The parameter D at each service element should be set to the maximum
549	   packet transfer delay (independent of bucket size) through the
550	   service element.  For instance, in a simple router, one might compute
551	   the worst case amount of time it make take for a datagram to get
552	   through the input interface to the processor, and how long it would
553	   take to get from the processor to the outbound interface (assuming
554	   the queueing schemes work correctly).  For an Ethernet, it might
555	   represent the worst case delay if the maximum number of collisions is
556	   experienced.

558	   The parameter C is the data backlog resulting from the vagaries of
559	   how a specific implementation deviates from a strict bit-by-bit
560	   service. So, for instance, for packetized weighted fair queueing, C
561	   is set to M.

563	   If a network element uses a certain amount of slack, Si, to reduce
564	   the amount of resources that it has reserved for a particular flow,
565	   i, the value Si should be stored at the network element.
566	   Subsequently, if reservation refreshes are received for flow i, the
567	   network element must use the same slack Si without any further
568	   computation. This guarantees consistency in the reservation process.

570	   As an example for the use of the slack term, consider the case where
571	   the required end-to-end delay, Dreq, is larger than the maximum delay
572	   of the fluid flow system. The latter is obtained by setting R=r in
573	   the fluid delay formula, and is given by

575	                           b/r + Ctot/r + Dtot.

577	   In this case the slack term is

579	                     S = Dreq - (b/r + Ctot/r + Dtot).

581	   The slack term may be used by the network elements to adjust their
582	   local reservations, so that they can admit flows that would otherwise
583	   have been rejected. A service element at an intermediate network
584	   element that can internally differentiate between delay and rate
585	   guarantees can now take advantage of this information to lower the
586	   amount of resources allocated to this flow. For example, by taking an
587	   amount of slack s <= S, an RCSD scheduler [5] can increase the local
588	   delay bound, d, assigned to the flow, to d+s. Given an RSpec, (Rin,
589	   Sin), it would do so by setting Rout = Rin and Sout = Sin - s.

591	   Similarly, a network element using a WFQ scheduler can decrease its
592	   local reservation from Rin to Rout by using some of the slack in the
593	   RSpec. This can be accomplished by using the transformation rules
594	   given in the previous section, that ensure that the reduced
595	   reservation level will not increase the overall end-to-end delay.

597	Evaluation Criteria

599	   The scheduling algorithm and admission control algorithm of the
600	   element must ensure that the delay bounds are never violated.
601	   Furthermore, the element must ensure that misbehaving flows do not
602	   affect the service given to other flows.  Vendors are encouraged to
603	   formally prove that their implementation is an approximation of the
604	   fluid model.

606	 Examples of Implementation

608	   Several algorithms and implementations exist that approximate the
609	   fluid model.  They include Weighted Fair Queueing (WFQ) [2], Jitter-
610	   EDD [3], Virtual Clock [4] and a scheme proposed by IBM [5].  A nice
611	   theoretical presentation that shows these schemes are part of a large
612	   class of algorithms can be found in [6].

614	 Examples of Use

616	   Consider an application that is intolerant of any lost or late
617	   packets.  It uses the advertised values Ctot and Dtot and the TSpec
618	   of the flow, to compute the resulting delay bound from a service
619	   request with rate R. Assuming R < p, it then sets its playback point
620	   to [(b-M)/R*(p-R)/(p-r)]+(M+Ctot)/R+Dtot.

622	Security Considerations

624	   This memo discusses how this service could be abused to permit denial
625	   of service attacks.  The service, as defined, does not allow denial
626	   of service (although service may degrade under certain
627	   circumstances).

629	Acknowledgements

631	   The authors would like to gratefully acknowledge the help of the INT
632	   SERV working group.  We would also like to expressly acknowledge the
633	   help of several people who helped us ensure the mathematics of this
634	   document were correct [TBA].

636	References

638	   [1] S. Shenker and J. Wroclawski. "Network Element Service
639	   Specification Template", Internet Draft, June 1995, <draft-ietf-
640	   intserv-svc-template-01.txt>

642	   [2] A. Demers, S. Keshav and S. Shenker, "Analysis and Simulation of
643	   a Fair Queueing Algorithm," in Internetworking: Research and
644	   Experience, Vol 1, No. 1., pp. 3-26.

646	   [3] L. Zhang, "Virtual Clock: A New Traffic Control Algorithm for
647	   Packet Switching Networks," in Proc. ACM SIGCOMM '90, pp. 19-29.

649	   [4] D. Verma, H. Zhang, and D. Ferrari, "Guaranteeing Delay Jitter
650	   Bounds in Packet Switching Networks," in Proc. Tricomm '91.

652	   [5] L. Georgiadis, R. Guerin, V. Peris, and K. N. Sivarajan,
653	   "Efficient Network QoS Provisioning Based on per Node Traffic
654	   Shaping," IBM Research Report No. RC-20064.

656	   [6] P. Goyal, S.S. Lam and H.M. Vin, "Determining End-to-End Delay
657	   Bounds in Heterogeneous Networks," in Proc. 5th Intl. Workshop on
658	   Network and Operating System Support for Digital Audio and Video,
659	   April 1995.

661	   [7] S. Shenker. "Specification of General Characterization
662	   Parameters", Internet Draft, November 1995, <draft-ietf-intserv-
663	   charac-01.txt>

665	Authors' Address:

667	   Scott Shenker
668	   Xerox PARC
669	   3333 Coyote Hill Road
670	   Palo Alto, CA  94304-1314
671	   shenker@parc.xerox.com
672	   415-812-4840
673	   415-812-4471 (FAX)

675	   Craig Partridge
676	   BBN
677	   2370 Amherst St
678	   Palo Alto CA 94306
679	   craig@bbn.com