idnits 2.17.1 

draft-ietf-diffserv-arch-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-26) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 9
     longer pages, the longest (page 1) being 59 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There is 1 instance of lines with control characters in the document.

  ** The abstract seems to contain references ([DSFWK], [DSFIELD], [Baker]),
     which it shouldn't.  Please replace those with straight textual mentions
     of the documents in question.

  == There is 10 instances of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 146 has weird spacing: '...ervices   a pa...'

  == Line 258 has weird spacing: '...reement   a se...'

  == Line 967 has weird spacing: '...ppendix  inclu...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 1998) is 9478 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'Clark97' is defined on line 1290, but no explicit
     reference was found in the text

  == Unused Reference: 'Ellesson' is defined on line 1294, but no explicit
     reference was found in the text

  == Unused Reference: 'Ferguson' is defined on line 1303, but no explicit
     reference was found in the text

  == Unused Reference: 'Heinanen' is defined on line 1310, but no explicit
     reference was found in the text

  == Unused Reference: 'IntServ' is defined on line 1314, but no explicit
     reference was found in the text

  == Unused Reference: 'RSVP' is defined on line 1336, but no explicit
     reference was found in the text

  == Unused Reference: 'SIMA' is defined on line 1340, but no explicit
     reference was found in the text

  == Unused Reference: '2BIT' is defined on line 1344, but no explicit
     reference was found in the text

  == Unused Reference: 'Weiss' is defined on line 1356, but no explicit
     reference was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. 'AH'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'ATM'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Baker'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DSFIELD'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DSFWK'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Clark97'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ellesson'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'ESP'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ferguson'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'FRELAY'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Heinanen'

  ** Downref: Normative reference to an Informational RFC: RFC 1633 (ref.
     'IntServ')

  -- Possible downref: Non-RFC (?) normative reference: ref. 'MPLSFWK'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'PASTE'

  ** Obsolete normative reference: RFC 1349 (Obsoleted by RFC 2474)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'SIMA'

  -- Possible downref: Non-RFC (?) normative reference: ref. '2BIT'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'TR'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Weiss'


     Summary: 13 errors (**), 0 flaws (~~), 15 warnings (==), 19 comments
     (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	INTERNET-DRAFT                                               David Black
3	Diffserv Working Group                                    The Open Group
4	Expires: November 1998                                      Steven Blake
5	                                                         IBM Corporation
6	                                                            Mark Carlson
7	                                                        Redcape Software
8	                                                            Elwyn Davies
9	                                                               Nortel UK
10	                                                              Zheng Wang
11	                                           Bell Labs Lucent Technologies
12	                                                            Walter Weiss
13	                                                     Lucent Technologies

15	                                                                May 1998

17	               An Architecture for Differentiated Services

19	                    <draft-ietf-diffserv-arch-00.txt>

21	Status of This Memo

23	   This document is an Internet-Draft.  Internet-Drafts are working
24	   documents of the Internet Engineering Task Force (IETF), its areas,
25	   and its working groups.  Note that other groups may also distribute
26	   working documents as Internet-Drafts.

28	   Internet-Drafts are draft documents valid for a maximum of six months
29	   and may be updated, replaced, or obsoleted by other documents at any
30	   time.  It is inappropriate to use Internet-Drafts as reference
31	   material or to cite them other than as "work in progress."

33	   To view the entire list of current Internet-Drafts, please check the
34	   "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
35	   Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern
36	   Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific
37	   Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast).

39	Abstract

41	   This document defines an architecture for implementing scalable
42	   service differentiation in the Internet.  This architecture achieves
43	   scalability by aggregating traffic classification state which is
44	   conveyed by means of IP-layer packet marking using the DS field
45	   [DSFIELD].  Packets are classified and marked to receive a particular
46	   per-hop forwarding behavior on routers along their path.
47	   Sophisticated classification, policing, and shaping operations need
48	   only be implemented at network boundaries or hosts.  Network
49	   resources are allocated to traffic streams by service provisioning
50	   policies which govern how traffic is conditioned upon entry to a
51	   differentiated services-capable network, and how that traffic is

53	Black, et. al.            Expires: November 1998               [Page  1]
54	   forwarded within that network.  A wide variety of services can
55	   implemented on top of these building blocks.

57	   This document should be read along with its companion documents, the
58	   the differentiated services framework [DSFWK], the definition of the
59	   DS field [DSFIELD], and other documents which specify per-hop
60	   behaviors, such as [Baker].

62	1.  Introduction

64	1.1  Overview

66	   This document defines an architecture for implementing scalable
67	   service differentiation in the Internet.  "Service" is taken to
68	   signify some significant characteristics of packet transmission
69	   across a set of one or more paths within a network.  These
70	   characteristics may be specified in quantitative or statistical terms
71	   of throughput, delay, jitter, and/or loss, or may otherwise be
72	   specified in terms of some relative priority of access to network
73	   resources.  Service differentiation is desired to accommodate
74	   heterogeneous application requirements and user expectations, and to
75	   permit differentiated pricing of Internet service.

77	   This architecture is composed of a number of functional elements
78	   implemented in network nodes, including a small set of well-defined
79	   per-hop forwarding behaviors, and traffic conditioning functions
80	   including classification, metering, marking, shaping, and policing.
81	   This architecture achieves scalability by implementing complex
82	   conditioning functions only at network edge nodes, and by applying
83	   per-hop behaviors to aggregates of traffic which have been
84	   appropriately marked using the DS field in the IPv4 or IPv6 headers
85	   [DSFIELD].  Per-hop behaviors are defined to permit a reasonably
86	   granular means of allocating buffer and bandwidth resources among
87	   competing traffic streams.  Per-application flow or per-customer
88	   forwarding state need not be maintained within the core of the
89	   network.  Service provisioning and traffic conditioning policies are
90	   sufficiently decoupled from the forwarding behaviors within the
91	   network interior to permit a wide variety of service behaviors to be
92	   implemented, with room for future expansion.

94	   Section 1.2 is a glossary of terms used within this document.
95	   Section 1.3 lists requirements for this architecture, and Section 1.4
96	   provides a brief comparison to other approaches for service
97	   differentiation.  Section 2 discusses the components of the
98	   architecture in detail.  Section 3 proposes requirements for per-hop
99	   behavior specifications.  Section 4 discusses interoperability issues
100	   with networks which do not implement differentiated services as
101	   defined in this document and [DSFIELD].  Section 5 discusses issues
102	   with multicast traffic (this section is currently left for future
103	   study).  Section 6 addresses security and tunnel considerations.

105	Black, et. al.            Expires: November 1998               [Page  2]
106	   This document should be read along with its companion documents, the
107	   differentiated services framework [DSFWK], the definition of the DS
108	   field [DSFIELD], and other documents which specify per-hop behaviors,
109	   such as [Baker].  It has been heavily influenced by the thoughtful
110	   proposals of previous authors [Clark97, Ellesson, Ferguson, Heinanen,
111	   SIMA, 2BIT, Weiss].

113	1.2  Terminology

115	   This section gives a general conceptual overview of the terms used
116	   in this document.  Some of these terms are more precisely defined in
117	   later sections of this document.  The choice of terms and definitions
118	   were influenced by [MPLSFWK].

120	   Behavior Aggregate (BA)   a DS behavior aggregate.

122	   BA classifier             a classifier that selects packets based
123	                             only on the contents of the DS-field.  Such
124	                             classifiers are used in DS interior nodes,
125	                             and are typically used for policing at a DS
126	                             ingress node.

128	   Boundary                  a link connecting the edge nodes of two
129	                             domains.

131	   Classifier                a logical element of traffic conditioning
132	                             that selects packets based on the content
133	                             of packet headers according to defined
134	                             rules.

136	   Customer DS domain        a DS domain that has an SLA in place with
137	                             another directly attached DS domain (the
138	                             provider DS domain) governing the rules by
139	                             which traffic from the customer DS domain
140	                             will be serviced within the provider DS
141	                             domain.  A single DS domain may be both a
142	                             customer DS domain and a provider DS domain
143	                             for different directions of traffic at the
144	                             same time.

146	   Differentiated Services   a paradigm for providing quality-of-service
147	   (DS)                      (QoS) in the Internet by employing a small,
148	                             well-defined set of building blocks from
149	                             which a variety of services may be built.

151	   DS behavior aggregate     a stream of packets that have the same DS
152	                             codepoint.

154	   DS field                  the IPv4 TOS octet or IPv6 Traffic Class
155	                             octet when interpreted according to
156	                             [DSFIELD].

158	Black, et. al.            Expires: November 1998               [Page  3]
159	   DS capable                able to support differentiated services
160	                             functions and behaviors as defined in
161	                             [DSFIELD], this document, and other
162	                             documents.

164	   DS codepoint              a specific bit-pattern of the DS field.

166	   DS edge node              a DS node that connects one DS domain to a
167	                             node either in another DS domain or in a
168	                             domain that is not DS capable.

170	   DS egress node            a DS edge node in its role in handling
171	                             traffic as it leaves a DS domain.

173	   DS destination host       a DS host that acts as a DS egress node.

175	   DS domain                 a contiguous set of nodes which operate
176	                             with a common set of service provisioning
177	                             policies and PHB definitions.

179	   DS host                   a host computer that can perform certain
180	                             traffic conditioning functions and
181	                             therefore acts as a special DS edge node.

183	   DS ingress node           a DS edge node in its role in handling
184	                             traffic as it enters a DS domain.

186	   DS interior node          a DS node that is not a DS edge node.

188	   DS node                   a DS capable node.

190	   DS region                 a set of contiguous DS domains which can
191	                             offer differentiated services over paths
192	                             across those DS domains.

194	   DS source host            a DS host that acts as a DS ingress node.

196	   Legacy node               a node which implements IPv4 Precedence as
197	                             defined in [RFC791] but which is otherwise
198	                             not DS capable.

200	   Marker                    a logical element of traffic conditioning
201	                             that sets the DS codepoint in the DS field
202	                             based on defined rules.

204	   MF Classifier             a classifier which selects packets based on
205	                             the content of some arbitrary number of
206	                             header fields; typically some combination
207	                             of source address, destination address,
208	                             protocol ID, source port and destination
209	                             port.

211	Black, et. al.            Expires: November 1998               [Page  4]
212	   Mechanism                 a specific algorithm or operation (e.g.,
213	                             queueing discipline) that is implemented in
214	                             a node to realize a set of one or more per-
215	                             hop behaviors.

217	   Meter                     a logical element of traffic conditioning
218	                             that measures the properties (e.g., rate)
219	                             of a packet stream selected by a
220	                             classifier.

222	   Microflow                 a single instance of an application-to-
223	                             application flow of packets which is
224	                             identified by source address, source port,
225	                             destination address, destination port and
226	                             protocol id.

228	   Per-Hop-Behavior (PHB)    the externally observable forwarding
229	                             behavior applied at a DS capable node to a
230	                             DS behavior aggregate.

232	   PHB group                 a set of one or more PHBs that can only be
233	                             meaningfully specified and implemented
234	                             simultaneously, due to a common constraint
235	                             applying to all PHBs in the set such as a
236	                             packet scheduling or discard policy.

238	   Policing                  the process of applying traffic
239	                             conditioning functions such as marking or
240	                             discarding to a traffic stream in
241	                             accordance with the state of a
242	                             corresponding meter.

244	   Provider DS domain        a DS domain that has an SLA in place with
245	                             another directly attached DS domain (the
246	                             customer DS domain) governing the rules by
247	                             which traffic from the customer DS domain
248	                             will be serviced within the provider DS
249	                             domain.  A single DS domain may be both a
250	                             customer DS domain and a provider DS domain
251	                             for different directions of traffic at the
252	                             same time.

254	   Service                   the overall treatment of a defined subset
255	                             of a customer's traffic within a DS domain
256	                             or end-to-end.

258	   Service Level Agreement   a service contract between a customer and a
259	   (SLA)                     service provider that specifies the details
260	                             of a TCA and the corresponding service
261	                             behavior a customer should receive.  A
262	                             customer may be a user organization or
263	                             another DS domain.

265	Black, et. al.            Expires: November 1998               [Page  5]
266	   Service Provisioning      a policy which defines how traffic
267	   Policy                    conditioners are configured on DS edge
268	                             nodes and how traffic streams are mapped to
269	                             DS behavior aggregates to achieve a range
270	                             of service behaviors.

272	   Shaper                    a logical element of traffic conditioning
273	                             that delays packets within a traffic stream
274	                             to cause it to conform to some defined
275	                             traffic properties.

277	   Traffic conditioner       an entity that performs traffic
278	                             conditioning and which may contain
279	                             classifiers, markers, meters, and shapers.

281	   Traffic conditioning      control functions performed to enforce
282	                             rules specified in a TCA and to prepare
283	                             traffic for differentiated services,
284	                             including classifying, metering, marking,
285	                             policing, and shaping.

287	   Traffic Conditioning      an agreement specifying classifier rules
288	   Agreement (TCA)           and the corresponding traffic profiles and
289	                             metering, marking, policing and/or shaping
290	                             rules which are to apply to the traffic
291	                             streams selected by the classifier.

293	   Traffic profile           a description of the expected properties
294	                             of a traffic stream such as rate and burst
295	                             size.

297	   Traffic stream            an administratively significant set of one
298	                             or more microflows which traverse a path
299	                             segment.  A traffic stream may consist of
300	                             the set of active microflows which are
301	                             selected by a particular classifier.

303	1.3  Requirements

305	   The history of the Internet has been continuous growth in the number
306	   of hosts, the number and variety of applications, and the capacity of
307	   the network infrastructure, and this growth is expected to continue
308	   for the foreseeable future.  A scalable architecture for service
309	   differentiation must be able to accommodate this continued growth.

311	   The following requirements were identified and are addressed in this
312	   architecture:

314	   o  must accommodate a wide variety of service behaviors and
315	      provisioning policies, extending end-to-end or within a particular
316	      (set of) network(s),

318	Black, et. al.            Expires: November 1998               [Page  6]
319	   o  must allow decoupling of the service behavior from the particular
320	      application in use,

322	   o  must work with existing applications (assuming suitable deployment
323	      of traffic conditioners),

325	   o  must decouple traffic conditioning and service provisioning
326	      functions from forwarding behaviors implemented within the core
327	      network routers,

329	   o  must not depend on hop-by-hop application signaling,

331	   o  must require only a small set of forwarding behaviors whose
332	      implementation complexity does not dominate the cost of a network
333	      device, and which will not introduce bottlenecks for future high-
334	      speed system implementations,

336	   o  must avoid per-microflow or per-customer state within core network
337	      routers,

339	   o  must utilize only aggregated classification state within the
340	      network core,

342	   o  must permit simple packet classification implementations in core
343	      network routers (BA classifier),

345	   o  must permit reasonable interoperability with non-compliant network
346	      nodes,

348	   o  must accommodate incremental deployment.

350	1.4  Comparisons with Other Approaches

352	   The differentiated services architecture specified in this document
353	   can be contrasted with other existing models of traffic management
354	   and service differentiation.  We classify these alternative models
355	   into the following categories: relative priority, virtual circuit,
356	   Integrated Services/RSVP, and service marking.

358	   Implementations of the relative priority model include IPv4
359	   Precedence marking as defined in [RFC791], 802.5 Token Ring priority
360	   [TR], and 802.1p priority [802.1p].  In this model the application,
361	   host, or proxy node selects a relative priority or "precedence" for a
362	   packet (e.g., delay or discard priority), and the network nodes along
363	   the transit path apply the appropriate priority forwarding behavior
364	   corresponding to the priority value within the packet's header.  Our
365	   architecture can be considered as a refinement to this model, since
366	   we more clearly specify the role and importance of edge nodes and
367	   traffic conditioners, and since our per-hop behavior model permits
368	   more general forwarding behaviors than relative delay or discard
369	   priority.

371	Black, et. al.            Expires: November 1998               [Page  7]
372	   Implementations of the virtual circuit model include Frame Relay,
373	   ATM, and MPLS [FRELAY, ATM, PASTE].  In this model path forwarding
374	   state and traffic management or QoS state is established for traffic
375	   streams on each hop along a path.  Traffic aggregates of varying
376	   granularity are associated with a virtual circuit, and packets/cells
377	   within each virtual circuit are marked with a forwarding label that
378	   is used to lookup the next hop, the per-hop forwarding behavior, and
379	   the replacement label at each hop.  This model permits finer
380	   granularity resource allocation to traffic streams, but the amount
381	   of forwarding state scales linearly with the number of edges of the
382	   network in the best case (assuming multipoint-to-point virtual
383	   circuits), and it scales with the square of the number of edges in
384	   the worst case, when edge-edge traffic streams with provisioned
385	   resources are employed.

387	   The Integrated Services/RSVP model relies upon traditional datagram
388	   forwarding in the default case, but allows sources and receivers to
389	   exchange signaling messages which establish classification and
390	   forwarding state on each node along the path between them [IntServ,
391	   RSVP].  In the absence of state aggregation, the amount of state on
392	   each node scales in proportion to the ratio of the link rate to the
393	   average reservation size (in bps), multiplied by some fraction of the
394	   link rate which is "reservable".  This model also requires
395	   application support for the RSVP signaling protocol.

397	   An example of a service marking model is IPv4 TOS as defined in
398	   [RFC1349].  In this example each packet is marked with a request for
399	   a "type of service", which may include "minimize delay", "maximize
400	   throughput", "maximize reliability", or "minimize cost".  Network
401	   nodes may select routing paths or forwarding behaviors which are
402	   suitably provisioned to satisfy the service request.  This model is
403	   subtly different from our architecture.  The defined TOS markings are
404	   very generic and do not span the range of possible service semantics.
405	   Furthermore, the service request is associated with each individual
406	   packet, whereas some service semantics may depend on the aggregate
407	   forwarding behavior of a sequence of packets.  The service marking
408	   model does not easily accommodate growth in the number and range of
409	   future services, and involves configuration of the "TOS->forwarding
410	   behavior" association in each core network router.

412	2.  Differentiated Services Architectural Model

414	   The differentiated services architecture is based on a simple model
415	   where traffic entering a network is conditioned at the edges of the
416	   network, and assigned to different behavior aggregates.  Each
417	   behavior aggregate is identified with a single DS codepoint.  Within
418	   the core of the network, packets are forwarded according to the per-
419	   hop behavior associated with the DS codepoint.  In this section, we
420	   discuss the key components in a differentiated services region,
421	   traffic conditioning functions, and how differentiated services are
422	   achieved through the combination of traffic conditioning and PHB-

424	Black, et. al.            Expires: November 1998               [Page  8]
425	   based forwarding.

427	2.1  Differentiated Services Regions

429	   A differentiated services region (DS Region) is a set of contiguous
430	   DS domains, where each DS domain consists of a set of edge nodes and
431	   interior nodes.

433	2.1.1  DS Domain

435	   A DS domain is a contiguous set of DS nodes which operate with a
436	   common service provisioning policy and set of PHB group definitions.
437	   A DS domain has a well-defined boundary consisting of DS edge nodes
438	   which condition ingress traffic and ensure that packets which transit
439	   the domain are only marked using one of the PHB groups supported in
440	   the domain.  All nodes inside the DS domain select the forwarding
441	   behavior for packets based solely on the DS codepoint as defined for
442	   the PHB groups supported in the domain.  Inclusion of non-DS capable
443	   nodes within a DS domain may result in unpredictable performance and
444	   may impede the ability to satisfy SLAs.

446	   A DS domain normally consists of one or more networks under the same
447	   administration, for example, an organization's intranet or an ISP.
448	   Multiple DS domains may be inter-connected through mutual agreements
449	   to form a DS region.  DS domains in a DS region may implement
450	   different PHB groups.  However, to permit services which span across
451	   the domains, the peering DS domains must each establish a peering SLA
452	   which includes a Traffic Conditioning Agreement (TCA) which specifies
453	   how transit traffic from one DS domain to another DS domain is
454	   conditioned at the boundary of the two DS domains.

456	   It is possible that several DS domains within a DS region may adopt a
457	   common service provisioning policy and PHB group definitions, thus
458	   eliminating the need for traffic conditioning between those DS
459	   domains.  In such cases, those DS domains are effectively under a
460	   single administration and may be considered as a single DS domain.

462	   The administration of the domain is responsible for ensuring that
463	   adequate resources are provisioned and/or reserved to support the
464	   SLAs offered by the domain.

466	2.1.2  DS Edge Nodes and Interior Nodes

468	   A DS domain consists of DS edge nodes and DS interior nodes. While
469	   DS edge nodes connect the DS domain to other DS or non-DS domains, DS
470	   interior nodes only connect to other DS interior or edge nodes within
471	   the DS domain.

473	   Both DS edge nodes and interior nodes must be able to forward packets
474	   based on the DS codepoint as defined by the PHB groups supported in
475	   the domain; otherwise unpredictable behavior may result. In addition,

477	Black, et. al.            Expires: November 1998               [Page  9]
478	   DS edge nodes must be able to perform traffic conditioning functions
479	   as described by the TCA between their DS domain and the peering
480	   domain which they connect to.

482	   Interior nodes may be able to perform limited traffic conditioning
483	   functions such as DS codepoint mutation.

485	   A host within a DS domain may act as a DS edge node for traffic to
486	   and from applications running on that host.  If a host is embedded in
487	   a DS domain and does not act as an edge node, then the host's first-
488	   hop router acts as the DS edge node for the host's traffic.

490	2.1.3  DS Ingress Node and Egress Node

492	   DS edge nodes may act both as a DS ingress node or as a DS egress
493	   node.  Traffic enters a DS domain at a DS ingress node and leaves a
494	   DS domain at a DS egress node. A DS ingress node is responsible for
495	   ensuring that the traffic entering the DS domain conforms to the TCA
496	   between it and the other domain which the ingress node is connected
497	   to.  A DS egress node may perform traffic conditioning functions on
498	   traffic forwarded to the peering domain, depending on the details of
499	   the TCA between two domains.

501	2.2  Traffic Conditioning

503	   Traffic conditioning functions are performed by DS edge nodes in a DS
504	   domain to ensure that the traffic entering a DS domain conforms to
505	   the rules specified in the TCA, in accordance with the domain's
506	   service provisioning policy, and to prepare the traffic for the PHB-
507	   based forwarding treatment in the interior routers.

509	2.2.1  General Architecture of Traffic Conditioners

511	   A traffic conditioner may contain the following elements: classifier,
512	   meter, marker, and shaper.  The classifier and the meter select the
513	   packets within a traffic stream and measure the stream against a
514	   traffic profile.  The marker and shaper perform control actions on
515	   the packets depending on whether the traffic stream is within its
516	   associated profile.

518	   A packet stream normally passes to a classifier first, and the
519	   matched packets are measured by a meter against the profile as
520	   defined in the TCA.  The packets within the profile may leave the
521	   traffic conditioner or may be marked by the marker. The packets that
522	   are out-of-profile may be either marked or shaped according to the
523	   rules specified in the TCA.  Note that discard policing can be
524	   performed by a specially configured shaper (see Sec. 2.2.3.4).  When
525	   packets leave the traffic conditioner of a DS ingress node, the DS
526	   field of each packet must be set to one of DS codepoints defined by
527	   the PHB groups supported in the DS domain.

529	   Fig. 1 shows the block diagram of a traffic conditioner.  Note that a
530	   traffic conditioner may not necessarily contain all four elements.
531	   For example, packets may pass from the classifier directly to the
532	   marker or shaper (null meter).

534	                                                    +-------+
535	                                                 -->|       |---->
536	                    +-------+       +-------+  /    +-------+
537	                    |       |       |       |/        marker
538	     packets -----> |       |------>|       |-------------------->
539	                    |       |       |       |\
540	                    +-------+       +-------+  \    +-------+
541	                    classifier        meter      -->|       |---->
542	                                                    +-------+
543	                                                      shaper

545			Fig. 1: Logical View of a Traffic Conditioner

547	2.2.2  Traffic Conditioning Agreement (TCA)

549	   Differentiated services are extended across a DS domain boundary by
550	   establishing a SLA between the customer and provider DS domains.  The
551	   SLA includes a traffic conditioning agreement which usually specifies
552	   traffic profiles and actions to in-profile and out-of-profile
553	   packets.

555	   2.2.2.1  Traffic Profiles

557	   A traffic profile specifies rules for classifying and measuring a
558	   traffic stream.  It identifies what packets are eligible and rules
559	   for determining whether a particular packet is in-profile or out-of-
560	   profile.  For example, a profile based on token bucket may look like:

562	     codepoint=X, use token-bucket r, b

564	   The above profile indicates that all packets in the behavior
565	   aggregate with DS codepoint X should be measured against a token
566	   bucket meter with rate r and burst size b.  In this example out-of-
567	   profile packets are those packets in the behavior aggregate which
568	   arrive when insufficient tokens are available in the bucket.
569	   Different conditioning actions may be applied to the in-profile
570	   packets and out-of-profile packets, or different accounting actions
571	   may be triggered.

573	   2.2.2.2  Actions to In-Profile and Out-of-Profile Packets

575	   In-profile packets may be allowed to enter the DS domain without
576	   further conditioning as they conform to the TCA; or, alternatively,
577	   their DS field may be marked with a new DS codepoint.  The latter
578	   happens when the DS field is set to a non-Default value for the first
579	   time [DSFIELD], or when the packets enter a DS domain that uses a
580	   different PHB group for this traffic stream, so the DS codepoint has
581	   to be mapped to the new PHB group.

583	   The actions to out-of-profile packets may include delaying the
584	   packets until they are in-profile (shaping), discarding the packets,
585	   marking the DS field to a particular codepoint, or triggering some
586	   accounting action.

588	2.2.3  Components of a Traffic Conditioner

590	   2.2.3.1  Classifiers

592	   Packet classifiers select packets in a traffic stream based on the
593	   content of some portion of the packet header.  The classification may
594	   be based on the DS field only (Behavior Aggregate Classification), or
595	   on any combination of one or several fields in the packet header such
596	   as source address, destination address, DS field, protocol ID, and,
597	   transport-layer header fields such as source port and destination
598	   port numbers (Multi-Field Classification).  Classifiers are used to
599	   steer packets matching some specified rule to another element of the
600	   traffic conditioner for further processing.  Classifiers must be
601	   configured by some management procedure in accordance with the
602	   appropriate TCA.

604	   The classifier should authenticate the information which it uses to
605	   classify the packet (see Sec. 6).

607	   Note that in the event of upstream packet fragmentation, multi-field
608	   classifiers which examine the contents of transport-layer header
609	   fields may incorrectly classify packet fragments subsequent to the
610	   first.  A possible solution to this problem is to maintain
611	   fragmentation state; however, this is not a general solution due to
612	   the possibility of upstream fragment re-ordering or divergent routing
613	   paths.

615	   2.2.3.2  Meters

617	   Traffic meters measure the traffic properties of the set of packets
618	   selected by a classifier against a traffic profile specified in the
619	   TCA.  A meter indicates to other conditioning functions whether each
620	   individual packet is in- or out-of-profile.

622	   A null meter will identify all packets as in-profile.  Such a meter
623	   may be used when the traffic profile does not specify conforming rate
624	   or burst parameters.

626	   2.2.3.3  Markers

628	   Packet markers set the DS field of a packet to a particular
629	   codepoint, adding the marked packet to a particular DS behavior
630	   aggregate.  The marker may be configured to mark all packets which
631	   are steered to it to a single codepoint, or may be configured to mark
632	   a packet to one of a set of codepoints within a PHB group according
633	   to the state of a meter.

635	   2.2.3.4  Shapers

637	   Shapers delay some or all of the packets in a traffic stream in order
638	   to bring the stream into compliance with its associated traffic
639	   profile.  A shaper usually has a finite-size buffer, and packets may
640	   be discarded if there is not enough buffer space to hold the delayed
641	   packets.  Note that discard policers can be implemented as a special
642	   case of a shaper by setting the shaper buffer size to zero (or a few)
643	   packets.

645	2.2.4  Location of Traffic Conditioners

647	   Traffic conditioners may be located within a customer DS domain, and
648	   at the boundary of a DS domain.  Traffic conditioners may also be
649	   located in nodes in a non-DS domain.

651	   2.2.4.1  Traffic Conditioners within a Customer DS Domain

653	   Traffic sources and nodes within a customer DS domain may perform
654	   traffic conditioning functions.  The packets originating from the
655	   customer DS domain across a boundary may have their DS field marked
656	   by the traffic sources or by intermediate routers before leaving the
657	   customer DS domain.

659	   For example, suppose that a customer domain has a policy that the
660	   CEO's packets should have higher priority. The CEO's host may mark
661	   the DS field of all outgoing packets with a DS codepoint that
662	   indicates higher priority.  Alternatively, the first-hop router
663	   directly connected to the CEO's host may classify the traffic and
664	   mark the CEO's packets with the correct DS codepoint.

666	   There are some advantages to marking the DS field close to the
667	   traffic source.  First, a traffic source can more easily take an
668	   application's preferences into account when deciding which packets
669	   should receive better forwarding treatment.  Also, classification of
670	   packets is much simpler before the traffic has been aggregated with
671	   packets from other sources, since the number of classification rules
672	   which need to be applied within a single node is reduced.

674	   Since packet marking may be distributed across different nodes, the
675	   customer DS domain is responsible for ensuring that the aggregated
676	   traffic towards its provider DS domain conforms to the appropriate
677	   TCA.  Additional allocation mechanisms such as bandwidth brokers or
678	   RSVP may be used to dynamically allocate resources for a particular
679	   DS behavior aggregate within the customer's network. The edge node of
680	   the customer DS domain should also monitor conformance to the TCA,
681	   and triage packets as necessary.

683	   2.2.4.2  Traffic Conditioners at the Boundary of a DS Domain

685	   Traffic streams may be marked and otherwise conditioned on either end
686	   of a boundary link (the DS egress node of the customer DS domain or
687	   the DS ingress node of the provider DS domain).  The TCA between the
688	   domains should specify which domain has responsibility for mapping
689	   traffic streams to DS behavior aggregates and conditioning those
690	   aggregates in conformance with the TCA.  However, a DS ingress node
691	   must assume that the incoming traffic may not conform to the TCA and
692	   must be prepared to enforce the TCA in accordance with local policy.

694	   There is an advantage to performing complex conditioning operations
695	   in the customer DS domain since it is then no longer necessary to
696	   divulge the local classification and service provisioning rules to
697	   the provider DS domain.  In this circumstance the provider domain may
698	   only need to re-mark or police incoming behavior aggregates to
699	   enforce the TCA.  However, more sophisticated services which are
700	   path- or source-dependent may require multi-field classification in
701	   the provider's ingress nodes.

703	   Since packet marking may be distributed across different nodes, the
704	   If a DS ingress node is connected to a non-DS domain, the DS ingress
705	   node must be able to perform all traffic conditioning functions on
706	   the incoming traffic.

708	   2.2.4.3  Traffic Conditioners in non-DS Domains

710	   Traffic sources or intermediate nodes in a non-DS domain may employ
711	   traffic conditioners to pre-mark traffic before it reaches the
712	   ingress of a provider DS domain.

714	2.3  Per-Hop Behaviors

716	   A per-hop behavior (PHB) is a description of the externally
717	   observable forwarding behavior of a DS node applied to a particular
718	   DS behavior aggregate.  "Forwarding behavior" is a general concept in
719	   this context.  For example, in the event that only one behavior
720	   aggregate occupies a link, the observable forwarding behavior (i.e.,
721	   loss, delay, jitter) will usually depend only on the relative loading
722	   of the link (i.e., in the event that the behavior assumes a work-
723	   conserving scheduling discipline).  Useful behavioral distinctions
724	   are only observed when multiple behavior aggregates compete for
725	   buffer and bandwidth resources on a node.  The PHB is the means by
726	   which a node allocates resources to behavior aggregates, and it is on
727	   top of this basic hop-by-hop resource allocation mechanism that
728	   useful differentiated services may be constructed.

730	   The most simple example of a PHB is one which guarantees a minimal
731	   bandwidth allocation of X% of a link (over some reasonable time
732	   interval) to a behavior aggregate.  This PHB can be fairly easily
733	   measured under a variety of competing traffic conditions.  A slightly
734	   more complex PHB would guarantee a minimal bandwidth allocation of X%
735	   of a link, with proportional fair sharing of any excess link
736	   capacity.  Another simple example is taken from [DSFIELD]; the
737	   Expedited Forwarding PHB.  This PHB provides negligible loss, delay,
738	   and delay jitter (similar to that observed by a single packet
739	   traversing an otherwise idle router) for a behavior aggregate which
740	   is the multiplex of multiple peak-rate regulated traffic streams,
741	   under the constraint that the load of the behavior aggregate is a
742	   small fraction of the link capacity.  This last constraint is a
743	   consequence of queueing physics; a multiplex of peak-rate regulated
744	   traffic streams may still exhibit arrival burstiness, and the
745	   resulting delay and jitter will only be negligible under the
746	   circumstance where the relative load of the aggregated traffic is
747	   small, even when there is no competing traffic from other behavior
748	   aggregates.  In general, the observable behavior of a PHB may depend
749	   on certain constraints on the traffic characteristics of the
750	   associated behavior aggregate, or the characteristics of other
751	   behavior aggregates.

753	   PHBs may be specified in terms of their resource (e.g., buffer,
754	   bandwidth) priority relative to other PHBs, or in terms of their
755	   relative observable traffic characteristics (e.g., delay, loss)
756	   [Baker].  These PHBs should be specified as a group (PHB group) for
757	   consistency.  The priority relationship within a PHB group will tend
758	   to be hierarchical, and the associated DS codepoints should be
759	   assigned in increasing order of relative priority for clarity of
760	   interpretation.  The priority relationship between PHBs in the group
761	   may be absolute (e.g., absolute discard priority) or may be less
762	   rigid (e.g., higher probability of loss).  A single PHB defined in
763	   isolation is a degenerate form of a PHB group.

765	   PHBs are implemented in nodes by means of some buffer management and
766	   packet scheduling mechanisms.  PHBs should be defined in terms of
767	   behavior characteristics relevant to service provisioning policies,
768	   and not in terms of particular implementation mechanisms.  In
769	   general, a variety of implementation mechanisms may be suitable for
770	   implementing a particular PHB group.  Furthermore, it is likely that
771	   more than one PHB group may be implemented on a node and utilized
772	   within a domain.  PHB groups should be defined such that the proper
773	   resource allocation between groups can be inferred, and integrated
774	   mechanisms can be implemented which can simultaneously support two
775	   or more groups.

777	2.4  Network Resource Allocation

779	   The implementation, configuration, operation and administration of
780	   the supported PHB groups in the nodes of a DS Domain should
781	   effectively partition the resources of those nodes and the inter-node
782	   links between the traffic aggregates, in accordance with the domain's
783	   service provisioning policy.  Traffic conditioners control the usage
784	   of these resources through the administrative control of TCAs and
785	   possibly through operational feedback from the nodes and traffic
786	   conditioners in the domain.

788	   The configuration of and interaction between the traffic conditioners
789	   and the interior nodes should be managed by the administrative
790	   control of the domain and may require operational control through
791	   protocols and a control entity.  There is a wide range of possible
792	   control models [DSFWK].  The precise nature and implementation of the
793	   interaction between these components is outside the scope of this
794	   architecture.  However, scalability requires that the control of the
795	   domain does not require micro-management of the network resources.
796	   The most scalable control model would operate nodes in open-loop in
797	   the operational timeframe, and would only require administrative-
798	   timescale management as SLAs are varied.  This simple model may be
799	   unsuitable in some circumstances, and some automated but relatively
800	   long time-constant operational control (minutes rather than seconds)
801	   may be desirable to balance the utilization of the network against
802	   the recent load profile.

804	3.  Per-Hop Behavior Definition Requirements

806	   In order for a Per Hop Behavior (PHB) group to be considered for
807	   standardization, a detailed definition of the behavior should be
808	   provided as a basis for implementation consistency.  This section
809	   provides a template for defining a new PHB group.  Before a PHB group
810	   is considered for standardization it should satisfy the PHB
811	   definition requirements in this section, to preserve the integrity of
812	   this architecture.

814	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
815	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
816	   document are to be interpreted as described in [RFC2119].

818	   3.1.  A PHB definition MUST NOT require inspection or modification of
819	   any part of the packet other than the DS field.

821	   3.2.  The definition of each newly proposed PHB group MUST include an
822	   overview of the behavior and the purpose of the behavior being
823	   proposed.  The overview MUST include a problem or problems statement
824	   for which the PHB group is targeted.  The overview MUST include the
825	   basic concepts behind the PHB group.  These concepts SHOULD include,
826	   but are not restricted to, queueing behavior, discard behavior, and
827	   output link selection behavior.  Lastly, the overview MUST specify
828	   the method by which the PHB group solves the problem or problems
829	   specified in the problem statement.

831	   Any configuration or management issues which affect the basic PHB
832	   definition MUST be specified in the overview of the behavior.  The
833	   actual details of the management and configuration of PHB groups in
834	   routers or hosts MUST be addressed in a separate, parallel document.

836	   3.3.  A PHB group definition MUST indicate whether a PHB group
837	   consists of one or more codepoints.  In the event that multiple
838	   codepoints are specified, the interactions between the codepoints
839	   within the PHB group and constraints that must be respected globally
840	   across all the codepoints within the PHB group MUST be clearly
841	   explained in the description of the PHB group.  As an example, the
842	   definition MUST specify whether packet reordering within a microflow
843	   with packets marked by two or more codepoints within the group is
844	   likely.

846	   3.4.  A PHB group may be standardized for local use within a domain
847	   in order to provide some domain specific functionality or domain
848	   specific services.  In this event, the PHB definition is useful for
849	   providing vendors with a consistent definition of the PHB group.  The
850	   PHB definition can also provide semantics for PHB translation and
851	   service mappings with peer domains which do not support the PHB
852	   group.  However, any PHB group which is defined as local use MUST be
853	   considered as an informational standard.  In contrast, a PHB group
854	   which is proposed for general use will follow a stricter
855	   standardization process.  Therefore all proposed PHB definitions MUST
856	   specifically state whether they are to be considered for general use
857	   or local use.

859	   It is recognized that PHB groups can be designed with the intent of
860	   providing host-to-host, WAN edge-to-WAN edge, or domain edge-to-
861	   domain edge services.  Use of the term "end-to-end" in a PHB
862	   definition MUST be interpreted to mean "host-to-host".

864	   Other PHB groups may be defined and deployed locally within domains,
865	   for experimental or operational purposes.  There is no requirement
866	   that these PHB groups must be publically documented, but they SHOULD
867	   utilize DS codepoints from one of the EXP/LU pools as defined in
868	   [DSFIELD].

870	   3.5.  It may be possible or appropriate for a packet marked with a
871	   codepoint within a PHB group to be re-marked to another codepoint
872	   within that group either within a domain or across two cooperating
873	   domains.  Typically there are three reasons for PHB group mutability:

875	   1. The codepoints of the PHB group are collectively intended to carry
876	      state about the network.
877	   2. Changes in the network state which require promotion or demotion
878	      of traffic marked with a codepoint within the PHB group.
879	   3. A PHB group is not implemented one both sides of a domain
880	      boundary; All codepoints of a PHB group have to be mapped to some
881	      other PHB or PHB group at the boundary.

883	   In contrast, it may also be necessary for specific PHB groups to be
884	   preserved within a domain and/or across multiple domains.  Typically
885	   this is because the PHB groups carry some host-to-host, WAN edge-to-
886	   WAN edge, or domain edge-to-domain edge semantics which are difficult
887	   to duplicate when the PHB group is mapped to a different PHB group.

889	   Further, these semantics may also be difficult to duplicate if packet
890	   markings are promoted or demoted within the same PHB group.

892	   A PHB definition MUST clearly state whether packets marked by a
893	   codepoint within a PHB group MAY, or SHOULD be promoted, demoted (to
894	   another codepoint within the group), or preserved within a domain.  A
895	   PHB definition MUST clearly state whether packets marked by a
896	   codepoint within a PHB group MAY, or SHOULD be promoted, demoted, or
897	   preserved across multiple, cooperating domains.  A PHB definition
898	   MUST clearly state whether codepoints within a PHB group MAY, or
899	   SHOULD be mapped to a different PHB group.

901	   If it is desirable for a PHB group to be changed, the definition
902	   SHOULD clearly state the circumstances under which a change is
903	   desirable.  If it is undesirable for a PHB group to be changed, the
904	   definition MUST clearly state what the risks are when a PHB group is
905	   modified.  A PHB definition may include constraints on actions that
906	   change the PHB group.  These constraints may be specified as actions
907	   the router SHOULD, or MUST perform.

909	   3.6.  The PHB definition MUST also include a section defining the
910	   implications of tunneling on the PHB group.  This section should
911	   specify the implications on the PHB group of a newly created outer
912	   header when the original PHB group of the inner header is
913	   encapsulated in a tunnel.  This section should also discuss what
914	   possible changes should be applied to the inner header at the egress
915	   of the tunnel, when both the PHB groups from the inner header and the
916	   outer header are accessible.

918	   3.7.  The process of defining PHB groups is incremental in nature.
919	   When new PHB groups are defined, their known interactions with
920	   previously defined PHB groups MUST be documented.  When a new PHB
921	   group is created, it can be entirely new in scope or it can be an
922	   extension to an existing PHB group.  If the PHB group is entirely
923	   independent of some or all of the existing PHB definitions, a section
924	   MUST be included in the PHB definition which details how the new PHB
925	   group co-exists with those PHB groups already defined.  For example,
926	   this section might indicate the possibility of packet re-ordering
927	   within a microflow with packets marked by codepoints within two
928	   separate PHB groups.  If concurrent operation of two (or more)
929	   different PHB groups in the same router is impossible or detrimental
930	   this MUST be stated.  If the concurrent operation of two (or more)
931	   different PHB groups requires some specific behaviors by the router
932	   when traffic specifying these different PHB groups are in the router
933	   at the same time, these behaviors MUST be stated.

935	   If the proposed PHB group is an extension to an existing PHB group, a
936	   section MUST be included in the PHB group definition which details
937	   how this extension inter-operates with the behavior being extended.
938	   Further, if the extension alters or more narrowly defines the
939	   existing behavior in some way, this MUST also be clearly specified in
940	   the PHB definition.

942	   3.8.  Each PHB definition MUST include a section specifying minimal
943	   conformance to the PHB group.  This conformance section is intended
944	   to provide a means for specifying the details of a behavior while
945	   allowing for implementation variation to the extent permitted by the
946	   PHB definition.  This conformance section can take the form of rules,
947	   tables, pseudo-code or tests.

949	   3.9.  A PHB definition MUST include a section detailing the security
950	   implications of the behavior.  This section should include a
951	   discussion of the mutability of the inner header's PHB group at the
952	   egress of a tunnel.  Further, this section should also discuss how
953	   the proposed PHB group could be used in denial-of-service attacks,
954	   reduction of service contract attacks, and service contract violation
955	   attacks.  Lastly, this section should discuss the means for detecting
956	   such attacks as they are relevant to the proposed behavior.

958	   3.10. It is strongly RECOMMENDED that an appendix be provided for
959	   each PHB definition that considers the implications of the proposed
960	   behavior on current and potential services.  These services could
961	   include but are not restricted to be user specific, device specific,
962	   domain specific or end to end services.  It is also strongly
963	   RECOMMENDED that the appendix include a section describing how the
964	   services are verified by users, devices, and/or domains.

966	   3.11.  If the PHB definition is targeted for local use within a
967	   domain, it is RECOMMENDED that the appendix  include a description of
968	   how the PHB group is mapped to existing general use PHB groups as
969	   well as other local use PHB groups.

971	   3.12.  It is RECOMMENDED that an appendix be provided for each PHB
972	   definition which considers the impact of the proposed new PHB groups
973	   on existing higher-layer protocols.  Under some circumstances PHB
974	   definitions may allow for possible changes to higher-layer protocols
975	   which may increase or decrease the utility of the proposed PHB group.

977	4.  Interoperability with Non-Differentiated Services-Compliant Nodes

979	   We define a non-differentiated services-capable node (non-DS-capable
980	   node) as a node which does not interpret the DS field as specified in
981	   [DSFIELD] and/or does not implement some or all of the standardized
982	   PHBs.  This may be due to the capabilities or configuration of the
983	   node.  We distinguish such a node from a one which does not implement
984	   differentiated forwarding behaviors which can be selected by the
985	   value of the IPv4 TOS byte or the IPv6 Traffic Class byte.  We define
986	   a legacy node as one which implements IPv4 Precedence as defined in
987	   [RFC791], but which is otherwise non-DS capable.

989	   Differentiated services depend on the resource allocation mechanisms
990	   provided by per-hop behavior implementations on nodes.  The quality
991	   or statistical assurance level of a service may break down in the
992	   event that traffic transits a non-DS-capable node, or a non-DS-
993	   capable domain.

995	   We will examine two separate cases.  The first case concerns the use
996	   of non-DS-capable nodes within a DS domain.  Note that PHB forwarding
997	   is primarily useful for allocating scarce node and link resources in
998	   a controlled manner.  On high-speed, lightly loaded links, the worst-
999	   case packet delay, jitter, and loss may be negligible, and the use of
1000	   a non-DS-capable node on the upstream end of such a link may not
1001	   result in service degradation.  In more realistic circumstances, the
1002	   lack of PHB forwarding in a node may make it impossible to offer low-
1003	   delay, low-loss, or provisioned bandwidth services across paths which
1004	   traverse the node.  However, use of a legacy node may be an
1005	   acceptable alternative, assuming that the DS domain restricts itself
1006	   to using only the precedence-compatible PHBs defined in [Baker], and
1007	   assuming that the particular precedence implementation results in
1008	   forwarding behaviors which are compatible with the services offered
1009	   along paths which traverse that node.

1011	   The second case concerns the behavior of services which traverse non-
1012	   DS-capable domains.  We assume for the sake of argument that a non-
1013	   DS-capable domain does not deploy traffic conditioning functions on
1014	   domain edge nodes; therefore, even in the event that the domain
1015	   consists of legacy or DS-capable interior nodes, the lack of traffic
1016	   enforcement at the edges will limit the ability to consistently
1017	   deliver some types of services across the domain.  A DS domain and a
1018	   non-DS-capable domain may negotiate an agreement which governs how
1019	   egress traffic from the DS-domain should be marked before entry into
1020	   the non-DS-capable domain.  This agreement might be monitored for
1021	   compliance by traffic sampling instead of by rigorous traffic
1022	   conditioning.  Alternatively, where there is knowledge that the non-
1023	   DS-capable domain consists of legacy nodes, the upstream DS domain
1024	   may opportunistically re-mark differentiated services traffic to one
1025	   or more IPv4 precedence values.  Where there is no knowledge of the
1026	   traffic management capabilities of the domain, and no agreement in
1027	   place, a DS domain egress node may choose to re-mark the DS field to
1028	   zero, under the assumption that the non-DS-capable domain will treat
1029	   the traffic uniformly with best-effort service.

1031	   In the event that a non-DS-capable peers with a DS domain, traffic
1032	   flowing from the non-DS-capable domain should be conditioned at the
1033	   DS ingress node of the DS domain according to the appropriate SLA or
1034	   policy.

1036	5.  Multicast Considerations

1038	   For future study.

1040	6.  Security and Tunneling Considerations

1042	   This section addresses security issues raised by the introduction of
1043	   differentiated services, primarily the potential for denial-of-
1044	   service attacks, and the related potential for theft of service by
1045	   unauthorized traffic (Section 6.1).  In addition, the operation of
1046	   differentiated services in the presence of IPsec and its interaction
1047	   with IPsec are also discussed (Section 6.2), as well as auditing
1048	   requirements (Section 6.3).  This section considers issues introduced
1049	   by the use of both IPsec and non-IPsec tunnels.

1051	6.1  Theft and Denial of Service

1053	   The primary goal of differentiated services is to allow different
1054	   levels of service to be provided for traffic streams on a common
1055	   network infrastructure.  A variety of resource management techniques
1056	   may be used to achieve this, but the end result will be that some
1057	   packets receive different (e.g., better) service than others.  The
1058	   mapping of network traffic to the specific behaviors that result in
1059	   different (e.g., better or worse) service is indicated primarily by
1060	   the DS field, and hence an adversary may be able to obtain better
1061	   service by modifying the DS field to values indicating behaviors used
1062	   for enhanced services or by injecting packets with DS field's set to
1063	   such values.  Taken to its limits, this theft of service becomes a
1064	   denial-of-service attack when the modified or injected traffic
1065	   depletes the resources available to forward it and other traffic
1066	   streams.  The defense against such theft- and denial-of-service
1067	   attacks consists of a combination of edge policing and security of
1068	   the network infrastructure within a DS domain.

1070	   As described in Section 2.1, DS ingress nodes must ensure that all
1071	   traffic entering a DS domain has DS field values that are acceptable
1072	   to that domain's service provision policy.  This makes the ingress
1073	   nodes the first line of defense against theft-of-service and denial-
1074	   of-service attacks based on modified DS field values (e.g., values to
1075	   which the traffic is not entitled).  An important instance of an
1076	   ingress node is that any traffic-originating node in a DS domain is
1077	   the ingress node for that traffic, and must ensure that that traffic
1078	   carries acceptable DS field values.

1080	   A domain's service provision policy may require the ingress nodes to
1081	   change the DS field values on some entering packets (e.g., an ingress
1082	   router may set the DS field values of a customer's traffic in
1083	   accordance with the appropriate SLA).  Ingress nodes should police
1084	   all other inbound traffic to ensure that the DS field values are
1085	   acceptable; packets found to have unacceptable values must either be
1086	   discarded or must have their DS fields modified to acceptable values
1087	   before being forwarded.  For example, an ingress node receiving
1088	   traffic from a domain with which no enhanced service agreement exists
1089	   may reset the DS field to DE(fault) service [DSFIELD].  A service
1090	   provisioning policy may require traffic authentication to validate
1091	   the use of some DS field values (e.g., those corresponding to
1092	   enhanced services), and such authentication may be performed by
1093	   technical means (e.g., IPsec) and/or non-technical means (e.g., the
1094	   inbound link is known to be connected to exactly one customer site).

1096	   An inter-domain agreement may reduce or eliminate the need for
1097	   ingress node traffic policing by making the upstream domain partly or
1098	   completely responsible for ensuring that traffic has DS field values
1099	   acceptable to the downstream domain.  In this case, the ingress node
1100	   may still perform redundant acceptability checks to reduce the
1101	   dependence on the upstream domain (e.g., such checks can prevent
1102	   theft-of-service attacks from propagating across the domain
1103	   boundary).  If an acceptability check fails because the upstream
1104	   domain is not fulfilling its responsibilities, that failure is an
1105	   auditable event; the generated audit log entry should include the
1106	   date/time the packet was received, the source and destination IP
1107	   addresses, and the DS field value that caused the failure.  In
1108	   practice, the limited gains from such checks need to be weighed
1109	   against their potential performance impact in determining what, if
1110	   any, checks to perform under these circumstances.

1112	   Interior nodes in a DS domain may rely on the DS field to associate
1113	   differentiated services traffic with the behaviors used to implement
1114	   enhanced services.  Any node doing so depends on the correct
1115	   operation of the DS domain to prevent the arrival of traffic with
1116	   unacceptable DS field values.  Robustness concerns dictate that the
1117	   arrival of packets with unacceptable DS field values must not cause
1118	   the failure (e.g., crash) of network nodes.  Interior nodes are not
1119	   responsible for enforcing the service provisioning policy (or
1120	   individual SLAs) and hence are not required to check DS field values
1121	   for acceptability.  Interior nodes may perform some acceptability
1122	   checks on DS field values (e.g., check for DS field values that are
1123	   never used for traffic on a specific link, never used with a source/
1124	   destination address outside a specific range, etc.) to improve
1125	   security and robustness (e.g., resistance to theft of service attacks
1126	   based on DS field modifications).  Any detected failure of such an
1127	   acceptability check is an auditable event and the generated audit log
1128	   entry should include the date/time the packet was received, the
1129	   source and destination IP addresses, and the DS field value that
1130	   caused the failure.  In practice, the limited gains from such checks
1131	   need to be weighed against their potential performance impact in
1132	   determining what, if any, checks to perform at interior nodes.

1134	   Any link that cannot be adequately secured against modification of DS
1135	   field values or traffic injection by adversaries should be treated as
1136	   a boundary link (and hence any arriving traffic on that link is
1137	   treated as if it were entering the domain at an ingress node).  Local
1138	   security policy provides the definition of "adequately secured," and
1139	   such a definition may include a determination that the risks and
1140	   consequences of DS field modification and/or traffic injection do not
1141	   justify any additional security measures for a link.  Link security
1142	   can be enhanced via physical access controls and/or software means
1143	   such as tunnels that ensure packet integrity.

1145	6.2  IPsec and Tunneling interactions

1147	   The IPsec protocol, as defined in [ESP, AH], does not include the IP
1148	   header's DS field in any of its cryptographic calculations (in the
1149	   case of tunnel mode, it is the outer IP header's DS field that is not
1150	   included).  Hence modification of the DS field by a network node has
1151	   no effect on IPsec's end-to-end security, because it cannot cause any
1152	   IPsec integrity check to fail.  As a consequence, IPsec does not
1153	   provide any defense against an adversary's modification of the DS
1154	   field (i.e., a man-in-the-middle attack); the adversary's
1155	   modification will also have no effect on IPsec's end-to-end security.
1156	   In some environments, the ability to modify the DS field without
1157	   affecting IPsec integrity checks may constitute a covert channel; if
1158	   it is necessary to eliminate such a channel or reduce its bandwidth,
1159	   the DS domains should be configured so that the required processing
1160	   (e.g., set all DS fields on sensitive traffic to a single value) can
1161	   be performed at DS egress nodes where traffic exits higher security
1162	   domains.

1164	   IPsec's tunnel mode provides security for the encapsulated IP
1165	   header's DS field.  A tunnel mode IPsec packet contains two IP
1166	   headers: an outer header supplied by the ingress node and an
1167	   encapsulated inner header supplied by the original source of the
1168	   packet.  When an IPsec tunnel is hosted (in whole or in part) on a
1169	   differentiated services network, the intermediate network nodes
1170	   operate on the DS field in the outer header.  At the tunnel egress
1171	   node, IPsec processing includes stripping the outer header and
1172	   forwarding the packet (if required) using the inner header.  Since
1173	   the inner IP header has not been processed by a DS ingress node, the
1174	   tunnel egress node is the DS ingress node for traffic exiting the
1175	   tunnel, and hence must carry out the corresponding responsibilities
1176	   (see Section 6.1).  If the IPsec processing includes a sufficiently
1177	   strong cryptographic integrity check of the encapsulated packet
1178	   (where sufficiency is determined by local security policy), the
1179	   tunnel egress node can safely assume that the DS field in the inner
1180	   header has the same value as it had at the tunnel ingress node.  If
1181	   the tunnel ingress node is in the same DS domain as the tunnel egress
1182	   node, the tunnel egress node can safely treat a packet passing such
1183	   an integrity check as if it had arrived from another node within the
1184	   same DS domain and hence omit the DS ingress node policing that would
1185	   otherwise be required.  An important consequence is that otherwise
1186	   insecure internal links within DS domains can be secured by a
1187	   sufficiently strong IPsec tunnel.

1189	   This analysis and its implications apply to any tunneling protocol
1190	   that performs integrity checks, but the level of assurance of the
1191	   inner header's DS field depends on the strength of the integrity
1192	   check performed by the tunneling protocol.  In the absence of
1193	   sufficient assurance for a tunnel that may transit nodes outside the
1194	   current DS domain (or is otherwise vulnerable), the encapsulated
1195	   packet must be treated as if it had arrived at a DS ingress node from
1196	   outside the domain.

1198	   IPsec currently specifies that the inner header's DS field must not
1199	   be changed by IPsec decapsulation processing at the tunnel egress
1200	   node.  This ensures that an adversary's modifications to the DS field
1201	   cannot be used to launch theft- or denial-of-service attacks across
1202	   an IPsec tunnel endpoint, as any such modifications will be discarded
1203	   at the tunnel endpoint.

1205	   Note: the following paragraph requires coordination with and approval
1206	   by he Security Area of the IETF, and may result in the need for brief
1207	   modifications of the appropriate security RFCs.

1209	   A tunnel egress node in a DS domain may modify the DS field in an
1210	   inner IP header based on the DS field value in the outer header,
1211	   including copying part or all of the outer DS field to the inner DS
1212	   field.  For a tunnel contained entirely within a single DS domain and
1213	   for which the links are adequately secured against modifications of
1214	   the outer DS field, the only limits on modifications are those
1215	   imposed by the domain's service provisioning policy.  Otherwise, the
1216	   tunnel egress node performing such modifications is acting as a DS
1217	   ingress node for traffic exiting the tunnel, and must carry out the
1218	   responsibilities of an ingress node, including ensuring that the
1219	   resulting DS field values are acceptable (see Section 6.1).

1221	   If the tunnel enters the DS domain at a node different from the
1222	   tunnel egress node, the tunnel egress node may depend on the upstream
1223	   DS ingress node having ensured the acceptability of the outer DS
1224	   field value.  Even in this case, there are some acceptability checks
1225	   that can only be performed by the tunnel egress node (e.g., a
1226	   consistency check between the inner and outer DS field values for an
1227	   encrypted tunnel).  Any detected failure of such a check is an
1228	   auditable event and the generated audit log entry should include the
1229	   date/time the packet was received, the source and destination IP
1230	   addresses, and the DS field value that was unacceptable.  The
1231	   requirements in this paragraph apply to any future use of the
1232	   currently unused (CU) bits in the IPv4 TOS byte and the IPv6 Traffic
1233	   Class byte [DSFIELD].

1235	6.3  Auditing

1237	   Not all systems that support differentiated services will implement
1238	   auditing.  However, if differentiated services support is
1239	   incorporated into a system that supports auditing, then the
1240	   differentiated services implementation must also support auditing and
1241	   must allow a system administrator to enable or disable auditing for
1242	   differentiated services.  For the most part, the granularity of
1243	   auditing is a local matter.  However, several auditable events are
1244	   identified in this document and for each of these events a minimum
1245	   set of information that should be included in an audit log is
1246	   defined.  Additional information also may be included in the audit
1247	   log for each of these events, and additional events, not explicitly
1248	   called out in this specification, also may result in audit log
1249	   entries.  There is no requirement for the receiver to transmit any
1250	   message to the purported sender in response to the detection of an
1251	   auditable event, because of the potential to induce denial of service
1252	   via such action.

1254	7.  Acknowledgements

1256	   The authors would like to acknowledge the following individuals for
1257	   their helpful comments and suggestions: Kathleen Nichols, Brian
1258	   Carpenter, Konstantinos Dovrolis, Shivkumar Kalyana, Wu-chang Feng,
1259	   Marty Borden, Yoram Bernet, Ronald Bonica, James Binder, and Borje
1260	   Ohlman.

1262	8.  References

1264	   [802.1p]    ISO/IEC Final CD 15802-3 Information technology - Tele-
1265	               communications and information exchange between systems -
1266	               Local and metropolitan area networks - Common
1267	               specifications - Part 3: Media Access Control (MAC)
1268	               bridges, (current draft available as IEEE P802.1D/D15).

1270	   [AH]        S. Kent and R. Atkinson, "IP Authentication Header",
1271	               Internet Draft <draft-ietf-ipsec-auth-header-06.txt>,
1272	               May 1998.

1274	   [ATM]       ATM Traffic Management Specification Version 4.0
1275	               <af-tm-0056.000>, April 1996.

1277	   [Baker]     F. Baker, S. Brim, T. Li, F. Kastenholz, S. Jagannath,
1278	               and J. Renwick, "IP Precedence in Differentiated
1279	               Services Using the Assured Service", Internet Draft
1280	               <draft-ietf-diffserv-precedence-00.txt>, April 1998.

1282	   [DSFIELD]   K. Nichols and S. Blake, "Definition of the
1283	               Differentiated Services Field (DS Byte) in the IPv4 and
1284	               IPv6 Headers", Internet Draft
1285	               <draft-ietf-diffserv-header-00.txt>, May 1998.

1287	   [DSFWK]     Differentiated Services Framework Document (work in
1288	               preparation).

1290	   [Clark97]   D. Clark and J. Wroclawski, "An Approach to Service
1291	               Allocation in the Internet", Internet Draft
1292	               <draft-clark-diff-svc-alloc-00.txt>, July 1997.

1294	   [Ellesson]  E. Ellesson and S. Blake, "A Proposal for the Format and
1295	               Semantics of the TOS Byte and Traffic Class Byte in IPv4
1296	               and IPv6", Internet Draft <draft-ellesson-tos-00.txt>,
1297	               November 1997.

1299	   [ESP]       S. Kent and R. Atkinson, "IP Encapsulating Security
1300	               Payload", Internet Draft
1301	               <draft-ietf-ipsec-esp-v2-05.txt>, May 1998.

1303	   [Ferguson]  P. Ferguson, "Simple Differential Services: IP TOS and
1304	               Precedence, Delay Indication, and Drop Preference,
1305	               Internet Draft <draft-ferguson-delay-drop-02.txt>,
1306	               April 1998.

1308	   [FRELAY]    ANSI T1S1, "DSSI Core Aspects of Frame Rely", March 1990.

1310	   [Heinanen]  J. Heinanen, "Use of the IPv4 TOS Octet to Support
1311	               Differentiated Services", Internet Draft
1312	               <draft-heinanen-diff-tos-octet-01.txt>, November 1997.

1314	   [IntServ]   R. Braden, D. Clark, and S. Shenker, "Integrated Services
1315	               in the Internet Architecture: An Overview", Internet RFC
1316	               1633, July 1994.

1318	   [MPLSFWK]   R. Callon, P. Doolan, N. Feldman, A. Fredette, G.
1319	               Swallow, and A. Viswanathan, "A Framework for
1320	               Multiprotocol Label Switching", Internet Draft
1321	               <draft-ietf-mpls-framework-02.txt>, November 1997.

1323	   [PASTE]     T. Li and Y. Rekhter, "Provider Architecture for
1324	               Differentiated Services and Traffic Engineering (PASTE)",
1325	               Internet Draft <draft-li-paste-00.txt>, January 1998.

1327	   [RFC791]    Information Sciences Institute, "Internet Protocol",
1328	               Internet RFC 791, September 1981.

1330	   [RFC1349]   P. Almquist, "Type of Service in the Internet Protocol
1331	               Suite", Internet RFC 1349, July 1992.

1333	   [RFC2119]   S. Bradner, "Key words for use in RFCs to Indicate
1334	               Requirement Levels", Internet RFC 2119, March 1997.

1336	   [RSVP]      B. Braden et. al., "Resource ReSerVation Protocol (RSVP)
1337	               -- Version 1 Functional Specification", Internet RFC
1338	               2205, September 1997.

1340	   [SIMA]      K. Kilkki, "Simple Integrated Media Access (SIMA)",
1341	               Internet Draft <draft-kalevi-simple-media-access-01.txt>,
1342	               June 1997.

1344	   [2BIT]      K. Nichols, V. Jacobson, and L. Zhang, "A Two-bit
1345	               Differentiated Services Architecture for the Internet",
1346	               Internet Draft <draft-nichols-diff-svc-arch-00.txt>,
1347	               November 1997.

1349	   [TR]        ISO/IEC 8802-5 Information technology -
1350	               Telecommunications and information exchange between
1351	               systems - Local and metropolitan area networks - Common
1352	               specifications - Part 5: Token Ring Access Method and
1353	               Physical Layer Specifications, (also ANSI/IEEE Std 802.5-
1354	               1995), 1995.

1356	   [Weiss]     W. Weiss, "Providing Differentiated Services Through
1357	               Cooperative Dropping and Delay Indication", Internet
1358	               Draft <draft-weiss-cooperative-drop-00.txt>, March 1998.

1360	Authors' Addresses

1362	   David Black
1363	   The Open Group Research Institute
1364	   Eleven Cambridge Center
1365	   Cambridge, MA  02142
1366	   Phone:  +1-617-621-7347
1367	   E-mail: d.black@opengroup.org

1369	   Steven Blake
1370	   IBM Corporation
1371	   800 Park Offices Drive
1372	   Research Triangle Park, NC  27709
1373	   Phone:  +1-919-254-2030
1374	   E-mail: slblake@raleigh.ibm.com

1376	   Mark A. Carlson
1377	   Redcape Software, Inc.
1378	   2990 Center Green Court South
1379	   Boulder, CO 80301
1380	   Phone:  +1-303-448-0048 x115
1381	   E-mail: mac@redcape.com

1383	   Elwyn Davies
1384	   Nortel UK
1385	   London Road
1386	   Harlow, Essex CM17 9NA, UK
1387	   Phone:  +44-1279-405498
1388	   E-mail: elwynd@nortel.co.uk

1390	   Zheng Wang
1391	   Bell Labs Lucent Tech
1392	   101 Crawfords Corner Road
1393	   Holmdel, NJ 07733
1394	   E-mail: zhwang@bell-labs.com

1396	   Walter Weiss
1397	   Lucent Technologies
1398	   300 Baker Avenue, Suite 100,
1399	   Concord, MA  01742-2168
1400	   E-mail: wweiss@lucent.com