idnits 2.17.1 draft-ietf-diffserv-arch-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 9 longer pages, the longest (page 1) being 59 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There is 1 instance of lines with control characters in the document. ** The abstract seems to contain references ([DSFWK], [DSFIELD], [Baker]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There is 10 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 146 has weird spacing: '...ervices a pa...' == Line 258 has weird spacing: '...reement a se...' == Line 967 has weird spacing: '...ppendix inclu...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 1998) is 9478 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'Clark97' is defined on line 1290, but no explicit reference was found in the text == Unused Reference: 'Ellesson' is defined on line 1294, but no explicit reference was found in the text == Unused Reference: 'Ferguson' is defined on line 1303, but no explicit reference was found in the text == Unused Reference: 'Heinanen' is defined on line 1310, but no explicit reference was found in the text == Unused Reference: 'IntServ' is defined on line 1314, but no explicit reference was found in the text == Unused Reference: 'RSVP' is defined on line 1336, but no explicit reference was found in the text == Unused Reference: 'SIMA' is defined on line 1340, but no explicit reference was found in the text == Unused Reference: '2BIT' is defined on line 1344, but no explicit reference was found in the text == Unused Reference: 'Weiss' is defined on line 1356, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'AH' -- Possible downref: Non-RFC (?) normative reference: ref. 'ATM' -- Possible downref: Non-RFC (?) normative reference: ref. 'Baker' -- Possible downref: Non-RFC (?) normative reference: ref. 'DSFIELD' -- Possible downref: Non-RFC (?) normative reference: ref. 'DSFWK' -- Possible downref: Non-RFC (?) normative reference: ref. 'Clark97' -- Possible downref: Non-RFC (?) normative reference: ref. 'Ellesson' -- Possible downref: Non-RFC (?) normative reference: ref. 'ESP' -- Possible downref: Non-RFC (?) normative reference: ref. 'Ferguson' -- Possible downref: Non-RFC (?) normative reference: ref. 'FRELAY' -- Possible downref: Non-RFC (?) normative reference: ref. 'Heinanen' ** Downref: Normative reference to an Informational RFC: RFC 1633 (ref. 'IntServ') -- Possible downref: Non-RFC (?) normative reference: ref. 'MPLSFWK' -- Possible downref: Non-RFC (?) normative reference: ref. 'PASTE' ** Obsolete normative reference: RFC 1349 (Obsoleted by RFC 2474) -- Possible downref: Non-RFC (?) normative reference: ref. 'SIMA' -- Possible downref: Non-RFC (?) normative reference: ref. '2BIT' -- Possible downref: Non-RFC (?) normative reference: ref. 'TR' -- Possible downref: Non-RFC (?) normative reference: ref. 'Weiss' Summary: 13 errors (**), 0 flaws (~~), 15 warnings (==), 19 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT David Black 3 Diffserv Working Group The Open Group 4 Expires: November 1998 Steven Blake 5 IBM Corporation 6 Mark Carlson 7 Redcape Software 8 Elwyn Davies 9 Nortel UK 10 Zheng Wang 11 Bell Labs Lucent Technologies 12 Walter Weiss 13 Lucent Technologies 15 May 1998 17 An Architecture for Differentiated Services 19 21 Status of This Memo 23 This document is an Internet-Draft. Internet-Drafts are working 24 documents of the Internet Engineering Task Force (IETF), its areas, 25 and its working groups. Note that other groups may also distribute 26 working documents as Internet-Drafts. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 To view the entire list of current Internet-Drafts, please check the 34 "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow 35 Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern 36 Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific 37 Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). 39 Abstract 41 This document defines an architecture for implementing scalable 42 service differentiation in the Internet. This architecture achieves 43 scalability by aggregating traffic classification state which is 44 conveyed by means of IP-layer packet marking using the DS field 45 [DSFIELD]. Packets are classified and marked to receive a particular 46 per-hop forwarding behavior on routers along their path. 47 Sophisticated classification, policing, and shaping operations need 48 only be implemented at network boundaries or hosts. Network 49 resources are allocated to traffic streams by service provisioning 50 policies which govern how traffic is conditioned upon entry to a 51 differentiated services-capable network, and how that traffic is 53 Black, et. al. Expires: November 1998 [Page 1] 54 forwarded within that network. A wide variety of services can 55 implemented on top of these building blocks. 57 This document should be read along with its companion documents, the 58 the differentiated services framework [DSFWK], the definition of the 59 DS field [DSFIELD], and other documents which specify per-hop 60 behaviors, such as [Baker]. 62 1. Introduction 64 1.1 Overview 66 This document defines an architecture for implementing scalable 67 service differentiation in the Internet. "Service" is taken to 68 signify some significant characteristics of packet transmission 69 across a set of one or more paths within a network. These 70 characteristics may be specified in quantitative or statistical terms 71 of throughput, delay, jitter, and/or loss, or may otherwise be 72 specified in terms of some relative priority of access to network 73 resources. Service differentiation is desired to accommodate 74 heterogeneous application requirements and user expectations, and to 75 permit differentiated pricing of Internet service. 77 This architecture is composed of a number of functional elements 78 implemented in network nodes, including a small set of well-defined 79 per-hop forwarding behaviors, and traffic conditioning functions 80 including classification, metering, marking, shaping, and policing. 81 This architecture achieves scalability by implementing complex 82 conditioning functions only at network edge nodes, and by applying 83 per-hop behaviors to aggregates of traffic which have been 84 appropriately marked using the DS field in the IPv4 or IPv6 headers 85 [DSFIELD]. Per-hop behaviors are defined to permit a reasonably 86 granular means of allocating buffer and bandwidth resources among 87 competing traffic streams. Per-application flow or per-customer 88 forwarding state need not be maintained within the core of the 89 network. Service provisioning and traffic conditioning policies are 90 sufficiently decoupled from the forwarding behaviors within the 91 network interior to permit a wide variety of service behaviors to be 92 implemented, with room for future expansion. 94 Section 1.2 is a glossary of terms used within this document. 95 Section 1.3 lists requirements for this architecture, and Section 1.4 96 provides a brief comparison to other approaches for service 97 differentiation. Section 2 discusses the components of the 98 architecture in detail. Section 3 proposes requirements for per-hop 99 behavior specifications. Section 4 discusses interoperability issues 100 with networks which do not implement differentiated services as 101 defined in this document and [DSFIELD]. Section 5 discusses issues 102 with multicast traffic (this section is currently left for future 103 study). Section 6 addresses security and tunnel considerations. 105 Black, et. al. Expires: November 1998 [Page 2] 106 This document should be read along with its companion documents, the 107 differentiated services framework [DSFWK], the definition of the DS 108 field [DSFIELD], and other documents which specify per-hop behaviors, 109 such as [Baker]. It has been heavily influenced by the thoughtful 110 proposals of previous authors [Clark97, Ellesson, Ferguson, Heinanen, 111 SIMA, 2BIT, Weiss]. 113 1.2 Terminology 115 This section gives a general conceptual overview of the terms used 116 in this document. Some of these terms are more precisely defined in 117 later sections of this document. The choice of terms and definitions 118 were influenced by [MPLSFWK]. 120 Behavior Aggregate (BA) a DS behavior aggregate. 122 BA classifier a classifier that selects packets based 123 only on the contents of the DS-field. Such 124 classifiers are used in DS interior nodes, 125 and are typically used for policing at a DS 126 ingress node. 128 Boundary a link connecting the edge nodes of two 129 domains. 131 Classifier a logical element of traffic conditioning 132 that selects packets based on the content 133 of packet headers according to defined 134 rules. 136 Customer DS domain a DS domain that has an SLA in place with 137 another directly attached DS domain (the 138 provider DS domain) governing the rules by 139 which traffic from the customer DS domain 140 will be serviced within the provider DS 141 domain. A single DS domain may be both a 142 customer DS domain and a provider DS domain 143 for different directions of traffic at the 144 same time. 146 Differentiated Services a paradigm for providing quality-of-service 147 (DS) (QoS) in the Internet by employing a small, 148 well-defined set of building blocks from 149 which a variety of services may be built. 151 DS behavior aggregate a stream of packets that have the same DS 152 codepoint. 154 DS field the IPv4 TOS octet or IPv6 Traffic Class 155 octet when interpreted according to 156 [DSFIELD]. 158 Black, et. al. Expires: November 1998 [Page 3] 159 DS capable able to support differentiated services 160 functions and behaviors as defined in 161 [DSFIELD], this document, and other 162 documents. 164 DS codepoint a specific bit-pattern of the DS field. 166 DS edge node a DS node that connects one DS domain to a 167 node either in another DS domain or in a 168 domain that is not DS capable. 170 DS egress node a DS edge node in its role in handling 171 traffic as it leaves a DS domain. 173 DS destination host a DS host that acts as a DS egress node. 175 DS domain a contiguous set of nodes which operate 176 with a common set of service provisioning 177 policies and PHB definitions. 179 DS host a host computer that can perform certain 180 traffic conditioning functions and 181 therefore acts as a special DS edge node. 183 DS ingress node a DS edge node in its role in handling 184 traffic as it enters a DS domain. 186 DS interior node a DS node that is not a DS edge node. 188 DS node a DS capable node. 190 DS region a set of contiguous DS domains which can 191 offer differentiated services over paths 192 across those DS domains. 194 DS source host a DS host that acts as a DS ingress node. 196 Legacy node a node which implements IPv4 Precedence as 197 defined in [RFC791] but which is otherwise 198 not DS capable. 200 Marker a logical element of traffic conditioning 201 that sets the DS codepoint in the DS field 202 based on defined rules. 204 MF Classifier a classifier which selects packets based on 205 the content of some arbitrary number of 206 header fields; typically some combination 207 of source address, destination address, 208 protocol ID, source port and destination 209 port. 211 Black, et. al. Expires: November 1998 [Page 4] 212 Mechanism a specific algorithm or operation (e.g., 213 queueing discipline) that is implemented in 214 a node to realize a set of one or more per- 215 hop behaviors. 217 Meter a logical element of traffic conditioning 218 that measures the properties (e.g., rate) 219 of a packet stream selected by a 220 classifier. 222 Microflow a single instance of an application-to- 223 application flow of packets which is 224 identified by source address, source port, 225 destination address, destination port and 226 protocol id. 228 Per-Hop-Behavior (PHB) the externally observable forwarding 229 behavior applied at a DS capable node to a 230 DS behavior aggregate. 232 PHB group a set of one or more PHBs that can only be 233 meaningfully specified and implemented 234 simultaneously, due to a common constraint 235 applying to all PHBs in the set such as a 236 packet scheduling or discard policy. 238 Policing the process of applying traffic 239 conditioning functions such as marking or 240 discarding to a traffic stream in 241 accordance with the state of a 242 corresponding meter. 244 Provider DS domain a DS domain that has an SLA in place with 245 another directly attached DS domain (the 246 customer DS domain) governing the rules by 247 which traffic from the customer DS domain 248 will be serviced within the provider DS 249 domain. A single DS domain may be both a 250 customer DS domain and a provider DS domain 251 for different directions of traffic at the 252 same time. 254 Service the overall treatment of a defined subset 255 of a customer's traffic within a DS domain 256 or end-to-end. 258 Service Level Agreement a service contract between a customer and a 259 (SLA) service provider that specifies the details 260 of a TCA and the corresponding service 261 behavior a customer should receive. A 262 customer may be a user organization or 263 another DS domain. 265 Black, et. al. Expires: November 1998 [Page 5] 266 Service Provisioning a policy which defines how traffic 267 Policy conditioners are configured on DS edge 268 nodes and how traffic streams are mapped to 269 DS behavior aggregates to achieve a range 270 of service behaviors. 272 Shaper a logical element of traffic conditioning 273 that delays packets within a traffic stream 274 to cause it to conform to some defined 275 traffic properties. 277 Traffic conditioner an entity that performs traffic 278 conditioning and which may contain 279 classifiers, markers, meters, and shapers. 281 Traffic conditioning control functions performed to enforce 282 rules specified in a TCA and to prepare 283 traffic for differentiated services, 284 including classifying, metering, marking, 285 policing, and shaping. 287 Traffic Conditioning an agreement specifying classifier rules 288 Agreement (TCA) and the corresponding traffic profiles and 289 metering, marking, policing and/or shaping 290 rules which are to apply to the traffic 291 streams selected by the classifier. 293 Traffic profile a description of the expected properties 294 of a traffic stream such as rate and burst 295 size. 297 Traffic stream an administratively significant set of one 298 or more microflows which traverse a path 299 segment. A traffic stream may consist of 300 the set of active microflows which are 301 selected by a particular classifier. 303 1.3 Requirements 305 The history of the Internet has been continuous growth in the number 306 of hosts, the number and variety of applications, and the capacity of 307 the network infrastructure, and this growth is expected to continue 308 for the foreseeable future. A scalable architecture for service 309 differentiation must be able to accommodate this continued growth. 311 The following requirements were identified and are addressed in this 312 architecture: 314 o must accommodate a wide variety of service behaviors and 315 provisioning policies, extending end-to-end or within a particular 316 (set of) network(s), 318 Black, et. al. Expires: November 1998 [Page 6] 319 o must allow decoupling of the service behavior from the particular 320 application in use, 322 o must work with existing applications (assuming suitable deployment 323 of traffic conditioners), 325 o must decouple traffic conditioning and service provisioning 326 functions from forwarding behaviors implemented within the core 327 network routers, 329 o must not depend on hop-by-hop application signaling, 331 o must require only a small set of forwarding behaviors whose 332 implementation complexity does not dominate the cost of a network 333 device, and which will not introduce bottlenecks for future high- 334 speed system implementations, 336 o must avoid per-microflow or per-customer state within core network 337 routers, 339 o must utilize only aggregated classification state within the 340 network core, 342 o must permit simple packet classification implementations in core 343 network routers (BA classifier), 345 o must permit reasonable interoperability with non-compliant network 346 nodes, 348 o must accommodate incremental deployment. 350 1.4 Comparisons with Other Approaches 352 The differentiated services architecture specified in this document 353 can be contrasted with other existing models of traffic management 354 and service differentiation. We classify these alternative models 355 into the following categories: relative priority, virtual circuit, 356 Integrated Services/RSVP, and service marking. 358 Implementations of the relative priority model include IPv4 359 Precedence marking as defined in [RFC791], 802.5 Token Ring priority 360 [TR], and 802.1p priority [802.1p]. In this model the application, 361 host, or proxy node selects a relative priority or "precedence" for a 362 packet (e.g., delay or discard priority), and the network nodes along 363 the transit path apply the appropriate priority forwarding behavior 364 corresponding to the priority value within the packet's header. Our 365 architecture can be considered as a refinement to this model, since 366 we more clearly specify the role and importance of edge nodes and 367 traffic conditioners, and since our per-hop behavior model permits 368 more general forwarding behaviors than relative delay or discard 369 priority. 371 Black, et. al. Expires: November 1998 [Page 7] 372 Implementations of the virtual circuit model include Frame Relay, 373 ATM, and MPLS [FRELAY, ATM, PASTE]. In this model path forwarding 374 state and traffic management or QoS state is established for traffic 375 streams on each hop along a path. Traffic aggregates of varying 376 granularity are associated with a virtual circuit, and packets/cells 377 within each virtual circuit are marked with a forwarding label that 378 is used to lookup the next hop, the per-hop forwarding behavior, and 379 the replacement label at each hop. This model permits finer 380 granularity resource allocation to traffic streams, but the amount 381 of forwarding state scales linearly with the number of edges of the 382 network in the best case (assuming multipoint-to-point virtual 383 circuits), and it scales with the square of the number of edges in 384 the worst case, when edge-edge traffic streams with provisioned 385 resources are employed. 387 The Integrated Services/RSVP model relies upon traditional datagram 388 forwarding in the default case, but allows sources and receivers to 389 exchange signaling messages which establish classification and 390 forwarding state on each node along the path between them [IntServ, 391 RSVP]. In the absence of state aggregation, the amount of state on 392 each node scales in proportion to the ratio of the link rate to the 393 average reservation size (in bps), multiplied by some fraction of the 394 link rate which is "reservable". This model also requires 395 application support for the RSVP signaling protocol. 397 An example of a service marking model is IPv4 TOS as defined in 398 [RFC1349]. In this example each packet is marked with a request for 399 a "type of service", which may include "minimize delay", "maximize 400 throughput", "maximize reliability", or "minimize cost". Network 401 nodes may select routing paths or forwarding behaviors which are 402 suitably provisioned to satisfy the service request. This model is 403 subtly different from our architecture. The defined TOS markings are 404 very generic and do not span the range of possible service semantics. 405 Furthermore, the service request is associated with each individual 406 packet, whereas some service semantics may depend on the aggregate 407 forwarding behavior of a sequence of packets. The service marking 408 model does not easily accommodate growth in the number and range of 409 future services, and involves configuration of the "TOS->forwarding 410 behavior" association in each core network router. 412 2. Differentiated Services Architectural Model 414 The differentiated services architecture is based on a simple model 415 where traffic entering a network is conditioned at the edges of the 416 network, and assigned to different behavior aggregates. Each 417 behavior aggregate is identified with a single DS codepoint. Within 418 the core of the network, packets are forwarded according to the per- 419 hop behavior associated with the DS codepoint. In this section, we 420 discuss the key components in a differentiated services region, 421 traffic conditioning functions, and how differentiated services are 422 achieved through the combination of traffic conditioning and PHB- 424 Black, et. al. Expires: November 1998 [Page 8] 425 based forwarding. 427 2.1 Differentiated Services Regions 429 A differentiated services region (DS Region) is a set of contiguous 430 DS domains, where each DS domain consists of a set of edge nodes and 431 interior nodes. 433 2.1.1 DS Domain 435 A DS domain is a contiguous set of DS nodes which operate with a 436 common service provisioning policy and set of PHB group definitions. 437 A DS domain has a well-defined boundary consisting of DS edge nodes 438 which condition ingress traffic and ensure that packets which transit 439 the domain are only marked using one of the PHB groups supported in 440 the domain. All nodes inside the DS domain select the forwarding 441 behavior for packets based solely on the DS codepoint as defined for 442 the PHB groups supported in the domain. Inclusion of non-DS capable 443 nodes within a DS domain may result in unpredictable performance and 444 may impede the ability to satisfy SLAs. 446 A DS domain normally consists of one or more networks under the same 447 administration, for example, an organization's intranet or an ISP. 448 Multiple DS domains may be inter-connected through mutual agreements 449 to form a DS region. DS domains in a DS region may implement 450 different PHB groups. However, to permit services which span across 451 the domains, the peering DS domains must each establish a peering SLA 452 which includes a Traffic Conditioning Agreement (TCA) which specifies 453 how transit traffic from one DS domain to another DS domain is 454 conditioned at the boundary of the two DS domains. 456 It is possible that several DS domains within a DS region may adopt a 457 common service provisioning policy and PHB group definitions, thus 458 eliminating the need for traffic conditioning between those DS 459 domains. In such cases, those DS domains are effectively under a 460 single administration and may be considered as a single DS domain. 462 The administration of the domain is responsible for ensuring that 463 adequate resources are provisioned and/or reserved to support the 464 SLAs offered by the domain. 466 2.1.2 DS Edge Nodes and Interior Nodes 468 A DS domain consists of DS edge nodes and DS interior nodes. While 469 DS edge nodes connect the DS domain to other DS or non-DS domains, DS 470 interior nodes only connect to other DS interior or edge nodes within 471 the DS domain. 473 Both DS edge nodes and interior nodes must be able to forward packets 474 based on the DS codepoint as defined by the PHB groups supported in 475 the domain; otherwise unpredictable behavior may result. In addition, 477 Black, et. al. Expires: November 1998 [Page 9] 478 DS edge nodes must be able to perform traffic conditioning functions 479 as described by the TCA between their DS domain and the peering 480 domain which they connect to. 482 Interior nodes may be able to perform limited traffic conditioning 483 functions such as DS codepoint mutation. 485 A host within a DS domain may act as a DS edge node for traffic to 486 and from applications running on that host. If a host is embedded in 487 a DS domain and does not act as an edge node, then the host's first- 488 hop router acts as the DS edge node for the host's traffic. 490 2.1.3 DS Ingress Node and Egress Node 492 DS edge nodes may act both as a DS ingress node or as a DS egress 493 node. Traffic enters a DS domain at a DS ingress node and leaves a 494 DS domain at a DS egress node. A DS ingress node is responsible for 495 ensuring that the traffic entering the DS domain conforms to the TCA 496 between it and the other domain which the ingress node is connected 497 to. A DS egress node may perform traffic conditioning functions on 498 traffic forwarded to the peering domain, depending on the details of 499 the TCA between two domains. 501 2.2 Traffic Conditioning 503 Traffic conditioning functions are performed by DS edge nodes in a DS 504 domain to ensure that the traffic entering a DS domain conforms to 505 the rules specified in the TCA, in accordance with the domain's 506 service provisioning policy, and to prepare the traffic for the PHB- 507 based forwarding treatment in the interior routers. 509 2.2.1 General Architecture of Traffic Conditioners 511 A traffic conditioner may contain the following elements: classifier, 512 meter, marker, and shaper. The classifier and the meter select the 513 packets within a traffic stream and measure the stream against a 514 traffic profile. The marker and shaper perform control actions on 515 the packets depending on whether the traffic stream is within its 516 associated profile. 518 A packet stream normally passes to a classifier first, and the 519 matched packets are measured by a meter against the profile as 520 defined in the TCA. The packets within the profile may leave the 521 traffic conditioner or may be marked by the marker. The packets that 522 are out-of-profile may be either marked or shaped according to the 523 rules specified in the TCA. Note that discard policing can be 524 performed by a specially configured shaper (see Sec. 2.2.3.4). When 525 packets leave the traffic conditioner of a DS ingress node, the DS 526 field of each packet must be set to one of DS codepoints defined by 527 the PHB groups supported in the DS domain. 529 Fig. 1 shows the block diagram of a traffic conditioner. Note that a 530 traffic conditioner may not necessarily contain all four elements. 531 For example, packets may pass from the classifier directly to the 532 marker or shaper (null meter). 534 +-------+ 535 -->| |----> 536 +-------+ +-------+ / +-------+ 537 | | | |/ marker 538 packets -----> | |------>| |--------------------> 539 | | | |\ 540 +-------+ +-------+ \ +-------+ 541 classifier meter -->| |----> 542 +-------+ 543 shaper 545 Fig. 1: Logical View of a Traffic Conditioner 547 2.2.2 Traffic Conditioning Agreement (TCA) 549 Differentiated services are extended across a DS domain boundary by 550 establishing a SLA between the customer and provider DS domains. The 551 SLA includes a traffic conditioning agreement which usually specifies 552 traffic profiles and actions to in-profile and out-of-profile 553 packets. 555 2.2.2.1 Traffic Profiles 557 A traffic profile specifies rules for classifying and measuring a 558 traffic stream. It identifies what packets are eligible and rules 559 for determining whether a particular packet is in-profile or out-of- 560 profile. For example, a profile based on token bucket may look like: 562 codepoint=X, use token-bucket r, b 564 The above profile indicates that all packets in the behavior 565 aggregate with DS codepoint X should be measured against a token 566 bucket meter with rate r and burst size b. In this example out-of- 567 profile packets are those packets in the behavior aggregate which 568 arrive when insufficient tokens are available in the bucket. 569 Different conditioning actions may be applied to the in-profile 570 packets and out-of-profile packets, or different accounting actions 571 may be triggered. 573 2.2.2.2 Actions to In-Profile and Out-of-Profile Packets 575 In-profile packets may be allowed to enter the DS domain without 576 further conditioning as they conform to the TCA; or, alternatively, 577 their DS field may be marked with a new DS codepoint. The latter 578 happens when the DS field is set to a non-Default value for the first 579 time [DSFIELD], or when the packets enter a DS domain that uses a 580 different PHB group for this traffic stream, so the DS codepoint has 581 to be mapped to the new PHB group. 583 The actions to out-of-profile packets may include delaying the 584 packets until they are in-profile (shaping), discarding the packets, 585 marking the DS field to a particular codepoint, or triggering some 586 accounting action. 588 2.2.3 Components of a Traffic Conditioner 590 2.2.3.1 Classifiers 592 Packet classifiers select packets in a traffic stream based on the 593 content of some portion of the packet header. The classification may 594 be based on the DS field only (Behavior Aggregate Classification), or 595 on any combination of one or several fields in the packet header such 596 as source address, destination address, DS field, protocol ID, and, 597 transport-layer header fields such as source port and destination 598 port numbers (Multi-Field Classification). Classifiers are used to 599 steer packets matching some specified rule to another element of the 600 traffic conditioner for further processing. Classifiers must be 601 configured by some management procedure in accordance with the 602 appropriate TCA. 604 The classifier should authenticate the information which it uses to 605 classify the packet (see Sec. 6). 607 Note that in the event of upstream packet fragmentation, multi-field 608 classifiers which examine the contents of transport-layer header 609 fields may incorrectly classify packet fragments subsequent to the 610 first. A possible solution to this problem is to maintain 611 fragmentation state; however, this is not a general solution due to 612 the possibility of upstream fragment re-ordering or divergent routing 613 paths. 615 2.2.3.2 Meters 617 Traffic meters measure the traffic properties of the set of packets 618 selected by a classifier against a traffic profile specified in the 619 TCA. A meter indicates to other conditioning functions whether each 620 individual packet is in- or out-of-profile. 622 A null meter will identify all packets as in-profile. Such a meter 623 may be used when the traffic profile does not specify conforming rate 624 or burst parameters. 626 2.2.3.3 Markers 628 Packet markers set the DS field of a packet to a particular 629 codepoint, adding the marked packet to a particular DS behavior 630 aggregate. The marker may be configured to mark all packets which 631 are steered to it to a single codepoint, or may be configured to mark 632 a packet to one of a set of codepoints within a PHB group according 633 to the state of a meter. 635 2.2.3.4 Shapers 637 Shapers delay some or all of the packets in a traffic stream in order 638 to bring the stream into compliance with its associated traffic 639 profile. A shaper usually has a finite-size buffer, and packets may 640 be discarded if there is not enough buffer space to hold the delayed 641 packets. Note that discard policers can be implemented as a special 642 case of a shaper by setting the shaper buffer size to zero (or a few) 643 packets. 645 2.2.4 Location of Traffic Conditioners 647 Traffic conditioners may be located within a customer DS domain, and 648 at the boundary of a DS domain. Traffic conditioners may also be 649 located in nodes in a non-DS domain. 651 2.2.4.1 Traffic Conditioners within a Customer DS Domain 653 Traffic sources and nodes within a customer DS domain may perform 654 traffic conditioning functions. The packets originating from the 655 customer DS domain across a boundary may have their DS field marked 656 by the traffic sources or by intermediate routers before leaving the 657 customer DS domain. 659 For example, suppose that a customer domain has a policy that the 660 CEO's packets should have higher priority. The CEO's host may mark 661 the DS field of all outgoing packets with a DS codepoint that 662 indicates higher priority. Alternatively, the first-hop router 663 directly connected to the CEO's host may classify the traffic and 664 mark the CEO's packets with the correct DS codepoint. 666 There are some advantages to marking the DS field close to the 667 traffic source. First, a traffic source can more easily take an 668 application's preferences into account when deciding which packets 669 should receive better forwarding treatment. Also, classification of 670 packets is much simpler before the traffic has been aggregated with 671 packets from other sources, since the number of classification rules 672 which need to be applied within a single node is reduced. 674 Since packet marking may be distributed across different nodes, the 675 customer DS domain is responsible for ensuring that the aggregated 676 traffic towards its provider DS domain conforms to the appropriate 677 TCA. Additional allocation mechanisms such as bandwidth brokers or 678 RSVP may be used to dynamically allocate resources for a particular 679 DS behavior aggregate within the customer's network. The edge node of 680 the customer DS domain should also monitor conformance to the TCA, 681 and triage packets as necessary. 683 2.2.4.2 Traffic Conditioners at the Boundary of a DS Domain 685 Traffic streams may be marked and otherwise conditioned on either end 686 of a boundary link (the DS egress node of the customer DS domain or 687 the DS ingress node of the provider DS domain). The TCA between the 688 domains should specify which domain has responsibility for mapping 689 traffic streams to DS behavior aggregates and conditioning those 690 aggregates in conformance with the TCA. However, a DS ingress node 691 must assume that the incoming traffic may not conform to the TCA and 692 must be prepared to enforce the TCA in accordance with local policy. 694 There is an advantage to performing complex conditioning operations 695 in the customer DS domain since it is then no longer necessary to 696 divulge the local classification and service provisioning rules to 697 the provider DS domain. In this circumstance the provider domain may 698 only need to re-mark or police incoming behavior aggregates to 699 enforce the TCA. However, more sophisticated services which are 700 path- or source-dependent may require multi-field classification in 701 the provider's ingress nodes. 703 Since packet marking may be distributed across different nodes, the 704 If a DS ingress node is connected to a non-DS domain, the DS ingress 705 node must be able to perform all traffic conditioning functions on 706 the incoming traffic. 708 2.2.4.3 Traffic Conditioners in non-DS Domains 710 Traffic sources or intermediate nodes in a non-DS domain may employ 711 traffic conditioners to pre-mark traffic before it reaches the 712 ingress of a provider DS domain. 714 2.3 Per-Hop Behaviors 716 A per-hop behavior (PHB) is a description of the externally 717 observable forwarding behavior of a DS node applied to a particular 718 DS behavior aggregate. "Forwarding behavior" is a general concept in 719 this context. For example, in the event that only one behavior 720 aggregate occupies a link, the observable forwarding behavior (i.e., 721 loss, delay, jitter) will usually depend only on the relative loading 722 of the link (i.e., in the event that the behavior assumes a work- 723 conserving scheduling discipline). Useful behavioral distinctions 724 are only observed when multiple behavior aggregates compete for 725 buffer and bandwidth resources on a node. The PHB is the means by 726 which a node allocates resources to behavior aggregates, and it is on 727 top of this basic hop-by-hop resource allocation mechanism that 728 useful differentiated services may be constructed. 730 The most simple example of a PHB is one which guarantees a minimal 731 bandwidth allocation of X% of a link (over some reasonable time 732 interval) to a behavior aggregate. This PHB can be fairly easily 733 measured under a variety of competing traffic conditions. A slightly 734 more complex PHB would guarantee a minimal bandwidth allocation of X% 735 of a link, with proportional fair sharing of any excess link 736 capacity. Another simple example is taken from [DSFIELD]; the 737 Expedited Forwarding PHB. This PHB provides negligible loss, delay, 738 and delay jitter (similar to that observed by a single packet 739 traversing an otherwise idle router) for a behavior aggregate which 740 is the multiplex of multiple peak-rate regulated traffic streams, 741 under the constraint that the load of the behavior aggregate is a 742 small fraction of the link capacity. This last constraint is a 743 consequence of queueing physics; a multiplex of peak-rate regulated 744 traffic streams may still exhibit arrival burstiness, and the 745 resulting delay and jitter will only be negligible under the 746 circumstance where the relative load of the aggregated traffic is 747 small, even when there is no competing traffic from other behavior 748 aggregates. In general, the observable behavior of a PHB may depend 749 on certain constraints on the traffic characteristics of the 750 associated behavior aggregate, or the characteristics of other 751 behavior aggregates. 753 PHBs may be specified in terms of their resource (e.g., buffer, 754 bandwidth) priority relative to other PHBs, or in terms of their 755 relative observable traffic characteristics (e.g., delay, loss) 756 [Baker]. These PHBs should be specified as a group (PHB group) for 757 consistency. The priority relationship within a PHB group will tend 758 to be hierarchical, and the associated DS codepoints should be 759 assigned in increasing order of relative priority for clarity of 760 interpretation. The priority relationship between PHBs in the group 761 may be absolute (e.g., absolute discard priority) or may be less 762 rigid (e.g., higher probability of loss). A single PHB defined in 763 isolation is a degenerate form of a PHB group. 765 PHBs are implemented in nodes by means of some buffer management and 766 packet scheduling mechanisms. PHBs should be defined in terms of 767 behavior characteristics relevant to service provisioning policies, 768 and not in terms of particular implementation mechanisms. In 769 general, a variety of implementation mechanisms may be suitable for 770 implementing a particular PHB group. Furthermore, it is likely that 771 more than one PHB group may be implemented on a node and utilized 772 within a domain. PHB groups should be defined such that the proper 773 resource allocation between groups can be inferred, and integrated 774 mechanisms can be implemented which can simultaneously support two 775 or more groups. 777 2.4 Network Resource Allocation 779 The implementation, configuration, operation and administration of 780 the supported PHB groups in the nodes of a DS Domain should 781 effectively partition the resources of those nodes and the inter-node 782 links between the traffic aggregates, in accordance with the domain's 783 service provisioning policy. Traffic conditioners control the usage 784 of these resources through the administrative control of TCAs and 785 possibly through operational feedback from the nodes and traffic 786 conditioners in the domain. 788 The configuration of and interaction between the traffic conditioners 789 and the interior nodes should be managed by the administrative 790 control of the domain and may require operational control through 791 protocols and a control entity. There is a wide range of possible 792 control models [DSFWK]. The precise nature and implementation of the 793 interaction between these components is outside the scope of this 794 architecture. However, scalability requires that the control of the 795 domain does not require micro-management of the network resources. 796 The most scalable control model would operate nodes in open-loop in 797 the operational timeframe, and would only require administrative- 798 timescale management as SLAs are varied. This simple model may be 799 unsuitable in some circumstances, and some automated but relatively 800 long time-constant operational control (minutes rather than seconds) 801 may be desirable to balance the utilization of the network against 802 the recent load profile. 804 3. Per-Hop Behavior Definition Requirements 806 In order for a Per Hop Behavior (PHB) group to be considered for 807 standardization, a detailed definition of the behavior should be 808 provided as a basis for implementation consistency. This section 809 provides a template for defining a new PHB group. Before a PHB group 810 is considered for standardization it should satisfy the PHB 811 definition requirements in this section, to preserve the integrity of 812 this architecture. 814 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 815 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 816 document are to be interpreted as described in [RFC2119]. 818 3.1. A PHB definition MUST NOT require inspection or modification of 819 any part of the packet other than the DS field. 821 3.2. The definition of each newly proposed PHB group MUST include an 822 overview of the behavior and the purpose of the behavior being 823 proposed. The overview MUST include a problem or problems statement 824 for which the PHB group is targeted. The overview MUST include the 825 basic concepts behind the PHB group. These concepts SHOULD include, 826 but are not restricted to, queueing behavior, discard behavior, and 827 output link selection behavior. Lastly, the overview MUST specify 828 the method by which the PHB group solves the problem or problems 829 specified in the problem statement. 831 Any configuration or management issues which affect the basic PHB 832 definition MUST be specified in the overview of the behavior. The 833 actual details of the management and configuration of PHB groups in 834 routers or hosts MUST be addressed in a separate, parallel document. 836 3.3. A PHB group definition MUST indicate whether a PHB group 837 consists of one or more codepoints. In the event that multiple 838 codepoints are specified, the interactions between the codepoints 839 within the PHB group and constraints that must be respected globally 840 across all the codepoints within the PHB group MUST be clearly 841 explained in the description of the PHB group. As an example, the 842 definition MUST specify whether packet reordering within a microflow 843 with packets marked by two or more codepoints within the group is 844 likely. 846 3.4. A PHB group may be standardized for local use within a domain 847 in order to provide some domain specific functionality or domain 848 specific services. In this event, the PHB definition is useful for 849 providing vendors with a consistent definition of the PHB group. The 850 PHB definition can also provide semantics for PHB translation and 851 service mappings with peer domains which do not support the PHB 852 group. However, any PHB group which is defined as local use MUST be 853 considered as an informational standard. In contrast, a PHB group 854 which is proposed for general use will follow a stricter 855 standardization process. Therefore all proposed PHB definitions MUST 856 specifically state whether they are to be considered for general use 857 or local use. 859 It is recognized that PHB groups can be designed with the intent of 860 providing host-to-host, WAN edge-to-WAN edge, or domain edge-to- 861 domain edge services. Use of the term "end-to-end" in a PHB 862 definition MUST be interpreted to mean "host-to-host". 864 Other PHB groups may be defined and deployed locally within domains, 865 for experimental or operational purposes. There is no requirement 866 that these PHB groups must be publically documented, but they SHOULD 867 utilize DS codepoints from one of the EXP/LU pools as defined in 868 [DSFIELD]. 870 3.5. It may be possible or appropriate for a packet marked with a 871 codepoint within a PHB group to be re-marked to another codepoint 872 within that group either within a domain or across two cooperating 873 domains. Typically there are three reasons for PHB group mutability: 875 1. The codepoints of the PHB group are collectively intended to carry 876 state about the network. 877 2. Changes in the network state which require promotion or demotion 878 of traffic marked with a codepoint within the PHB group. 879 3. A PHB group is not implemented one both sides of a domain 880 boundary; All codepoints of a PHB group have to be mapped to some 881 other PHB or PHB group at the boundary. 883 In contrast, it may also be necessary for specific PHB groups to be 884 preserved within a domain and/or across multiple domains. Typically 885 this is because the PHB groups carry some host-to-host, WAN edge-to- 886 WAN edge, or domain edge-to-domain edge semantics which are difficult 887 to duplicate when the PHB group is mapped to a different PHB group. 889 Further, these semantics may also be difficult to duplicate if packet 890 markings are promoted or demoted within the same PHB group. 892 A PHB definition MUST clearly state whether packets marked by a 893 codepoint within a PHB group MAY, or SHOULD be promoted, demoted (to 894 another codepoint within the group), or preserved within a domain. A 895 PHB definition MUST clearly state whether packets marked by a 896 codepoint within a PHB group MAY, or SHOULD be promoted, demoted, or 897 preserved across multiple, cooperating domains. A PHB definition 898 MUST clearly state whether codepoints within a PHB group MAY, or 899 SHOULD be mapped to a different PHB group. 901 If it is desirable for a PHB group to be changed, the definition 902 SHOULD clearly state the circumstances under which a change is 903 desirable. If it is undesirable for a PHB group to be changed, the 904 definition MUST clearly state what the risks are when a PHB group is 905 modified. A PHB definition may include constraints on actions that 906 change the PHB group. These constraints may be specified as actions 907 the router SHOULD, or MUST perform. 909 3.6. The PHB definition MUST also include a section defining the 910 implications of tunneling on the PHB group. This section should 911 specify the implications on the PHB group of a newly created outer 912 header when the original PHB group of the inner header is 913 encapsulated in a tunnel. This section should also discuss what 914 possible changes should be applied to the inner header at the egress 915 of the tunnel, when both the PHB groups from the inner header and the 916 outer header are accessible. 918 3.7. The process of defining PHB groups is incremental in nature. 919 When new PHB groups are defined, their known interactions with 920 previously defined PHB groups MUST be documented. When a new PHB 921 group is created, it can be entirely new in scope or it can be an 922 extension to an existing PHB group. If the PHB group is entirely 923 independent of some or all of the existing PHB definitions, a section 924 MUST be included in the PHB definition which details how the new PHB 925 group co-exists with those PHB groups already defined. For example, 926 this section might indicate the possibility of packet re-ordering 927 within a microflow with packets marked by codepoints within two 928 separate PHB groups. If concurrent operation of two (or more) 929 different PHB groups in the same router is impossible or detrimental 930 this MUST be stated. If the concurrent operation of two (or more) 931 different PHB groups requires some specific behaviors by the router 932 when traffic specifying these different PHB groups are in the router 933 at the same time, these behaviors MUST be stated. 935 If the proposed PHB group is an extension to an existing PHB group, a 936 section MUST be included in the PHB group definition which details 937 how this extension inter-operates with the behavior being extended. 938 Further, if the extension alters or more narrowly defines the 939 existing behavior in some way, this MUST also be clearly specified in 940 the PHB definition. 942 3.8. Each PHB definition MUST include a section specifying minimal 943 conformance to the PHB group. This conformance section is intended 944 to provide a means for specifying the details of a behavior while 945 allowing for implementation variation to the extent permitted by the 946 PHB definition. This conformance section can take the form of rules, 947 tables, pseudo-code or tests. 949 3.9. A PHB definition MUST include a section detailing the security 950 implications of the behavior. This section should include a 951 discussion of the mutability of the inner header's PHB group at the 952 egress of a tunnel. Further, this section should also discuss how 953 the proposed PHB group could be used in denial-of-service attacks, 954 reduction of service contract attacks, and service contract violation 955 attacks. Lastly, this section should discuss the means for detecting 956 such attacks as they are relevant to the proposed behavior. 958 3.10. It is strongly RECOMMENDED that an appendix be provided for 959 each PHB definition that considers the implications of the proposed 960 behavior on current and potential services. These services could 961 include but are not restricted to be user specific, device specific, 962 domain specific or end to end services. It is also strongly 963 RECOMMENDED that the appendix include a section describing how the 964 services are verified by users, devices, and/or domains. 966 3.11. If the PHB definition is targeted for local use within a 967 domain, it is RECOMMENDED that the appendix include a description of 968 how the PHB group is mapped to existing general use PHB groups as 969 well as other local use PHB groups. 971 3.12. It is RECOMMENDED that an appendix be provided for each PHB 972 definition which considers the impact of the proposed new PHB groups 973 on existing higher-layer protocols. Under some circumstances PHB 974 definitions may allow for possible changes to higher-layer protocols 975 which may increase or decrease the utility of the proposed PHB group. 977 4. Interoperability with Non-Differentiated Services-Compliant Nodes 979 We define a non-differentiated services-capable node (non-DS-capable 980 node) as a node which does not interpret the DS field as specified in 981 [DSFIELD] and/or does not implement some or all of the standardized 982 PHBs. This may be due to the capabilities or configuration of the 983 node. We distinguish such a node from a one which does not implement 984 differentiated forwarding behaviors which can be selected by the 985 value of the IPv4 TOS byte or the IPv6 Traffic Class byte. We define 986 a legacy node as one which implements IPv4 Precedence as defined in 987 [RFC791], but which is otherwise non-DS capable. 989 Differentiated services depend on the resource allocation mechanisms 990 provided by per-hop behavior implementations on nodes. The quality 991 or statistical assurance level of a service may break down in the 992 event that traffic transits a non-DS-capable node, or a non-DS- 993 capable domain. 995 We will examine two separate cases. The first case concerns the use 996 of non-DS-capable nodes within a DS domain. Note that PHB forwarding 997 is primarily useful for allocating scarce node and link resources in 998 a controlled manner. On high-speed, lightly loaded links, the worst- 999 case packet delay, jitter, and loss may be negligible, and the use of 1000 a non-DS-capable node on the upstream end of such a link may not 1001 result in service degradation. In more realistic circumstances, the 1002 lack of PHB forwarding in a node may make it impossible to offer low- 1003 delay, low-loss, or provisioned bandwidth services across paths which 1004 traverse the node. However, use of a legacy node may be an 1005 acceptable alternative, assuming that the DS domain restricts itself 1006 to using only the precedence-compatible PHBs defined in [Baker], and 1007 assuming that the particular precedence implementation results in 1008 forwarding behaviors which are compatible with the services offered 1009 along paths which traverse that node. 1011 The second case concerns the behavior of services which traverse non- 1012 DS-capable domains. We assume for the sake of argument that a non- 1013 DS-capable domain does not deploy traffic conditioning functions on 1014 domain edge nodes; therefore, even in the event that the domain 1015 consists of legacy or DS-capable interior nodes, the lack of traffic 1016 enforcement at the edges will limit the ability to consistently 1017 deliver some types of services across the domain. A DS domain and a 1018 non-DS-capable domain may negotiate an agreement which governs how 1019 egress traffic from the DS-domain should be marked before entry into 1020 the non-DS-capable domain. This agreement might be monitored for 1021 compliance by traffic sampling instead of by rigorous traffic 1022 conditioning. Alternatively, where there is knowledge that the non- 1023 DS-capable domain consists of legacy nodes, the upstream DS domain 1024 may opportunistically re-mark differentiated services traffic to one 1025 or more IPv4 precedence values. Where there is no knowledge of the 1026 traffic management capabilities of the domain, and no agreement in 1027 place, a DS domain egress node may choose to re-mark the DS field to 1028 zero, under the assumption that the non-DS-capable domain will treat 1029 the traffic uniformly with best-effort service. 1031 In the event that a non-DS-capable peers with a DS domain, traffic 1032 flowing from the non-DS-capable domain should be conditioned at the 1033 DS ingress node of the DS domain according to the appropriate SLA or 1034 policy. 1036 5. Multicast Considerations 1038 For future study. 1040 6. Security and Tunneling Considerations 1042 This section addresses security issues raised by the introduction of 1043 differentiated services, primarily the potential for denial-of- 1044 service attacks, and the related potential for theft of service by 1045 unauthorized traffic (Section 6.1). In addition, the operation of 1046 differentiated services in the presence of IPsec and its interaction 1047 with IPsec are also discussed (Section 6.2), as well as auditing 1048 requirements (Section 6.3). This section considers issues introduced 1049 by the use of both IPsec and non-IPsec tunnels. 1051 6.1 Theft and Denial of Service 1053 The primary goal of differentiated services is to allow different 1054 levels of service to be provided for traffic streams on a common 1055 network infrastructure. A variety of resource management techniques 1056 may be used to achieve this, but the end result will be that some 1057 packets receive different (e.g., better) service than others. The 1058 mapping of network traffic to the specific behaviors that result in 1059 different (e.g., better or worse) service is indicated primarily by 1060 the DS field, and hence an adversary may be able to obtain better 1061 service by modifying the DS field to values indicating behaviors used 1062 for enhanced services or by injecting packets with DS field's set to 1063 such values. Taken to its limits, this theft of service becomes a 1064 denial-of-service attack when the modified or injected traffic 1065 depletes the resources available to forward it and other traffic 1066 streams. The defense against such theft- and denial-of-service 1067 attacks consists of a combination of edge policing and security of 1068 the network infrastructure within a DS domain. 1070 As described in Section 2.1, DS ingress nodes must ensure that all 1071 traffic entering a DS domain has DS field values that are acceptable 1072 to that domain's service provision policy. This makes the ingress 1073 nodes the first line of defense against theft-of-service and denial- 1074 of-service attacks based on modified DS field values (e.g., values to 1075 which the traffic is not entitled). An important instance of an 1076 ingress node is that any traffic-originating node in a DS domain is 1077 the ingress node for that traffic, and must ensure that that traffic 1078 carries acceptable DS field values. 1080 A domain's service provision policy may require the ingress nodes to 1081 change the DS field values on some entering packets (e.g., an ingress 1082 router may set the DS field values of a customer's traffic in 1083 accordance with the appropriate SLA). Ingress nodes should police 1084 all other inbound traffic to ensure that the DS field values are 1085 acceptable; packets found to have unacceptable values must either be 1086 discarded or must have their DS fields modified to acceptable values 1087 before being forwarded. For example, an ingress node receiving 1088 traffic from a domain with which no enhanced service agreement exists 1089 may reset the DS field to DE(fault) service [DSFIELD]. A service 1090 provisioning policy may require traffic authentication to validate 1091 the use of some DS field values (e.g., those corresponding to 1092 enhanced services), and such authentication may be performed by 1093 technical means (e.g., IPsec) and/or non-technical means (e.g., the 1094 inbound link is known to be connected to exactly one customer site). 1096 An inter-domain agreement may reduce or eliminate the need for 1097 ingress node traffic policing by making the upstream domain partly or 1098 completely responsible for ensuring that traffic has DS field values 1099 acceptable to the downstream domain. In this case, the ingress node 1100 may still perform redundant acceptability checks to reduce the 1101 dependence on the upstream domain (e.g., such checks can prevent 1102 theft-of-service attacks from propagating across the domain 1103 boundary). If an acceptability check fails because the upstream 1104 domain is not fulfilling its responsibilities, that failure is an 1105 auditable event; the generated audit log entry should include the 1106 date/time the packet was received, the source and destination IP 1107 addresses, and the DS field value that caused the failure. In 1108 practice, the limited gains from such checks need to be weighed 1109 against their potential performance impact in determining what, if 1110 any, checks to perform under these circumstances. 1112 Interior nodes in a DS domain may rely on the DS field to associate 1113 differentiated services traffic with the behaviors used to implement 1114 enhanced services. Any node doing so depends on the correct 1115 operation of the DS domain to prevent the arrival of traffic with 1116 unacceptable DS field values. Robustness concerns dictate that the 1117 arrival of packets with unacceptable DS field values must not cause 1118 the failure (e.g., crash) of network nodes. Interior nodes are not 1119 responsible for enforcing the service provisioning policy (or 1120 individual SLAs) and hence are not required to check DS field values 1121 for acceptability. Interior nodes may perform some acceptability 1122 checks on DS field values (e.g., check for DS field values that are 1123 never used for traffic on a specific link, never used with a source/ 1124 destination address outside a specific range, etc.) to improve 1125 security and robustness (e.g., resistance to theft of service attacks 1126 based on DS field modifications). Any detected failure of such an 1127 acceptability check is an auditable event and the generated audit log 1128 entry should include the date/time the packet was received, the 1129 source and destination IP addresses, and the DS field value that 1130 caused the failure. In practice, the limited gains from such checks 1131 need to be weighed against their potential performance impact in 1132 determining what, if any, checks to perform at interior nodes. 1134 Any link that cannot be adequately secured against modification of DS 1135 field values or traffic injection by adversaries should be treated as 1136 a boundary link (and hence any arriving traffic on that link is 1137 treated as if it were entering the domain at an ingress node). Local 1138 security policy provides the definition of "adequately secured," and 1139 such a definition may include a determination that the risks and 1140 consequences of DS field modification and/or traffic injection do not 1141 justify any additional security measures for a link. Link security 1142 can be enhanced via physical access controls and/or software means 1143 such as tunnels that ensure packet integrity. 1145 6.2 IPsec and Tunneling interactions 1147 The IPsec protocol, as defined in [ESP, AH], does not include the IP 1148 header's DS field in any of its cryptographic calculations (in the 1149 case of tunnel mode, it is the outer IP header's DS field that is not 1150 included). Hence modification of the DS field by a network node has 1151 no effect on IPsec's end-to-end security, because it cannot cause any 1152 IPsec integrity check to fail. As a consequence, IPsec does not 1153 provide any defense against an adversary's modification of the DS 1154 field (i.e., a man-in-the-middle attack); the adversary's 1155 modification will also have no effect on IPsec's end-to-end security. 1156 In some environments, the ability to modify the DS field without 1157 affecting IPsec integrity checks may constitute a covert channel; if 1158 it is necessary to eliminate such a channel or reduce its bandwidth, 1159 the DS domains should be configured so that the required processing 1160 (e.g., set all DS fields on sensitive traffic to a single value) can 1161 be performed at DS egress nodes where traffic exits higher security 1162 domains. 1164 IPsec's tunnel mode provides security for the encapsulated IP 1165 header's DS field. A tunnel mode IPsec packet contains two IP 1166 headers: an outer header supplied by the ingress node and an 1167 encapsulated inner header supplied by the original source of the 1168 packet. When an IPsec tunnel is hosted (in whole or in part) on a 1169 differentiated services network, the intermediate network nodes 1170 operate on the DS field in the outer header. At the tunnel egress 1171 node, IPsec processing includes stripping the outer header and 1172 forwarding the packet (if required) using the inner header. Since 1173 the inner IP header has not been processed by a DS ingress node, the 1174 tunnel egress node is the DS ingress node for traffic exiting the 1175 tunnel, and hence must carry out the corresponding responsibilities 1176 (see Section 6.1). If the IPsec processing includes a sufficiently 1177 strong cryptographic integrity check of the encapsulated packet 1178 (where sufficiency is determined by local security policy), the 1179 tunnel egress node can safely assume that the DS field in the inner 1180 header has the same value as it had at the tunnel ingress node. If 1181 the tunnel ingress node is in the same DS domain as the tunnel egress 1182 node, the tunnel egress node can safely treat a packet passing such 1183 an integrity check as if it had arrived from another node within the 1184 same DS domain and hence omit the DS ingress node policing that would 1185 otherwise be required. An important consequence is that otherwise 1186 insecure internal links within DS domains can be secured by a 1187 sufficiently strong IPsec tunnel. 1189 This analysis and its implications apply to any tunneling protocol 1190 that performs integrity checks, but the level of assurance of the 1191 inner header's DS field depends on the strength of the integrity 1192 check performed by the tunneling protocol. In the absence of 1193 sufficient assurance for a tunnel that may transit nodes outside the 1194 current DS domain (or is otherwise vulnerable), the encapsulated 1195 packet must be treated as if it had arrived at a DS ingress node from 1196 outside the domain. 1198 IPsec currently specifies that the inner header's DS field must not 1199 be changed by IPsec decapsulation processing at the tunnel egress 1200 node. This ensures that an adversary's modifications to the DS field 1201 cannot be used to launch theft- or denial-of-service attacks across 1202 an IPsec tunnel endpoint, as any such modifications will be discarded 1203 at the tunnel endpoint. 1205 Note: the following paragraph requires coordination with and approval 1206 by he Security Area of the IETF, and may result in the need for brief 1207 modifications of the appropriate security RFCs. 1209 A tunnel egress node in a DS domain may modify the DS field in an 1210 inner IP header based on the DS field value in the outer header, 1211 including copying part or all of the outer DS field to the inner DS 1212 field. For a tunnel contained entirely within a single DS domain and 1213 for which the links are adequately secured against modifications of 1214 the outer DS field, the only limits on modifications are those 1215 imposed by the domain's service provisioning policy. Otherwise, the 1216 tunnel egress node performing such modifications is acting as a DS 1217 ingress node for traffic exiting the tunnel, and must carry out the 1218 responsibilities of an ingress node, including ensuring that the 1219 resulting DS field values are acceptable (see Section 6.1). 1221 If the tunnel enters the DS domain at a node different from the 1222 tunnel egress node, the tunnel egress node may depend on the upstream 1223 DS ingress node having ensured the acceptability of the outer DS 1224 field value. Even in this case, there are some acceptability checks 1225 that can only be performed by the tunnel egress node (e.g., a 1226 consistency check between the inner and outer DS field values for an 1227 encrypted tunnel). Any detected failure of such a check is an 1228 auditable event and the generated audit log entry should include the 1229 date/time the packet was received, the source and destination IP 1230 addresses, and the DS field value that was unacceptable. The 1231 requirements in this paragraph apply to any future use of the 1232 currently unused (CU) bits in the IPv4 TOS byte and the IPv6 Traffic 1233 Class byte [DSFIELD]. 1235 6.3 Auditing 1237 Not all systems that support differentiated services will implement 1238 auditing. However, if differentiated services support is 1239 incorporated into a system that supports auditing, then the 1240 differentiated services implementation must also support auditing and 1241 must allow a system administrator to enable or disable auditing for 1242 differentiated services. For the most part, the granularity of 1243 auditing is a local matter. However, several auditable events are 1244 identified in this document and for each of these events a minimum 1245 set of information that should be included in an audit log is 1246 defined. Additional information also may be included in the audit 1247 log for each of these events, and additional events, not explicitly 1248 called out in this specification, also may result in audit log 1249 entries. There is no requirement for the receiver to transmit any 1250 message to the purported sender in response to the detection of an 1251 auditable event, because of the potential to induce denial of service 1252 via such action. 1254 7. Acknowledgements 1256 The authors would like to acknowledge the following individuals for 1257 their helpful comments and suggestions: Kathleen Nichols, Brian 1258 Carpenter, Konstantinos Dovrolis, Shivkumar Kalyana, Wu-chang Feng, 1259 Marty Borden, Yoram Bernet, Ronald Bonica, James Binder, and Borje 1260 Ohlman. 1262 8. References 1264 [802.1p] ISO/IEC Final CD 15802-3 Information technology - Tele- 1265 communications and information exchange between systems - 1266 Local and metropolitan area networks - Common 1267 specifications - Part 3: Media Access Control (MAC) 1268 bridges, (current draft available as IEEE P802.1D/D15). 1270 [AH] S. Kent and R. Atkinson, "IP Authentication Header", 1271 Internet Draft , 1272 May 1998. 1274 [ATM] ATM Traffic Management Specification Version 4.0 1275 , April 1996. 1277 [Baker] F. Baker, S. Brim, T. Li, F. Kastenholz, S. Jagannath, 1278 and J. Renwick, "IP Precedence in Differentiated 1279 Services Using the Assured Service", Internet Draft 1280 , April 1998. 1282 [DSFIELD] K. Nichols and S. Blake, "Definition of the 1283 Differentiated Services Field (DS Byte) in the IPv4 and 1284 IPv6 Headers", Internet Draft 1285 , May 1998. 1287 [DSFWK] Differentiated Services Framework Document (work in 1288 preparation). 1290 [Clark97] D. Clark and J. Wroclawski, "An Approach to Service 1291 Allocation in the Internet", Internet Draft 1292 , July 1997. 1294 [Ellesson] E. Ellesson and S. Blake, "A Proposal for the Format and 1295 Semantics of the TOS Byte and Traffic Class Byte in IPv4 1296 and IPv6", Internet Draft , 1297 November 1997. 1299 [ESP] S. Kent and R. Atkinson, "IP Encapsulating Security 1300 Payload", Internet Draft 1301 , May 1998. 1303 [Ferguson] P. Ferguson, "Simple Differential Services: IP TOS and 1304 Precedence, Delay Indication, and Drop Preference, 1305 Internet Draft , 1306 April 1998. 1308 [FRELAY] ANSI T1S1, "DSSI Core Aspects of Frame Rely", March 1990. 1310 [Heinanen] J. Heinanen, "Use of the IPv4 TOS Octet to Support 1311 Differentiated Services", Internet Draft 1312 , November 1997. 1314 [IntServ] R. Braden, D. Clark, and S. Shenker, "Integrated Services 1315 in the Internet Architecture: An Overview", Internet RFC 1316 1633, July 1994. 1318 [MPLSFWK] R. Callon, P. Doolan, N. Feldman, A. Fredette, G. 1319 Swallow, and A. Viswanathan, "A Framework for 1320 Multiprotocol Label Switching", Internet Draft 1321 , November 1997. 1323 [PASTE] T. Li and Y. Rekhter, "Provider Architecture for 1324 Differentiated Services and Traffic Engineering (PASTE)", 1325 Internet Draft , January 1998. 1327 [RFC791] Information Sciences Institute, "Internet Protocol", 1328 Internet RFC 791, September 1981. 1330 [RFC1349] P. Almquist, "Type of Service in the Internet Protocol 1331 Suite", Internet RFC 1349, July 1992. 1333 [RFC2119] S. Bradner, "Key words for use in RFCs to Indicate 1334 Requirement Levels", Internet RFC 2119, March 1997. 1336 [RSVP] B. Braden et. al., "Resource ReSerVation Protocol (RSVP) 1337 -- Version 1 Functional Specification", Internet RFC 1338 2205, September 1997. 1340 [SIMA] K. Kilkki, "Simple Integrated Media Access (SIMA)", 1341 Internet Draft , 1342 June 1997. 1344 [2BIT] K. Nichols, V. Jacobson, and L. Zhang, "A Two-bit 1345 Differentiated Services Architecture for the Internet", 1346 Internet Draft , 1347 November 1997. 1349 [TR] ISO/IEC 8802-5 Information technology - 1350 Telecommunications and information exchange between 1351 systems - Local and metropolitan area networks - Common 1352 specifications - Part 5: Token Ring Access Method and 1353 Physical Layer Specifications, (also ANSI/IEEE Std 802.5- 1354 1995), 1995. 1356 [Weiss] W. Weiss, "Providing Differentiated Services Through 1357 Cooperative Dropping and Delay Indication", Internet 1358 Draft , March 1998. 1360 Authors' Addresses 1362 David Black 1363 The Open Group Research Institute 1364 Eleven Cambridge Center 1365 Cambridge, MA 02142 1366 Phone: +1-617-621-7347 1367 E-mail: d.black@opengroup.org 1369 Steven Blake 1370 IBM Corporation 1371 800 Park Offices Drive 1372 Research Triangle Park, NC 27709 1373 Phone: +1-919-254-2030 1374 E-mail: slblake@raleigh.ibm.com 1376 Mark A. Carlson 1377 Redcape Software, Inc. 1378 2990 Center Green Court South 1379 Boulder, CO 80301 1380 Phone: +1-303-448-0048 x115 1381 E-mail: mac@redcape.com 1383 Elwyn Davies 1384 Nortel UK 1385 London Road 1386 Harlow, Essex CM17 9NA, UK 1387 Phone: +44-1279-405498 1388 E-mail: elwynd@nortel.co.uk 1390 Zheng Wang 1391 Bell Labs Lucent Tech 1392 101 Crawfords Corner Road 1393 Holmdel, NJ 07733 1394 E-mail: zhwang@bell-labs.com 1396 Walter Weiss 1397 Lucent Technologies 1398 300 Baker Avenue, Suite 100, 1399 Concord, MA 01742-2168 1400 E-mail: wweiss@lucent.com