idnits 2.17.1 draft-guerin-aggreg-rsvp-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-19) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 3 instances of too long lines in the document, the longest one being 6 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 413: '... traffic MUST NOT be forwarded onto ...' RFC 2119 keyword, line 885: '...this option type SHOULD skip over this...' RFC 2119 keyword, line 886: '...eader. This option MUST NOT change en...' RFC 2119 keyword, line 887: '... route. There MUST only be one opti...' RFC 2119 keyword, line 901: '...at ``Unrecognized value fields MUST be...' Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (21 November 1997) is 9646 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- No information found for draft-rsvp-cidr-ext - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'Boy97' -- Possible downref: Normative reference to a draft: ref. 'Bra97' == Outdated reference: A later version (-01) exists of draft-berson-classy-approach-00 -- Possible downref: Normative reference to a draft: ref. 'BV97' -- Possible downref: Normative reference to a draft: ref. 'CW97' == Outdated reference: A later version (-01) exists of draft-heinanen-diff-tos-octet-00 -- Possible downref: Normative reference to a draft: ref. 'Hei97' -- Possible downref: Non-RFC (?) normative reference: ref. 'KAPJ97' -- Possible downref: Non-RFC (?) normative reference: ref. 'Kat97' -- Possible downref: Normative reference to a draft: ref. 'Kil97' -- Possible downref: Normative reference to a draft: ref. 'RG97' Summary: 11 errors (**), 0 flaws (~~), 3 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force R. Guerin/S. Blake/S. Herzog 2 INTERNET DRAFT IBM/IBM/IPHighway 3 21 November 1997 5 Aggregating RSVP-based QoS Requests 6 draft-guerin-aggreg-rsvp-00.txt 8 Status of This Memo 10 This document is an Internet-Draft. Internet Drafts are working 11 documents of the Internet Engineering Task Force (IETF), its Areas, 12 and its Working Groups. Note that other groups may also distribute 13 working documents as Internet Drafts. 15 Internet Drafts are draft documents valid for a maximum of six 16 months, and may be updated, replaced, or obsoleted by other documents 17 at any time. It is not appropriate to use Internet Drafts as 18 reference material, or to cite them other than as a ``working draft'' 19 or ``work in progress.'' 21 To learn the current status of any Internet-Draft, please check 22 the ``1id-abstracts.txt'' listing contained in the internet-drafts 23 Shadow Directories on ds.internic.net (US East Coast), nic.nordu.net 24 (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific 25 Rim). 27 Abstract 29 This document describes issues and approaches related to aggregation 30 of QoS requests, when RSVP [BZB+97] is the protocol used to convey 31 such requests. Aggregation is an important component to provide 32 scalable QoS solutions, especially in the core of the backbone 33 where the sheer number of flows mandates some form of aggregation. 34 However, aggregation needs to be provided without impacting the 35 ability to provide end-to-end QoS guarantees to individual flows. 36 In this document, we review some of the main goals of aggregation 37 and describe possible solutions, that do not preclude support for 38 end-to-end QoS guarantees. Those solutions are targeted at unicast 39 flows as we expect them to represent a large fraction of the flows 40 requesting reservation, and hence to be the main contributors to 41 potential scalability problems with RSVP. 43 Contents 45 Status of This Memo i 47 Abstract i 49 1. Introduction 1 51 2. Sample Scenario and Requirements for QoS Aggregation 2 53 3. Data Path Aggregation 4 54 3.1. Tunnel Based Aggregation . . . . . . . . . . . . . . . . 5 55 3.2. TOS Field Based Aggregation . . . . . . . . . . . . . . . 5 57 4. Control Path Aggregation 6 58 4.1. Tunnel Based Aggregation . . . . . . . . . . . . . . . . 7 59 4.1.1. Setting of Aggregate Reservations . . . . . . . . 8 60 4.2. TOS Field Based Aggregation . . . . . . . . . . . . . . . 9 61 4.2.1. Ingress-Egress Aggregation: Method 1 . . . . . . 10 62 4.2.2. Ingress-Egress Aggregation: Method 2 . . . . . . 13 63 4.2.3. Setting of Aggregate Reservations . . . . . . . . 14 65 5. Conclusion and Recommendations 15 67 A. Router Alert Options for Concealing ``Individual'' PATH Messages 17 68 A.1. IPv4 Syntax . . . . . . . . . . . . . . . . . . . . . . . 17 69 A.2. IPv6 Syntax . . . . . . . . . . . . . . . . . . . . . . . 18 71 1. Introduction 73 As described in [Bra97], there are several facets to the support 74 of QoS in the Internet. The aspect of QoS aggregation with RSVP 75 falls primarily in the areas of ``Control Model'', and to some 76 extent ``Scope'', as they are identified in [Bra97]. Specifically, 77 the focus of QoS aggregation is on both the granularity of QoS 78 guarantees, and their extent, i.e., from where to where. 80 In RSVP, the granularity of a QoS request is determined through 81 filters that specify destination address and port number, as well 82 as source address and port number in some instances (see [BZB+97] 83 for details). This corresponds to a very fine granularity of QoS 84 guarantees, i.e., per flow, and while this provides end-users with 85 accurate control, it can also translate into a substantial overhead 86 for the network. This is particularly true for backbone links, 87 where the sheer number of flows (there are 37,500 64kbps flows on 88 an OC-48 link) can introduce a scalability problem. Similarly, the 89 scope of RSVP QoS requests is end-to-end, i.e., from application to 90 application, and while this does again provide end-users with maximum 91 control, it can also impose substantial overhead. For example, 92 a network administrator may want to reserve a certain amount of 93 bandwidth to interconnect two sites across the network of an ISP. 94 This is not readily feasible under the current RSVP specifications, 95 which require that reservations be setup and managed between all 96 pairs of end-users in the two sites. A possible alternative is to 97 establish an RSVP ``tunnel'' between the two sites, and we discuss 98 this option, but it has the disadvantage of additional encapsulation 99 overhead and processing. 101 As a result, the issue of QoS aggregation in the context of RSVP 102 has two major components. The first, is an extension to RSVP to 103 support ``aggregate'' QoS requests, i.e., on behalf of a set of 104 flows rather than individual flows. For example, the set of flows 105 to which an aggregate request would apply, could correspond to 106 traffic between a given source subnet and a given destination subnet. 107 Support for such aggregate requests is not available from the current 108 RSVP specifications, and would require the definition of new filter 109 specifications. One possible example are the CIDR prefix based 110 filters suggested in [Boy97]. The introduction of such extensions is 111 certainly key to increasing the applicability of RSVP as a generic 112 reservation protocol, but in this document we instead focus on the 113 second and more immediate aspect of QoS aggregation for RSVP. 115 Specifically, we consider the problem of aggregating a large number 116 of individual RSVP requests to improve scalability, e.g., on backbone 117 links, without precluding support for individual QoS guarantees where 118 feasible, e.g., on low speed links and local networks. In other 119 words, the focus of QoS aggregation in this document, is to provide 120 the means for ensuring individual end-to-end QoS guarantees, but 121 without requiring that awareness of individual flows be maintained on 122 each and every segment of their path. This is an important issue as 123 the need for maintaining and updating a large number of individual 124 RSVP flow states has been often mentioned as a major obstacle to 125 the widespread deployment of RSVP. The goals of this document are, 126 therefore, to review and address the potential scalability problems 127 that have been identified with the CURRENT RSVP specifications, and 128 propose possible solutions. 130 The rest of this document is structured as follows. In Section 2, 131 we first describe a sample scenario illustrating the constraints 132 and aspects of QoS aggregation with RSVP. In Sections 3 and 4, we 133 identify specific goals when supporting QoS aggregation for RSVP, and 134 propose possible aggregation solutions to achieve them. 136 2. Sample Scenario and Requirements for QoS Aggregation 138 Consider the network topology of Figure 1. It consists of three 139 separate AS, with the two edge AS (AS1 and AS3) corresponding 140 to local AS and the middle one (AS2) representing a backbone 141 interconnecting the two. For the purpose of our discussion on QoS 142 aggregation, we assume that scalability is of concern only in the 143 backbone AS2, i.e., AS1 and AS3 are capable of maintaining RSVP state 144 information for all the individual flows that originate and terminate 145 in them. Furthermore, and without loss of generality, we focus on 146 RSVP flows between AS1 and AS3 that cross AS2. In that context, QoS 147 aggregation is of concern only for AS2. 149 AS1 AS2 AS3 150 ________ ________________ _________ 151 / \ / \ / \ 152 / \ / \ / \ 153 | Access | | Backbone | | Access | 154 | Network |----| Network |----| Network | 155 | | | | | | 156 \ / \ / \ / 157 \________/ \________________/ \_________/ 159 Figure 1: Sample Network Configuration 161 Aggregation of individual RSVP flows through AS2 must satisfy a 162 number of requirements, which we briefly review. 164 R1 AS2 should not have to maintain awareness of individual RSVP 165 flows between AS1 and AS3. Instead, AS2 should be able to map 166 individual RSVP flows onto few internal service ``classes''. 168 R2 AS2 should ensure that it satisfies the QoS requirements of 169 individual RSVP flows, e.g., the resources allocated to a service 170 class in AS2 should at least be equal to the aggregate resources 171 required by all the individual flows mapped onto it. 173 R3 Isolation between flows should be maintained in AS2, i.e., even 174 when flows are aggregated into a common service class, the excess 175 traffic of one flow should not affect the performance guarantees 176 of another flow. 178 R4 Aggregation in AS2 should not prevent support for individual flow 179 reservations in AS1 and AS3. 181 Requirement R1 is the core scalability requirement expressed by AS2. 182 It basically states, that because QoS support within AS2 is provided 183 through much coarser mechanisms than the control and allocation of 184 resources to individual RSVP flows, of which there could be way too 185 many, it is necessary for individual RSVP flows to be mapped onto one 186 of the internal class-based mechanisms supported by AS2. Coarser 187 class-based mechanisms are usually mandated by the speed of the 188 backbone links, where the time available for making packet forwarding 189 and scheduling decisions is often not sufficient to accommodate per 190 flow operations. In addition to the constraint on forwarding and 191 scheduling decision, there is a similar limitation on the amount of 192 control information that a backbone node is capable of maintaining 193 and updating. Specifically, maintaining path and reservation control 194 blocks for individual flows may not be not practical in AS2. 196 Requirements R2 and R3 specify properties, that the mapping of 197 individual RSVP flows onto the coarser ``classes'' of AS2 has to 198 satisfy. First and foremost, requirement R2 expresses the need 199 for some coupling between the resources (bandwidth and buffer) and 200 level of service (priority) assigned to a class in AS2, and the 201 aggregation of the individual RSVP flows mapped onto that class. For 202 example, this means that the amount of bandwidth assigned to a class 203 should be sufficient to accommodate the traffic of all the RSVP flows 204 mapped onto it. This must remain true even as flows modify their 205 reservations. Furthermore, requirement R2 also points to the fact 206 that services classes in AS2 must be defined so as to ensure they 207 can meet the QoS guarantees of any individual flow mapped onto them. 208 This typically means that flows mapped onto the same service class 209 must exhibit some level of homogeneity in their QoS requirements, or 210 that the service class is dimensioned to meet the most stringent QoS 211 requirements of the individual flows mapped onto it. 213 Requirement R3 is a direct result of the aggregation of individual 214 flows. The QoS guarantees provided to an individual RSVP flow are 215 limited to its conformant packets, i.e., packets that comply with 216 the advertised TSpec of the flow. Checking compliance with a flow 217 TSpec is readily achieved when per flow awareness is maintained, but 218 is lost after flows have been aggregated. In particular, violation 219 of an aggregate TSpec (``sum'' of individual TSpec's) can be caused 220 by a single non-conformant flow, but can impact the QoS guarantees 221 experienced by all the flows that have been aggregated. As a result, 222 some mechanism is needed to identify non-conformant packet even 223 after flows have been merged. One possible approach is to use a 224 ``tagging'' capability as suggested in [CW97]. 226 Requirement R4 expresses the important constraint that satisfying 227 scalability in AS2, should not come at the expense of functionality 228 in AS1 and AS3. Specifically, the aggregation of control and data 229 path information in AS2 should be reversible, so that reservations in 230 AS1 and AS3 can default back to individual flows after crossing AS2. 231 In other words, the hiding of individual flow information in AS2 232 should not prevent reservations at a finer level of granularity in 233 AS1 and AS3, so that end-to-end RSVP reservations can be supported. 234 This means that for RSVP flows, AS2 should essentially behave as a 235 single RSVP ``node''. Reservation of resources within a node are 236 transparent to RSVP, but should not affect end-to-end operation. 238 In the next sections, we qualify how the above requirements translate 239 into specific goals to support aggregation, and also describe 240 possible approaches to satisfy these requirements. 242 3. Data Path Aggregation 244 On the data path, the main issue is the classification of data 245 packets to determine the level of QoS they are eligible to receive. 246 Performing this classification on the basis of bit patterns that are 247 specific to individual flows, i.e., source and destination addresses 248 and port numbers, may not scale. Specifically, storing all the 249 patterns corresponding to individual flows holding a reservation and 250 extracting the corresponding patterns from all incoming packets, can 251 represent a substantial per packet processing overhead. As a result, 252 the goal of an aggregation solution is to map all the bit patterns 253 used to classify individual flows with reservations onto a much 254 smaller number of patterns. There are several possible approaches to 255 achieve such a mapping. 257 3.1. Tunnel Based Aggregation 259 A first solution is to rely on RSVP tunnels. In other words, an RSVP 260 tunnel is created between any two ingress and egress points for which 261 there exists at least one RSVP flow across AS2. At the ingress, 262 packets (data and control) belonging to the corresponding RSVP 263 flows are encapsulated in IP packets with an IP destination address 264 identifying the egress point from AS2. The egress point is then 265 responsible for the reverse decapsulation process before forwarding 266 packets towards their next hop. As a result of encapsulation, 267 routers on the path in AS2 only require a single entry to classify 268 packets from all the associated RSVP flows. The main disadvantages 269 of this solution are the data and processing overheads associated 270 with encapsulation, as well as the need for close synchronization 271 with routing. Specifically, the mapping of individual RSVP flows 272 onto a given egress point from AS2 depends on routing information, 273 and route changes need to be closely monitored to determine if and 274 when they affect this mapping. 276 In addition to these disadvantages, tunnels *alone* do not easily 277 address the above requirement R3 concerning flow isolation. This 278 is because after encapsulation, conformant packets from one flow 279 cannot be distinguished from non-conformant packets of another flow. 280 As a result, it is necessary to discriminate between conformant 281 and non-conformant packets at the ingress point of a tunnel, e.g., 282 send non-conformant packets as regular (non-encapsulated) packets 283 through AS2. While this satisfies requirement R3, it does so at the 284 cost of potentially unnecessary penalization of RSVP flows, e.g., 285 out-of-order delivery, even in the absence of congestion in AS2. 287 3.2. TOS Field Based Aggregation 289 A number of other approaches for aggregation have been 290 brought forward, [CW97, BV97, Hei97, Kil97], with several 291 [CW97, Hei97, Kil97] proposing the use of fewer bits in the IP header 292 for classification purposes. In particular, it has been suggested to 293 use the TOS octet field [Pos81] to specify different service classes 294 as well as drop precedence. From the point of view of aggregation of 295 RSVP flows, this means that RSVP data packets are assigned a value 296 for the TOS field in their IP header, that is a function of both 297 their service class, e.g., Controlled Load [Wro97a] or Guaranteed 298 Service [SPG97], and whether the packet is conformant or not. 300 Specifically, several (the exact number is tbd and a function of the 301 number of distinct service classes that are deemed necessary) TOS 302 bits are used to specify service classes. Data packets from RSVP 303 flows entering AS2 then have the TOS field in their IP header set 304 accordingly to reflect the service class they have requested. As a 305 result, routers in AS2 can classify packets using TOS bit patterns 306 instead of full filters. Conceptually, on a link at a router in 307 AS2, each service class is identified through its assigned TOS bit 308 pattern and mapped onto a separate transmission queue, that has been 309 allocated sufficient resources (bandwidth and buffers) to satisfy the 310 QoS requirements of the aggregation of flows it carries. 312 In addition to the TOS bits identifying the service class to 313 which the data packets of RSVP flows belong to, one more bit from 314 the TOS field is needed to indicate conformance of packets from 315 each flow to their corresponding TSpec. This explicit indication 316 of non-conformant packets is key to enforcing flow isolation 317 (requirement R3) and ensuring that the QoS guarantees of individual 318 flows are met in spite of their aggregation into a common class. 319 For example, as has been suggested, the conformance bit can be used 320 by routers in AS2 to implement a ``drop precedence'' policy to 321 preferentially discard non-conformant packets in case of congestion. 323 In general, ensuring that the aggregate resources allocated to 324 each service class are adequate to satisfy the QoS guarantees of 325 individual RSVP flows, i.e., requirements R2 and R3, requires 326 coupling to the RSVP control path and this aspect is discussed in 327 the next section. However, before addressing this issue, it should 328 be noted that the above approach offers a number of benefits above 329 those afforded by the previous tunneling solution. First, it avoids 330 the overhead of encapsulation. Second, the ingress and egress 331 processing required is minimal, i.e., update of the TOS field in the 332 IP header (note that this could even be performed in the end-stations 333 themselves). Third, it does not require any interactions with 334 routing above and beyond what is normally required by RSVP. In other 335 words, aggregation is supported in a manner that is essentially 336 transparent to RSVP. 338 4. Control Path Aggregation 340 The aggregation of control information associated with individual 341 RSVP flows is just as important for scalability as its data path 342 counterpart. Specifically, maintaining PATH and RESV states for 343 individual RSVP flows can represent a substantial burden in backbone 344 routers which need to support a large number of flows. The goal of 345 QoS aggregation is then to eliminate or at least minimize the amount 346 of per flow control information that needs to be maintained. As 347 with data path aggregation, this needs to be done while maintaining 348 the QoS guarantees requested by individual flows. In particular, 349 the resources allocated to a set of aggregated flows must reflect 350 the ``sum'' of the reservation requests conveyed in individual RESV 351 messages. Making sure this and the several requirements identified 352 earlier (requirement R2, R3, and R4) are met, varies according 353 to the data path aggregation method used. Note that aggregation 354 also offers the opportunity for greater efficiency because of the 355 potential benefits of statistical multiplexing. However, some 356 care must be applied to avoid under-provisioning of resources in 357 the backbone, e.g., aggregation may affect measurement based call 358 admission rules in backbone routers. 360 4.1. Tunnel Based Aggregation 362 When QoS aggregation is achieved through the use of an RSVP tunnel, 363 all RSVP control messages for individual flows are encapsulated 364 and, therefore, not seen by any of the intermediate routers (in 365 AS2). However, because those messages are carried across the tunnel, 366 after decapsulation at the egress router, they will be forwarded as 367 usual so that reservations for individual RSVP flows can still be 368 established on the rest of the path, i.e., in AS1 and AS3. Note 369 that because individual PATH messages are encapsulated, their ADSPEC 370 is not updated as they cross the backbone. At the egress router of 371 the tunnel, updating the ADSPEC in the PATH messages of individual 372 flows is carried out using the corresponding ADSPEC fields from the 373 PATH messages of the tunnel itself. Specifically, hop count, path 374 latency, service specific quantities such as Guaranteed Service error 375 terms, etc., are all updated as if the ADSPEC values for the tunnel 376 were those of a single ``node'' (AS2 is considered as one node). 378 As far as the tunnel is concerned, its establishment is the 379 responsibility of the ingress and egress routers at its end-points, 380 which generate new RSVP control messages with their address as the 381 source and destination addresses (1). In addition, the traffic 382 specification (TSpec) and reservation levels (FLOWSPEC) specified in 383 these messages need to adequately reflect the requirements of the 384 flows aggregated into the tunnel. In particular, the type of service 385 used for a tunnel should match that of the flows being aggregated on 386 the tunnel, e.g., Controlled Load flows should be aggregated onto a 387 Controlled Load tunnel. 389 ---------------------------- 390 1. Note that a possible alternative is to 391 use a layer 2, e.g., ATM, tunnel, which would then be setup using the 392 available layer 2 signalling. 394 4.1.1. Setting of Aggregate Reservations 396 In the case of a Controlled Load tunnel, the aggregate TSpec used 397 in the PATH messages for the tunnel, needs to be selected so as to 398 accommodate the TSpec's of all the flows it aggregates. A natural 399 selection is to choose the sum of the TSpec's of all the individual 400 flows being aggregated (see [Wro97b] for a discussion on how TSpec's 401 are to be summed). Similarly, the TSpec specified in the FLOWSPEC 402 of the RESV messages for the tunnel should again be chosen to ensure 403 that the aggregated flows receive a level of service consistent with 404 their individual requests. One option is to again select the sum 405 of individual FLOWSPEC's, although as mentioned above the potential 406 benefits of statistical multiplexing may allow a lower reservation 407 level. 409 Irrespective of the aggregate reservation level specified, satisfying 410 the QoS guarantees of individual flows is also predicated on the 411 ``proper'' handling of excess traffic, i.e., packets from each flow 412 that do not conform to their individual TSpec. Specifically, excess 413 traffic MUST NOT be forwarded onto the RSVP tunnel, unless some 414 form of explicit identification of excess traffic is provided. For 415 example, this could be achieved through the use of a bit from the TOS 416 field in the IP header of packets as suggested in Section 3.2. 418 The case of a Guaranteed Service Tunnel is somewhat more involved. 419 There are two issues that need to be addressed. The first is the 420 update of the ADSPEC in the PATH messages of individual flows at 421 the egress router (see [SPG97] for details on the use of ADSPEC). 422 The second is the selection of appropriate TSpec and RSpec for the 423 tunnel, so that the delay bounds of all individual flows can be 424 guaranteed. The handling of these two issues are not independent, 425 and there are many possible solutions. In this document, we outline 426 only one of several alternatives. 428 The update of the ADSPEC can be done as described before, using the 429 ADSPEC values of the tunnel. The determination of appropriate TSpec 430 and RSpec values for the tunnel, essentially follows the method 431 described in [RG97]. Specifically, the TSpec used for the tunnel 432 needs to be at least the ``sum'' of the TSpec's of the individual 433 flows. Similarly, the reserved rate R of the RSpec is determined 434 using eqs. (6) and (7) of [RG97], with the only difference that the 435 individual delay bounds used in eq. (7) are only for the portion of 436 the flows paths that coincide with the tunnel. This partial delay 437 bound for individual flows is readily computed from the TSpec of 438 individual flows, their RSpec, and the error terms for the portion of 439 their path that corresponds to the tunnel. 441 It should be pointed out that as mentioned in [RG97], the resulting 442 aggregate reservation rate for the tunnel can be either smaller or 443 bigger than the sum of the individual reservation rates. Another 444 point worth noting concerns the possible use of the slack term, in 445 particular when individual flows specify a non-zero slack and a 446 reservation rate R equal to their token rate r, i.e., they could 447 tolerate a higher delay but cannot ask for a lower rate. In the case 448 of a tunnel, the slack could be used to increase the individual delay 449 bound for that flow used in eq. (7), provided that the SUM of the 450 token rates of individual flows remains smaller than or equal to the 451 aggregate reservation rate. 453 4.2. TOS Field Based Aggregation 455 The case of TOS based QoS aggregation is different from that of 456 a tunnel because the egress point associated with a particular 457 flow is not identified *a priori* at the ingress router. This 458 has the advantage of eliminating the need for an ingress router 459 to continuously interact with routing to monitor possible changes 460 in egress routers and mapping of individual flows into tunnels. 461 However, this means that some other mechanisms are needed to ensure 462 that the appropriate amount of resources is reserved for RSVP flows 463 between the associated ingress and egress routers. 465 There are many possible approaches that one can follow, and in 466 this document we describe two, which we feel represent reasonable 467 trade-offs between simplicity and minimization of backbone overhead. 468 Other alternatives are clearly possible. In both approaches, as 469 in the tunneling case, a key goal is to avoid or at least minimize 470 awareness and/or processing of individual flows in the backbone. 471 Satisfying this goal has several requirements and implications: 473 - Disable processing of (most) individual RSVP messages in the 474 backbone, while still allowing their identification when they 475 arrive at egress or ingress routers. 477 - Identify transparently, i.e., without relying on interactions 478 with routing, the egress routers corresponding to individual 479 flows entering the backbone at an ingress router. 481 - Reserve the appropriate amount of resources on backbone links to 482 satisfy the requirements of individual flows routed over them. 484 - Properly update RSVP PATH messages of individual flows at egress 485 routers. 487 In what follows, we describe two possible approaches to achieving 488 those goals. 490 4.2.1. Ingress-Egress Aggregation: Method 1 492 Next, we describe a first approach, and the steps performed at 493 ingress and egress routers to both identify each other and ensure 494 proper aggregation of flows and allocation of resources between them. 496 - For new flows, the ingress router starts forwarding 497 ``individual'' PATH messages carrying a Policy Object containing 498 its IP address. Receipt of those individual PATH messages 499 provides the associated egress routers with the identify of the 500 ingress router for the flow. The individual PATH messages are 501 initially processed by the backbone routers, and reach the egress 502 router with updated ADSPEC information. 504 - Upon receiving an ``individual'' PATH message with a policy 505 object specifying a new ingress router, the egress router logs 506 the association between the flow and the ingress router and 507 forwards the PATH message. 509 - Upon receipt of a RESV message (2) for a flow, the egress router 510 forwards the ``individual'' RESV message with a Policy Object 511 specifying its IP address. The Policy Object will eventually be 512 delivered to the ingress router, and inform it of the identity of 513 the egress router associated with the flow. 515 - Upon receipt of a RESV message identifying a new egress or 516 when the ingress router deems there are sufficient flows to a 517 given egress to consider aggregating them, it starts sending 518 PATH message destined to this egress and representing the 519 aggregation of all flows destined to it. At the same time, the 520 ingress router starts sending the PATH messages corresponding to 521 individual flows, in a format that ``hides'' them from backbone 522 routers (more on this below) but not the egress router. 524 - Upon receipt of a PATH message destined to itself, the egress 525 router sends a RESV message with an aggregate reservation for all 526 the flows it has logged as coming from the associated ingress 528 ---------------------------- 529 2. Alternatively, the egress router could generate a ``fake'', e.g., 530 near zero reservation, RESV 531 message immediately after receiving the first PATH message. This has 532 the benefit of faster awareness about the egress at the ingress. 534 router. At the same time, it starts sending RESV messages for 535 the individual flows directly to the ingress router. This 536 ensures that they will not be processed by backbone routers, and 537 any existing reservations for individual flows in the backbone 538 will time out. Note that to lower the potential for call 539 admission failure, the egress router may want to progressively 540 increase the reservation level in its aggregate RESV message. 541 This may give it a better chance of recapturing bandwidth as it 542 is being released, when reservation states of individual flows 543 time out. 545 - Upon receipt of ``hidden'' PATH messages for individual flows, 546 the egress router changes them back to ``standard'' PATH messages 547 and updates them with the ADSPEC information from the PATH 548 message originated by the associated ingress router before 549 forwarding them downstream. 551 - Upon receipt of RESV messages for individual flows from a known 552 egress router, the ingress router simply forwards them upstream. 554 The above steps ensure that ingress and egress routers become aware 555 of each other without having to directly query routing, and also 556 ultimately removes awareness of individual flows in backbone routers. 557 However, it is still necessary to describe how route changes within 558 the backbone are handled. This is tightly coupled to the approach 559 used to ``hide'' RSVP PATH messages in the backbone, and we therefore 560 describe this next. 562 In the case of tunnels, individual RSVP messages were ``hidden'' 563 on backbone links because they were encapsulated within another IP 564 header. As a result backbone routers would forward them as regular 565 IP packets. Furthermore, because the destination address in the 566 encapsulating IP header was that of the egress (ingress) router, 567 decapsulation would be performed and ensure proper identification and 568 processing of the RSVP messages. Such a solution is not applicable 569 in the case of TOS based aggregation, because of the decoupling from 570 routing, i.e., identity of egress or ingress, if known, cannot be 571 used to ensure delivery of RSVP messages 573 There are several possible options to overcome those problems while 574 avoiding processing of RSVP messages from individual flows in the 575 backbone. Processing of RSVP (PATH) messages from individual flows 576 in the backbone can be avoided simply by hiding the information 577 used to trigger RSVP processing, i.e., turn the router-alert option 578 [Kat97, KAPJ97] off at the ingress router. The problem is then that 579 without the router-alert option on, the egress router will also fail 580 to identify, and therefore intercept and process those PATH messages. 582 There are several possible solutions to this problem. One is 583 to use some other bit pattern in the IP header, that can be used 584 by egress routers to identify RSVP PATH messages from individual 585 flows. For example, a TOS bit combination could be assigned to 586 indicate ``aggregated control information.'' Routers responsible 587 for de-aggregating control information, e.g., egress routers, would 588 then intercept such packets, while other routers (backbone routers) 589 would ignore them. Another option is to require that egress routers 590 examine the protocol number of all arriving packets, even when 591 the router alert option is not set. This may, however, impose a 592 significant performance penalty. A third option is to keep the 593 router alert option set, but use a different protocol number inside 594 the backbone. Backbone routers would still intercept RSVP PATH 595 messages from individual flows, but not need to process them any 596 further, i.e., upon identifying the new protocol number they would 597 simply forward the packet on. A last option is to define a *new* 598 router alert option for ``Unaggregated RSVP'' messages, which would 599 be silently ignored by backbone routers, but recognized by access 600 (ingress/egress) routers. 602 This last alternative (see Appendix A for additional details) 603 appears to provide a reasonable trade-off, that ensures the required 604 functionality at egress routers while keeping the backbone overhead 605 reasonable. 607 Assuming that one of the above mechanisms is being used, PATH 608 messages for individual flows are now being automatically delivered 609 directly from ingress routers to the appropriate egress routers. 610 However, note that PATH messages are not being processed at any of 611 the backbone routers they traverse. The main implication for the 612 egress is that the ADSPEC field of the PATH messages has not been 613 updated to reflect the characteristic of the backbone path they have 614 traversed. As a result, they cannot be readily propagated forward 615 by the egress router, unless the information needed to properly 616 update their ADSPEC is *already* available at the egress router. 617 This is one of the motivations for the above choice of initially 618 sending individual PATH messages into the backbone, as this enables 619 the egress to first acquire the necessary information to update the 620 ADSPEC of ``hidden'' PATH messages. However, this approach does not 621 address the problem in case of route changes in the backbone. 623 Route changes in the backbone result in ``hidden'' PATH messages 624 being delivered to a *new* new egress, without being preceded by 625 corresponding ``clear'' PATH messages. As a result, the new egress 626 does not have the necessary information to update the ADSPEC of the 627 ``hidden'' PATH messages it starts receiving. Hence, those messages 628 cannot be propagated forward. In order to address this problem, 629 the ingress router needs to become aware of the route change. The 630 simplest approach is to rely on RSVP soft states. Basically, the 631 ingress router will detect that it stops receiving RESV messages from 632 the old egress routers (at least for the flows affected by the route 633 change). It can then use that information as a trigger to start 634 forwarding the PATH messages of those flows again as *regular* RSVP 635 PATH messages. As a result, they will be processed by intermediate 636 backbone routers, and we are back to the initial case described 637 above. 639 4.2.2. Ingress-Egress Aggregation: Method 2 641 In this section, we describe a second alternative, which is mostly a 642 variation on the general method described in the previous section. 643 The main motivation for this variation is to avoid *all* processing 644 of individual RSVP flows in the backbone. This is desirable as 645 even the limited processing of individual RSVP flows required from 646 backbone routers by method 1, can represent a substantial processing 647 load when flows are of short duration. In addition, this second 648 method can avoid reliance on Policy Objects. 650 The main difference with the previous method is that the PATH 651 messages from individual flows are not sent directly in the backbone. 652 Instead, they are always forwarded as ``hidden''. The main issue is 653 then to determine how to inform the ingress router of the identity 654 of the egress router associated with each individual flow, without 655 relying on explicit queries to routing. We describe next, the 656 different steps involved in addressing this issue. 658 - Upon receipt of a new PATH message, the ingress router forwards 659 it as ``hidden'' into the backbone. 661 - On receipt of a hidden PATH message for a new flow, the 662 egress router immediately notifies the ingress router of its 663 existence (the identity of the ingress is carried in the PHOP 664 of the PATH message). This notification can take several 665 forms. One possibility is for the egress router to generate 666 a PATH_ERR message (with some appropriate new error code) 667 directly destined to the ingress router. Another possibility 668 is for the egress router to generate a ``fake'' RESV message 669 with near-zero reservation (FLOWSPEC). Note that as discussed 670 earlier, ``hidden'' PATH messages cannot be forwarded until the 671 information needed to update their ADSPEC is available (more on 672 this below). 674 - On receipt of a ``fake'' RESV or a PATH_ERR from a new egress, 675 the ingress proceeds to send a ``regular'' aggregate PATH message 676 to that egress. 678 - On receipt of an aggregate PATH message (destined to itself), the 679 egress now has the information necessary to update the ADSPEC of 680 the individual PATH message and can start forwarding it. The 681 main disadvantage here is the latency incurred in forwarding the 682 individual PATH message. However, this latency is typically only 683 incurred by the first flow from a given ingress. The egress can 684 use the ADSPEC from the aggregate PATH message to update and 685 immediately forward the PATH messages of subsequent flows from 686 that ingress. 688 - On receipt of a new RESV message for an individual flow, the 689 egress sends a RESV message associated with the aggregate PATH 690 from the corresponding ingress (or updates an existing RESV). 691 The individual RESV messages are then forwarded directly to the 692 ingress router. 694 As mentioned earlier, the method embodied in the above steps avoids 695 any processing of individual flows in the backbone. The cost is an 696 increased latency in propagating the first PATH message of the first 697 flow from the associated ingress. 699 4.2.3. Setting of Aggregate Reservations 701 Proper selection of appropriate aggregate reservation levels requires 702 some care, especially for Guaranteed Service flows. For Controlled 703 Load flows, it is only necessary that in backbone routers the queue 704 assigned to Controlled Load traffic, be allocated the proper service 705 rate. Since rate is an additive quantity, aggregate reservations 706 can be based on the sum of the FLOWSPECs of individual flows. The 707 situation is again more complex for Guaranteed Service flows. 709 The main difference with the tunnel-based case, is that on any link 710 in the backbone the overall aggregation of packets/flows with the 711 same TOS value (corresponding to the Guaranteed Service) is not know 712 to either the ingress or egress routers associated with individual 713 RSVP flows whose route through the backbone includes that link. As 714 a result, the egress router cannot use the approach of Section 4.1.1 715 to determine an appropriate aggregate service rate, that will ensure 716 that all individual delay bounds are met. 718 In order to support aggregated Guaranteed Service flows in this 719 setting, it is necessary to change the ``node model'' used to 720 represent the backbone. Specifically, an approach similar to the 721 one used in the ISSLL drafts to account for ATM networks, can be 722 used. It amounts to representing the backbone as a delay only node. 723 In other words, the backbone only contributes to the D error term 724 of the ADSPEC and not the C term. The main difference with an 725 ATM network is that, contrary to ATM switches, individual backbone 726 routers will update the ADSPEC in PATH messages. In order to ensure 727 a behavior consistent with that of a delay-only node, each individual 728 router needs to only update the D error term of the ADSPEC of PATH 729 messages it processes. The implication of this behavior is that the 730 scheduling and call admission support for Guaranteed Service flows in 731 backbone routers, will be based on ensuring a fixed delay upper bound 732 for the TOS queue assigned to Guaranteed Service packets. This delay 733 upper bound will then be the quantity used to update the D error 734 term in the ADSPEC field of PATH messages. 736 5. Conclusion and Recommendations 738 In this draft we have outlined issues and proposed possible 739 approaches to allow aggregation of individual RSVP flows, without 740 precluding support for individual reservations where available. This 741 can enable delivery of the end-to-end and per flow QoS guarantees 742 supported by RSVP and the Int-Serv Services, while avoiding possible 743 scalability limitations. 745 As a result of this exercise, several requirements emerged to 746 support the different aggregation methods that were discussed. These 747 requirements are summarized below': 749 - Allocation of one bit from the TOS field of the IP header to 750 specify in-profile and out-of-profile packets. 752 - Allocation of one bit pattern from the TOS field that can be 753 mapped to the Controlled Load service, and at least one bit 754 pattern from the TOS field that can be mapped to the Guaranteed 755 Service (two would be preferable to provide some granularity in 756 the delay bounds for Guaranteed Service flows). 758 - Support for a mechanism to selectively ``hide'' RSVP control 759 messages. Specifically, the preferred mechanism is through 760 the introduction of an new Router Alert option, that can be 761 selectively recognized or ignored in routers. 763 References 765 [Boy97] J. Boyle. RSVP extensions for CIDR aggregated data flows, 766 (draft-rsvp-cidr-ext-00.txt). Internet draft (work in 767 progress), Internet Engineering Task Force, February 1997. 769 [Bra97] S. Bradner. Internet protocol quality of service problem 770 statement, (draft-bradner-qos-problem-00.txt). Internet 771 draft (work in progress), Internet Engineering Task Force, 772 September 1997. 774 [BV97] S. Berson and S. Vincent. A ``classy'' approach to 775 aggregation for integrated services, 776 (draft-berson-classy-approach-00.txt). Internet draft 777 (work in progress), Internet Engineering Task Force, March 778 1997. 780 [BZB+97] R. Braden, L. Zhang, S. Berson, S. Herzog, and S. Jamin. 781 Resource reSerVation Protocol (RSVP) version 1, functional 782 specification. Request for comments, rfc 2205, (proposed 783 standard), Internet Engineering Task Force, September 1997. 785 [CW97] D. Clark and J. Wroclawski. An approach to service 786 allocation in the Internet, (draft-clark-diff-svc-alloc-00.txt). 787 Internet draft (work in progress), Internet Engineering 788 Task Force, July 1997. 790 [Hei97] J. Heinanen. Use of the IPv4 TOS octet to support 791 differential services, (draft-heinanen-diff-tos-octet-00.txt). 792 Internet draft (work in progress), Internet Engineering 793 Task Force, October 1997. 795 [KAPJ97] D. Katz, R. Atkinson, C. Partridge, and A. Jackson. IP 796 router alert option, (draft-ietf-ipngwg-ipv6-router-alert-03.txt). 797 Internet draft (work in progress), Internet Engineering 798 Task Force, July 1997. 800 [Kat97] D. Katz. IP router alert option. Request for comments, 801 rfc 2113, (proposed standard), Internet Engineering Task 802 Force, February 1997. 804 [Kil97] K. Kilkki. Simple integrated media access (SIMA). 805 (draft-kalevi-simple-media-access-01.txt). Internet draft 806 (work in progress), Internet Engineering Task Force, June 807 1997. 809 [Pos81] J. Postel. Internet protocol. Request for comments, rfc 810 791, (standard), Internet Engineering Task Force, September 811 1981. 813 [RG97] S. Rampal and R. Guerin. Flow grouping for reducing 814 reservation requirements for Guaranteed Delay service, 815 (draft-rampal-flow-delay-service-01.txt). Internet draft 816 (work in progress), Internet Engineering Task Force, July 817 1997. 819 [SPG97] S. Shenker, C. Partridge, and R. Guerin. Specification of 820 guaranteed quality of service. Request for comments, rfc 821 2212, (proposed standard), Internet Engineering Task Force, 822 September 1997. 824 [Wro97a] J. Wroclawski. Specification of the controlled-load 825 network element service. Request for comments, rfc 2211, 826 (proposed standard), Internet Engineering Task Force, 827 September 1997. 829 [Wro97b] J. Wroclawski. The use of RSVP with IETF integrated 830 services. Request for comments, rfc 2210, (proposed 831 standard), Internet Engineering Task Force, September 1997. 833 A. Router Alert Options for Concealing ``Individual'' PATH Messages 835 As discussed in Section 4.2, the scalability of RSVP is improved 836 when using TOS field based aggregation if the PATH messages from 837 individual applications are concealed from the interior routers in 838 the backbone. PATH messages are addressed either to a destination 839 host or multicast group and are transmitted with the IP router alert 840 option as defined in [Kat97] or [KAPJ97]. This allows routers along 841 their transit path to intercept the packets for RSVP processing. 842 To prevent the backbone routers from intercepting and processing 843 the PATH messages from individual applications, while allowing the 844 aggregating egress routers to recognize and intercept them, a new 845 router alert option value may be used. 847 The syntax of the IPv4 router alert option is defined as follows 848 [Kat97]: 850 A.1. IPv4 Syntax 852 The Router Alert option has the following format: 854 +--------+--------+--------+--------+ 855 |10010100|00000100| 2 octet value | 856 +--------+--------+--------+--------+ 858 Type: 859 Copied flag: 1 (all fragments must carry the option) 860 Option class: 0 (control) 861 Option number: 20 (decimal) 863 Length: 4 864 Value: A two octet code with the following values: 865 0 - Router shall examine packet 866 1-65535 - Reserved 868 The specification states that ``Unrecognized value fields shall be 869 silently ignored''. 871 The syntax of the IPv6 router alert option is defined as follows 872 [KAPJ97]: 874 A.2. IPv6 Syntax 876 The router alert option has the following format: 878 +--------+--------+--------+--------+ 879 |00| TBD | Len= 2 | Value (2 octets)| 880 +--------+--------+--------+--------+ 882 ``TBD'' is the Hop-by-Hop Option Type number (To be allocated by the 883 IANA). 885 Nodes not recognizing this option type SHOULD skip over this option 886 and continue processing the header. This option MUST NOT change en 887 route. There MUST only be one option of this type, regardless of 888 value, per Hop-by-Hop header. 890 Value: A 2 octet code in network byte order with the 891 following values: 893 0 Datagram contains ICMPv6 Group Membership message. 894 1 Datagram contains RSVP message. 895 2 Datagram contains an Active Networks message 896 \cite{ANEP97}. 897 3-65535 Reserved to IANA for future use. 899 New value fields must be registered with the IANA. 901 This specification states that ``Unrecognized value fields MUST be 902 silently ignored and the processing of the header continued''. 904 There are two alternatives which will satisfy the requirement to 905 ``hide'' application PATH messages (when necessary) from the backbone 906 routers: 908 - Define a 2 octet router alert option value for both IPv4 and IPv6 909 which signifies that the datagram contains an ``Unaggregated RSVP 910 Message''. The router should silently ignore this router alert 911 option and continue to forward the packet unless specifically 912 configured to recognize and intercept it. 914 - Define a 2 octet router alert option value for both IPv4 and IPv6 915 which signifies that the router should ``Ignore by Default''. 916 The router should silently ignore this router alert option and 917 continue to forward the packet unless specifically configured to 918 recognize and intercept it. 920 PATH messages from individual applications would be transmitted by 921 the aggregating ingress router using either router alert option 922 value (whichever is defined) whenever it employs TOS field based 923 aggregation to a particular egress router. Aggregated PATH messages 924 to that router would be transmitted with the default router alert 925 option value used for RSVP. The backbone routers would be configured 926 to ignore router alert options using this new option value. The 927 aggregating egress routers would be configured to intercept packets 928 transmitted with the new router alert option value. 930 Authors' Address 932 Roch Guerin 933 IBM T.J. Watson Research Center 934 P.O. Box 704 935 Yorktown Heights, NY 10598 936 Phone: +1 914 784-7038 937 Fax: +1 914 784-6205 938 Email: guerin@watson.ibm.com 940 Steven Blake 941 E95/664 942 IBM Corporation 943 800 Park Offices Drive 944 Research Triangle Park, NC 27709 945 Phone: +1-919-254-2030 946 Fax: +1-919-254-5483 947 Email: slblake@raleigh.ibm.com 948 Shai Herzog 949 IPHighway 950 Email: herzog@iphighway.com