idnits 2.17.1 draft-ietf-issll-rsvp-aggr-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing document type: Expected "INTERNET-DRAFT" in the upper left hand corner of the first page ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 25 longer pages, the longest (page 1) being 69 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 25 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack an Introduction section. ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** There are 2 instances of too long lines in the document, the longest one being 2 characters in excess of 72. ** There are 919 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 1999) is 8989 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'RSVP' on line 1042 looks like a reference -- Missing reference section? 'CSZ' on line 991 looks like a reference -- Missing reference section? 'ISDS' on line 1030 looks like a reference -- Missing reference section? 'BERSON' on line 1021 looks like a reference -- Missing reference section? 'REFRESH' on line 1052 looks like a reference -- Missing reference section? 'DCLASS' on line 1062 looks like a reference -- Missing reference section? 'BRIM' on line 1026 looks like a reference -- Missing reference section? 'GUERIN' on line 1037 looks like a reference -- Missing reference section? 'BERNET' on line 1047 looks like a reference -- Missing reference section? 'TERZIS' on line 1057 looks like a reference -- Missing reference section? 'IP' on line 997 looks like a reference -- Missing reference section? 'HOSTREQ' on line 999 looks like a reference -- Missing reference section? 'FRAMEWORK' on line 1003 looks like a reference -- Missing reference section? 'PRINCIPLES' on line 1007 looks like a reference -- Missing reference section? 'ASSURED' on line 1011 looks like a reference -- Missing reference section? 'BROKER' on line 1016 looks like a reference Summary: 13 errors (**), 0 flaws (~~), 3 warnings (==), 18 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Draft Fred Baker 2 Carol Iturralde 3 Francois Le Faucheur 4 Bruce Davie 5 Cisco Systems 7 RSVP Reservation Aggregation September 1999 9 Aggregation of RSVP for IPv4 and IPv6 Reservations 10 draft-ietf-issll-rsvp-aggr-00.txt 12 This document is an Internet-Draft and is in full conformance 13 with all provisions of Section 10 of RFC 2026. Internet Drafts 14 are working documents of the Internet Engineering Task Force 15 (IETF), its Areas, and its Working Groups. Note that other 16 groups may also distribute working documents as Internet 17 Drafts. 19 Internet Drafts are valid for a maximum of six months and may 20 be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet Drafts as reference 22 material or to cite them other than as a "work in progress". 23 Comments should be made to the authors and the rsvp@isi.edu 24 list. 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 Copyright (C) The Internet Society (1999). All Rights Reserved 34 Abstract 36 A key problem in the design of RSVP version 1 is, as noted in 37 its applicability statement, that it lacks facilities for 38 aggregation of individual reserved sessions into a common 39 class. The use of such aggregation is required for 40 scalability. 42 This document describes the use of a single RSVP reservation 43 to aggregate other RSVP reservations across a transit routing 44 region, in a manner conceptually similar to the use of Virtual 45 Paths in an ATM network. It proposes a way to dynamically 46 create the aggregate reservation, classify the traffic for 47 which the aggregate reservation applies, determine how much 48 bandwidth is needed to achieve the requirement, and recover 49 the bandwidth when the sub-reservations are no longer 50 required. It also contains recommendations concerning 51 algorithms and policies for predictive reservations. 53 Draft RSVP Reservation Aggregation September 1999 55 1. Introduction 57 A key problem in the design of RSVP version 1 [RSVP] is, as 58 noted in its applicability statement, that it lacks facilities 59 for aggregation of individual reserved sessions into a common 60 class. The use of such aggregation is recommended in [CSZ], 61 and required for scalability. 63 The problem of aggregation may be addressed in a variety of 64 ways. For example, it may sometimes be sufficient simply to 65 mark reserved traffic with a suitable DSCP (e.g. EF), thus 66 enabling aggregation of scheduling and classification state. 67 It may also be desirable to install one or more aggregate 68 reservations from ingress to egress of an "aggregation region" 69 (defined below) where each aggregate reservation carries 70 similarly marked packets from a large number of flows. This is 71 to provide high levels of assurance that the end-to-end 72 requirements of reserved flows will be met, while at the same 73 time enabling reservation state to be aggregated. 75 Throughout, we will talk about "Aggregator" and 76 "Deaggregator", referring to the routers at the ingress and 77 egress edges of an aggregation region. Exactly how a router 78 determines whether it should perform the role of aggregator or 79 deaggregator is described below. 81 We will refer to the individual reserved sessions (the 82 sessions we are attempting to aggregate) as "end-to-end" 83 reservations ("E2E" for short), and to their respective 84 Path/Resv messages as E2E Path/Resv messages. We refer to the 85 the larger reservation (that which represents many E2E 86 reservations) as an "aggregate" reservation, and its 87 respective Path/Resv messages as "aggregate Path/Resv 88 messages". 90 1.1. Problem Statement: Aggregation Of E2E Reservations 92 The problem of many small reservations has been extensively 93 discussed, and may be summarized in the observation that each 94 reservation requires a non-trivial amount of message exchange, 95 computation, and memory resources in each router along the 96 way. It would be nice to reduce this to a more manageable 97 level where the load is heaviest and aggregation is possible. 99 Aggregation, however, brings its own challenges. In 101 Draft RSVP Reservation Aggregation September 1999 103 particular, it reduces the level of isolation between 104 individual flows, implying that one flow may suffer delay from 105 the bursts of another. Synchronization of bursts from 106 different flows may occur. However, there is evidence [CSZ] to 107 suggest that aggregation of flows has no negative effect on 108 the mean delay of the flows, and actually leads to a reduction 109 of delay in the "tail" of the delay distribution (e.g. 99% 110 percentile delay) for the flows. These benefits of aggregation 111 to some extent offset the loss of strict isolation. 113 1.2. Proposed Solution 115 The solution we propose involves the aggregation of several 116 E2E reservations that cross an "aggregation region" and share 117 common ingress and egress routers into one larger reservation 118 from ingress to egress. We define an "aggregation region" as a 119 contiguous set of systems capable of performing RSVP 120 aggregation (as defined following) along any possible route 121 through this contiguous set. 123 Communication interfaces fall into two categories with respect 124 to an aggregation region; they are "exterior" to an 125 aggregation region, or they are "interior" to it. Routers that 126 have at least one interface in the region fall into one of 127 three categories with respect to a given RSVP session; they 128 aggregate, they deaggregate, or they are between an aggregator 129 and a deaggregator. 131 Aggregation depends on being able to hide E2E RSVP messages 132 from RSVP-capable routers inside the aggregation region. To 133 achieve this end, the IP Protocol Number in the E2E 134 reservation's Path, PathTear, and ResvConf messages is changed 135 from RSVP (46) to RSVP-E2E-IGNORE (a new value, to be 136 assigned) upon entering the aggregation region, and restored 137 to RSVP at the deaggregator point. These messages are ignored 138 (no state is stored and the message is forwarded as a normal 139 IP datagram) by each router within the aggregation region 140 whenever they are forwarded to an interior interface. Since 141 the deaggregating router perceives the previous RSVP hop on 142 such messages to be the aggregating router, Resv and other 143 messages do not require this modification; they are unicast 144 from RSVP hop to RSVP hop anyway. 146 The token buckets (SENDER_TSPECs and FLOWSPECS) of E2E 147 reservations are summed into the corresponding information 148 elements in aggregate Path and Resv messages. Aggregate Path 150 Draft RSVP Reservation Aggregation September 1999 152 messages are sent from the aggregator to the deaggregator(s) 153 using RSVP's normal IP Protocol Number. Aggregate Resv 154 messages are sent back from the deaggregator to the 155 aggregator, thus establishing an aggregate reservation on 156 behalf of the set of E2E flows that use this aggregator and 157 deaggregator. There may be several such aggregate reservations 158 between the same two routers, representing different classes 159 of traffic; the aggregate reservation is therefore for the 160 traffic marked with a particular DSCP. 162 1.3. Definitions 164 We define an "aggregation region" as a set of RSVP-capable 165 routers for which E2E RSVP messages arriving on an exterior 166 interface of one router in the set would traverse one or more 167 interior interfaces (of this and possibly of other routers in 168 the set) before finally traversing an exterior interface. 170 Such an E2E RSVP message is said to have crossed the 171 aggregation region. 173 We define the "aggregating" router for this E2E flow as the 174 first router that processes the E2E Path message as it enters 175 the aggregation region (i.e., the one which forwards the 176 message from an exterior interface to an interior interface). 178 We define the "deaggregating" router for this E2E flow as the 179 last router to process the E2E Path as it leaves the 180 aggregation region (i.e., the one which forwards the message 181 from an interior interface to an exterior interface). 183 We define an "interior" router for this E2E flow as any router 184 in the aggregation region which receives this message on an 185 interior interface and forwards it to another interior 186 interface. Interior routers perform neither aggregation nor 187 deaggregation for this flow. 189 Note that by these definitions a single router with a mix of 190 interior and exterior interfaces may have the capability to 191 act as an aggregator on some E2E flows, a deaggregator on 192 other E2E flows, and an interior router on yet other flows. 194 Draft RSVP Reservation Aggregation September 1999 196 1.4. Detailed Aspects of Proposed Solution 198 A number of issues jump to mind in considering this model. 200 1.4.1. Traffic Classification Within The Aggregation Region 202 One of the reasons that RSVP Version 1 did not identify a way 203 to aggregate sessions was that there was not a clear way to 204 classify the aggregate. With the development of the 205 Differentiated Services architecture, this is at least 206 partially resolved; traffic of a particular class can be 207 marked with a given DSCP and so classified. We presume this 208 model. 210 We presume that on each link en route, a queue, WDM color, or 211 similar management component is set aside for all aggregated 212 traffic of the same class, and that sufficient bandwidth is 213 made available to carry the traffic that has been assigned to 214 it. This bandwidth may be adjusted based on the total amount 215 of aggregated reservation traffic assigned to the same class. 217 There are numerous options for exactly which Diff-serv PHBs 218 might be used for different classes of traffic as it crosses 219 the aggregation region. This is the "service mapping" problem 220 described in [ISDS], and is applicable to situations broader 221 than those described in this document. Arguments can be made 222 for using either EF or one or more AF PHBs for aggregated 223 traffic. 225 Independent of which PHB is used, care needs to be take in an 226 environment where provisioned Diff-Serv and aggregated RSVP 227 are used in the same network, to ensure that the total offered 228 load for a single PHB does not exceed the link capacity 229 allocated to that PHB. One solution to this is to reserve one 230 of the four AF classes strictly for the aggregated reservation 231 traffic while using other AF classes for provisioned Diff- 232 Serv. 234 Inside the aggregation region, some RSVP reservation state is 235 maintained per aggregate reservation, while a single 236 classification and scheduling state (e.g., a DSCP used for 237 classifying traffic) is maintined per aggregate reservation 238 class (rather than per aggregate reservation). For example, 239 if Guaranteed Service is represented by the EF DSCP throughout 240 the aggregation region, there may be a reservation for each 241 aggregator/deaggregator pair in each router, but only the EF 243 Draft RSVP Reservation Aggregation September 1999 245 DSCP need be inspected at each interior interface, and only a 246 single queue is used for all EF traffic. 248 1.4.2. Deaggregator Determination 250 The first question is "How do we know which aggregate 251 reservation a particular E2E flow should aggregate into?" To 252 know that, we must know three things: its aggregating router, 253 its deaggregating router, and (assuming DSCPs are used to 254 differentiate among various reservations between the same two 255 routers), the relevant DSCP. 257 Determination of the aggregator is trivial: we know that an 258 E2E flow has arrived at an aggregator when its Path message 259 arrives at a router on an exterior interface and must be 260 forwarded on an interior interface. 262 Determining the DSCP is equally easy, or at least it is in 263 concept. The DSCP is chosen for an aggregate reservation based 264 on some policy, which may take into account such factors as 265 the intserv service class requested for the flow. (Some 266 details in the exact point at which the DSCP can be determined 267 are discussed below.) 269 Determination of the deaggregator is more involved. If an SPF 270 routing protocol, such as OSPF or IS-IS, is in use, and if it 271 has been extended to advertise information on Deaggregation 272 roles, it can tell us the set of routers from which the 273 deaggregator will be chosen. In principle, if the aggregator 274 and deaggregator are in the same area, then the identity of 275 the deaggregator could be determined from the link state 276 database. However, this approach would not work in multi-area 277 environments or for distance vector protocols. 279 One method for Deaggregator determination is manual 280 configuration. With this method the network operator would 281 configure the Aggregator and the Deaggregator with the 282 necessary information. 284 Another method allows automatic Deaggregator determination and 285 corresponding Aggregator notification. When the E2E RSVP Path 286 message transits from an interior interface to an exterior 287 interface, the deaggregating router must advise the 288 aggregating router of the correlation between itself and the 289 flow. This has the nice attribute of not being specific to the 290 routing protocol. It also has the property of automatically 292 Draft RSVP Reservation Aggregation September 1999 294 adjusting to route changes. For instance, if because of a 295 topology change, another Deaggregator is now on the shortest 296 path, this method will automatically identify the new 297 Deaggregator and swap to it. 299 1.4.3. Size of Aggregate Reservations 301 A range of options exist for determining the size of the 302 aggregate reservation, presenting a tradeoff between 303 simplicity and scalability. Simplistically, the size of the 304 aggregate reservation needs to be greater than or equal to the 305 sum of the bandwidth of the E2E reservations it aggregates, 306 and its burst capacity must be greater than or equal to the 307 sum of their burst capacities. However, if followed 308 religiously, this leads us to change the bandwidth of the 309 aggregate reservation each time an underlying E2E reservation 310 changes, which loses one of the key benefits of aggregation, 311 the reduction of message processing cost in the aggregation 312 region. 314 We assume, therefore, that there is some policy, not defined 315 in this specification (although sample policies are suggested 316 which have the necessary characteristics). This policy 317 maintains the amount of bandwidth required on a given 318 aggregate reservation by taking account of the sum of the 319 bandwidths of its underlying E2E reservations, while 320 endeavoring to change it infrequently. This may require some 321 level of trend analysis. If there is a significant probability 322 that in the next interval of time the current aggregate 323 reservation will be exhausted, the router must predict the 324 necessary bandwidth and request it. If the router has a 325 significant amount of bandwidth reserved but has very little 326 probability of using it, the policy may be to predict the 327 amount of bandwidth required and release the excess. 329 This policy is likely to benefit from introduction of some 330 hysteresis (i.e. ensure that the trigger condition for 331 aggregate reservation size increase is sufficiently different 332 from the trigger condition for aggregate reservation size 333 decrease) to avoid oscillation in stable conditions. 335 Clearly, the definition and operation of such policies are as 336 much business issues as they are technical, and are out of the 337 scope of this document. 339 Draft RSVP Reservation Aggregation September 1999 341 1.4.4. Intra-domain Routes 343 RSVP directly handles route changes, in that reservations 344 follow the routes that their data follow. This follows from 345 the property that Path messages contain the same IP source and 346 destination address as the data flow for which a reservation 347 is to be established. However, since we are now making 348 aggregate reservations by sending a Path message from an 349 aggregating to a deaggregating router, the reserved (E2E) data 350 packets no longer carry the same IP addresses as the relevant 351 (aggregate) Path message. The issue becomes one of making sure 352 that data packets for reserved flows follow the same path as 353 the Path message that established Path state for the aggregate 354 reservation. Several approaches are viable. 356 First, the data may be tunneled from aggregator to 357 deaggregator, using technologies such as IP-in-IP tunnels, GRE 358 tunnels, MPLS label-switched paths, and so on. These each have 359 particular advantages, especially MPLS, which allows traffic 360 engineering. They each also have some cost in link overhead 361 and configuration complexity. 363 If data is not tunneled, then we are depending a 364 characteristic of IP best metric routing , which is that if 365 the route from A to Z includes the path from H to L, and the 366 best metric route was chosen all along the way, then the best 367 metric route was chosen from H to L. Therefore, an aggregate 368 path message which crosses a given aggregator and deaggregator 369 will of necessity use the best path between them. 371 If this is a single path, the problem is solved. If it is a 372 multi-path route, and the paths are of equal cost, then we are 373 forced to determine, perhaps by measurement, what proportion 374 of the traffic for a given E2E reservation is passing along 375 each of the paths, and assure ourselves of sufficient 376 bandwidth for the present use. A simple, though inelegant, way 377 of doing this is to reserve the total capacity of the 378 aggregate route down each path. 380 For this reason, we believe it is advantageous to use one of 381 the above-mentioned tunneling mechanisms in cases where 382 multiple equal-cost paths may exist. 384 Draft RSVP Reservation Aggregation September 1999 386 1.4.5. Inter-domain Routes 388 The case of inter-domain routes differs somewhat from the 389 intra-domain case just described. Specifically, best-path 390 considerations do not apply, as routing is by a combination of 391 routing policy and shortest AS path rather than simple best 392 metric. 394 In the case of inter-domain routes, data traffic belonging to 395 different E2E sessions (but the same aggregate session) may 396 not enter an aggregation region via the same aggregator 397 interface, and/or may not leave via the same deaggregator 398 interface. It is possible that we could identify this 399 occurrence in some central system which sees the reservation 400 information for both of the apparent sessions, but it is not 401 clear that we could determine a priori how much traffic went 402 one way or the other apart from measurement. 404 We simply note that this problem can occur and needs to be 405 allowed for in the implementation. We recommend that each such 406 e2e reservation be summed into its appropriate aggregate 407 reservation, even though this involves over-reservation. 409 1.4.6. Reservations for Multicast Sessions 411 Aggregating reservations for multicast sessions is 412 significantly more complex than for unicast sessions. The 413 first challenge is to construct a multicast tree for 414 distribution of the aggregate Path messages which follows the 415 same path as will be followed by the data packets for which 416 the aggregate reservation is to be made. This is complicated 417 by the fact that the path taken by a data packet may depend on 418 many factors such as its source address, the choice of shared 419 trees or source-specific trees, and the location of a 420 rendezvous point for the tree. 422 Once the problem of distributing aggregate Path messages is 423 solved, there are considerable problems in determining the 424 correct amount of resources to reserve at each link along the 425 multicast tree. Because of the amount of heterogeneity that 426 may exist in an aggregate multicast reservation, it appears 427 that it would be necessary to retain information about 428 individual E2E reservations within the aggregation region to 429 allocate resources correctly. Thus, we may end up with a 430 complex set of procedures for forming aggregate reservations 431 that do not actually reduce the amount of stored state 433 Draft RSVP Reservation Aggregation September 1999 435 significantly for multicast sessions. [BERSON] describes 436 possible ways to reduce this state by using measurement-based 437 admission control. 439 As noted above, there are several aspects to RSVP state, and 440 our approach for unicast aggregates all forms of state: 441 classification, scheduling, and reservation state. One 442 possible approach to multicast is to focus only on aggregation 443 of classification and scheduling state, which are arguably the 444 most important because of their impact on the fast path. That 445 approach is the one described in the current draft. 447 1.4.7. Multi-level Aggregation 449 Ideally, an aggregation scheme should be able to accommodate 450 recursive aggregation, with aggregate reservations being 451 themselves aggregated. Multi-level aggregation can be 452 accomplished using the procedures described here and a simple 453 extension to the protocol number swapping process. 455 We can consider E2E RSVP reservations to be at aggregation 456 level 0. When we aggregate these reservations, we produce 457 reservations at aggregation level 1. In general, level n 458 reservations may be aggregated to form reservations at level 459 n+1. 461 When an aggregating router receives an E2E Path, it swaps the 462 protocol number from RSVP to RSVP-E2E-IGNORE. In addition, it 463 should write the aggregation level (1, in this case) in the 2 464 byte field that is present (and currently unused) in the 465 router alert option. In general, a router which aggregates 466 reservations at level n to create reservations at level n+1 467 will write the number n+1 in the router alert field. A router 468 which deaggregates level n+1 reservations will examine all 469 messages with IP protocol number RSVP-E2E-IGNORE but will 470 process the message and swap the protocol number back to RSVP 471 only in the case where the router alert field carries the 472 number n+1. For any other value, the message is forwarded 473 unchanged. Interior routers ignore all messages with IP 474 protocol number RSVP-E2E-IGNORE. Note that only a few bits of 475 the 2 byte field in the option would be needed, given the 476 likely number of levels of aggregation. 478 Draft RSVP Reservation Aggregation September 1999 480 1.4.8. Reliability Issues 482 There are a variety of issues that arise in the context of 483 aggregation that would benefit from some form of explicit 484 acknowledgment mechanism for RSVP messages. For example, it 485 is possible to configure a set of routers such that an E2E 486 Path of protocol type RSVP-E2E-IGNORE would be effectively 487 "black-holed", if it never reached a router which was 488 appropriately configured to act as a deaggregator. It could 489 then travel all the way to its destination where it would 490 probably be ignored due to its non-standard protocol number. 491 This situation is not easy to detect. The aggregator can be 492 sure this problem has not occurred if an aggregate PathErr 493 message is received from the deaggregator (as described in 494 detail below). It can also be sure there is no problem if an 495 E2E Resv is received. However, the fact that neither of these 496 events has happened may only mean that no receiver wishes to 497 reserve resources for this session, or that an RSVP message 498 loss occurred, or it may mean that the Path was black-holed. 499 However, if a neighbor-to-neighbor acknowledgment mechanism 500 existed, the aggregator would expect to receive an 501 acknowledgment of the E2E Path from the deaggregator, and 502 would interpret the lack of a response as an indication that a 503 problem of configuration existed. It could then refrain from 504 aggregating this particular session. We note that such a 505 reliability mechanism has been proposed for RSVP in [REFRESH] 506 and propose that it be used here. 508 Draft RSVP Reservation Aggregation September 1999 510 2. Elements of Procedure 512 To implement aggregation, we define a number of elements of 513 procedure. 515 2.1. Receipt of E2E Path Message By Aggregating Router 517 The very first event is the arrival of the E2E Path message at 518 an exterior interface of an aggregator. Standard RSVP 519 procedures [RSVP] are followed for this, including onto what 520 set of interfaces the message should be forwarded. These 521 interfaces comprise zero or more exterior interfaces and zero 522 or more interior interfaces. (If the number of interior 523 interfaces is zero, the router is not acting as an aggregator 524 for this E2E flow.) 526 Service on exterior interfaces is handled as defined in 527 [RSVP]. 529 Service on interior interfaces is complicated by the fact that 530 the message needs to be included in some aggregate 531 reservation, but at this point it is not known which one, 532 because the deaggregator is not known. Therefore, the E2E Path 533 message is forwarded on the interior interface(s) using the IP 534 Protocol number RSVP-E2E-IGNORE, but in every other respect 535 identically to the way it would be sent by an RSVP router that 536 was not performing aggregation. 538 2.2. Handling Of E2E Path Message By Interior Routers 540 At this point, the e2e Path message traverses zero or more 541 interior routers. Interior routers receive the e2e Path 542 message on an interior interface and forward it on another 543 interior interface. The Router Alert IP Option alerts interior 544 routers to check internally, but they find that the IP 545 Protocol is RSVP-E2E-IGNORE and the next hop interface is 546 interior. As such, they simply forward it as a normal IP 547 datagram. 549 2.3. Receipt of E2E Path Message By Deaggregating Router 551 The E2E Path message finally arrives at a deaggregating 552 router, which receives it on an interior interface and 554 Draft RSVP Reservation Aggregation September 1999 556 forwards it on an exterior interface. Again, the Router Alert 557 IP Option alerts it to intercept the message, but this time 558 the IP Protocol is RSVP-E2E-IGNORE and the next hop interface 559 is an exterior interface. 561 At this point, the deaggregating router associates the flow 562 with an aggregate reservation. This selection is done on the 563 basis of policy, and may take into account not only the 564 aggregating router (whose IP Address may be found in the RSVP 565 Hop Object) but other information about the flow. If no such 566 aggregate reservation exists and the router is so configured, 567 it may generate a PathErr with code NEW-AGGREGATE-NEEDED back 568 to the aggregating router. This should not result in any 569 reservation being taken down, but may result in the 570 aggregating router initiating the necessary aggregate Path 571 message, as described in the following section. 573 The deaggregating router changes the e2e Path message's IP 574 Protocol from RSVP-E2E-IGNORE to IP Protocol RSVP, updates the 575 ADSPEC of the e2e Path using information accumulated by the 576 aggregate Path ADSPEC (if an aggregate Path has been 577 received), and the E2E Path message is forwarded towards its 578 intended destination. To enable correct updating of the 579 ADSPEC, a deaggregating router may wait for the arrival of an 580 aggregate Path before forwarding the E2E Path. 582 2.4. Initiation of New Aggregate Path Message By Aggregating 583 Router 585 The aggregating router is responsible to take account of the 586 SENDER_TSPEC information from individual E2E Path messages in 587 constructing the SENDER_TSPEC of the aggregate Path message it 588 sends to its deaggregating router. The aggregating router may 589 know that an E2E session is associated with a given 590 deaggregator when one of two events occurs: it receives a 591 PathErr message with the error code NEW-AGGREGATE-NEEDED from 592 the deaggregator, or it receives an E2E Resv message from the 593 deaggregator. In the latter case, the Resv contains a DCLASS 594 object [DCLASS] indicating which DSCP the deaggregator 595 believes that the E2E flow belongs in. In the former case, the 596 aggregator must make its own determination of a suitable DSCP 597 based on the information in the E2E Path message(s) being 598 aggregated and using locally available policy information. 599 The identity of the deaggregator itself is found in either the 600 ERROR SPECIFICATION of the PathErr message or the RSVP HOP 602 Draft RSVP Reservation Aggregation September 1999 604 object of the E2E Resv. 606 On receipt of either message, if no corresponding aggregate 607 path state exists from the aggregator to the deaggregator for 608 a session with the appropriate DSCP, and the aggregator is 609 configured to do so, the aggregator should generate an 610 aggregate Path message for the aggregate reservation. The 611 destination address of the aggregate Path message is the 612 address of the deaggregating router, and the message is sent 613 with IP protocol number RSVP. 615 2.5. Handling of E2E Resv Message by Deaggregating Router 617 Having sent the E2E Path message on toward the destination, 618 the deaggregator must now expect to receive an E2E Resv for 619 the session. On receipt, its responsibility is to ensure that 620 there is sufficient bandwidth reserved within the aggregation 621 region to support the new E2E reservation, and if there is, 622 then to forward the E2E Resv to the aggregating router. 624 If there is insufficient bandwidth reserved, it should follow 625 the normal RSVP procedures [RSVP] for a reservation being 626 placed with insufficient bandwidth to support the reservation. 627 It may also immediately attempt to increase the aggregate 628 reservation that is supplying bandwidth by increasing the size 629 of the flowspec that it includes in the aggregate Resv that it 630 sends upstream. However, this may not be sufficient to 631 increase the size of the aggregate reservation, because RSVP 632 routers take the minimum of the Sender TSpec and Receiver 633 TSpec when installing a reservation, and thus the installed 634 aggregate reservation may be limited by the size of the sender 635 TSpec. The likelihood of this situation can be reduced by a 636 sufficiently large choice of TSpec by the aggregator. 638 When sufficient bandwidth is available, it may simply send the 639 E2E Resv message with IP Protocol RSVP to the aggregating 640 router. This message should, in addition to other data, 641 contain the DCLASS object to indicate which DSCP the 642 deaggregating router expects the aggregator to use. The choice 643 of DSCP may be made based on a combination of information in 644 the received E2E Resv and local policy. An example policy 645 might dictate a certain DSCP for Guaranteed Service and 646 another DSCP for Controlled Load. The de-aggregator will also 647 add the token bucket from the FLOWSPEC object into its 648 internal understanding of how much of that reservation is in 649 use. 651 Draft RSVP Reservation Aggregation September 1999 653 2.6. Initiation of New Aggregate Resv Message By 654 Deaggregating Router 656 Upon receiving an E2E Resv message on an exterior interface, 657 and having determined the appropriate DSCP for the session, 658 the deaggregator looks for corresponding path state for a 659 session with the chosen DSCP. If aggregate Path state exists, 660 but no aggregate Resv state exists, the deaggregator creates 661 an aggregate Resv and sets its initial request to a value not 662 smaller than the requirement of the E2E reservation it is 663 supporting. 665 If no aggregate Path state exists for the appropriate DSCP, 666 this may be because the aggregator has not yet responded to 667 the arrival of the E2E Resv sent in the preceding step. To 668 avoid deadlock while waiting for a response, it would be 669 desirable to use the acknowledgment mechanisms described in 670 [REFRESH]. 672 Once the deaggregator has established the aggregate Path 673 state, then it sends an aggregate Resv message toward the 674 aggregator (i.e., to the previous hop), using the AGGREGATED- 675 RSVP session and filter specifications. Since the DSCP is in 676 the SESSION object, the DCLASS is unnecessary. The message 677 should be reliably delivered using the mechanisms in [REFRESH] 678 or, alternatively, the CONFIRM object may be used, to assure 679 that the aggregate Resv does indeed arrive and is granted. 680 This enables the deaggregator to determine that the requested 681 bandwidth is available to allocate to the E2E flows it 682 supports. 684 2.7. Handling of Aggregate Resv Message by Interior Routers 686 The aggregate Resv message is handled in essentially the same 687 way as defined in [RSVP]. The Session object contains the 688 address of the deaggregating router (or the group address for 689 the session in the case of multicast) and the DSCP that has 690 been chosen for the session. The Filterspec object identifies 691 the aggregating router. These routers perform admission 692 control and resource allocation as usual and send the 693 aggregate Resv on towards the aggregator. 695 Draft RSVP Reservation Aggregation September 1999 697 2.8. Handling of E2E Resv Message by Aggregating Router 699 The E2E Resv message is the final confirmation to the 700 aggregating router that a proportion of a given aggregate's 701 bandwidth has been reserved. At this point, it should ensure 702 that the E2E reservation is associated with an appropriate 703 aggregate, that the aggregator and deaggregator expectations 704 synchronize, and that all things are in place. In particular, 705 it needs to ensure that the DCLASS carried in the E2E Resv 706 matches the DSCP for an aggregate session to that 707 deaggregator; if not, it needs to create a new aggregate Path 708 for the appropriate DSCP and send it to the deaggregator. It 709 should also ensure that the SENDER_TSPEC from the E2E Path 710 message has been accumulated into the appropriate aggregate 711 Path message. Under normal circumstances, this is the only way 712 it will be informed of this association. It should now forward 713 the E2E Resv to its previous hop, following normal RSVP 714 processing rules [RSVP]. 716 2.9. Removal of E2E Reservation 718 E2E reservations are removed in the usual way via PathTear, 719 ResvTear, timeout, or as the result of an error condition. 720 When they are removed, their FLOWSPEC information must also be 721 removed from the allocated portion of the aggregate 722 reservation. This same bandwidth may be re-used for other 723 traffic in the near future. When E2E Path messages are 724 removed, their SENDER_TSPEC information must also be removed 725 from the aggregate Path. 727 2.10. Removal of Aggregate Reservation 729 Should an aggregate reservation go away (presumably due to a 730 configuration change, route change, or policy event), the E2E 731 reservations it supports are no longer active. They must be 732 treated accordingly. 734 2.11. Handling of Data On Reserved E2E Flow by Aggregating 735 Router 737 Prior to establishment that a given E2E flow is part of a 738 given aggregate, the flow's data should be treated as traffic 739 without a reservation by whatever policies prevail for such. 740 Generally, this will mean being given the same forwarding 742 Draft RSVP Reservation Aggregation September 1999 744 behavior as non-essential traffic. However, upon establishing 745 that the flow belongs to a given aggregate, the aggregating 746 router is responsible to mark any related traffic with the 747 correct DSCP and forward it in the manner appropriate to 748 traffic on that reservation. This may imply forwarding it to a 749 given IP next hop, or piping it down a given link layer 750 circuit, tunnel, or MPLS label switched path. 752 The aggregator is responsible for performing per-reservation 753 policing on the E2E flows that it is aggregating. The 754 aggregator performs metering of traffic belonging to each 755 reservation to assess compliance to the token bucket for the 756 corresponding E2E reservation. Packets which are assessed in 757 compliance are forwarded as mentioned above. Packets which are 758 assessed out of compliance must be either dropped or marked to 759 a different DSCP. The detailed policing behavior is an aspect 760 of the service mapping described in [ISDS]. 762 2.12. Procedures for Multicast Sessions 764 Because of the difficulties of aggregating multicast sessions 765 described above, we focus on the aggregation of scheduling and 766 classification state in the multicast case. The main 767 difference between the multicast and unicast cases is that 768 rather than sending an aggregate Path message to the unicast 769 address of a single deaggregating router, in the multicast 770 case we send the "aggregate" Path message to the same group 771 address as the E2E session. This ensures that the aggregate 772 Path message follows the same route as the E2E Path. This 773 difference between unicast and multicast is reflected in the 774 Session objects defined below. A consequence of this approach 775 is that we continue to have reservation state per multicast 776 session inside the aggregation region. 778 A further challenge arises in multicast sessions with 779 heterogeneous receivers. Consider an interior router which 780 must forward packets for a multicast session on two 781 interfaces, but has only received a reservation request on one 782 of those interfaces. It receives packets marked with the DSCP 783 chosen for the aggregate reservation. When sending them out 784 the interface which has no installed reservation, it has the 785 following options: 787 a) remark those packets to best effort before sending 788 them out the interface; 790 Draft RSVP Reservation Aggregation September 1999 792 b) send the packets out the interface with the DSCP 793 chosen for the aggregate reservation. 795 The first approach suffers from the drawback that it requires 796 MF classification at an interior router in order to recognize 797 the flows whose packets must be demoted. The second approach 798 requires over-reservation of resources on the interface on 799 which no reservation was received. In the absence of such 800 over-reservation, the packets sent with the "wrong" DSCP would 801 be able to degrade the service experienced by packets using 802 that DSCP legitimately. 804 To make MF classification acceptable in an interior router, it 805 may be possible to treat the case of heterogenous flows as an 806 exception. That is, an interior router only needs to be able 807 to recognize those individual microflows that have 808 heterogeneous resource needs on the outbound interfaces of 809 this router. 811 3. Protocol Elements 813 3.1. IP Protocol RSVP-E2E-IGNORE 815 This specification presumes the assignment of a protocol type 816 RSVP-E2E-IGNORE, whose number is at this point TBD. This is 817 used only on messages which require a router alert (Path, 818 PathErr, and ResvConf), and signifies that the message must be 819 treated one way when copied to an interior interface, and 820 another way when copied to an exterior interface. 822 3.2. Path Error Code 824 A PathErr code NEW-AGGREGATE-NEEDED is presumed. This value 825 does not signify that a fatal error has occurred, but that an 826 action is required of the aggregating router to avoid an error 827 condition in the near future. 829 3.3. SESSION Object 831 The SESSION object contains two values: the IP Address of the 832 aggregate session destination, and the DSCP that it will use 833 on the E2E data the reservation contains. For unicast 834 sessions, the session destination address is the address of 835 the deaggregating router. For multicast sessions, the session 837 Draft RSVP Reservation Aggregation September 1999 839 destination is the multicast address of the E2E session (or 840 sessions) being aggregated. The inclusion of the DSCP in the 841 session allows for multiple sessions toward the same address 842 to be distinguished by their DSCP and queued separately. It 843 also provides the means for aggregating scheduling and 844 classification state. In the case where a session uses a pair 845 of PHBs (e.g. AF11 and AF12), the DSCP used should represent 846 the numerically smallest PHB (e.g. AF11). This follows the 847 same naming convention described in [BRIM]. 849 Session types are defined for IPv4 and IPv6 addresses. 851 o IP4 SESSION object: Class = SESSION, 852 C-Type = RSVP-AGGREGATE-IP4 854 +-------------+-------------+-------------+-------------+ 855 | IPv4 Session Address (4 bytes) | 856 +-------------+-------------+-------------+-------------+ 857 | /////////// | Flags | ///////// | DSCP | 858 +-------------+-------------+-------------+-------------+ 860 o IP6 SESSION object: Class = SESSION, 861 C-Type = RSVP-AGGREGATE-IP6 863 +-------------+-------------+-------------+-------------+ 864 | | 865 + + 866 | | 867 + IPv6 Session Address (16 bytes) + 868 | | 869 + + 870 | | 871 +-------------+-------------+-------------+-------------+ 872 | /////////// | Flags | ///////// | DSCP | 873 +-------------+-------------+-------------+-------------+ 875 3.4. SENDER_TEMPLATE Object 877 The SENDER_TEMPLATE object identifies the aggregating router 878 for the aggregate reservation. 880 Draft RSVP Reservation Aggregation September 1999 882 o IP4 SENDER_TEMPLATE object: Class = SENDER_TEMPLATE, 883 C-Type = RSVP-AGGREGATE-IP4 885 +-------------+-------------+-------------+-------------+ 886 | IPv4 Aggregator Address (4 bytes) | 887 +-------------+-------------+-------------+-------------+ 889 o IP6 SENDER_TEMPLATE object: Class = SENDER_TEMPLATE, 890 C-Type = RSVP-AGGREGATE-IP6 892 +-------------+-------------+-------------+-------------+ 893 | | 894 + + 895 | | 896 + IPv6 Aggregator Address (16 bytes) + 897 | | 898 + + 899 | | 900 +-------------+-------------+-------------+-------------+ 902 3.5. FILTER_SPEC Object 904 The FILTER_SPEC object identifies the aggregating router for 905 the aggregate reservation, and is syntactically identical to 906 the SENDER_TEMPLATE object. 908 Draft RSVP Reservation Aggregation September 1999 910 4. Policies and Algorithms For Predictive Management Of 911 Blocks Of Bandwidth 913 The exact policies used in determining how much bandwidth 914 should be allocated to an aggregate reservation at any given 915 time are beyond the scope of this document, and may be 916 proprietary to the service provider in question. However, here 917 we explore some of the issues and suggest approaches. 919 In short, the ideal condition is that the aggregate 920 reservation always has enough resources to allocate to any E2E 921 reservation that requires its support, and never takes too 922 much. Simply stated, but more difficult to achieve. Factors 923 that come into account include significant times in the 924 diurnal cycle: one may find that a large number of people 925 start placing calls at 8:00 AM, even though the hour from 7:00 926 to 8:00 is dead calm. They also include recent history: if 927 more people have been placing calls recently than have been 928 finishing them, a prediction of the necessary bandwidth a few 929 moments hence may call for more bandwidth than is currently 930 allocated. Likewise, at the end of a busy period, we may find 931 that the trend calls for declining reservation amounts. 933 We recommend a policy something along this line. At any given 934 time, one should expect that the amount of bandwidth required 935 for the aggregate reservation is the larger of the following: 937 (a) a requirement known a priori, such as from history of the 938 diurnal cycle at a particular week day and time of day, 939 and 941 (b) the trend line over recent history, with 90 or 99% 942 statistical confidence. 944 We further expect that changes to that aggregate reservation 945 would be made no more often than every few minutes, and 946 ideally perhaps on larger granularity such as fifteen minute 947 intervals or hourly. The finer the granularity, the greater 948 the level of signaling required, while the coarser the 949 granularity, the greater the chance for error, and the need to 950 recover from that error. 952 In general, we expect that the aggregate reservation will not 953 ever add up to exactly the sum of the reservations it 954 supports, but rather will be an integer multiple of some block 955 reservation size, which exceeds that value. 957 Draft RSVP Reservation Aggregation September 1999 959 5. Security Considerations 961 Numerous security issues pertain to this document; for 962 example, the loss of an aggregate reservation to an aggressor 963 causes many calls to operate unreserved, and the reservation 964 of a great excess of bandwidth may result in a denial of 965 service. However, these issues are not confined to this 966 extension: RSVP itself has them. We believe that the security 967 mechanisms in RSVP address these issues as well. 969 6. IANA Considerations 971 Beyond allocating an IP Protocol, a PathErr code, and an RSVP 972 Addressing object "type", there are no IANA issues in this 973 document. We do not define an object that will itself require 974 assignment by IANA. 976 7. Acknowledgments 978 The authors acknowledge that published documents and 979 discussion with several people, notably John Wroclawski, Steve 980 Berson, and Andreas Terzis materially contributed to this 981 draft. The design derives directly from an internet draft by 982 Roch Guerin [GUERIN] and from Steve Berson's drafts on the 983 subject. It is also influenced by the design in the diff-edge 984 draft by Bernet et al [BERNET] and by the RSVP tunnels draft 985 [TERZIS]. 987 Draft RSVP Reservation Aggregation September 1999 989 8. References 991 [CSZ] 992 Clark, D., S. Shenker, and L. Zhang, "Supporting Real- 993 Time Applications in an Integrated Services Packet 994 Network: Architecture and Mechanism," in Proc. 995 SIGCOMM'92, September 1992. 997 [IP] RFC 791, "Internet Protocol". J. Postel. Sep-01-1981. 999 [HOSTREQ] 1000 RFC 1122, "Requirements for Internet hosts - 1001 communication layers". R.T. Braden. Oct-01-1989. 1003 [FRAMEWORK] 1004 Nichols, "Differentiated Services Operational Model and 1005 Definitions", 02/11/1998, draft-nichols-dsopdef-00.txt 1007 [PRINCIPLES] 1008 RFC 1958, "Architectural Principles of the Internet". B. 1009 Carpenter. June 1996. 1011 [ASSURED] 1012 Clark and Wroclawski, "An Approach to Service Allocation 1013 in the Internet", 08/04/1997, draft-clark-diff-svc- 1014 alloc-00.txt 1016 [BROKER] 1017 Nichols and Zhang, "A Two-bit Differentiated Services 1018 Architecture for the Internet", 12/23/1997, draft- 1019 nichols-diff-svc-arch-01.txt 1021 [BERSON] 1022 Berson and Vincent. "Aggregation of Internet Integrated 1023 Services State". draft-berson-rsvp-aggregation-00.txt, 1024 August 1998 1026 [BRIM] 1027 Brim and Carpenter. "Per Hop Behavior Identification 1028 Codes". draft-brim-diffserv-phbid-00.txt, April 1999. 1030 [ISDS] 1031 Bernet et al. "Integrated Services Operation Over 1032 Diffserv Networks". draft-ietf-issll-diffserv-rsvp- 1033 03.txt, Sept. 1999. 1035 Draft RSVP Reservation Aggregation September 1999 1037 [GUERIN] 1038 Guerin, R., Blake, S. and Herzog, S.,"Aggregating RSVP 1039 based QoS Requests", Internet Draft, draft-guerin- 1040 aggreg-rsvp-00.txt, November 1997. 1042 [RSVP] 1043 Braden, R., Zhang, L., Berson, S., Herzog, S. and Jamin, 1044 S., "Resource Reservation Protocol (RSVP) Version 1 1045 Functional Specification", RFC 2205, September 1997. 1047 [BERNET] 1048 Bernet, Y., Durham, D., and F. Reichmeyer, "Requirements 1049 of Diff-serv Boundary Routers", Internet Draft, draft- 1050 bernet-diffedge-01.txt, November, 1998. 1052 [REFRESH] 1053 Berger, L., Gan, D., and G. Swallow, "RSVP Refresh 1054 Reduction Extensions", Internet Draft, draft-berger- 1055 rsvp-refresh-reduct-02.txt, May 1999. 1057 [TERZIS] 1058 Terzis, A., Krawczyk, J., Wroclawski, J., and L. Zhang, 1059 "RSVP Operation Over IP Tunnels", Internet Draft, draft- 1060 ietf-rsvp-tunnel-04.txt, May 1999. 1062 [DCLASS] 1063 Bernet, Y., "Usage and Format of the DCLASS Object With 1064 RSVP Signaling", Internet Draft, draft-bernet-dclass- 1065 01.txt, June 1999. 1067 9. Authors' Addresses 1069 Fred Baker 1070 Cisco Systems 1071 519 Lado Drive 1072 Santa Barbara, California 93111 1073 Phone: (408) 526-4257 1074 Email: fred@cisco.com 1076 Carol Iturralde 1077 Cisco Systems 1078 250 Apollo Drive 1079 Chelmsford MA,01824 USA 1080 Phone: 978-244-8532 1081 Email: cei@cisco.com 1083 Draft RSVP Reservation Aggregation September 1999 1085 Francois Le Faucheur 1086 Cisco Systems 1087 291, rue Albert Caquot 1088 06560 Valbonne, France 1089 Phone: +33.1.6918 6266 1090 Email: flefauch@cisco.com 1092 Bruce Davie 1093 Cisco Systems 1094 250 Apollo Drive 1095 Chelmsford MA,01824 USA 1096 Phone: 978-244-8921 1097 Email: bdavie@cisco.com 1099 10. Full Copyright Statement 1101 Copyright (C) The Internet Society (1999). All Rights 1102 Reserved. 1104 This document and translations of it may be copied and 1105 furnished to others, and derivative works that comment on or 1106 otherwise explain it or assist in its implmentation may be 1107 prepared, copied, published and distributed, in whole or in 1108 part, without restriction of any kind, provided that the above 1109 copyright notice and this paragraph are included on all such 1110 copies and derivative works. However, this document itself 1111 may not be modified in any way, such as by removing the 1112 copyright notice or references to the Internet Society or 1113 other Internet organizations, except as needed for the purpose 1114 of developing Internet standards in which case the procedures 1115 for copyrights defined in the Internet Standards process must 1116 be followed, or as required to translate it into languages 1117 other than English. 1119 The limited permissions granted above are perpetual and will 1120 not be revoked by the Internet Society or its successors or 1121 assigns. 1123 This document and the information contained herein is provided 1124 on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET 1125 ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR 1126 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE 1127 USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR 1128 ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 1129 PARTICULAR PURPOSE."