idnits 2.17.1 draft-ietf-issll-is802-sbm-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-23) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 64 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 65 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. (A line matching the expected section header was found, but with an unexpected indentation: ' 8. Security Considerations' ) ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 58 instances of too long lines in the document, the longest one being 8 characters in excess of 72. == There are 5 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 2 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 1529: '... managed segment MUST be SBM-aware and...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The "Author's Address" (or "Authors' Addresses") section title is misspelled. == Line 1766 has weird spacing: '...riority to de...' == Line 1769 has weird spacing: '... tie is broke...' == Line 1813 has weird spacing: '...listens for i...' == Line 1819 has weird spacing: '...sending out a...' == Line 1876 has weird spacing: '...ponding prior...' == (5 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 1998) is 9536 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'Ghanwani98' on line 1683 looks like a reference -- Missing reference section? 'Seaman97' on line 1688 looks like a reference -- Missing reference section? 'IEEE8021D' on line 1705 looks like a reference -- Missing reference section? 'IEEE8021P' on line 224 looks like a reference -- Missing reference section? 'IEEE802Q' on line 1692 looks like a reference -- Missing reference section? 'IEEE8021p' on line 622 looks like a reference -- Missing reference section? 'IEEE8021Q' on line 628 looks like a reference -- Missing reference section? 'RFC-1700' on line 852 looks like a reference -- Missing reference section? 'RFC-2205' on line 1657 looks like a reference -- Missing reference section? 'RFC-2212' on line 1670 looks like a reference -- Missing reference section? 'Baker97' on line 1661 looks like a reference -- Missing reference section? 'RFC-2211' on line 1667 looks like a reference -- Missing reference section? 'RFC-2206' on line 1664 looks like a reference -- Missing reference section? 'RFC-2215' on line 1673 looks like a reference -- Missing reference section? 'RFC-2210' on line 1677 looks like a reference -- Missing reference section? 'RFC-2213' on line 1680 looks like a reference -- Missing reference section? 'IEEEP8021p' on line 1696 looks like a reference Summary: 11 errors (**), 0 flaws (~~), 12 warnings (==), 20 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Raj Yavatkar, Intel 3 INTERNET-DRAFT Don Hoffman, Teledesic 4 Yoram Bernet, Microsoft 5 Fred Baker, Cisco 6 Michael Speer, Sun Microsystems 8 March 1998 10 SBM (Subnet Bandwidth Manager): 11 A Protocol for RSVP-based Admission Control over IEEE 802-style networks 13 Status of this Memo 15 This document is an Internet-Draft. Internet-Drafts are working 16 documents of the Internet Engineering Task Force (IETF), its areas, 17 and its working groups. Note that other groups may also distribute 18 working documents as Internet-Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as ``work in progress.'' 25 To learn the current status of any Internet-Draft, please check the 26 ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow 27 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au 28 (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West 29 Coast). 31 SBM (Subnet Bandwidth Manager) March, 1998 33 Abstract 35 This document describes a signaling method and protocol for RSVP-based 36 admission control over IEEE 802-style LANs. The protocol is designed 37 to work both with the current generation of IEEE 802 LANs as well as with the 38 recent work completed by the IEEE 802.1 committee. 40 SBM (Subnet Bandwidth Manager) March, 1998 42 1. Introduction 44 New extensions to the Internet architecture and service models have 45 been defined for an integrated services Inernet [RFC-1633, RFC-2205, 46 RFC-2210] so that applications can request specific qualities or lev- 47 els of service from an internetwork in addition to the current IP 48 best-effort service. These extensions include RSVP, a resource reser- 49 vation setup protocol, and definition of new service classes to be 50 supported by Integrated Services routers. RSVP and service class 51 definitions are largely independent of the underlying networking tech- 52 nologies and it is necessary to define the mapping of RSVP and 53 Integrated Services specifications onto specific subnetwork technolo- 54 gies. For example, a definition of service mappings and reservation 55 setup protocols is needed for specific link-layer technologies such as 56 shared and switched IEEE-802-style LAN technologies. 58 This document defines SBM, a signaling protocol for RSVP-based admis- 59 sion control over IEEE 802-style networks. SBM provides a method for 60 mapping an internet-level setup protocol such as RSVP onto IEEE 802- 61 style networks. In particular, it describes the operation of RSVP- 62 enabled hosts/routers and link layer devices (switches, bridges) to 63 support reservation of LAN resources for RSVP-enabled data flows. A 64 framework for providing Integrated Services over shared and switched 65 IEEE-802-style LAN technologies and a definition of service mappings 66 have been described in separate documents [Ghanwani98, Seaman97]. 68 2. Goals and Assumptions 70 The SBM (Subnet Bandwidth Manager) protocol and its use for admission 71 control and bandwidth management in IEEE 802 level-2 networks is based 72 on the following architectural goals and assumptions: 74 I. Even though the current trend is towards increased use of 75 switched LAN topologies consisting of newer switches that support 76 the priority queuing mechanisms specified by IEEE 802.1p, we 77 assume that the LAN technologies will continue to be a mix of 78 legacy shared/ switched LAN segments and newer switched segments 79 based on IEEE 802.1p specification. Therefore, we specify a sig- 80 naling protocol for managing bandwidth over both legacy and newer 81 LAN topologies and that takes advantage of the additional func- 82 tionality (such as an explicit support for different traffic 83 classes or integrated service classes) as it becomes available in 84 the new generation of switches, hubs, or bridges. As a result, 85 the SBM protocol would allow for a range of LAN bandwidth 87 SBM (Subnet Bandwidth Manager) March, 1998 89 management solutions that vary from one that exercises purely 90 administrative control (over the amount of bandwidth consumed by 91 RSVP-enabled traffic flows) to one that requires cooperation (and 92 enforcement) from all the end-systems or switches in a IEEE 802 93 LAN. 95 II. This document specifies only a signaling method and protocol for 96 LAN-based admission control over RSVP flows. We do not define 97 here any traffic control mechanisms for the link layer; the pro- 98 tocol is designed to use any such mechanisms defined by IEEE 802. 99 In addition, we assume that the Layer 3 end-systems (e.g., a host 100 or a router) will exercise traffic control by policing Integrated 101 Services traffic flows to ensure that each flow stays within its 102 traffic specifications stipulated in an earlier reservation 103 request submitted for admission control. This then allows a sys- 104 tem using SBM admission control combined with per-flow shaping at 105 end-systems and IEEE-defined traffic control at link layer to 106 realize some approximation of Controlled Load (and even 107 Guaranteed) services over IEEE 802-style LANs. 109 III. In the absence of any link-layer traffic control or priority 110 queuing mechanisms in the underlying LAN (such as a shared LAN 111 segment), the SBM-based admission control mechanism only limits 112 the total amount of traffic load imposed by RSVP-enabled flows on 113 a shared LAN. In such an environment, no traffic flow separation 114 mechanism exists to protect the RSVP-enabled flows from the 115 best-effort traffic on the same shared media and that raises the 116 question of the utility of such a mechanism outside a topology 117 consisting only of 802.1p-compliant switches. However, we assume 118 that the SBM-based admission control mechanism will still serve a 119 useful purpose in a legacy, shared LAN topology for two reasons. 120 First, assuming that all the nodes that generate Integrated Ser- 121 vices traffic flows utilize the SBM-based admission control pro- 122 cedure to request reservation of resources before sending any 123 traffic, the mechanism will restrict the total amount of traffic 124 generated by Integrated Services flows within the bounds desired 125 by a LAN administrator. Second, the best-effort traffic generated 126 by the TCP/IP-based traffic sources is generally rate-adaptive 127 (using a TCP-style "slow start" congestion avoidance mechanism or 128 a feedback-based rate adaptation mechanism used by audio/video 129 streams based on RTP/RTCP protocols) and adapts to stay within 130 the available network bandwidth. Thus, the combination of admis- 131 sion control and rate adaptation should avoid persistent traffic 132 congestion. This does not, however, guarantee that non- 133 Integrated-Services traffic will not interfere with the 134 Integrated Services traffic in the absence of traffic control 136 SBM (Subnet Bandwidth Manager) March, 1998 138 support in the underlying LAN infrastructure. 140 3. Organization of the rest of this document 142 The rest of this document provides a detailed description of the SBM- 143 based admission control procedure(s) for IEEE 802 LAN technologies. 144 The document is organized as follows: 146 * Section 4 first defines the various terms used in the document 147 and then provides an overview of the admission control procedure 148 with an example of its application to a sample network. 150 * Section 5 describes the rules for processing and forwarding PATH 151 (and PATH_TEAR) messages at DSBMs (Designated Subnet Bandwidth 152 Managers), SBMs, and DSBM clients. 154 * Section 6 addresses the inter-operability issues when a DSBM may 155 operate in the absence of RSVP signaling at Layer 3 or when 156 another signaling protocol (such as SNMP) is used to reserve 157 resources on a LAN segment. 159 * Appendix A describes the details of the DSBM election algorithm 160 used for electing a designated SBM on a LAN segment when more 161 than one SBM is present. It also describes how DSBM clients dis- 162 cover the presence of a DSBM on a managed segment. 164 * Appendix B specifies the formats of SBM-specific messages used 165 and the formats of new RSVP objects needed for the SBM operation. 167 4. Overview 169 4.1. Definitions 171 - Link Layer or Layer 2 or L2: We refer to data-link layer techno- 172 logies such as IEEE 802.3/Ethernet as L2 or layer 2. 174 - Link Layer Domain or Layer 2 domain or L2 domain: a set of nodes 176 SBM (Subnet Bandwidth Manager) March, 1998 178 and links interconnected without passing through a L3 forwarding 179 function. One or more IP subnets can be overlaid on a L2 domain. 181 - Layer 2 or L2 devices: We refer to devices that only implement 182 Layer 2 functionality as Layer 2 or L2 devices. These include 183 802.1D bridges or switches. 185 - Internetwork Layer or Layer 3 or L3: Layer 3 of the ISO 7 layer 186 model. This document is primarily concerned with networks that 187 use the Internet Protocol (IP) at this layer. 189 - Layer 3 Device or L3 Device or End-Station: these include hosts 190 and routers that use L3 and higher layer protocols or application 191 programs that need to make resource reservations. 193 - Segment: A L2 physical segment that is shared by one or more 194 senders. Examples of segments include (a) a shared Ethernet or 195 Token-Ring wire resolving contention for media access using CSMA 196 or token passing ("shared L2 segment"), (b) a half duplex link 197 between two stations or switches, (c) one direction of a switched 198 full-duplex link. 200 - Managed segment: A managed segment is a segment with a DSBM 201 present and responsible for exercising admission control over 202 requests for resource reservation. A managed segment includes 203 those interconnected parts of a shared LAN that are not separated 204 by DSBMs. 206 - Traffic Class: An aggregation of data flows which are given simi- 207 lar service within a switched network. 209 - User_priority: User_priority is a value associated with the 210 transmission and reception of all frames in the IEEE 802 service 211 model: it is supplied by the sender that is using the MAC ser- 212 vice. It is provided along with the data to a receiver using the 213 MAC service. It may or may not be actually carried over the net- 214 work: Token-Ring/802.5 carries this value (encoded in its FC 215 octet), basic Ethernet/802.3 does not, 802.12 may or may not 216 depending on the frame format in use. 802.1p defines a consistent 217 way to carry this value over the bridged network on Ethernet, 218 Token Ring, Demand-Priority, FDDI or other MAC-layer media using 220 SBM (Subnet Bandwidth Manager) March, 1998 222 an extended frame format. The usage of user_priority is fully 223 described in section 2.5 of 802.1D [IEEE8021D] and 802.1p 224 [IEEE8021P] "Support of the Internal Layer Service by Specific 225 MAC Procedures". 227 - Subnet: used in this memo to indicate a group of L3 devices shar- 228 ing a common L3 network address prefix along with the set of seg- 229 ments making up the L2 domain in which they are located. 231 - Bridge/Switch: a layer 2 forwarding device as defined by IEEE 232 802.1D. The terms bridge and switch are used synonymously in this 233 document. 235 - DSBM: Designated SBM (DSBM) is a protocol entity that resides in 236 a L2 or L3 device and manages resources on a L2 segment. At most 237 one DSBM exists for each L2 segment. 239 - SBM: the SBM is a protocol entity that resides in a L2 or L3 dev- 240 ice and is capable of managing resources on a segment. However, 241 only a DSBM manages the resources for a managed segment. When 242 more than one SBM exists on a segment, one of the SBMs is elected 243 to be the DSBM. 245 - Extended segment: An extended segment includes those parts of a 246 network which are members of the same IP subnet and therefore are 247 not separated by any layer 3 devices. Several managed segments, 248 interconnected by layer 2 devices, constitute an extended seg- 249 ment. 251 - Managed L2 domain: An L2 domain consisting of managed segments is 252 referred to as a managed L2 domain to distinguish it from a L2 253 domain with no DSBMs present for exercising admission control 254 over resources at segments in the L2 domain. 256 - DSBM clients: These are entities that transmit traffic onto a 257 managed segment and use the services of a DSBM for the managed 258 segment for admission control over a LAN segment. Only the layer 259 3 or higher layer entities on L3 devices such as hosts and 260 routers are expected to send traffic that requires resource 261 reservations, and, therefore, DSBM clients are L3 entities. 263 SBM (Subnet Bandwidth Manager) March, 1998 265 - SBM transparent devices: A "SBM transparent" device is unaware of 266 SBMs or DSBMs (though it may or may not be RSVP aware) and, 267 therefore, does not participate in the SBM-based admission con- 268 trol procedure over a managed segment. Such a device uses stan- 269 dard forwarding rules appropriate for the device and is tran- 270 sparent with respect to SBM. An example of such a L2 device is a 271 legacy switch that does not participate in resource reservation. 273 - Layer 3 and layer 2 addresses: We refer to layer 3 addresses of 274 L3/L2 devices as "L3 addresses" and layer2 addresses as "L2 275 addresses". This convention will be used in the rest of the docu- 276 ment to distinguish between Layer 3 and layer 2 addresses used to 277 refer to RSVP next hop (NHOP) and previous hop (PHOP) devices. 278 For example, in conventional RSVP message processing, RSVP_HOP 279 object in a PATH message carries the L3 address of the previous 280 hop device. We will refer to the address contained in the 281 RSVP_HOP object as the RSVP_HOP_L3 address and the corresponding 282 MAC address of the previous hop device will be referred to as the 283 RSVP_HOP_L2 address. 285 4.2. Overview of the SBM-based Admission Control Procedure 287 A protocol entity called "Designated SBM" (DSBM) exists for each 288 managed segment and is responsible for admission control over the 289 resource reservation requests originating from the DSBM clients in 290 that segment. Given a segment, one or more SBMs may exist on the seg- 291 ment. For example, many SBM-capable devices may be attached to a 292 shared L2 segment whereas two SBM-capable switches may share a half- 293 duplex switched segment. In that case, a single DSBM is elected for 294 the segment. The procedure for dynamically electing the DSBM is 295 described in Appendix A. The only other approved method for specifying 296 a DSBM for a managed segment is static configuration at SBM-capable 297 devices. 299 The presence of a DSBM makes the segment a "managed segment". Some- 300 times, two or more L2 segments may be interconnected by SBM tran- 301 sparent devices. In that case, a single DSBM will manage the resources 302 for those segments treating the collection of such segments as a sin- 303 gle managed segment for the purpose of admission control. 305 SBM (Subnet Bandwidth Manager) March, 1998 307 4.2.1. Basic Algorithm 309 Figure 1 - An Example of a Managed Segment. 311 +-------+ +-----+ +------+ +-----+ +--------+ 312 |Router | | Host| | DSBM | | Host| | Router | 313 | R2 | | C | +------+ | B | | R3 | 314 +-------+ +-----+ / +-----+ +--------+ 315 | | / | | 316 | | / | | 317 ==============================================================LAN 318 | | 319 | | 320 +------+ +-------+ 321 | Host | | Router| 322 | A | | R1 | 323 +------+ +-------+ 325 Figure 1 shows an example of a managed segment in a L2 domain that 326 interconnects a set of hosts and routers. For the purpose of this dis- 327 cussion, we ignore the actual physical topology of the L2 domain 328 (assume it is a shared L2 segment and a single managed segment 329 represents the entire L2 domain). A single SBM device is designated to 330 be the DSBM for the managed segment. We will provide examples of 331 operation of the DSBM over switched and shared segments later in the 332 document. 334 The basic DSBM-based admission control procedure works as follows: 336 1. DSBM Initialization: As part of its initial configuration, DSBM 337 obtains information such as the limits on fraction of available 338 resources that can be reserved on each managed segment under its 339 control. For instance, bandwidth is one such resource. Even 340 though methods such as auto-negotiation of link speeds and 341 knowledge of link topology allow discovery of link capacity, the 342 configuration may be necessary to limit the fraction of link 343 capacity that can be reserved on a link. Configuration is likely 344 to be static with the current L2/L3 devices. Future work may 345 allow for dynamic discovery of this information. This document 346 does not specify the configuration mechanism. 348 2. DSBM Client Initialization: For each interface attached, a DSBM 349 client determines whether a DSBM exists on the interface. The 351 SBM (Subnet Bandwidth Manager) March, 1998 353 procedure for discovering and verifying the existence of the DSBM 354 for an attached segment is described in Appendix A. If the client 355 itself is capable of serving as the DSBM on the segment, it may 356 choose to participate in the election to become the DSBM. At the 357 start, a DSBM client first verifies that a DSBM exists in its L2 358 domain so that it can communicate with the DSBM for admission 359 control purposes. 361 In the case of a full-duplex segment, an election may not be 362 necessary as the SBM at each end will typically act as the DSBM 363 for outgoing traffic in each direction. 365 3. DSBM-based Admission Control: To request reservation of resources 366 (e.g., LAN bandwidth in a L2 domain), DSBM clients (RSVP-capable 367 L3 devices such as hosts and routers) follow the following steps: 369 a) When a DSBM client sends or forwards a RSVP PATH message over 370 an interface attached to a managed segment, it sends the PATH 371 message to the segment's DSBM instead of sending it to the RSVP 372 session destination address (as is done in conventional RSVP 373 processing). After processing (and possibly updating an 374 ADSPEC), the DSBM will forward the PATH message toward its des- 375 tination address. As part of its processing, the DSBM builds 376 and maintains a PATH state for the session and notes the previ- 377 ous L2/L3 hop that sent it the PATH message. 379 Let us consider the managed segment in Figure 1. Assume that a 380 sender to a RSVP session (session address specifies the IP 381 address of host A on the managed segment in Figure 1) resides 382 outside the L2 domain of the managed segment and sends a PATH 383 message that arrives at router R1 which is on the path towards 384 host A. 386 DSBM client on Router R1 forwards the PATH message from the 387 sender to the DSBM. The DSBM processes the PATH message and 388 forwards the PATH message towards the RSVP receiver (Detailed 389 message processing and forwarding rules are described in Sec- 390 tion 5). In the process, the DSBM builds the PATH state, 391 remembers the router R1 (its L2 and l3 addresses) as the previ- 392 ous hop for the session, puts its own L2 and L3 addresses in 393 the PHOP objects (see explanation later), and effectively 394 inserts itself as an intermediate node between the sender (or 395 R1 in Figure 1) and the receiver (host A) on the managed seg- 396 ment. 398 b) When an application on host A wishes to make a reservation for 400 SBM (Subnet Bandwidth Manager) March, 1998 402 the RSVP session, host A follows the standard RSVP message pro- 403 cessing rules and sends a RSVP RESV message to the previous hop 404 L2/L3 address (the DSBMs address) obtained from the PHOP 405 object(s) in the previously received PATH message. 407 c) The DSBM processes the RSVP RESV message based on the bandwidth 408 available and returns an RESVERR message to the requester (host 409 A) if the request cannot be granted. If sufficient resources 410 are available and the reservation request is granted, the DSBM 411 forwards the RESV message towards the PHOP(s) based on its 412 local PATH state for the session. The DSBM merges reservation 413 requests for the same session as and when possible using the 414 rules similar to those used in the conventional RSVP process- 415 ing. 417 d) If the L2 domain contains more than one managed segment, the 418 requester (host A) and the forwarder (router R1) may be 419 separated by more than one managed segment. In that case, the 420 original PATH message would propagate through many DSBMs (one 421 for each managed segment on the path from R1 to A) setting up 422 PATH state at each DSBM. Therefore, the RESV message would pro- 423 pagate hop-by-hop in reverse through the intermediate DSBMs and 424 eventually reach the original forwarder (router R1) on the L2 425 domain if admission control at all DSBMs succeeds. 427 4.2.2. Enhancements to the conventional RSVP operation 429 The addition of a DSBM for admission control over managed segments 430 results in some additions to the RSVP message processing rules at a 431 DSBM client. In the following, we first motivate and summarize the 432 additions and a detailed description of the message processing and 433 forwarding rules at (D)SBMs and DSBM clients is provided in Section 5: 435 - Normal RSVP forwarding rules apply at a DSBM client when it is 436 not forwarding an outgoing PATH message over a managed segment. 437 However, outgoing PATH messages on a managed segment are sent to 438 the DSBM for the corresponding managed segment (Section 5.2 439 describes how the PATH messages are sent to the DSBM on a managed 440 segment). 442 - In conventional RSVP processing over point-to-point links, RSVP 443 nodes (hosts/routers) use RSVP_HOP object (NHOP and PHOP info) to 445 SBM (Subnet Bandwidth Manager) March, 1998 447 keep track of the next hop (downstream node in the path of data 448 packets in a traffic flow) and the previous hop (upstream nodes 449 with respect to the data flow) nodes on the path between a sender 450 and a receiver. Routers along the path of a PATH message forward 451 the message towards the destination address based on the L3 rout- 452 ing (packet forwarding) tables. 454 For example, consider the L2 domain in Figure 1. Assume that both 455 the sender (some host X) and the receiver (some host Y) in a RSVP 456 session reside outside the L2 domain shown in the Figure, but 457 PATH messages from the sender to its receiver pass through the 458 routers in the L2 domain using it as a transit subnet. Assume 459 that the PATH message from the sender X arrives at the router R1. 460 R1 uses its local routing information to decide which next hop 461 router (either router R2 or router R3) to use to forward the PATH 462 message towards host Y. However, when the path traverses a 463 managed L2 domain, we require the PATH and RESV messages to go 464 through a DSBM for each managed segment. Such a L2 domain may 465 span many managed segments (and DSBMs) and, typically, SBM proto- 466 col entities on L2 devices (such as a switch) will serve as the 467 DSBMs for the managed segments in a switched topology. When R1 468 forwards the PATH message to the DSBM (an L2 device), the DSBM 469 may not have the L3 routing information necessary to select the 470 egress router (between R2 and R3) before forwarding the PATH mes- 471 sage. To ensure correct operation and routing of RSVP messages, 472 we must provide additional forwarding information to DSBMs. 474 For this purpose, we introduce new RSVP objects called LAN_NHOP 475 address objects that keep track of the next L3 hop as the PATH 476 message traverses an L2 domain between two L3 entities (RSVP PHOP 477 and NHOP nodes). 479 - When a DSBM client (a host or a router acting as the originator 480 of a PATH message) sends out a PATH message to the DSBM, it must 481 include LAN_NHOP information in the message. In the case of a 482 unicast destination, the LAN_NHOP address specifies the destina- 483 tion address (if the destination is local to its L2 domain) or 484 the address of the next hop router towards the destination. In 485 our example of an RSVP session involving the sender X and 486 receiver Y with L2 domain in Figure 1 acting as the transit sub- 487 net, R1 is the ingress node that receives the PATH message. R1 488 first determines that R2 is the next hop router (or the egress 489 node in the L2 domain for the session address) and then inserts a 490 LAN_NHOP object that specifies R2's IP address. When a DSBM 491 receives a PATH message, it can now look at the address in the 492 LAN_NHOP object and forward the PATH message towards the egress 493 node after processing the PATH message. However, we expect the 495 SBM (Subnet Bandwidth Manager) March, 1998 497 L2 devices (such as switches) to act as DSBMs on the path within 498 the L2 domain and it may not be reasonable to expect these dev- 499 ices to have an ARP capability to determine the MAC address (we 500 call it L2ADDR for Layer 2 address) corresponding to the IP 501 address in the LAN_NHOP object. 503 Therefore, we require that the LAN_NHOP information (generated by 504 the L3 device) include both the IP address (LAN_NHOP_L3 address) 505 and the corresponding MAC address (LAN_NHOP_L2 address ) for the 506 next L3 hop over the L2 domain. The LAN_NHOP_L3 address is used 507 by SBM protocol entities on L3 devices to forward the PATH mes- 508 sage towards its destination whereas the L2 address is used by 509 the SBM protocol entities on L2 devices to determine how to for- 510 ward the PATH message towards the L3 NHOP (egress point from the 511 L2 domain). The exact format of the LAN_NHOP information and 512 relevant objects is described later in Appendix B. 514 - When a DSBM receives a RSVP PATH message, it processes the PATH 515 message according to the PATH processing rules described in the 516 RSVP specification. In particular, the DSBM retrieves the IP 517 address of the previous hop from the RSVP_HOP object in the PATH 518 message and stores the PHOP address in its PATH state. It then 519 forwards the PATH message with the PHOP (RSVP_HOP) object modi- 520 fied to reflect its own IP address (RSVP_HOP_L3 address). Thus, 521 the DSBM inserts itself as an intermediate hop in the chain of 522 nodes in the path between two L3 nodes across the L2 domain. 524 - The PATH state in a DSBM is used for forwarding subsequent RESV 525 messages as per the standard RSVP message processing rules. When 526 the DSBM receives a RESV message, it processes the message and 527 forwards it to appropriate PHOP(s) based on its PATH state. 529 - Because a DSBM inserts itself as a hop between two RSVP nodes in 530 the path of a RSVP flow, all RSVP related messages (such as PATH, 531 PATH_TEAR, RESV, RESV_CONFM, RESV_TEAR, and RESV_ERR) now flow 532 through the DSBM. In particular, a PATH_TEAR message is routed 533 exactly through the intermediate DSBM(s) as its corresponding 534 PATH message and the local PATH state is first cleaned up at each 535 intermediate hop before the PATH_TEAR message gets forwarded. 537 - So far, we have described how the PATH message propagates through 538 the L2 domain establishing PATH state at each DSBM along the 539 managed segments in the path. The layer 2 address (LAN_NHOP_L2 540 address) in the LAN_NHOP object should be used by the L2 devices 542 SBM (Subnet Bandwidth Manager) March, 1998 544 along the path to decide how to forward the PATH message toward 545 the next L3 hop. Such devices will apply the standard IEEE 546 802.1D forwarding rules (e.g., send it on a single port based on 547 its filtering database, or flood it on all ports active in the 548 spanning tree if the L2 address does not appear in the filtering 549 database) to the LAN_NHOP_L2 address as are applied normally to 550 data packets destined to the address. 552 In the conventional RSVP message processing, the PATH state esta- 553 blished along the nodes on a path is used to route the RESV mes- 554 sage from a receiver to a sender in an RSVP session. As each 555 intermediate node builds the path state, it remembers the previ- 556 ous hop (stores the PHOP IP address available in the RSVP_HOP 557 object of an incoming message) that sent it the PATH message and, 558 when the RESV message arrives, the intermediate node simply uses 559 the stored PHOP address to forward the RESV after processing it 560 successfully. 562 In our case, we expect the SBM entities residing at L2 devices to 563 act as DSBMs (and, therefore, intermediate RSVP hops in an L2 564 domain) along the path between a sender (PHOP) and receiver 565 (NHOP). Thus, when a RESV message arrives at a DSBM, it must use 566 the stored PHOP IP address to forward the RESV message to its 567 previous hop. However, it may not be reasonable to expect the L2 568 devices to have an ARP cache or the ARP capability to map the 569 PHOP IP address to its corresponding L2 address before forwarding 570 the RESV message. 572 To obviate the need for such address mapping at L2 devices, we 573 use a RSVP_HOP_L2 object in the PATH message. The RSVP_HOP_L2 574 object includes the Layer 2 address (L2ADDR) of the previous hop 575 and complements the L3 address information included in the 576 RSVP_HOP object (RSVP_HOP_L3 address). 578 When a L3 device constructs and forwards a PATH message over a 579 managed segment, it includes its IP address (IP address of the 580 interface over which PATH is sent) in the RSVP_HOP object and add 581 a RSVP_HOP_L2 object that includes the corresponding L2 address 582 for the interface. When a device in the L2 domain receives such a 583 PATH message, it remembers the addresses in the RSVP_HOP and 584 RSVP_HOP_L2 objects in its PATH state and then overwrites the 585 RSVP_HOP and RSVP_HOP_L2 objects with its own addresses before 586 forwarding the PATH message over a managed segment. 588 The exact format of RSVP_HOP_L2 object is specified in APPENDIX 589 B. 591 SBM (Subnet Bandwidth Manager) March, 1998 593 - When an RSVP session address is a multicast address and a SBM, 594 DSBM, and DSBM clients share the same L2 segment (a shared seg- 595 ment), it is possible for a SBM or a DSBM client to receive one 596 or more copies of a PATH message that it forwarded earlier when a 597 DSBM on the same wire forwards it (See Section 5.8 for an example 598 of such a case). To facilitate detection of such loops, we use a 599 new RSVP object called the LAN_LOOPBACK object. DSBM clients or 600 SBMs (but not the DSBMs reflecting a PATH message onto the inter- 601 face over which it arrived earlier) must overwrite (or add if the 602 PATH message does NOT already include a LAN_LOOPBACK object) the 603 LAN_LOOPBACK object in the PATH message with their own unicast IP 604 address. 606 Now, a SBM or a DSBM client can easily detect and discard the 607 duplicates by checking the contents of the LAN_LOOPBACK object (a 608 duplicate PATH message will list a device's own interface address 609 in the LAN_LOOPBACK object). Appendix B specifies the exact for- 610 mat of the LAN_LOOPBACK object. 612 - The model proposed by the Integrated Services working group 613 requires isolation of traffic flows from each other during their 614 transit across a network. The motivation for traffic flow separa- 615 tion is to provide Integrated Services flows protection from mis- 616 behaving flows and other best-effort traffic that share the same 617 path. The basic IEEE 802.3/Ethernet networks do not provide any 618 notion of traffic classes to discriminate among different flows 619 that request different services. However, IEEE 802.1p defines a 620 way for switches to differentiate among several "user_priority" 621 values encoded in packets representing different traffic classes 622 (see [IEEE802Q, IEEE8021p] for further details). The 623 user_priority values can be encoded either in native LAN packets 624 (e.g., in IEEE 802.5's FC octet) or by using an encapsulation 625 above the MAC layer (e.g., in the case of Ethernet, the 626 user_priority value assigned to each packet will be carried in 627 the frame header using the new, extended frame format defined by 628 IEEE 802.1Q [IEEE8021Q]. IEEE, however, makes no recommendations 629 about how a sender or network should use the user_priority 630 values. An accompanying document makes recommendations on the 631 usage of the user_priority values (see [Seaman97] for details). 633 Under the Integrated Services model, L3 (or higher) entities that 634 transmit traffic flows onto a L2 segment should perform per-flow 635 policing to ensure that the flows do not exceed their traffic 636 specification as specified during admission control. In addition, 637 L3 devices may label the frames in such flows with a 638 user_priority value to identify their service class. 640 SBM (Subnet Bandwidth Manager) March, 1998 642 For the purpose of this discussion, we will refer to the 643 user_priority value carried in the extended frame header as a 644 "traffic class" of a packet. Under the ISSLL model, the L3 enti- 645 ties, that send traffic and that use the SBM protocol, may not 646 select the traffic class of outgoing packets. Instead, once a 647 sender sends a PATH message, downstream DSBMs will insert a new 648 traffic class object (TCLASS object) in the PATH message that 649 travels to the next L3 device (L3 NHOP for the PATH message). To 650 some extent, the TCLASS object contents are treated like the 651 ADSPEC object in the RSVP PATH messages. The L3 device that 652 receives the PATH message must remove and store the TCLASS object 653 as part of its PATH state for the session. Later, when the same 654 L3 device needs to forward a RSVP RESV message towards the origi- 655 nal sender, it must include the TCLASS object in the RESV mes- 656 sage. When the RESV message arrives at the original sender, it 657 must pass the user_priority value in the TCLASS object to its 658 local packet classifier (traffic control) so that subsequent, 659 outgoing data packets for this RSVP flow will have the 660 user_priority value included in the extended MAC header. 662 The format of the TCLASS object is specified in Appendix B. Note 663 that TCLASS and other SBM-specific objects are carried in a RSVP 664 message in addition to all the other, normal RSVP objects per RFC 665 2205. 667 In summary, use of TCLASS objects requires following additions to 668 the conventional RSVP message processing at DSBMs, SBMs, and DSBM 669 clients: 671 * When a DSBM receives a PATH message over a managed segment and 672 the PATH message does not include a TCLASS object, the DSBM 673 adds a TCLASS object to the PATH message before forwarding it. 674 The DSBM comes up with the appropriate user_priority value for 675 the TCLASS object according to some internal mapping of the 676 service classes. One possible set of internal mappings is pro- 677 posed as an example in an accompanying document [Seaman97]. 679 * When SBM or DSBM receives a PATH or RESV message with a TCLASS 680 object over a managed segment in a L2 domain and needs to for- 681 ward it over a managed segment in the same L2 domain, it will 682 typically forward the message without changing the contents of 683 the TCLASS object. However, if the DSBM/SBM cannot support the 684 service class represented by the user_priority value specified 685 by the TCLASS object in the PATH message, it may change the 686 priority value in the TCLASS to a semantically "lower" service 687 value to reflect its capability. 689 SBM (Subnet Bandwidth Manager) March, 1998 691 [NOTE: An accompanying document defines the int-serv mappings 692 over IEEE 802 networks [Seaman97] provides a precise definition 693 of user_priority values and describes how the user_priority 694 values are compared to determine "lower" of the two values or 695 the "lowest" among all the user_priority values.] 697 * When a DSBM receives a RESV message with a TCLASS object, it 698 may use the traffic class information (in addition to the usual 699 flowspec information in the RSVP message) for its own admission 700 control for the managed segment. 702 Note that this document does not specify the actual algorithm 703 or policy used for admission control. At one extreme, a DSBM 704 may use per-flow reservation request as specified by the 705 flowspec for a fine grain admission control. At the other 706 extreme, a DSBM may only consider the traffic class information 707 for a very coarse-grain admission control based on some static 708 allocation of link capacity for each traffic class. Any combi- 709 nation of the options represented by these two extremes may 710 also be used. 712 * When a DSBM client (residing at an L3 device such as a host or 713 an edge router) receives the TCLASS object in a PATH message 714 that it accepts over an interface, it should store the TCLASS 715 object as part of its PATH state for the interface. Later, when 716 the client forwards a RESV message for the same session on the 717 interface, the client must include the TCLASS object (unchanged 718 from what was received in the previous PATH message) in the 719 RESV message it forwards over the interface. 721 * When a DSBM client receives a TCLASS object in an incoming RESV 722 message over a managed segment and local admission control 723 succeeds for the session for the outgoing interface over the 724 managed segment, the client must pass the user_priority value 725 in the TCLASS object to its local packet classifier. This will 726 ensure that the data packets in the admitted RSVP flow that are 727 subsequently forwarded over the outgoing interface will contain 728 the appropriate value encoded in their frame header. 730 * When an L3 device receives a PATH or RESV message over a 731 managed segment in one L2 domain and it needs to forward the 732 PATH/RESV message over an interface outside that domain, the L3 733 device must remove the TCLASS object (along with LAN_NHOP, 734 RSVP_HOP_L2, and LAN_LOOPBACK objects in the case of the PATH 736 SBM (Subnet Bandwidth Manager) March, 1998 738 message) before forwarding the PATH/RESV message. If the outgo- 739 ing interface is on a separate L2 domain, these objects may be 740 regenerated according to the processing rules applicable to 741 that interface. 743 5. Detailed Message Processing Rules 745 5.1. Additional Notes on Terminology 747 * An L2 device may have several interfaces with attached segments 748 that are part of the same L2 domain. A switch in a L2 domain is 749 an example of such a device. A device which has several inter- 750 faces may contain a SBM protocol entity that acts in different 751 capacities on each interface. For example, a SBM protocol entity 752 could act as a SBM on interface A, and act as a DSBM on interface 753 B. 755 * A SBM protocol entity on a layer 3 device can be a DSBM client, 756 and SBM, a DSBM, or none of the above (SBM transparent). Non- 757 transparent L3 devices can implement any combination of these 758 roles simultaneously. DSBM clients always reside at L3 devices. 760 * A SBM protocol entity residing at a layer 2 device can be a SBM, 761 a DSBM or none of the above (SBM transparent). A layer 2 device 762 will never host a DSBM client. 764 5.2. Use Of Reserved IP Multicast Addresses 766 As stated earlier, we require that the DSBM clients forward the RSVP 767 PATH messages to their DSBMs in a L2 domain before they reach the next 768 L3 hop in the path. RSVP PATH messages are addressed, according to 769 RFC-2205, to their destination address (which can be either an IP uni- 770 cast or multicast address). When a L2 device hosts a DSBM, a simple- 771 to-implement mechanism must be provided for the device to capture an 772 incoming PATH message and hand it over to the local DSBM agent without 773 requiring the L2 device to snoop for L3 RSVP messages. 775 In addition, DSBM clients need to know how to address SBM messages to 777 SBM (Subnet Bandwidth Manager) March, 1998 779 the DSBM. For the ease of operation and to allow dynamic DSBM-client 780 binding, it should be possible to easily detect and address the exist- 781 ing DSBM on a managed segment. 783 To facilitate dynamic DSBM-client binding as well as to enable easy 784 detection and capture of PATH messages at L2 devices, we require that 785 a DSBM be addressed using a logical address rather than a physical 786 address. We make use of reserved IP multicast address(es) for the pur- 787 pose of communication with a DSBM. In particular, we require that 788 when a DSBM client or a SBM forwards a PATH message over a managed 789 segment, it is addressed to a reserved IP multicast address. Thus, a 790 DSBM on a L2 device needs to be configured in a way to make it easy to 791 intercept the PATH message and forward it to the local SBM protocol 792 entity. For example, this may involve simply adding a static entry in 793 the device's filtering database (FDB) for the corresponding MAC multi- 794 cast address to ensure the PATH messages get intercepted and are not 795 forwarded further without the DSBM intervention. 797 Similarly, a DSBM always sends the PATH messages over a managed seg- 798 ment using a reserved IP multicast address and, thus, the SBMs or DSBM 799 clients on the managed segments must simply be configured to intercept 800 messages addressed to the reserved multicast address on the appropri- 801 ate interfaces to easily receive PATH messages. 803 RSVP RESV messages continue to be unicast to the previous hop address 804 stored as part of the PATH state at each intermediate hop. 806 We define use of two reserved IP multicast addresses. We call these 807 the "AllSBM Address" and the "DSBMLogicalAddress". These are chosen 808 from the range of local multicast addresses, such that: 810 * They are not passed through layer 3 devices. 812 * They are passed transparently through layer 2 devices which are 813 SBM transparent. 815 * They are configured in the permanent database of layer 2 devices 816 which host SBMs or DSBMs, such that they are directed to the SBM 817 management entity in these devices. This obviates the need for 818 these devices to explicitly snoop for SBM related control pack- 819 ets. 821 * The two reserved addresses are 224.0.0.16 (DSBMLogicalAddress) 823 SBM (Subnet Bandwidth Manager) March, 1998 825 and 224.0.0.17 (AllSBMAddress). 827 These addresses are used as described in the following table: 829 Type DSBMLogicaladdress AllSBMAddress 831 DSBM * Sends PATH messages * Monitors this address to detect 832 Client to this address the presence of a DSBM 833 * Monitors this address to 834 receive PATH messages 835 forwarded by the DSBM 837 SBM * Sends PATH messages * Monitors and sends on this 838 to this address address to participate in 839 election of the DSBM 840 * Monitors this address to 841 receive PATH messages 842 forwarded by the DSBM 844 DSBM * Monitors this address * Monitors and sends on this 845 for PATH messages to participate in election 846 directed to it of the DSBM 847 * Sends PATH messages to this 848 address 850 The L2 or MAC addresses corresponding to IP multicast addresses are 851 computed algorithmically using a reserved L2 address block (the high 852 order 24-bits are 00:00:5e). The Assigned Numbers RFC [RFC-1700] gives 853 additional details. 855 5.3. Layer 3 to Layer 2 Address Mapping 857 As stated earlier, DSBMs or DSBM clients residing at a L3 device must 858 include a LAN_NHOP_L2 address in the LAN_NHOP information so that L2 859 devices along the path of a PATH message do not need to separately 860 determine the mapping between the LAN_NHOP_L3 address in the LAN_NHOP 861 object and its corresponding L2 address (for example, using ARP). 863 For the purpose of such mapping at L3 devices, we assume a mapping 864 function called "map_address" that performs the necessary mapping: 866 L2ADDR object = map_addr(L3Addr) 868 We do not specify how the function is implemented; the implementation 869 may simply involve access to the local ARP cache entry or may require 870 performing an ARP function. The function returns a L2ADDR object that 872 SBM (Subnet Bandwidth Manager) March, 1998 874 need not be interpreted by an L3 device and can be treated as an 875 opaque object. The format of the L2ADDR object is specified in Appen- 876 dix B. 878 5.4. Raw vs. UDP Encapsulation 880 We assume that the DSBMs, DSBM clients, and SBMs use only raw IP for 881 encapsulating RSVP messages that are forwarded onto a L2 domain. 882 Thus, when a SBM protocol entity on a L3 device forwards a RSVP mes- 883 sage onto a L2 segment, it will only use RAW IP encapsulation. 885 5.5. The Forwarding Rules 887 The message processing and forwarding rules will be described in the 888 context of the sample network illustrated in Figure 2. 890 SBM (Subnet Bandwidth Manager) March, 1998 892 Figure 2 - A sample network or L2 domain consisting of switched and 893 shared L2 segments 895 .......... 896 . 897 +------+ . +------+ seg A +------+ seg C +------+ seg D +------+ 898 | H1 |_______| R1 |_________| S1 |_________| S2 |_________| H2 | 899 | | . | | | | | | | | 900 +------+ . +------+ +------+ +------+ +------+ 901 . | / 902 1.0.0.0 . | / 903 . |___ / 904 . seg B | / seg E 905 .......... | / 906 2.0.0.0 | / 907 +-----------+ 908 | S3 | 909 | | 910 +-----------+ 911 | 912 | 913 | 914 | 915 seg F | ................. 916 ------------------------------ . 917 | | | . 918 +------+ +------+ +------+ . +------+ 919 | H3 | | H4 | | R2 |____________| H5 | 920 | | | | | | . | | 921 +------+ +------+ +------+ . +------+ 922 . 923 . 3.0.0.0 924 ................. 926 Figure 2 illustrates a sample network topology consisting of three IP 927 subnets (1.0.0.0, 2.0.0.0, and 3.0.0.0) interconnected using two 928 routers. The subnet 2.0.0.0 is an example of a L2 domain consisting of 929 switches, hosts, and routers interconnected using switched segments 930 and a shared L2 segment. The sample network contains the following 931 devices: 933 Device Type SBM Type 935 H1, H5 Host (layer 3) SBM Transparent 936 H2-H4 Host (layer 3) DSBM Client 937 R1 Router (layer 3) SBM 938 R2 Router (layer 3) DSBM for segment F 940 SBM (Subnet Bandwidth Manager) March, 1998 942 S1 Switch (layer 2) DSBM for segments A, B 943 S2 Switch (layer 2) DSBM for segments C, D, E 944 S3 Switch (layer 2) SBM 946 The following paragraphs describe the rules, which each of these dev- 947 ices should use to forward PATH messages (rules apply to PATH_TEAR 948 messages as well). They are described in the context of the general 949 network illustrated above. While the examples do not address every 950 scenario, they do address most of the interesting scenarios. Excep- 951 tions can be discussed separately. 953 The forwarding rules are applied to received PATH messages (routers 954 and switches) or originating PATH messages (hosts), as follows: 956 1. Determine the interface(s) on which to forward the PATH message 957 using standard forwarding rules: 959 * If there is a LAN_LOOPBACK object in the PATH message, and it 960 carries the address of this device, silently discard the message. 961 (See the section below on "Additional notes on forwarding the 962 PATH message onto a managed segment). 964 * Layer 3 devices use the RSVP session address and perform a rout- 965 ing lookup to determine the forwarding interface(s). 967 * Layer 2 devices use the LAN_NHOP_L2 address in the LAN_NHOP 968 information and MAC forwarding tables to determine the forwarding 969 interface(s). (See the section below on "Additional notes on for- 970 warding the PATH message onto a managed segment") 972 2. For each forwarding interface: 974 * If the device is a layer 3 device, determine whether the inter- 975 face is on a managed segment managed by a DSBM, based on the 976 presence or absence of I_AM_DSBM messages. If the interface is 977 not on a managed segment, strip out RSVP_HOP_L2, LAN_NHOP, 978 LAN_LOOPBACK, and TCLASS objects (if present), and forward to 979 the unicast or multicast destination. 981 (Note that the RSVP Class Numbers for these new objects are 983 SBM (Subnet Bandwidth Manager) March, 1998 985 chosen so that if an RSVP message includes these objects, the 986 nodes that are RSVP-aware, but do not participate in the SBM 987 protocol, will ignore and silently discard such objects.) 989 * If the device is a layer 2 device or it is a layer 3 device 990 *and* the interface is on a managed segment, proceed to rule 991 #3. 993 3. Forward the PATH message onto the managed segment: 995 * If the device is a layer 3 device, insert LAN_NHOP address 996 objects, a LAN_LOOPBACK, and a RSVP_HOP_L2 object into the PATH 997 message. The LAN_NHOP objects carry the LAN_NHOP_L3 and 998 LAN_NHOP_L2 addresses of the next layer 3 hop. The RSVP_HOP_L2 999 object carries the device's own L2 address, and the 1000 LAN_LOOPBACK object contains the IP address of the outgoing 1001 interface. 1003 An L3 device should use the map_addr() function described ear- 1004 lier to obtain an L2 address corresponding to an IP address. 1006 * If the device hosts the DSBM for the segment to which the for- 1007 warding interface is attached, do the following: 1009 - Retrieve the PHOP information from the standard RSVP HOP 1010 object in the PATH message, and store it. This will be used 1011 to route RESV messages back through the L2 network. If the 1012 PATH message arrived over a managed segment, it will also 1013 contain the RSVP_HOP_L2 object; then retrieve and store also 1014 the previous hop's L2 address in the PATH state. 1016 - Copy the IP address of the forwarding interface (layer 2 dev- 1017 ices must also have IP addresses) into the standard RSVP HOP 1018 object and the L2 address of the forwarding interface into 1019 the RSVP_HOP_L2 object. 1021 - If the PATH message received does not contain the TCLASS 1022 object, insert a TCLASS object. The user_priority value 1023 inserted in the TCLASS object is based on service mappings 1024 internal to the device that are configured according to the 1025 guidelines listed in [Seaman97]. If the message already 1027 SBM (Subnet Bandwidth Manager) March, 1998 1029 contains the TCLASS object, the user_priority value may be 1030 changed based again on the service mappings internal to the 1031 device. 1033 * If the device is a layer 3 device and hosts a SBM for the seg- 1034 ment to which the forwarding interface is attached, it *is 1035 required* to retrieve and store the PHOP info. 1037 If the device is a layer 2 device and hosts a SBM for the seg- 1038 ment to which the forwarding interface is attached, it is *not* 1039 required to retrieve and store the PHOP info. If it does not do 1040 so, the SBM must leave the standard RSVP HOP object and the 1041 RSVP_HOP_L2 objects in the PATH message intact and it will not 1042 receive RESV messages. 1044 If the SBM on a L2 device chooses to overwrite the RSVP HOP and 1045 RSVP_HOP_L2 objects with the IP and L2 addresses of its for- 1046 warding interface, it will receive RESV messages. In this case, 1047 it must store the PHOP address info received in the standard 1048 RSVP_HOP field and RSVP_HOP_L2 objects of the incident PATH 1049 message. 1051 In both the cases mentioned above (L2 or L3 devices), the SBM 1052 must forward the TCLASS object in the received PATH message 1053 unchanged. 1055 * Copy the IP address of the forwarding interface into the 1056 LAN_LOOPBACK object, unless the SBM protocol entity is a DSBM 1057 reflecting a PATH message back onto the incident interface. 1058 (See the section below on "Additional notes on forwarding a 1059 PATH message onto a managed segment"). 1061 * If the SBM protocol entity is the DSBM for the segment to which 1062 the forwarding interface is attached, it must send the PATH 1063 message to the AllSBMAddress. 1065 * If the SBM protocol entity is a SBM or a DSBM Client on the 1066 segment to which the forwarding interface is attached, it must 1067 send the PATH message to the DSBMLogicalAddress. 1069 5.6.1. Additional notes on forwarding a PATH message onto a 1071 SBM (Subnet Bandwidth Manager) March, 1998 1073 managed segment 1075 Rule #1 states that normal IEEE 802.1D forwarding rules should be 1076 used to determine the interfaces on which the PATH message should 1077 be forwarded. In the case of data packets, standard forwarding 1078 rules at a L2 device dictate that the packet should not be for- 1079 warded on the interface from which it was received. However, in 1080 the case of a DSBM that receives a PATH message over a managed 1081 segment, the following exception applies: 1083 E1. If the address in the LAN_NHOP object is a unicast address, 1084 consult the filtering database (FDB) to determine whether 1085 the destination address is listed on the same interface 1086 over which the message was received. If yes, follow the 1087 rule below on "reflecting a PATH message back onto an 1088 interface" described below; otherwise, proceed with the 1089 rest of the message processing as usual. 1091 E2. If there are members of the multicast group address (speci- 1092 fied by the addresses in the LAN_NHOP object), on the seg- 1093 ment from which the message was received, the message 1094 should be forwarded back onto the interface from which it 1095 was received and follow the rule on "reflecting a PATH mes- 1096 sage back onto an interface" described below. 1098 *** Reflecting a PATH message back onto an interface *** 1100 Under the circumstances described above, when a DSBM reflects 1101 the PATH message back onto an interface over which it was 1102 received, it must address it using the AllSBMAddress. 1104 Since it is possible for a DSBM to reflect a PATH message back 1105 onto the interface from which it was received, precautions must 1106 be taken to avoid looping these messages indefinitely. The 1107 LAN_LOOPBACK object addresses this issue. All SBM protocol enti- 1108 ties (except DSBMs reflecting a PATH message) overwrite the 1109 LAN_LOOPBACK object in the PATH message with the IP address of 1110 the outgoing interface. DSBMs which are reflecting a PATH mes- 1111 sage, leave the LAN_LOOPBACK object unchanged. Thus, SBM proto- 1112 col entities will always be able to recognize a reflected multi- 1113 cast message by the presence of their own address in the 1114 LAN_LOOPBACK object. These messages should be silently dis- 1115 carded. 1117 SBM (Subnet Bandwidth Manager) March, 1998 1119 5.7. Applying the Rules -- Unicast Session 1121 Let's see how the rules are applied in the general network illus- 1122 trated previously (see Figure 2). 1124 Assume that H1 is sending a PATH for a unicast session for which 1125 H5 is the receiver. The following PATH message is composed by H1: 1127 RSVP Contents 1128 RSVP session IP address IP address of H5 (3.0.0.35) 1129 Sender Template IP address of H1 (1.0.0.11) 1130 PHOP IP address of H1 (1.0.0.11) 1131 RSVP_HOP_L2 n/a (H1 is not sending onto a managed 1132 segment) 1133 LAN_NHOP n/a (H1 is not sending onto a managed 1134 segment) 1135 LAN_LOOPBACK n/a (H1 is not sending onto a managed 1136 segment) 1138 IP Header 1139 Source address IP address of H1 (1.0.0.11) 1140 Destn address IP addr of H5 (3.0.0.35, assuming raw mode & 1141 router alert) 1143 MAC Header 1144 Destn address The L2 addr corresponding to R1 (determined 1145 by map_addr() and routing tables at H1) 1147 Since H1 is not sending onto a managed segment, the PATH message 1148 is composed and forwarded according to standard RSVP processing 1149 rules. 1151 Upon receipt of the PATH message, R1 composes and forwards a PATH 1152 message as follows: 1154 RSVP Contents 1155 RSVP session IP address IP address of H5 1156 Sender Template IP address of H1 1157 PHOP IP address of R1 (2.0.0.1) 1158 (seed the return path for RESV messages) 1159 RSVP_HOP_L2 L2 address of R1 1160 LAN_NHOP LAN_NHOP_L3 (2.0.0.2) and 1161 LAN_NHOP_L2 address of R2 (L2ADDR) 1162 (this is the next layer 3 hop) 1163 LAN_LOOPBACK IP address of R1 (2.0.0.1) 1165 IP Header 1167 SBM (Subnet Bandwidth Manager) March, 1998 1169 Source address IP address of H1 1170 Destn address DSBMLogical IP address (224.0.0.16) 1172 MAC Header 1173 Destn address DSBMLogical MAC address 1175 * R1 does a routing lookup on the RSVP session address, to 1176 determine the IP address of the next layer 3 hop, R2. 1178 * It determines that R2 is accessible via seg A and that seg A 1179 is managed by a DSBM, S1. 1181 * Therefore, it concludes that it is sending onto a managed 1182 segment, and composes LAN_NHOP objects to carry the layer 3 1183 and layer 2 next hop addresses. To compose the LAN_NHOP 1184 L2ADDR object, it invokes the L3L2 address mapping function 1185 ("map_address") to find out the MAC address for the next hop 1186 L3 device, and then inserts a LAN_NHOP_L2ADDR object (that 1187 carries the MAC address) in the message. 1189 * Since R1 is not the DSBM for seg A, it sends the PATH message 1190 to the DSBMLogicalAddress. 1192 Upon receipt of the PATH message, S1 composes and forwards a PATH 1193 message as follows: 1195 RSVP Contents 1196 RSVP session IP address IP address of H5 1197 Sender Template IP address of H1 1198 PHOP IP addr of S1 (seed the return path for RESV 1199 messages) 1200 RSVP_HOP_L2 L2 address of S1 1201 LAN_NHOP LAN_NHOP_L3 (IP) and LAN_NHOP_L2 1202 address of R2 1203 (layer 2 devices do not modify the LAN_NHOP) 1204 LAN_LOOPBACK IP addr of S1 1206 IP Header 1207 Source address IP address of H1 1208 Destn address AllSBMIPaddr (224.0.0.17, since S1 is the 1210 SBM (Subnet Bandwidth Manager) March, 1998 1212 DSBM for seg B). 1214 MAC Header 1215 Destn address All SBM MAC address (since S1 is the DSBM for 1216 seg B). 1218 * S1 looks at the LAN_NHOP address information to determine the 1219 L2 address towards which it should forward the PATH message. 1221 * From the bridge forwarding tables, it determines that the L2 1222 address is reachable via seg B. 1224 * S1 inserts the RSVP_HOP_L2 object and overwrites the RSVP HOP 1225 object (PHOP) with its own addresses. 1227 * Since S1 is the DSBM for seg B, it addresses the PATH message 1228 to the AllSBMAddress. 1230 Upon receipt of the PATH message, S3 composes and forwards a 1231 PATH message as follows: 1233 RSVP Contents 1234 RSVP session IP addr IP address of H5 1235 Sender Template IP address of H1 1236 PHOP IP addr of S3 (seed the return 1237 path for RESV messages) 1238 RSVP_HOP_L2 L2 address of S3 1239 LAN_NHOP LAN_NHOP_L3 (IP) and 1240 LAN_NHOP_L2 (MAC) address of R2 1241 (L2 devices don't modify LAN_NHOP) 1242 LAN_LOOPBACK IP address of S3 1244 IP Header 1245 Source address IP address of H1 1246 Destn address DSBMLogical IP addr (since S3 is 1247 not the DSBM for seg F) 1249 MAC Header 1250 Destn address DSBMLogical MAC address 1252 SBM (Subnet Bandwidth Manager) March, 1998 1254 * S3 looks at the LAN_NHOP address information to determine the 1255 L2 address towards which it should forward the PATH message. 1257 * From the bridge forwarding tables, it determines that the L2 1258 address is reachable via segment F. 1260 * It has discovered that R2 is the DSBM for segment F. It 1261 therefore sends the PATH message to the DSBMLogicalAddress. 1263 * Note that S3 may or may not choose to overwrite the PHOP 1264 objects with its own IP and L2 addresses. If it does so, it 1265 will receive RESV messages. In this case, it must also store 1266 the PHOP info received in the incident PATH message so that 1267 it is able to forward the RESV messages on the correct path. 1269 Upon receipt of the PATH message, R2 composes and forwards a PATH 1270 message as follows: 1272 RSVP Contents 1273 RSVP session IP addr IP address of H5 1274 Sender Template IP address of H1 1275 PHOP IP addr of R2 (seed the return path for RESV 1276 messages) 1277 RSVP_HOP_L2 Removed by R2 (R2 is not sending onto a 1278 managed segment) 1279 LAN_NHOP Removed by R2 (R2 is not sending onto a 1280 managed segment) 1282 IP Header 1283 Source address IP address of H1 1284 Destn address IP address of H5, the RSVP session address 1286 MAC Header 1287 Destn address L2 addr corresponding to H5, the next 1288 layer 3 hop 1290 * R2 does a routing lookup on the RSVP session address, to 1291 determine the IP address of the next layer 3 hop, H5. 1293 * It determines that H5 is accessible via a segment for which 1294 there is no DSBM (not a managed segment). 1296 SBM (Subnet Bandwidth Manager) March, 1998 1298 * Therefore, it removes the LAN_NHOP and RSVP_HOP_L2 objects 1299 and places the RSVP session address in the destination 1300 address of the IP header. It places the L2 address of the 1301 next layer 3 hop, into the destination address of the MAC 1302 header and forwards the PATH message to H5. 1304 5.8. Applying the Rules - Multicast Session 1306 The rules described above also apply to multicast (m/c) sessions. 1307 For the purpose of this discussion, it is assumed that layer 2 1308 devices track multicast group membership on each port individu- 1309 ally. Layer 2 devices which do not do so, will merely generate 1310 extra multicast traffic. This is the case for L2 devices which do 1311 not implement multicast filtering or GARP/GMRP capability. 1313 Assume that H1 is sending a PATH for an m/c session for which H3 1314 and H5 are the receivers. The rules are applied as they are in the 1315 unicast case described previously, until the PATH message reaches 1316 R2, with the following exception. The RSVP session address and the 1317 LAN_NHOP carry the destination m/c addresses rather than the uni- 1318 cast addresses carried in the unicast example. 1320 Now let's look at the processing applied by R2 upon receipt of the 1321 PATH message. Recall that R2 is the DSBM for segment F. Therefore, 1322 S3 will have forwarded its PATH message to the DSBMLogicalAddress, 1323 to be picked up by R2. The PATH message will not have been seen by 1324 H3 (one of the m/c receivers), since it monitors only the 1325 AllSBMAddress, not the DSBMLogicalAddress for incoming PATH mes- 1326 sages. We rely on R2 to reflect the PATH message back onto seg f, 1327 and to forward it to H5. R2 forwards the following PATH message 1328 onto seg f: 1330 RSVP Contents 1331 RSVP session addr m/c session address 1332 Sender Template IP address of H1 1333 PHOP IP addr of R2 (seed the return path for 1334 RESV messages) 1335 RSVP_HOP_L2 L2 addr of R2 1336 LAN_NHOP m/c session address and corresponding L2 address 1337 LAN_LOOPBACK IP addr of S3 (DSBMs reflecting a PATH 1338 message don't modify this object) 1340 IP Header 1341 Source address IP address of H1 1343 SBM (Subnet Bandwidth Manager) March, 1998 1345 Destn address AllSBMIP address (since R2 is the DSBM for seg F) 1347 MAC Header 1348 Destn address AllSBMMAC address (since R2 is the 1349 DSBM for seg F) 1351 Since H3 is monitoring the All SBM Address, it will receive the 1352 PATH message reflected by R2. Note that R2 violated the standard 1353 forwarding rules here by sending an incoming message back onto the 1354 interface from which it was received. It protected against loops 1355 by leaving S3's address in the LAN_LOOPBACK object unchanged. 1357 R2 forwards the following PATH message on to H5: 1359 RSVP Contents 1360 RSVP session addr m/c session address 1361 Sender Template IP address of H1 1362 PHOP IP addr of R2 (seed the return path for RESV 1363 messages) 1364 RSVP_HOP_L2 Removed by R2 (R2 is not sending onto a 1365 managed segment) 1366 LAN_NHOP Removed by R2 (R2 is not sending onto a 1367 managed segment) 1368 LAN_LOOPBACK Removed by R2 (R2 is not sending onto a 1369 managed segment) 1371 IP Header 1372 Source address IP address of H1 1373 Destn address m/c session address 1375 MAC Header 1376 Destn address MAC addr corresponding to the m/c 1377 session address 1379 * R2 determines that there is an m/c receiver accessible via a 1380 segment for which there is no DSBM. Therefore, it removes the 1381 LAN_NHOP and RSVP_HOP_L2 objects and places the RSVP session 1382 address in the destination address of the IP header. It 1383 places the corresponding L2 address into the destination 1384 address of the MAC header and multicasts the message towards 1385 H5. 1387 5.9. Merging Traffic Class objects 1389 SBM (Subnet Bandwidth Manager) March, 1998 1391 When a DSBM client receives TCLASS objects from different senders 1392 (different PATH messages) in the same RSVP session and needs to 1393 combine them for sending back a single RESV message (as in a 1394 wild-card style reservation), the DSBM client should use the 1395 semantically "lowest" user_priority value among the values 1396 received in TCLASS objects of the PATH messages. 1398 Similarly, when a SBM or DSBM needs to merge RESVs from different 1399 next hops at a merge point, it should merge the TCLASS values in 1400 the incoming RESVs to the semantically "lowest" user_priority 1401 value among those received. 1403 [NOTE: As stated earlier, an accompanying document defines the 1404 int-serv mappings over IEEE 802 networks [Seaman97] provides a 1405 precise definition of user_priority values and describes how the 1406 priority values are compared to determine semantically "lower" of 1407 the two values or the semantically "lowest" among all the 1408 user_priority values.] 1410 5.10. Operation of SBM Transparent Devices 1412 We previously defined SBM Transparent devices. Since no SBM tran- 1413 sparent devices were illustrated in the example provided, we will 1414 describe the operation of these in the following paragraph. 1416 SBM transparent devices are unaware of the entire SBM/DSBM proto- 1417 col. They do not intercept messages addressed to either of the SBM 1418 related local group addresses (the DSBMLogicalAddrss and the 1419 ALLSBMAddress), but instead, pass them through. As a result, they 1420 do not divide the DSBM election scope, they do not explicitly par- 1421 ticipate in routing of PATH or RESV messages, and they do not par- 1422 ticipate in admission control. They are entirely transparent with 1423 respect to SBM operation. 1425 According to the definitions provided, physical segments intercon- 1426 nected by SBM transparent devices are considered a single managed 1427 segment. Therefore, DSBMs must perform admission control on such 1428 managed segments, with limited knowledge of the segment's topol- 1429 ogy. In this case, the network administrator should configure the 1430 DSBM for each managed segment, with some reasonable approximation 1431 of the segment's capacity. A conservative policy would configure 1432 the DSBM for the lowest capacity route through the managed seg- 1433 ment. A liberal policy would configure the DSBM for the highest 1434 capacity route through the managed segment. A network administra- 1435 tor will likely choose some value between the two, based on the 1436 level of guarantee required and some knowledge of likely traffic 1437 patterns. 1439 SBM (Subnet Bandwidth Manager) March, 1998 1441 This document does not specify the configuration mechanism or the 1442 choice of a policy. 1444 5.11. Operation of SBMs Which are NOT DSBMs 1446 In the example illustrated, S3 hosts a SBM, but the SBM on S3 did 1447 not win the election to act as DSBM on any segment. One might ask 1448 what purpose such a SBM protocol entity serves. Such SBMs actually 1449 provide two useful functions. First, the additional SBMs remain 1450 passive in the background for fault tolerance. They listen to the 1451 periodic announcements from the current DSBM for the managed seg- 1452 ment (Appendix A describes this in more detail) and step in to 1453 elect a new DSBM when the current DSBM fails or ceases to be 1454 operational for some reason. Second, such SBMs also provide the 1455 important service of dividing the election scope and reducing the 1456 size and complexity of managed segments. For example, consider the 1457 sample topology in Figure 3 again. the device S3 contains an SBM 1458 that is not a DSBM for any f the segments, B, E, or F, attached to 1459 it. However, if the SBM protocol entity on S3 was not present, 1460 ssegments B and F would not be separate segments from the point of 1461 view of the SBM protocol. Instead, they would constitute a single 1462 managed segment, managed by a single DSBM. Because the SBM entity 1463 on S3 divides the election scope, seg B and seg F are each 1464 managed by separate DSBMs. Each of these segments have a trivial 1465 topology and a well defined capacity. As a result, the DSBMs for 1466 these segments do not need to perform admission control based on 1467 approximations (as would be the case if S3 were SBM transparent). 1469 Note that, SBM protocol entities which are not DSBMs, are not 1470 required to overwrite the PHOP in incident PATH messages with 1471 their own address. This is because it is not necessary for RESV 1472 messages to be routed through these devices. RESV messages are 1473 only required to be routed through the correct sequence of DSBMs. 1474 SBMs may not process RESV messages that do pass through them, 1475 other than to forward them towards their destination address, 1476 using standard forwarding rules. 1478 SBM protocol entities which are not DSBMs are required to 1479 overwrite the address in the LAN_LOOPBACK object with their own 1480 address, in order to avoid looping multicast messages. However, no 1481 state need be stored. 1483 6. Inter-Operability Considerations 1485 SBM (Subnet Bandwidth Manager) March, 1998 1487 There are a few interesting inter-operability issues related to 1488 the deployment of a DSBM-based admission control method in an 1489 environment consisting of network nodes with and without RSVP 1490 capability. In the following, we list some of these scenarios and 1491 explain how SBM-aware clients and nodes can operate in those 1492 scenarios: 1494 6.1. An L2 domain with no RSVP capability. 1496 It is possible to envisage L2 domains that do not use RSVP signal- 1497 ing for requesting resource reservations, but, instead, use some 1498 other (e.g., SNMP or static configuration) mechanism to reserve 1499 bandwidth at a particular network device such as a router. In that 1500 case, the question is how does a DSBM-based admission control 1501 method work and interoperate with the non-RSVP mechanism. The 1502 SBM-based method does not attempt to provide an admission control 1503 solution for such an environment. The SBM-based approach is part 1504 of an end2end signaling approach to establish resource reserva- 1505 tions and does not attempt to provide a solution for SNMP-based 1506 configuration scenario. 1508 As stated earlier, the SBM-based approach can, however, co-exist 1509 with any other, non-RSVP bandwidth allocation mechanism as long as 1510 resources being reserved are either partitioned statically between 1511 the different mechanisms or are resolved dynamically through a 1512 common bandwidth allocator so that there is no over-commitment of 1513 the same resource. 1515 6.2. An L2 domain with SBM-transparent L2 Devices. 1517 This scenario has been addressed earlier in the document. The 1518 SBM-based method is designed to operate in such an environment. 1519 When SBM-transparent L2 devices interconnect SBM-aware devices, 1520 the resulting managed segment is a combination of one or more phy- 1521 sical segments and the DSBM for the managed segment may not be as 1522 efficient in allocating resources as it would if all L2 devices 1523 were SBM-aware. 1525 6.3. An L2 domain on which some RSVP-based senders are not DSBM 1526 clients. 1528 All senders that are sourcing RSVP-based traffic flows onto a 1529 managed segment MUST be SBM-aware and participate in the SBM 1531 SBM (Subnet Bandwidth Manager) March, 1998 1533 protocol. Use of the standard, non-SBM version of RSVP may result 1534 in over-allocation of resources, as such use bypasses the resource 1535 management function of the DSBM. All other senders (i.e., senders 1536 that are not sending streams subject to RSVP admission control) 1537 should be elastic applications that send traffic of lower priority 1538 than the RSVP traffic, and use TCP-like congestion avoidance 1539 mechanisms. 1541 All DSBMs, SBMs, or DSBM clients on a managed segment (a segment 1542 with a currently active DSBM) must not accept PATH messages from 1543 senders that are not SBM-aware. PATH messages from such devices 1544 can be easily detected by SBMs and DSBM clients as they would not 1545 be multicast to the ALLSBMAddress (in case of SBMs and DSBM 1546 clients) or the DSBMLogicalAddress (in case of DSBMs). 1548 6.4. A non-SBM router that interconnects two DSBM-managed L2 1549 domains. 1551 Multicast SBM messages (e.g., election and PATH messages) have 1552 local scope and are not intended to pass between the two domains. 1553 A correctly configured non-SBM router will not pass such messages 1554 between the domains. A broken router implementation that does so 1555 may cause incorrect operation of the SBM protocol and consequent 1556 over- or under-allocation of resources. 1558 6.5. Interoperability with RSVP clients that use UDP encapsulation 1559 and are not capable of receiving/sending RSVP messages using 1560 RAW_IP 1562 This document stipulates that DSBMs, DSBM clients, and SBMs use 1563 only raw IP for encapsulating RSVP messages that are forwarded 1564 onto a L2 domain. RFC-2205 (the RSVP Proposed Standard) includes 1565 support for both raw IP and UDP encapsulation. Thus, a RSVP node 1566 using only the UDP encapsulation will not be able to interoperate 1567 with the DSBM unless DSBM accepts and supports UDP encapsulated 1568 RSVP messages. 1570 7. Guidelines for Implementors 1572 In the following, we provide guidelines for implementors on dif- 1573 ferent aspects of the implementation of the SBM-based admission 1574 control procedure including suggestions for DSBM initialization, 1576 SBM (Subnet Bandwidth Manager) March, 1998 1578 etc. 1580 7.1. DSBM Initialization 1582 As stated earlier, DSBM initialization includes configuration of 1583 maximum bandwidth that can be reserved on a managed segment under 1584 its control. We suggest the following guideline. 1586 In the case of a managed segment consisting of L2 devices inter- 1587 connected by a single shared segment, DSBM entities on such dev- 1588 ices should assume the bandwidth of the interface as the total 1589 link bandwidth. In the case of a DSBM located in a L2 switch, it 1590 might additionally need to be configured with an estimate of the 1591 device's switching capacity if that is less than the link 1592 bandwidth, and possibly with some estimate of the buffering 1593 resources of the switch (see [Ghanwani98] for the architectural 1594 model assumed for L2 switches). Given the total link bandwidth, 1595 the DSBM may be further configured to limit the maximum amount of 1596 bandwidth for RSVP-enabled flows to ensure spare capacity for 1597 best-effort traffic. 1599 7.2. Operation of DSBMs in Different L2 Topologies 1601 Depending on a L2 topology, a DSBM may be called upon to manage 1602 resources for one or more segments and the implementors must bear 1603 in mind efficiency implications of the use of DSBM in different L2 1604 topologies. Trivial L2 topologies consist of a single "physical 1605 segment". In this case, the 'managed segment' is equivalent to a 1606 single segment. Complex L2 topologies may consist of a number of 1607 'physical segments', separated by SBM-transparent L2 switches. 1608 Admission control on such an L2 extended segment can be performed 1609 from a single pool of resources, similar to a single shared seg- 1610 ment, from the point of view of a single DSBM. 1612 This configuration compromises the efficiency with which the DSBM 1613 can allocate resources. This is because the single DSBM is 1614 required to make admission control decisions for all reservation 1615 requests within the L2 topology, with no knowledge of the actual 1616 physical segments affected by the reservation. 1618 We can realize improvements in the efficiency of resource alloca- 1619 tion by subdividing the complex segment into a number of managed 1620 segments, each managed by their own DSBM. In this case, each DSBM 1621 manages a managed segment having a relatively simple topology. 1622 Since managed segments are simpler, the DSBM can be configured 1623 with a more accurate estimate of the resources available for all 1625 SBM (Subnet Bandwidth Manager) March, 1998 1627 reservations in the managed segment. In the ultimate configura- 1628 tion, each physical segment is a managed segment and is managed by 1629 its own DSBM. We make no assumption about the number of managed 1630 segments but state, simply, that in complex L2 topologies, the 1631 efficiency of resource allocation improves as the granularity of 1632 managed segments increases. 1634 8. Security Considerations 1636 The message formatting and usage rules described in this note 1637 raise some security issues, but they are no different than the 1638 ones raised by the use of RSVP and Integrated Services; the need 1639 to control and authenticate access to enhanced qualities of ser- 1640 vice. This requirement is discussed further in [RFC-2205], [RFC- 1641 2211], and [RFC-2212]. [Baker97] describes the mechanism used to 1642 protect the integrity of RSVP messages carrying the information 1643 described here. A SBM implementation should satisfy these require- 1644 ments and provide the suggested mechanisms just as though it were 1645 a conventional RSVP implementation and also protect the addi- 1646 tional, SBM-specific objects in a message. 1648 In addition, it is also necessary to authenticate DSBM candidates 1649 during the election process, and a mechanism based on a shared 1650 secret among the DSBM candidates may be used. The mechanism 1651 defined in [Baker97] should be used. 1653 SBM (Subnet Bandwidth Manager) March, 1998 1655 9. References 1657 [RFC-2205] R. Braden, L. Zhang, S. Berson, S. Herzog, S. Jamin, 1658 "Resource ReSerVation Protocol (RSVP) -- Version 1 Functional 1659 Specification ", RFC-2205, September 1997. 1661 [Baker97] F. Baker., "RSVP Cryptographic Authentication", draft- 1662 ietf-rsvp-md5-05.txt, August 1997. 1664 [RFC-2206] F. Baker, J. Krawczyk, "RSVP Management Information 1665 Base", RFC 2206, September 1997. 1667 [RFC-2211] J. Wroclawski, "Specification of the Controlled-Load 1668 Network Element Service", RFC-2211, September 1997. 1670 [RFC-2212] S. Shenker, C. Partridge, R. Guerin, "Specification of 1671 Guaranteed Quality of Service", RFC-2212, September 1997. 1673 [RFC-2215] S. Shenker, J. Wroclawski, "General Characterization 1674 Parameters for Integrated Service Network Elements", RFC-2215, 1675 September 1997. 1677 [RFC-2210] J. Wroclawski, "The Use of RSVP with IETF Integrated 1678 Services", RFC 2210, September 1997. 1680 [RFC-2213] F. Baker, J. Krawczyk, "Integrated Services Management 1681 Information Base", RFC 2213, September 1997. 1683 [Ghanwani98] A. Ghanwani, W. Pace, V. Srinivasan, A.Smith, 1684 M.Seaman "A Framework for Providing Integrated Services Over 1685 Shared and Switched LAN Technologies", Internet Draft , March 1998. 1688 [Seaman97] M. Seaman, A. Smith, E. Crawley, "Integrated Service 1689 Mappings on IEEE 802 Networks", Internet Draft , November 1997. 1692 [IEEE802Q] "IEEE Standards for Local and Metropolitan Area Net- 1693 works: Virtual Bridged Local Area Networks", Draft Standard 1694 P802.1Q/D9, February 20, 1998. 1696 [IEEEP8021p] "Information technology - Telecommunications and 1697 information exchange between systems - Local and metropolitan area 1698 networks - Common specifications - Part 3: Media Access Control 1699 (MAC) Bridges: Revision (Incorporating IEEE P802.1p: Traffic Class 1700 Expediting and Dynamic Multicast Filtering)", ISO/IEC Final CD 1701 15802-3 IEEE P802.1D/D15, November 24, 1997. 1703 SBM (Subnet Bandwidth Manager) March, 1998 1705 [IEEE8021D] "MAC Bridges", ISO/IEC 10038, ANSI/IEEE Std 802.1D- 1706 1993. 1708 SBM (Subnet Bandwidth Manager) March, 1998 1710 APPENDIX A 1711 DSBM Election Algorithm 1713 A.1. Introduction 1715 To simplify the rest of this discussion, we will assume that there 1716 is a single DSBM for the entire L2 domain (i.e., assume a shared 1717 L2 segment for the entire L2 domain). Later, we will discuss how a 1718 DSBM is elected for a half-duplex or full-duplex switched segment. 1720 To allow for quick recovery from the failure of a DSBM, we assume 1721 that additional SBMs may be active in a L2 domain for fault toler- 1722 ance. When more than one SBM is active in a L2 domain, the SBMs 1723 use an election algorithm to elect a DSBM for the L2 domain. After 1724 the DSBM is elected and is operational, other SBMs remain passive 1725 in the background to step in to elect a new DSBM when necessary. 1726 The protocol for electing and discovering DSBM is called the "DSBM 1727 election protocol" and is described in the rest of this Appendix. 1729 A.1.1. How a DSBM Client Detects a Managed Segment 1731 Once elected, a DSBM periodically multicasts an I_AM_DSBM message 1732 on the AllSBMAddress to indicate its presence. The message is sent 1733 every period (e.g., every 5 seconds) according to the 1734 DSBMRefreshInterval timer value (a configuration parameter). 1735 Absence of such a message over a certain time interval (called 1736 "DSBMDeadInterval"; another configuration parameter typically set 1737 to a multiple of RefreshInterval) indicates that the DSBM has 1738 failed or terminated and triggers another round of the DSBM elec- 1739 tion. The DSBM clients always listen for periodic DSBM advertise- 1740 ments. The advertisement includes the unicast IP address of the 1741 DSBM (DSBMAddress) and DSBM clients send their PATH/RESV (or 1742 other) messages to the DSBM. When a DSBM client detects the 1743 failure of a DSBM, it waits for a subsequent I_AM_DSBM advertise- 1744 ment before resuming any communication with the DSBM. During the 1745 period when a DSBM is not present, a DSBM client may forward out- 1746 going PATH messages using the standard RSVP forwarding rules. 1748 The exact message formats and addresses used for communication 1749 with (and among) SBM(s) are described in Appendix B. 1751 A.2. Overview of the DSBM Election Procedure 1753 SBM (Subnet Bandwidth Manager) March, 1998 1755 When a SBM first starts up, it listens for incoming DSBM adver- 1756 tisements for some period to check whether a DSBM already exists 1757 in its L2 domain. If one already exists (and no new election is in 1758 progress), the new SBM stays quiet in the background until an 1759 election of DSBM is necessary. All messages related to the DSBM 1760 election and DSBM advertisements are always sent to the AllSBMAd- 1761 dress. 1763 If no DSBM exists, the SBM initiates the election of a DSBM by 1764 sending out a DSBM_WILLING message that lists its IP address as a 1765 candidate DSBM and its "SBM priority". Each SBM is assigned a 1766 priority to determine its relative precedence. When more than one 1767 SBM candidate exists, the SBM priority determines who gets to be 1768 the DSBM based on the relative priority of candidates. If there is 1769 a tie based on the priority value, the tie is broken using the IP 1770 addresses of tied candidates (one with the higher IP address in 1771 the lexicographic order wins). The details of the election proto- 1772 col start in Section A.4. 1774 A.2.1 Summary of the Election Algorithm 1776 For the purpose of the algorithm, a SBM is in one of the four 1777 states (Idle, DetectDSBM, ElectDSBM, I_AM_DSBM). 1779 A SBM (call it X) starts up in the DetectDSBM state and waits for 1780 a ListenInterval for incoming I_AM_DSBM (DSBM advertisement) or 1781 DSBM_WILLING messages. If an I_AM_DSBM advertisement is received 1782 during this state, the SBM notes the current DSBM (its IP address 1783 and priority) and enters the Idle state. If a DSBM_WILLING message 1784 is received from another SBM (call it Y) during this state, then X 1785 enters the ElectDSBM state. Before entering the new state, X first 1786 checks to see whether it itself is a better candidate than Y and, 1787 if so, sends out a DSBM_WILLING message and then enters the 1788 ElectDSBM state. 1790 When a SBM (call it X) enters the ElectDSBM state, it sets a timer 1791 (called ElectionIntervalTimer that is typically set to a value at 1792 least equal to the DeadIntervalTimer value) to wait for the elec- 1793 tion to finish and to discover who is the best candidate. In this 1794 state, X keeps track of the best (or better) candidate seen so far 1795 (including itself). Whenever it receives another DSBM_WILLING mes- 1796 sage, it updates its notion of the best (or better) candidate 1797 based on the priority (and tie-breaking) criterion. During the 1798 ElectionInterval, X sends out a DSBM_WILLING message every 1799 RefreshInterval to (re)assert its candidacy. 1801 SBM (Subnet Bandwidth Manager) March, 1998 1803 At the end of the ElectionInterval, X checks whether it is the 1804 best candidate so far. If so, it declares itself to be the DSBM 1805 (by sending out the I_AM_DSBM advertisement) and enters the 1806 I_AM_DSBM state; otherwise, it decides to wait for the best candi- 1807 date to declare itself the winner. To wait, X re-initializes its 1808 ElectDSBM state and continues to wait for another round of elec- 1809 tion (each round lasts for an ElectionTimerInterval duration). 1811 A SBM is in Idle state when no election is in progress and the 1812 DSBM is already elected (and happens to be someone else). In this 1813 state, it listens for incoming I_AM_DSBM advertisements and uses 1814 a DSBMDeadInterval timer to detect the failure of DSBM. Every time 1815 the advertisement is received, the timer is restarted. If the 1816 timer fires, the SBM goes into the DetectDSBM state to prepare to 1817 elect the new DSBM. If a SBM receives a DSBM_WILLING message from 1818 the current DSBM in this state, the SBM enters the ElectDSBM state 1819 after sending out a DSBM_WILLING message (to announce its own 1820 candidacy). 1822 In the I_AM_DSBM state, the DSBM sends out I_AM_DSBM advertise- 1823 ments every refresh interval. If the DSBM wishes to shut down 1824 (gracefully terminate), it sends out a DSBM_WILLING message (with 1825 SBM priority value set to zero) to initiate the election pro- 1826 cedure. The priority value zero effectively removes the outgoing 1827 DSBM from the election procedure and makes way for the election of 1828 a different DSBM. 1830 A.3. Recovering from DSBM Failure 1832 When a DSBM fails (DSBMDeadInterval timer fires), all the SBMs 1833 enter the ElectDSBM state and start the election process. 1835 At the end of the ElectionInterval, the elected DSBM sends out an 1836 I_AM_DSBM advertisement and the DSBM is then operational. 1838 A.4. DSBM Advertisements 1840 The I_AM_DSBM advertisement contains the following information: 1842 1. DSBM address information -- contains the IP and L2 addresses 1843 of the DSBM and its SBM priority (a configuration parameter 1845 SBM (Subnet Bandwidth Manager) March, 1998 1847 -- priority specified by a network administrator). The prior- 1848 ity value is used to choose among candidate SBMs during the 1849 election algorithm. Higher integer values indicate higher 1850 priority and the value is in the range 0..255. The value zero 1851 indicates that the SBM is not eligible to be the DSBM. The 1852 IP address is required and used for breaking ties. The L2 1853 address is for the interface of the managed segment. 1855 2. refresh interval -- contains the value of the refresh inter- 1856 val in seconds. Value zero indicates the parameter has been 1857 omitted in the message. Receivers may substitute their own 1858 default value in this case. 1860 3. SBMDeadInterval -- contains the value of the SBMDeadInterval 1861 in seconds. If the value is omitted (or value zero is speci- 1862 fied), a default value (from initial configuration) should be 1863 used. 1865 A.5. DSBM_WILLING Messages 1867 When a SBM wishes to declare its candidacy to be the DSBM during 1868 an election phase, it sends out a DSBM_WILLING message. The 1869 DSBM_WILLING message contains the following information: 1871 1. DSBM address information -- Contains the SBM's own addresses 1872 (IP and L2 address), if it wishes to be the DSBM. The IP 1873 address is required and used for breaking ties. The L2 1874 address is the address of the interface for the managed seg- 1875 ment in question. Also, the DSBM address information 1876 includes the corresponding priority of the SBM whose address 1877 is given above. 1879 A.6. SBM State Variables 1881 SBM (Subnet Bandwidth Manager) March, 1998 1883 For each network interface, a SBM maintains the following state 1884 variables related to the election of the DSBM for the L2 domain on 1885 that interface: 1887 a) LocalDSBMAddrInfo -- current DSBM's IP address (initially, 1888 0.0.0.0) and priority. All IP addresses are assumed to be in 1889 network byte order. In addition, current DSBM's L2 address is 1890 also stored as part of this state information. 1892 b) OwnAddrInfo -- SBM's own IP address and L2 address for the 1893 interface and its own priority (a configuration parameter). 1895 c) DSBM RefreshInterval in seconds. When the DSBM is not yet 1896 elected, it is set to a default value specified as a confi- 1897 guration parameter. 1899 d) DSBMDeadInterval in seconds. When the DSBM is not yet 1900 elected, it is initially set to a default value specified as 1901 a configuration parameter. 1903 f) ListenInterval in seconds -- a configuration parameter 1904 that decides how long a SBM spends in the DetectDSBM state 1905 (see below). 1907 g) ElectionInterval in seconds -- a configuration parameter 1908 that decides how long a SBM spends in the ElectDSBM state 1909 when it has declared its candidacy. 1911 Figure 3 shows the state transition diagram for the election pro- 1912 tocol and the various states are described below. A complete 1913 description of the state machine is provided in Section A.10. 1915 A.7. DSBM Election States 1917 DOWN -- SBM is not operational. 1919 DetectDSBM -- typically, the initial state of a SBM when it 1921 SBM (Subnet Bandwidth Manager) March, 1998 1923 starts up. In this state, it checks to see whether a DSBM 1924 already exists in its domain. 1926 Idle -- SBM is in this state when no election is in progress 1927 and it is not the DSBM. In this state, SBM passively monitors 1928 the state of the DSBM. 1930 ElectDSBM -- SBM is in this state when a DSBM election is in 1931 progress. 1933 IAMDSBM -- SBM is in this state when it is the DSBM for the 1934 L2 domain. 1936 A.8. Events that cause state changes 1938 StartUp -- SBM starts operation. 1940 ListenInterval Timeout -- The ListenInterval timer has fired. 1941 This means that the SBM has monitored its domain to check for 1942 an existing DSBM or to check whether there are candidates 1943 (other than itself) willing to be the DSBM. 1945 DSBM_WILLING message received -- This means that the SBM 1946 received a DSBM_WILLING message from some other SBM. Such a 1947 message is sent when a SBM wishes to declare its candidacy to 1948 be the DSBM. 1950 I_AM_DSBM message received -- SBM received a DSBM advertise- 1951 ment from the DSBM in its L2 domain. 1953 SBMDeadInterval Timeout -- The SBMDeadInterval timer has 1954 fired. This means that the SBM did not receive even one DSBM 1955 advertisement during this period and indicates possible 1956 failure of the DSBM. 1958 RefreshInterval Timeout -- The RefreshInterval timer has 1960 SBM (Subnet Bandwidth Manager) March, 1998 1962 fired. In the I_AM_DSBM state, this means it is the time for 1963 sending out the next DSBM advertisement. In the ElectDSBM 1964 state, the event means that it is the time to send out 1965 another DSBM_WILLING message. 1967 ElectionInterval Timeout -- The ElectionInterval timer has 1968 fired. This means that the SBM has waited long enough after 1969 declaring its candidacy to determine whether or not it suc- 1970 ceeded. 1972 CONTINUED ON NEXT PAGE 1974 SBM (Subnet Bandwidth Manager) March, 1998 1976 A.9. State Transition Diagram (Figure 3) 1978 +-----------+ 1979 +--<--------------<-|DetectDSBM |---->------+ 1980 | +-----------+ | 1981 | | 1982 | | 1983 | | 1984 | +-------------+ +---------+ | 1985 +->---| Idle |--<>---|ElectDSBM|--<--+ 1986 +-------------+ +---------+ 1987 | | 1988 | | 1989 | | 1990 | +-----------+ | 1991 +<<- +---| IAMDSBM |-<-+ 1992 | +-----------+ 1993 | 1994 | +-----------+ 1995 +>>-| SHUTDOWN | 1996 +-----------+ 1998 A.10. Election State Machine 2000 Based on the events and states described above, the state changes 2001 at a SBM are described below. Each state change is triggered by an 2002 event and is typically accompanied by a sequence of actions. The 2003 state machine is described assuming a single threaded implementa- 2004 tion (to avoid race conditions between state changes and timer 2005 events) with no timer events occurring during the execution of the 2006 state machine. 2008 The following routines will be frequently used in the description 2009 of the state machine: 2011 ComparePrio(FirstAddrInfo, SecondAddrInfo) 2012 -- determines whether the entity represented by the first parameter 2013 is better than the second entity using the priority information 2014 and the IP address information in the two parameters. 2015 If any address is zero, that entity 2016 automatically loses; then first priorities are compared; higher 2017 priority candidate wins. If there is a tie based on 2018 the priority value, the tie is broken using the IP 2019 addresses of tied candidates (one with the higher IP address in the 2020 lexicographic order wins). Returns TRUE if first entity is a better 2022 SBM (Subnet Bandwidth Manager) March, 1998 2024 choice. FALSE otherwise. 2026 SendDSBMWilling Message() 2027 Begin 2028 Send out DSBM_WILLING message listing myself as a candidate for 2029 DSBM (copy OwnAddr and priority into appropriate fields) 2030 start RefreshIntervalTimer 2031 goto ElectDSBM state 2032 End 2034 AmIBetterDSBM(OtherAddrInfo) 2035 Begin 2036 if (ComparePrio(OwnAddrInfo, OtherAddrInfo)) 2037 return TRUE 2039 change LocalDSBMInfo = OtherDSBMAddrInfo 2040 return FALSE 2041 End 2043 UpdateDSBMInfo() 2044 /* invoked in an assignment such as LocalDSBMInfo = OtherAddrInfo */ 2045 Begin 2046 update LocalDSBMInfo such as IP addr, DSBM L2 address, 2047 DSBM priority, RefreshIntervalTimer, DSBMDeadIntervalTimer 2048 End 2050 A.10.1 State Changes 2052 In the following, the action "continue" or "continue in current 2053 state" means an "exit" from the current action sequence without a 2054 state transition. 2056 State: DOWN 2057 Event: StartUp 2058 New State: DetectDSBM 2059 Action: Initialize the local state variables (LocalDSBMADDR and 2060 LocalDSBMAddrInfo set to 0). Start the ListenIntervalTimer. 2062 State: DetectDSBM 2063 New State: Idle 2064 Event: I_AM_DSBM message received 2065 Action: set LocalDSBMAddrInfo = IncomingDSBMAddrInfo 2066 start DeadDSBMInterval timer 2067 goto Idle State 2069 SBM (Subnet Bandwidth Manager) March, 1998 2071 State: DetectDSBM 2072 Event: ListenIntervalTimer fired 2073 New State: ElectDSBM 2074 Action: Start ElectionIntervalTimer 2075 SendDSBMWillingMessage(); 2077 State: DetectDSBM 2078 Event: DSBM_WILLING message received 2079 New State: ElectDSBM 2080 Action: Cancel any active timers 2082 Start ElectionIntervalTimer 2083 /* am I a better choice than this dude? */ 2084 If (ComparePrio(OwnAddrInfo, IncomingDSBMInfo)) { 2085 /* I am better */ 2086 SendDSBMWillingMessage() 2087 } else { 2088 Change LocalDSBMAddrInfo = IncomingDSBMAddrInfo 2089 goto ElectDSBM state 2090 } 2092 State: Idle 2093 Event: SBMDeadInterval timer fired. 2094 New State: ElectDSBM 2095 Action: start ElectionIntervalTimer 2096 set LocalDSBMAddrInfo = OwnAddrInfo 2097 SendDSBMWiliingMessage() 2099 State: Idle 2100 Event: I_AM_DSBM message received. 2101 New State: Idle 2102 Action: /* first check whether anything has changed */ 2103 if (!ComparePrio(LocalDSBMAddrInfo, IncomingDSBMAddrInfo)) 2104 change LocalDSBMAddrInfo to reflect new info 2105 endif 2106 restart DSBMDeadIntervalTimer; 2107 continue in current state; 2109 State: Idle 2110 Event: DSBM_WILLING Message is received 2111 New State: Depends on action (ElectDSBM or Idle) 2112 Action: /* check whether it is from the DSBM itself (shutdown) */ 2113 if (IncomingDSBMAddr == LocalDSBMAddr) { 2114 cancel active timers 2115 Set LocalDSBMAddrInfo = OwnAddrInfo 2116 Start ElectionIntervalTimer 2117 SendDSBMWillingMessage() /* goto ElectDSBM state */ 2118 } 2120 SBM (Subnet Bandwidth Manager) March, 1998 2122 /* else, ignore it */ 2123 continue in current state 2125 State: ElectDSBM 2126 Event: ElectionIntervalTimer Fired 2127 New State: depends on action (I_AM_DSBM or Current State) 2128 Action: If (LocalDSBMAddrInfo == OwnAddrInfo) { 2129 /* I won */ 2130 send I_AM_DSBM message 2131 start RefreshIntervalTimer 2132 goto I_AM_DSBM state 2133 } else { /* someone else won, so wait for it to declare 2134 itself to be the DSBM */ 2135 set LocalDSBMAddressInfo = OwnAddrInfo 2136 start ElectionIntervalTimer 2137 SendDSBMWillingMessage() 2138 continue in current state 2139 } 2141 State: ElectDSBM 2142 Event: I_AM_DSBM message received 2143 New State: Idle 2144 Action: set LocalDSBMAddrInfo = IncomingDSBMAddrInfo 2145 Cancel any active timers 2146 start DeadDSBMInterval timer 2147 goto Idle State 2149 State: ElectDSBM 2150 Event: DSBM_WILLING message received 2151 New State: ElectDSBM 2152 Action: Check whether it's a loopback and if so, discard, continue; 2153 if (!AmIBetterDSBM(IncomingDSBMAddrInfo)) { 2154 Change LocalDSBMAddrInfo = IncomingDSBMAddrInfo 2155 Cancel RefreshIntervalTimer 2156 } else if (LocalDSBMAddrInfo == OwnAddrInfo) { 2157 SendDSBMWillingMessage() 2158 } 2159 continue in current state 2161 State: ElectDSBM 2162 Event: RefreshIntervalTimer fired 2163 New State: ElectDSBM 2164 Action: /* continue to send DSBMWilling messages until 2165 election interval ends */ 2166 SendDSBMWillingMessage() 2168 State: I_AM_DSBM 2169 Event: DSBM_WILLING message received 2171 SBM (Subnet Bandwidth Manager) March, 1998 2173 New State: I_AM_DSBM 2174 Action: send I_AM_DSBM message /* reassert myself */ 2175 restart RefreshIntervalTimer 2177 State: I_AM_DSBM 2178 Event: RefreshIntervalTimer fired 2179 New State: I_AM_DSBM 2180 Action: send I_AM_DSBM message 2181 restart RefreshIntervalTimer 2183 State: I_AM_DSBM 2184 Event: I_AM_DSBM message received 2185 New State: depends on action (I_AM_DSBM or Idle) 2186 Action: /* check whether other guy is better */ 2187 If (ComparePrio(OwnAddrInfo, IncomingAddrInfo)) { 2188 /* I am better */ 2189 send I_AM_DSBM message 2190 restart RefreshIntervalTimer 2191 continue in current state 2192 } else { 2193 Set LocalDSBMAddrInfo = IncomingAddrInfo 2194 cancel active timers 2195 start DSBMDeadInterval timer 2196 goto Idle State 2197 } 2199 State: I_AM_DSBM 2200 Event: Want to shut myself down 2201 New State: DOWN 2202 Action: send DSBM_WILLING message with My address filled in, but 2203 priority set to zero 2204 goto Down State 2206 A.10.2 Suggested Values of Interval Timers 2208 To avoid DSBM outages for long period, to ensure quick recovery 2209 from DSBM failures, and to avoid timeout of PATH and RESV state at 2210 the edge devices, we suggest the following values for various 2211 timers. 2213 Assuming that the RSVP implementations use a 30 second timeout for 2214 PATH and RESV refreshes, we suggest that the RefreshIntervalTimer 2215 should be set to about 5 seconds with DSBMDeadIntervalTimer set to 2216 15 seconds (K=3, K*RefreshInterval). The DetectDSBMTimer should be 2217 set to a random value between (DeadIntervalTimer, 2*DeadInterval- 2218 Timer). The ElectionIntervalTimer should be set at least to the 2220 SBM (Subnet Bandwidth Manager) March, 1998 2222 value of DeadIntervalTimer to ensure that each SBM has a chance to 2223 have its DSBM_WILLING message (sent every RefreshInterval in 2224 ElectDSBM state) delivered to others. 2226 A.10.3. Guidelines for Choice of Values for SBM_PRIORITY 2228 Network administrators should configure SBM protocol entity at 2229 each SBM-capable device with the device's "SBM priority" for each 2230 of the interfaces attached to a managed segment. SBM_PRIORITY is 2231 an 8-bit, unsigned integer value (in the range 0-255) with higher 2232 integer values denoting higher priority. The value zero for an 2233 interface indicates that the SBM protocol entity on the device is 2234 not eligible to be a DSBM for the segment attached to the inter- 2235 face. 2237 A separate range of values is reserved for each type of SBM- 2238 capable device to reflect the relative priority among different 2239 classes of L2/L3 devices. L2 devices get higher priority followed 2240 by routers followed by hosts. The priority values in the range of 2241 128..255 are reserved for L2 devices, the values in the range of 2242 64..127 are reserved for routers, and values in the range of 1..63 2243 are reserved for hosts. 2245 A.11. DSBM Election over switched links 2247 The election algorithm works as described before in this case 2248 except each SBM-capable L2 device restricts the scope of the elec- 2249 tion to its local segment. As described in Section B.1 below, all 2250 messages related to the DSBM election are sent to a special multi- 2251 cast address (AllSBMAddress). AllSBMAddress (its corresponding MAC 2252 multicast address) is configured in the permanent database of 2253 SBM-capable, layer 2 devices so that all frames with AllSBMAddress 2254 as the destination address are not forwarded and instead directed 2255 to the SBM management entity in those devices. Thus, a DSBM can be 2256 elected separately on each point-to-point segment in a switched 2257 topology. For example, in Figure 2, DSBM for "segment A" will be 2258 elected using the election algorithm between R1 and S1 and none of 2259 the election-related messages on this segment will be forwarded by 2260 S1 beyond "segment A". Similarly, a separate election will take 2261 place on each segment in this topology. 2263 When a switched segment is a half-duplex segment, two senders (one 2264 sender at each end of the link) share the link. In this case, one 2266 SBM (Subnet Bandwidth Manager) March, 1998 2268 of the two senders will win the DSBM election and will be respon- 2269 sible for managing the segment. 2271 If a switched segment is full-duplex, exactly one sender sends on 2272 the link in each direction. In this case, either one or two DSBMs 2273 can exist on such a managed segment. If a sender at each end 2274 wishes to serve as a DSBM for that end, it can declare itself to 2275 be the DSBM by sending out an I_AM_DSBM advertisement and start 2276 managing the resources for the outgoing traffic over the segment. 2277 If one of the two senders does not wish itself to be the DSBM, 2278 then the other DSBM will not receive any DSBM advertisement from 2279 its peer and assume itself to be the DSBM for traffic traversing 2280 in both directions over the managed segment. 2282 SBM (Subnet Bandwidth Manager) March, 1998 2284 APPENDIX B 2285 Message Encapsulation and Formats 2287 To minimize changes to the existing RSVP implementations and to 2288 ensure quick deployment of a SBM in conjunction with RSVP, all 2289 communication to and from a DSBM will be performed using messages 2290 constructed using the current rules for RSVP message formats and 2291 raw IP encapsulation. For more details on the RSVP message for- 2292 mats, refer to the RSVP specification (RFC 2205). No changes to 2293 the RSVP message formats are proposed, but new message types and 2294 new L2-specific objects are added to the RSVP message formats to 2295 accommodate DSBM-related messages. These additions are described 2296 below. 2298 B.1 Message Addressing 2300 For the purpose of DSBM election and detection, AllSBMAddress is 2301 used as the destination address while sending out both 2302 DSBM_WILLING and I_AM_DSBM messages. A DSBM client first detects a 2303 managed segment by listening to I_AM_DSBM advertisements and 2304 records the DSBMAddress (unicast IP address of the DSBM). 2306 B.2. Message Sizes 2308 Each message must occupy exactly one IP datagram. If it exceeds 2309 the MTU, such a datagram will be fragmented by IP and reassembled 2310 at the recipient node. This has a consequence that a single mes- 2311 sage may not exceed the maximum IP datagram size, approximately 2312 64K bytes. 2314 B.3. RSVP-related Message Formats 2316 All RSVP messages directed to and from a DSBM may contain various 2317 RSVP objects defined in the RSVP specification and messages con- 2318 tinue to follow the formatting rules specified in the RSVP specif- 2319 ication. In addition, an RSVP implementation must also recognize 2320 new object classes that are described below. 2322 B.3.1. Object Formats 2324 SBM (Subnet Bandwidth Manager) March, 1998 2326 All objects are defined using the format specified in the RSVP 2327 specification. Each object has a 32-bit header that contains 2328 length (of the object in bytes including the object header), the 2329 object class number, and a C-Type. All unused fields should be set 2330 to zero and ignored on receipt. 2332 B.3.2. LAN_NHOP, RSVP_HOP_L2, and LAN_LOOPBACK Objects 2334 LAN_NHOP, LAN_LOOPBACK, and RSVP_HOP_L2 objects are identified as 2335 separate object classes and the value of Class_Num for the objects 2336 is chosen so that non-SBM aware RSVP nodes will ignore the objects 2337 without forwarding them or generating an error message. 2339 B.3.3. IEEE 802 Canonical Address Format 2341 The 48-bit MAC Addresses used by IEEE 802 were originally defined 2342 in terms of wire order transmission of bits in the source and des- 2343 tination MAC address fields. The same wire order applied to both 2344 Ethernet and Token Ring. Since the bit transmission order of Eth- 2345 ernet and Token Ring data differ - Ethernet octets are transmitted 2346 least significant bit first, Token Ring most significant first - 2347 the numeric values naturally associated with the same address on 2348 different 802 media differ. To facilitate the communication of 2349 address values in higher layer protocols which might span both 2350 token ring and Ethernet attached systems connected by bridges, it 2351 was necessary to define one reference format - the so called 2352 canonical format for these addresses. Formally the canonical for- 2353 mat defines the value of the address, separate from the encoding 2354 rules used for transmission. It comprises a sequence of octets 2355 derived from the original wire order transmission bit order as 2356 follows. The least significant bit of the first octet is the first 2357 bit transmitted, the next least significant bit the second bit, 2358 and so on to the most significant bit of the first octet being the 2359 8th bit transmitted; the least significant bit of the second octet 2360 is the 9th bit transmitted, and so on to the most significant bit 2361 of the sixth octet of the canonical format being the last bit of 2362 the address transmitted. 2364 This canonical format corresponds to the natural value of the 2365 address octets for Ethernet. The actual transmission order or for- 2366 mal encoding rules for addresses on media which do not transmit 2367 bit serially are derived from the canonical format octet values. 2369 This document requires that all L2 addresses used in conjunction 2370 with the SBM protocol be encoded in the canonical format as a 2371 sequence of 6 octets. In the following, we define the object 2373 SBM (Subnet Bandwidth Manager) March, 1998 2375 formats for objects that contain L2 addresses that are based on 2376 the canonical representation. 2378 B.3.4. RSVP_HOP_L2 object 2380 RSVP_HOP_L2 object uses object class = 161; it contains the L2 2381 address of the previous hop L3 device in the IEEE Canonical 2382 address format discussed above. 2384 RSVP_HOP_L2 object: class = 161, C-Type represents the addressing format 2385 used. In our case, C-Type=1 represents the IEEE Canonical Address 2386 format. 2388 0 1 2 3 2389 +---------------+---------------+---------------+----------------+ 2390 | Length | 161 |C-Type(addrtype)| 2391 +---------------+---------------+---------------+----------------+ 2392 | Variable length Opaque data | 2393 +---------------+---------------+---------------+----------------+ 2395 C-Type = 1 (IEEE Canonical Address format) 2397 When C-Type=1, the object format is: 2399 0 1 2 3 2400 +---------------+---------------+---------------+---------------+ 2401 | 12 | 161 | 1 | 2402 +---------------+---------------+---------------+---------------+ 2403 | Octets 0-3 of the MAC address | 2404 +---------------+---------------+---------------+---------------+ 2405 | Octets 4-5 of the MAC addr. | ///// | //// | 2406 +---------------+---------------+---------------+---------------+ 2408 //// -- unused (set to zero) 2410 B.3.5. LAN_NHOP object 2412 LAN_NHOP object represents two objects, namely, LAN_NHOP_L3 2413 address object and LAN_NHOP_L2 address object. 2414 ::= 2416 LAN_NHOP_L2 address object uses object class = 162 and uses the 2417 same format (but different class number) as the RSVP_HOP_L2 2418 object. It provides the L2 or MAC address of the next hop L3 2420 SBM (Subnet Bandwidth Manager) March, 1998 2422 device. 2424 0 1 2 3 2425 +---------------+---------------+---------------+----------------+ 2426 | Length | 162 |C-Type(addrtype)| 2427 +---------------+---------------+---------------+----------------+ 2428 | Variable length Opaque data | 2429 +---------------+---------------+---------------+----------------+ 2431 C-Type = 1 (IEEE 802 Canonical Address Format as defined below) 2432 See the RSVP_HOP_L2 address object for more details. 2434 LAN_NHOP_L3 object uses object class = 163 and gives the L3 or IP 2435 address of the next hop L3 device. 2437 LAN_NHOP_L3 object: class = 163, C-Type specifies IPv4 or IPv6 address 2438 family used. 2440 IPv4 LAN_NHOP_L3 object: class =163, C-Type = 1 2441 +---------------+---------------+---------------+---------------+ 2442 | Length = 8 | 163 | 1 | 2443 +---------------+---------------+---------------+---------------+ 2444 | IPv4 NHOP address | 2445 +---------------------------------------------------------------+ 2447 IPv6 LAN_NHOP_L3 object: class =163, C-Type = 2 2448 +---------------+---------------+---------------+---------------+ 2449 | Length = 20 | 163 | 2 | 2450 +---------------+---------------+---------------+---------------+ 2451 // IPv6 NHOP address (16 bytes) | 2452 +---------------------------------------------------------------+ 2454 B.3.6. LAN_LOOPBACK Object 2456 The LAN_LOOPBACK object gives the IP address of the outgoing 2457 interface for a PATH message and uses object class=164; both IPv4 2458 and IPv6 formats are specified. 2460 IPv4 LAN_LOOPBACK object: class = 164, C-Type = 1 2462 0 1 2 3 2463 +---------------+---------------+---------------+---------------+ 2464 | Length | 164 | 1 | 2465 +---------------+---------------+---------------+---------------+ 2466 | IPV4 address of an interface | 2468 SBM (Subnet Bandwidth Manager) March, 1998 2470 +---------------+---------------+---------------+---------------+ 2472 IPv6 LAN_LOOPBACK object: class = 164, C-Type = 2 2474 +---------------+---------------+---------------+---------------+ 2475 | Length | 164 | 2 | 2476 +---------------+---------------+---------------+---------------+ 2477 | | 2478 + + 2479 | | 2480 + IPV6 address of an interface + 2481 | | 2482 + + 2483 | | 2484 +---------------+---------------+---------------+---------------+ 2486 B.3.7. TCLASS Object 2488 TCLASS object (traffic class based on IEEE 802.1p) uses object 2489 class = 165. 2491 0 1 2 3 2492 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2493 | Length | 165 | 1 | 2494 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2495 | /// | /// | //// | //// | PV | 2496 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2498 Only 3 bits in data contain the user_priority value (PV). 2500 B.4. RSVP PATH and PATH_TEAR Message Formats 2502 As specified in the RSVP specification, a PATH and PATH_TEAR mes- 2503 sages contain the RSVP Common Header and the relevant RSVP 2504 objects. For the RSVP Common Header, refer to the RSVP specifica- 2505 tion (RFC 2205). Enhancements to an RSVP_PATH message include 2506 additional objects as specified below. 2508 ::= [] 2509 2510 [] 2511 [] 2513 ::= [] 2515 SBM (Subnet Bandwidth Manager) March, 1998 2517 2518 [] 2520 If the INTEGRITY object is present, it must immediately follow the 2521 RSVP common header. L2-specific objects must always precede the 2522 SESSION object. 2524 B.5. RSVP RESV Message Format 2526 As specified in the RSVP specification, an RSVP_RESV message con- 2527 tains the RSVP Common Header and relevant RSVP objects. In addi- 2528 tion, it may contain an optional TCLASS object as described ear- 2529 lier. 2531 B.6. Additional RSVP message types to handle SBM interactions 2533 New RSVP message types are introduced to allow interactions 2534 between a DSBM and an RSVP node (host/router) for the purpose of 2535 discovering and binding to a DSBM. New RSVP message types needed 2536 are as follows: 2538 RSVP Msg Type (8 bits) Value 2539 DSBM_WILLING 66 2540 I_AM_DSBM 67 2542 All SBM-specific messages are formatted as RSVP messages with an 2543 RSVP common header followed by SBM-specific objects. 2545 ::= 2547 where ::= [] 2549 For each SBM message type, there is a set of rules for the permis- 2550 sible choice of object types. These rules are specified using 2551 Backus-Naur Form (BNF) augmented with square brackets surrounding 2552 optional sub-sequences. The BNF implies an order for the objects 2553 in a message. However, in many (but not all) cases, object order 2554 makes no logical difference. An implementation should create mes- 2555 sages with the objects in the order shown here, but accept the 2556 objects in any permissible order. Any exceptions to this rule will 2557 be pointed out in the specific message formats. 2559 SBM (Subnet Bandwidth Manager) March, 1998 2561 DSBM_WILLING Message 2563 ::= 2564 2566 I_AM_DSBM Message 2568 ::= 2569 2570 2572 All I_AM_DSBM messages are multicast to the well known AllSBMAd- 2573 dress. The default priority of a SBM is 1 and higher priority 2574 values represent higher precedence. The priority value zero indi- 2575 cates that the SBM is not eligible to be the DSBM. 2577 Relevant Objects 2579 DSBM IP ADDRESS objects use object class = 42; IPv4 DSBM IP 2580 ADDRESS object uses and IPv6 DSBM IP ADDRESS 2581 object uses . 2583 IPv4 DSBM IP ADDRESS object: class = 42, C-Type =1 2584 0 1 2 3 2585 +---------------+---------------+---------------+---------------+ 2586 | IPv4 DSBM IP Address | 2587 +---------------+---------------+---------------+---------------+ 2589 IPv6 DSBM IP ADDRESS object: Class = 42, C-Type = 2 2590 +---------------+---------------+---------------+---------------+ 2591 | | 2592 + + 2593 | | 2594 + IPv6 DSBM IP Address + 2595 | | 2596 + + 2597 | | 2598 +---------------+---------------+---------------+---------------+ 2600 Object is the same as object with C-Type 2601 =1 for IEEE Canonical Address format. 2603 SBM (Subnet Bandwidth Manager) March, 1998 2605 ::= 2607 a SBM may omit this object by including a NULL L2 address object. For 2608 C-Type=1 (IEEE Canonical address format), such a version of the L2 2609 address object contains value zero in the six octet s corresponding to the 2610 MAC address (see section B.3.4 for the exact format). 2612 SBM_PRIORITY Object: class = 43, C-Type =1 2614 0 1 2 3 2615 +---------------+---------------+---------------+---------------+ 2616 | //// | //// | //// | SBM priority | 2617 +---------------+---------------+---------------+---------------+ 2619 TIMER INTERVAL VALUES. 2621 The two timer intervals, namely, DSBM Dead Interval and DSBM 2622 Refresh Interval, are specified as integer values each in the 2623 range of 0..255 seconds. Both values are included in a single 2624 "DSBM Timer Intervals" object described below. 2626 DSBM Timer Intervals Object: class = 44, C-Type =1 2628 +---------------+---------------+---------------+----------------+ 2629 | //// | //// | DeadInterval |Refresh Interval| 2630 +---------------+---------------+---------------+----------------+ 2632 SBM_INFO Object. 2633 The SBM_INFO object is designed to provide additional information 2634 about the managed segment. This object uses 2635 and includes information such as media type (shared or switched, 2636 half duplex vs full duplex, etc.) and whether (and how much) 2637 traffic a sender can send before receiving a RESV message from a 2638 receiver. 2640 SBM_INFO Object: class = 45, C-Type = 1 2642 0 1 2 3 2643 +---------------+---------------+---------------+----------------+ 2644 | //// | //// | //// | Media Type | 2645 +---------------+---------------+---------------+----------------+ 2646 | OptFlowSpec (limit on traffic allowed to send without RESV) | 2647 | | 2648 +---------------+---------------+---------------+----------------+ 2650 Media Type values: 0 (Shared segment); a default 2651 1 (switched, half duplex) 2653 SBM (Subnet Bandwidth Manager) March, 1998 2655 2 (switched, full duplex) 2657 OptFlowSpec: 2658 This parameter specifies whether or not a sender can send traffic 2659 when its RESV request fails. The parameter is an Intserv SENDER_TSPEC 2660 object (see RFC 2210 for contents and encoding rules). 2661 If the token bucket rate (r) specified in 2662 this parameter is zero, it indicates that the sender(s) must not send 2663 traffic if their RESV request fails; otherwise, the parameter specifies 2664 per-session limit on the amount of traffic that can be sent when RESV 2665 attempt for the session fails. 2667 ::= (class=12, C-Type =2) 2669 SBM (Subnet Bandwidth Manager) March, 1998 2671 ACKNOWLEDGEMENTS 2673 Authors are grateful to Eric Crawley (Argon), Russ Fenger (Intel), 2674 David Melman (Siemens), Ramesh Pabbati (Microsoft), Mick Seaman 2675 (3COM), Andrew Smith (Extreme Networks) for their constructive 2676 comments on the SBM design and the earlier versions of this docu- 2677 ment. 2679 6. Authors` Addresses 2681 Raj Yavatkar 2682 Intel Corporation 2683 2111 N.E. 25th Avenue, 2684 Hillsboro, OR 97124 2685 USA 2686 phone: +1 503-264-9077 2687 email: yavatkar@ibeam.intel.com 2689 Don Hoffman 2690 Teledesic Corporation 2691 2300 Carillon Point 2692 Kirkland, WA 98033 2693 USA 2694 phone: +1 425-602-0000 2696 Yoram Bernet 2697 Microsoft 2698 1 Microsoft Way 2699 Redmond, WA 98052 2700 USA 2701 phone: +1 206 936 9568 2702 email: yoramb@microsoft.com 2704 Fred Baker 2705 Cisco Systems 2706 519 Lado Drive 2707 Santa Barbara, California 93111 2708 USA 2709 phone: +1 408 526 4257 2710 email: fred@cisco.com 2712 Michael Speer 2713 Sun Microsystems, Inc 2714 901 San Antonio Road UMPK15-215 2715 Palo Alto, CA 94303 2716 phone: +1 650-786-6368 2717 email: speer@Eng.Sun.COM 2719 SBM (Subnet Bandwidth Manager) March, 1998