idnits 2.17.1 draft-ietf-issll-is802-framework-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-19) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There is 1 instance of too long lines in the document, the longest one being 3 characters in excess of 72. ** The abstract seems to contain references ([13], [14]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 1739 has weird spacing: '... 1.2 ms unb...' == Line 1740 has weird spacing: '... 120 us unb...' == Line 1741 has weird spacing: '... 12 us unb...' == Line 1770 has weird spacing: '... 1.2 ms unb...' == Line 1771 has weird spacing: '... 120 us unb...' == (1 more instance...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 1998) is 9471 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '16' is mentioned on line 428, but not defined == Unused Reference: '11' is defined on line 2011, but no explicit reference was found in the text == Unused Reference: '12' is defined on line 2014, but no explicit reference was found in the text == Unused Reference: '17' is defined on line 2041, but no explicit reference was found in the text == Unused Reference: '18' is defined on line 2044, but no explicit reference was found in the text == Unused Reference: '20' is defined on line 2052, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' ** Downref: Normative reference to an Informational RFC: RFC 1633 (ref. '8') ** Downref: Normative reference to an Informational RFC: RFC 2216 (ref. '10') ** Downref: Normative reference to an Historic RFC: RFC 1819 (ref. '12') -- Possible downref: Non-RFC (?) normative reference: ref. '13' -- Possible downref: Non-RFC (?) normative reference: ref. '14' -- Possible downref: Non-RFC (?) normative reference: ref. '15' -- Possible downref: Non-RFC (?) normative reference: ref. '18' -- Possible downref: Non-RFC (?) normative reference: ref. '19' -- Possible downref: Non-RFC (?) normative reference: ref. '20' Summary: 14 errors (**), 0 flaws (~~), 13 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Anoop Ghanwani 2 INTERNET DRAFT J. Wayne Pace 3 Vijay Srinivasan 4 (IBM) 5 Andrew Smith 6 (Extreme Networks) 7 Mick Seaman 8 (3Com) 9 May 1998 11 A Framework for Providing Integrated Services 12 Over Shared and Switched IEEE 802 LAN Technologies 14 draft-ietf-issll-is802-framework-05.txt 16 Status of This Memo 18 This document is an Internet-Draft. Internet Drafts are working 19 documents of the Internet Engineering Task Force (IETF), its areas, 20 and its working groups. Note that other groups may also distribute 21 working documents as Internet Drafts. 23 Internet Drafts are draft documents valid for a maximum of six 24 months, and may be updated, replaced, or obsoleted by other documents 25 at any time. It is not appropriate to use Internet Drafts as 26 reference material, or to cite them other than as a ``working draft'' 27 or ``work in progress.'' 29 To view the entire list of current Internet-Drafts, please check 30 the "1id-abstracts.txt" listing contained in the Internet-Drafts 31 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net 32 (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au 33 (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu 34 (US West Coast). 36 Abstract 38 This memo describes a framework for supporting IETF Integrated 39 Services on shared and switched LAN infrastructure. It includes 40 background material on the capabilities of IEEE 802 like networks 41 with regard to parameters that affect Integrated Services such as 42 access latency, delay variation and queueing support in LAN switches. 43 It discusses aspects of IETF's Integrated Services model that cannot 44 easily be accommodated in different LAN environments. It outlines 45 a functional model for supporting the Resource Reservation Protocol 46 (RSVP) in such LAN environments. Details of extensions to RSVP for 47 use over LANs are described in an accompanying memo [14]. Mappings 48 of the various Integrated Services onto IEEE 802 LANs are described 49 in another memo [13]. 51 Contents 53 Status of This Memo i 55 Abstract ii 57 1. Introduction 1 59 2. Document Outline 1 61 3. Definitions 2 63 4. Frame Forwarding in IEEE 802 Networks 3 64 4.1. General IEEE 802 Service Model . . . . . . . . . . . . . 3 65 4.2. Ethernet/IEEE 802.3 . . . . . . . . . . . . . . . . . . . 5 66 4.3. Token Ring/IEEE 802.5 . . . . . . . . . . . . . . . . . . 6 67 4.4. Fiber Distributed Data Interface . . . . . . . . . . . . 7 68 4.5. Demand Priority/IEEE 802.12 . . . . . . . . . . . . . . . 8 70 5. Requirements and Goals 9 71 5.1. Requirements . . . . . . . . . . . . . . . . . . . . . . 9 72 5.2. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 11 73 5.3. Non-goals . . . . . . . . . . . . . . . . . . . . . . . . 12 74 5.4. Assumptions . . . . . . . . . . . . . . . . . . . . . . . 12 76 6. Basic Architecture 13 77 6.1. Components . . . . . . . . . . . . . . . . . . . . . . . 13 78 6.1.1. Requester Module . . . . . . . . . . . . . . . . 13 79 6.1.2. Bandwidth Allocator . . . . . . . . . . . . . . . 14 80 6.1.3. Communication Protocols . . . . . . . . . . . . . 14 81 6.2. Centralized vs. Distributed Implementations . . . . . . 15 83 7. Model of the Bandwidth Manager in a Network 17 84 7.1. End Station Model . . . . . . . . . . . . . . . . . . . . 17 85 7.1.1. Layer 3 Client Model . . . . . . . . . . . . . . 17 86 7.1.2. Requests to Layer 2 ISSLL . . . . . . . . . . . . 17 87 7.1.3. At the Layer 3 Sender . . . . . . . . . . . . . . 18 88 7.1.4. At the Layer 3 Receiver . . . . . . . . . . . . . 19 89 7.2. Switch Model . . . . . . . . . . . . . . . . . . . . . . 21 90 7.2.1. Centralized Bandwidth Allocator . . . . . . . . . 21 91 7.2.2. Distributed Bandwidth Allocator . . . . . . . . . 21 92 7.3. Admission Control . . . . . . . . . . . . . . . . . . . . 22 93 7.4. QoS Signaling . . . . . . . . . . . . . . . . . . . . . . 24 94 7.4.1. Client Service Definitions . . . . . . . . . . . 24 95 7.4.2. Switch Service Definitions . . . . . . . . . . . 25 97 8. Implementation Issues 27 98 8.1. Switch Characteristics . . . . . . . . . . . . . . . . . 27 99 8.2. Queueing . . . . . . . . . . . . . . . . . . . . . . . . 28 100 8.3. Mapping of Services to Link Level Priority . . . . . . . 29 101 8.4. Re-mapping of Non-conforming Aggregated Flows . . . . . . 29 102 8.5. Override of Incoming User Priority . . . . . . . . . . . 30 103 8.6. Different Reservation Styles . . . . . . . . . . . . . . 30 104 8.7. Receiver Heterogeneity . . . . . . . . . . . . . . . . . 31 106 9. Network Topology Scenarios 34 107 9.1. Full Duplex Switched Networks . . . . . . . . . . . . . . 35 108 9.2. Shared Media Ethernet Networks . . . . . . . . . . . . . 35 109 9.3. Half Duplex Switched Ethernet Networks . . . . . . . . . 36 110 9.4. Half Duplex Switched and Shared Token Ring Networks . . . 37 111 9.5. Half Duplex and Shared Demand Priority Networks . . . . . 38 113 10. Justification 41 115 11. Summary 41 116 1. Introduction 118 The Internet has traditionally provided support for best effort 119 traffic only. However, with the recent advances in link layer 120 technology, and with numerous emerging real time applications such 121 as video conferencing and Internet telephony, there has been much 122 interest for developing mechanisms which enable real time services 123 over the Internet. A framework for meeting these new requirements 124 was set out in RFC 1633 [8] and this has driven the specification of 125 various classes of network service by the Integrated Services working 126 group of the IETF, such as Controlled Load and Guaranteed Service 127 [6,7]. Each of these service classes is designed to provide certain 128 Quality of Service (QoS) to traffic conforming to a specified set 129 of parameters. Applications are expected to choose one of these 130 classes according to their QoS requirements. One mechanism for end 131 stations to utilize such services in an IP network is provided by 132 a QoS signaling protocol, the Resource Reservation Protocol (RSVP) 133 [5] developed by the RSVP working group of the IETF. The IEEE under 134 its Project 802 has defined standards for many different local area 135 network technologies. These all typically offer the same MAC layer 136 datagram service [1] to higher layer protocols such as IP although 137 they often provide different dynamic behavior characteristics -- it 138 is these that are important when considering their ability to support 139 real time services. Later in this memo we describe some of the 140 relevant characteristics of the different MAC layer LAN technologies. 141 In addition, IEEE 802 has defined standards for bridging multiple LAN 142 segments together using devices known as "MAC Bridges" or "Switches" 143 [2]. Recent work has also defined traffic classes, multicast 144 filtering, and virtual LAN capabilities for these devices [3,4]. 145 Such LAN technologies often constitute the last hop(s) between users 146 and the Internet as well as being a primary building block for entire 147 campus networks. It is therefore necessary to provide standardized 148 mechanisms for using these technologies to support end-to-end real 149 time services. In order to do this, there must be some mechanism 150 for resource management at the data link layer. Resource management 151 in this context encompasses the functions of admission control, 152 scheduling, traffic policing, etc. The ISSLL (Integrated Services 153 over Specific Link Layers) working group in the IETF was chartered 154 with the purpose of exploring and standardizing such mechanisms for 155 various link layer technologies. 157 2. Document Outline 159 This document is concerned with specifying a framework for providing 160 Integrated Services over shared and switched LAN technologies such 161 as Ethernet/IEEE 802.3, Token Ring/IEEE 802.5, FDDI, etc. We begin 162 in Section 4 with a discussion of the capabilities of various IEEE 163 802 MAC layer technologies. Section 5 lists the requirements and 164 goals for a mechanism capable of providing Integrated Services in 165 a LAN. The resource management functions outlined in Section 5 are 166 provided by an entity referred to as a Bandwidth Manager (BM). The 167 architectural model of the the BM is described in Section 6 and its 168 various components are discussed in Section 7. Some implementation 169 issues with respect to link layer support for Integrated Services are 170 examined in Section 8. Section 9 discusses a taxonomy of topologies 171 for the LAN technologies under consideration with an emphasis 172 on the capabilities of each which can be leveraged for enabling 173 Integrated Services. This framework makes no assumptions about the 174 topology at the link layer. The framework is intended to be as 175 exhaustive as possible; this means that it is possible that all the 176 functions discussed may not be supportable by a particular topology 177 or technology, but this should not preclude the usage of this model 178 for it. 180 3. Definitions 182 The following is a list of terms used in this and other ISSLL 183 documents. 185 - Link Layer or Layer 2 or L2: Data link layer technologies such 186 as Ethernet/IEEE 802.3 and Token Ring/IEEE 802.5 are referred to 187 as Layer 2 or L2. 189 - Link Layer Domain or Layer 2 Domain or L2 Domain: Refers to a 190 set of nodes and links interconnected without passing through a 191 L3 forwarding function. One or more IP subnets can be overlaid 192 on a L2 domain. 194 - Layer 2 or L2 Devices: Devices that only implement Layer 2 195 functionality as Layer 2 or L2 devices. These include IEEE 196 802.1D [2] bridges or switches. 198 - Internetwork Layer or Layer 3 or L3: Refers to Layer 3 of the 199 ISO OSI model. This memo is primarily concerned with networks 200 that use the Internet Protocol (IP) at this layer. 202 - Layer 3 Device or L3 Device or End Station: These include hosts 203 and routers that use L3 and higher layer protocols or application 204 programs that need to make resource reservations. 206 - Segment: A physical L2 segment that is shared by one or more 207 senders. Examples of segments include: (a) a shared Ethernet or 208 Token Ring wire resolving contention for media access using CSMA 209 or token passing; (b) a half duplex link between two stations or 210 switches; (c) one direction of a switched full duplex link. 212 - Managed Segment: A managed segment is a segment with a DSBM 213 present and responsible for exercising admission control over 214 requests for resource reservation. A managed segment includes 215 those interconnected parts of a shared LAN that are not separated 216 by DSBMs. 218 - Traffic Class: Refers to an aggregation of data flows which are 219 given similar service within a switched network. 221 - Subnet: Used in this memo to indicate a group of L3 devices 222 sharing a common L3 network address prefix along with the set of 223 segments making up the L2 domain in which they are located. 225 - Bridge/Switch: A Layer 2 forwarding device as defined by IEEE 226 802.1D [2]. The terms bridge and switch are used synonymously in 227 this memo. 229 4. Frame Forwarding in IEEE 802 Networks 231 4.1. General IEEE 802 Service Model 233 The user_priority is a value associated with the transmission 234 and reception of all frames in the IEEE 802 service model. It 235 is supplied by the sender that is using the MAC service and is 236 provided along with the data to a receiver using the MAC service. 237 It may or may not be actually carried over the network. Token 238 Ring/IEEE 802.5 carries this value encoded in its FC octet while 239 basic Ethernet/IEEE 802.3 does not carry it. IEEE 802.12 may or 240 may not carry it depending on the frame format in use. When the 241 frame format in use is IEEE 802.5, the user_priority is carried 242 explicitly. When IEEE 802.3 frame format is used, only the two 243 levels of priority (high/low) that are used to determine access 244 priority can be recovered. This is based on the value of priority 245 encoded in the start delimiter of the IEEE 802.3 frame. 247 IEEE 802.1D [3] (1) defines a consistent way carry this value over a 248 bridged network consisting of Ethernet, Token Ring, Demand Priority, 250 ---------------------------- 251 1. The original IEEE 802.1D standard [2] contains the specifications for 252 the operation of MAC bridges. This has recently been extended to 253 include support for traffic classes and dynamic multicast filtering [3]. 254 In this document, the reader should be aware that references to the 255 IEEE 802.1D standard refer to [3], unless explicitly noted otherwise. 257 FDDI or other MAC layer media using an extended frame format. The 258 usage of user_priority is summarized below. We refer the interested 259 reader to the IEEE 802.1D specification for further information. 261 If the user_priority is carried explicitly in packets, its utility is 262 as a simple label enabling packets within a data stream in different 263 classes to be discriminated easily by downstream nodes without having 264 to parse the packet in more detail. 266 Apart from making the job of desktop or wiring closet switches 267 easier, an explicit field means they do not have to change hardware 268 or software as the rules for classifying packets evolve; e.g. 269 based on new protocols or new policies. More sophisticated Layer 270 3 switches, perhaps deployed in the core of a network, may be able 271 to provide added value by performing packet classification more 272 accurately and, hence, utilizing network resources more efficiently 273 and providing better isolation between flows. This appears to be 274 a good economic choice since there are likely to be very many more 275 desktop/wiring closet switches in a network than switches requiring 276 Layer 3 functionality. 278 The IEEE 802 specifications make no assumptions about how 279 user_priority is to be used by end stations or by the network. 280 Although IEEE 802.1D defines static priority queueing as the default 281 mode of operation of switches that implement multiple queues, the 282 user_priority is really a priority only in a loose sense since it 283 depends on the number of traffic classes actually implemented by a 284 switch. The user_priority is defined as a 3 bit quantity with a 285 value of 7 representing the highest priority and a value of 0 as 286 the lowest. The general switch algorithm is as follows. Packets 287 are queued within a particular traffic class based on the received 288 user_priority, the value of which is either obtained directly from 289 the packet if an IEEE P802.1Q header or IEEE 802.5 network is used, 290 or is assigned according to some local policy. The queue is selected 291 based on a mapping from user_priority (0 through 7) onto the number 292 of available traffic classes. A switch may implement one or more 293 traffic classes. The advertised IntServ parameters and the switch's 294 admission control behavior may be used to determine the mapping from 295 user_priority to traffic classes within the switch. A switch is 296 not precluded from implementing other scheduling algorithms such as 297 weighted fair queueing and round robin. 299 IEEE 802.1D makes no recommendations about how a sender should 300 select the value for user_priority. One of the primary purposes of 301 this document is to propose such usage rules, and to discuss the 302 communication of the semantics of these values between switches and 303 end stations. In the remainder of this document we use the term 304 traffic class synonymously with user_priority. 306 4.2. Ethernet/IEEE 802.3 308 There is no explicit traffic class or user_priority field carried in 309 Ethernet packets. This means that user_priority must be regenerated 310 at a downstream receiver or switch according to some defaults or by 311 parsing further into higher layer protocol fields in the packet. 312 Alternatively, IEEE P802.1Q encapsulation [4] may be used which 313 provides an explicit user_priority field on top of the basic MAC 314 frame format. 316 For the different IP packet encapsulations used over Ethernet/IEEE 317 802.3, it will be necessary to adjust any admission control 318 calculations according to the framing and padding requirements. 320 Table 1: Ethernet encapsulations 322 --------------------------------------------------------------- 323 Encapsulation Framing Overhead IP MTU 324 bytes/pkt bytes 325 --------------------------------------------------------------- 326 IP EtherType (ip_len<=46 bytes) 64-ip_len 1500 327 (1500>=ip_len>=46 bytes) 18 1500 329 IP EtherType over 802.1D/Q (ip_len<=42) 64-ip_len 1500* 330 (1500>=ip_len>=42 bytes) 22 1500* 332 IP EtherType over LLC/SNAP (ip_len<=40) 64-ip_len 1492 333 (1500>=ip_len>=40 bytes) 24 1492 334 --------------------------------------------------------------- 336 *Note that the draft IEEE P802.1Q specification exceeds the current 337 IEEE 802.3 maximum packet length values by 4 bytes. The change of 338 maximum MTU size for IEEE P802.1Q frames is being accommodated by 339 IEEE P802.3ac. 341 4.3. Token Ring/IEEE 802.5 343 The Token Ring standard [6] provides a priority mechanism that can 344 be used to control both the queueing of packets for transmission and 345 the access of packets to the shared media. The priority mechanisms 346 are implemented using bits within the Access Control (AC) and the 347 Frame Control (FC) fields of a LLC frame. The first three bits of 348 the AC field, the Token Priority bits, together with the last three 349 bits of the AC field, the Reservation bits, regulate which stations 350 get access to the ring. The last three bits of the FC field of an 351 LLC frame, the User Priority bits, are obtained from the higher layer 352 in the user_priority parameter when it requests transmission of a 353 packet. This parameter also establishes the Access Priority used 354 by the MAC. The user_priority value is conveyed end-to-end by the 355 User Priority bits in the FC field and is typically preserved through 356 Token Ring bridges of all types. In all cases, 0 is the lowest 357 priority. 359 Token Ring also uses a concept of Reserved Priority which relates to 360 the value of priority which a station uses to reserve the token for 361 the next transmission on the ring. When a free token is circulating, 362 only a station having an Access Priority greater than or equal to the 363 Reserved Priority in the token will be allowed to seize the token for 364 transmission. Readers are referred to [14] for further discussion of 365 this topic. 367 A Token Ring station is theoretically capable of separately queueing 368 each of the eight levels of requested user_priority and then 369 transmitting frames in order of priority. A station sets Reservation 370 bits according to the user_priority of frames that are queued 371 for transmission in the highest priority queue. This allows the 372 access mechanism to ensure that the frame with the highest priority 373 throughout the entire ring will be transmitted before any lower 374 priority frame. Annex I to the IEEE 802.5 Token Ring standard 375 recommends that stations send/relay frames as follows. 377 To reduce frame jitter associated with high priority traffic, the 378 annex also recommends that only one frame be transmitted per token 379 and that the maximum information field size be 4399 octets whenever 380 delay sensitive traffic is traversing the ring. Most existing 381 implementations of Token Ring bridges forward all LLC frames with 382 a default access priority of 4. Annex I recommends that bridges 383 forward LLC frames that have a user_priority greater than 4 with 384 a reservation equal to the user_priority (although the draft IEEE 385 802.1D [3] permits network management override this behavior). The 386 capabilities provided by the Token Ring architecture, such User 387 Priority and Reserved Priority, can provide effective support for 388 Integrated Services flows that require QoS guarantees. 390 Table 2: Recommended use of Token Ring User Priority 392 ------------------------------------- 393 Application User Priority 394 ------------------------------------- 395 Non-time-critical data 0 396 - 1 397 - 2 398 - 3 399 LAN management 4 400 Time-sensitive data 5 401 Real-time-critical data 6 402 MAC frames 7 403 ------------------------------------- 405 For the different IP packet encapsulations used over Token Ring/IEEE 406 802.5, it will be necessary to adjust any admission control 407 calculations according to the framing requirements as shown in Table 408 3. 410 Table 3: Token Ring encapsulations 412 --------------------------------------------------------------- 413 Encapsulation Framing Overhead IP MTU 414 bytes/pkt bytes 415 --------------------------------------------------------------- 416 IP EtherType over 802.1D/Q 29 4370* 417 IP EtherType over LLC/SNAP 25 4370* 418 --------------------------------------------------------------- 420 *The suggested MTU from RFC 1042 [13] is 4464 bytes but there are 421 issues related to discovering what the maximum supported MTU between 422 any two points both within and between Token Ring subnets. The MTU 423 reported here is consistent with the IEEE 802.5 Annex I 424 recommendation. 426 4.4. Fiber Distributed Data Interface 428 The Fiber Distributed Data Interface (FDDI) standard [16] provides 429 a priority mechanism that can be used to control both the queueing 430 of packets for transmission and the access of packets to the shared 431 media. The priority mechanisms are implemented using similar 432 mechanisms to Token Ring described above. The standard also makes 433 provision for "Synchronous" data traffic with strict media access and 434 delay guarantees. This mode of operation is not discussed further 435 here and represents area within the scope of the ISSLL working group 436 that requires further work. In the remainder of this document, for 437 the discussion of QoS mechanisms, FDDI is treated as a 100 Mbps Token 438 Ring technology using a service interface compatible with IEEE 802 439 networks. 441 4.5. Demand Priority/IEEE 802.12 443 IEEE 802.12 [19] is a standard for a shared 100 Mbps LAN. Data 444 packets are transmitted using either the IEEE 802.3 or IEEE 802.5 445 frame format. The MAC protocol is called Demand Priority. Its main 446 characteristics with respect to QoS are the support of two service 447 priority levels, normal priority and high priority, and the order of 448 service for each of these. Data packets from all network nodes (end 449 hosts and bridges/switches) are served using a simple round robin 450 algorithm. 452 If the IEEE 802.3 frame format is used for data transmission then 453 the user_priority is encoded in the starting delimiter of the IEEE 454 802.12 data packet. If the IEEE 802.5 frame format is used then the 455 user_priority is additionally encoded in the YYY bits of the FC field 456 in the IEEE 802.5 packet header (see also Section 4.3). Furthermore, 457 the IEEE P802.1Q encapsulation with its own user_priority field may 458 also be applied in IEEE 802.12 networks. In all cases, switches are 459 able to recover any user_priority supplied by a sender. 461 The same rules apply for IEEE 802.12 user_priority mapping in a 462 bridge as with other media types. The only additional information 463 is that normal priority is used by default for user_priority values 464 0 through 4 inclusive, and high priority is used for user_priority 465 levels 5 through 7. This ensures that the default Token Ring 466 user_priority level of 4 for IEEE 802.5 bridges is mapped to normal 467 priority on IEEE 802.12 segments. 469 The medium access in IEEE 802.12 LANs is deterministic. The Demand 470 Priority mechanism ensures that, once the normal priority service 471 has been preempted, all high priority packets have strict priority 472 over packets with normal priority. In the abnormal situation that 473 a normal priority packet has been waiting at the head of line of a 474 MAC transmit queue for a time period longer than PACKET_PROMOTION 475 (200 - 300 ms) [19], its priority is automatically promoted to 476 high priority. Thus, even normal priority packets have a maximum 477 guaranteed access time to the medium. 479 Integrated Services can be built on top of the IEEE 802.12 medium 480 access mechanism. When combined with admission control and bandwidth 481 enforcement mechanisms, delay guarantees as required for a Guaranteed 482 Service can be provided without any changes to the existing IEEE 483 802.12 MAC protocol. 485 Since the IEEE 802.12 standard supports the IEEE 802.3 and IEEE 802.5 486 frame formats, the same framing overhead as reported in Sections 4.2 487 and 4.3 must be considered in the admission control computations for 488 IEEE 802.12 links. 490 5. Requirements and Goals 492 This section discusses the requirements and goals which should drive 493 the design of an architecture for supporting Integrated Services over 494 LAN technologies. The requirements refer to functions and features 495 which must be supported, while goals refer to functions and features 496 which are desirable, but are not an absolute necessity. Many of the 497 requirements and goals are driven by the functionality supported by 498 Integrated Services and RSVP. 500 5.1. Requirements 502 - Resource Reservation: The mechanism must be capable of reserving 503 resources on a single segment or multiple segments and at 504 bridges/switches connecting them. It must be able to provide 505 reservations for both unicast and multicast sessions. It should 506 be possible to change the level of reservation while the session 507 is in progress. 509 - Admission Control: The mechanism must be able to estimate 510 the level of resources necessary to meet the QoS requested by 511 the session in order to decide whether or not the session can 512 be admitted. For the purpose of management, it is useful to 513 provide the ability to respond to queries about availability of 514 resources. It must be able to make admission control decisions 515 for different types of services such as Guaranteed Service, 516 Controlled Load, etc. 518 - Flow Separation and Scheduling: It is necessary to provide a 519 mechanism for traffic flow separation so that real time flows can 520 be given preferential treatment over best effort flows. Packets 521 of real time flows can then be isolated and scheduled according 522 to their service requirements. 524 - Policing/Shaping: Traffic must be shaped and/or policed by 525 end stations (workstations, routers) to ensure conformance to 526 negotiated traffic parameters. Shaping is the recommended 527 behavior for traffic sources. A router initiating an ISSLL 528 session must have implemented traffic control mechanisms 529 according to the IntServ requirements which would ensure that 530 all flows sent by the router are in conformance. The ISSLL 531 mechanisms at the link layer rely heavily on the correct 532 implementation of policing/shaping mechanisms at higher layers by 533 devices capable of doing so. This is necessary because bridges 534 and switches are not typically capable of maintaining per flow 535 state which would be required to check flows for conformance. 536 Policing is left as an option for bridges and switches, which if 537 implemented, may be used to enforce tighter control over traffic 538 flows. This issue is further discussed in Section 8. 540 - Soft State: The mechanism must maintain soft state information 541 about the reservations. This means that state information must 542 periodically be refreshed if the reservation is to be maintained; 543 otherwise the state information and corresponding reservations 544 will expire after some pre-specified interval. 546 - Centralized or Distributed Implementation: In the case of a 547 centralized implementation, a single entity manages the resources 548 of the entire subnet. This approach has the advantage of being 549 easier to deploy since bridges and switches may not need to be 550 upgraded with additional functionality. However, this approach 551 scales poorly with geographical size of the subnet and the number 552 of end stations attached. In a fully distributed implementation, 553 each segment will have a local entity managing its resources. 554 This approach has better scalability than the former. However, 555 it requires that all bridges and switches in the network support 556 new mechanisms. It is also possible to have a semi- distributed 557 implementation where there is more than one entity, each managing 558 the resources of a subset of segments and bridges/switches 559 within the subnet. Ideally, implementation should be flexible; 560 i.e. a centralized approach may be used for small subnets and a 561 distributed approach can be used for larger subnets. Examples 562 of centralized and distributed implementations are discussed in 563 Section 6. 565 - Scalability: The mechanism and protocols should have a low 566 overhead and should scale to the largest receiver groups likely 567 to occur within a single link layer domain. 569 - Fault Tolerance and Recovery: The mechanism must be able to 570 function in the presence of failures; i.e. there should not 571 be a single point of failure. For instance, in a centralized 572 implementation, some mechanism must be specified for back-up and 573 recovery in the event of failure. 575 - Interaction with Existing Resource Management Controls: The 576 interaction with existing infrastructure for resource management 577 needs to be specified. For example, FDDI has a resource 578 management mechanism called the "Synchronous Bandwidth Manager". 579 The mechanism must be designed so that it takes advantage of, 580 and specifies the interaction with, existing controls where 581 available. 583 5.2. Goals 585 - Independence from higher layer protocols: The mechanism should, 586 as far as possible, be independent of higher layer protocols such 587 as RSVP and IP. Independence from RSVP is desirable so that it 588 can interwork with other reservation protocols such as ST2 [10]. 589 Independence from IP is desirable so that it can interwork with 590 other network layer protocols such as IPX, NetBIOS, etc. 592 - Receiver heterogeneity: this refers to multicast communication 593 where different receivers request different levels of service. 594 For example, in a multicast group with many receivers, it 595 is possible that one of the receivers desires a lower delay 596 bound than the others. A better delay bound may be provided 597 by increasing the amount of resources reserved along the path 598 to that receiver while leaving the reservations for the other 599 receivers unchanged. In its most complex form, receiver 600 heterogeneity implies the ability to simultaneously provide 601 various levels of service as requested by different receivers. 602 In its simplest form, receiver heterogeneity will allow a 603 scenario where some of the receivers use best effort service and 604 those requiring service guarantees make a reservation. Receiver 605 heterogeneity, especially for the reserved/best effort scenario, 606 is a very desirable function. More details on supporting 607 receiver heterogeneity are provided in Section 8. 609 - Support for different filter styles: It is desirable to provide 610 support for the different filter styles defined by RSVP such as 611 fixed filter, shared explicit and wildcard. Some of the issues 612 with respect to supporting such filter styles in the link layer 613 domain are examined in Section 8. 615 - Path Selection: In source routed LAN technologies such as 616 Token Ring/IEEE 802.5, it may be useful for the mechanism to 617 incorporate the function of path selection. Using an appropriate 618 path selection mechanism may optimize utilization of network 619 resources. 621 5.3. Non-goals 623 This document describes service mappings onto existing IEEE and ANSI 624 defined standard MAC layers and uses standard MAC layer services 625 as in IEEE 802.1 bridging. It does not attempt to make use of or 626 describe the capabilities of other proprietary or standard MAC layer 627 protocols although it should be noted that published work regarding 628 MAC layers suitable for QoS mappings exists. These are outside the 629 scope of the ISSLL working group charter. 631 5.4. Assumptions 633 This framework assumes that typical subnetworks that are concerned 634 about QoS will be "switch rich"; most communication between 635 end stations using integrated services support is expected to 636 pass through at least one switch. The mechanisms and protocols 637 described will be trivially extensible to communicating systems on 638 the same shared medium, but it is important not to allow problem 639 generalization to complicate the targeted practical application which 640 is switch rich LAN topologies. There have also been developments in 641 the area of MAC enhancements to ensure delay deterministic access on 642 network links e.g. IEEE 802.12 [19] and also proprietary schemes. 644 Although we illustrate most examples for this model using RSVP as 645 the upper layer QoS signaling protocol, there are actually no real 646 dependencies on this protocol. RSVP could be replaced by some other 647 dynamic protocol, or the requests could be made by network management 648 or other policy entities. The SBM signaling protocol [14], which is 649 based upon RSVP, is designed to work seamlessly in the architecture 650 described in this memo. 652 There may be a heterogeneous mix of switches with different 653 capabilities, all compliant with IEEE 802.1D [2,3], but implementing 654 varied queueing and forwarding mechanisms ranging from simple systems 655 with two queues per port and static priority scheduling, to more 656 complex systems with multiple queues using WFQ or other algorithms. 658 The problem is decomposed into smaller independent parts which may 659 lead to sub-optimal use of the network resources but we contend that 660 such benefits are often equivalent to very small improvement in 661 network efficiency in a LAN environment. Therefore, it is a goal 662 that the switches in a network operate using a much simpler set of 663 information than the RSVP engine in a router. In particular, it is 664 assumed that such switches do not need to implement per flow queueing 665 and policing (although they may do so). 667 A fundamental assumption of the IntServ model is that flows are 668 isolated from each other throughout their transit across a network. 669 Intermediate queueing nodes are expected shape or police the traffic 670 to ensure conformance to the negotiated traffic flow specification. 671 In the architecture proposed here for mapping to Layer 2, we 672 diverge from that assumption in the interest of simplicity. The 673 policing/shaping functions are assumed to be implemented in end 674 stations. In some LAN environments, it is reasonable to assume that 675 end stations are trusted to adhere to their negotiated contracts at 676 the inputs to the network, and that we can afford to over-allocate 677 resources during admission control to compensate for the inevitable 678 packet jitter/bunching introduced by the switched network itself. 680 This divergence has some implications on the types of receiver 681 heterogeneity that can be supported and the statistical multiplexing 682 gains that may be exploited, especially for Controlled Load flows. 683 This is discussed in Section 8.7 of this document. 685 6. Basic Architecture 687 The functional requirements described in Section 5 will be performed 688 by an entity which we refer to as the Bandwidth Manager (BM). The BM 689 is responsible for providing mechanisms for an application or higher 690 layer protocol to request QoS from the network. For architectural 691 purposes, the BM consists of the following components. 693 6.1. Components 695 6.1.1. Requester Module 697 The Requester Module (RM) resides in every end station in the subnet. 698 One of its functions is to provide an interface between applications 699 or higher layer protocols such as RSVP, ST2, SNMP, etc. and the BM. 700 An application can invoke the various functions of the BM by using 701 the primitives for communication with the RM and providing it with 702 the appropriate parameters. To initiate a reservation, in the link 703 layer domain, the following parameters must be passed to the RM: the 704 service desired (Guaranteed Service or Controlled Load), the traffic 705 descriptors contained in the TSpec, and an RSpec specifying the 706 amount of resources to be reserved [9]. More information on these 707 parameters may be found in the relevant Integrated Services documents 708 [6,7,8,9]. When RSVP is used for signaling at the network layer, 709 this information is available and needs to be extracted from the RSVP 710 PATH and RSVP RESV messages (See [5] for details). In addition to 711 these parameters, the network layer addresses of the end points must 712 be specified. The RM must then translate the network layer addresses 713 to link layer addresses and convert the request into an appropriate 714 format which is understood by other components of the BM responsible 715 admission control. The RM is also responsible for returning the 716 status of requests processed by the BM to the invoking application or 717 higher layer protocol. 719 6.1.2. Bandwidth Allocator 721 The Bandwidth Allocator (BA) is responsible for performing admission 722 control and maintaining state about the allocation of resources 723 in the subnet. An end station can request various services, e.g. 724 bandwidth reservation, modification of an existing reservation, 725 queries about resource availability, etc. These requests are 726 processed by the BA. The communication between the end station and 727 the BA takes place through the RM. The location of the BA will 728 depend largely on the implementation method. In a centralized 729 implementation, the BA may reside on a single station in the 730 subnet. In a distributed implementation, the functions of the BA 731 may be distributed in all the end stations and bridges/switches as 732 necessary. The BA is also responsible for deciding how to label 733 flows, e.g. based on the admission control decision, the BA may 734 indicate to the RM that packets belonging to a particular flow be 735 tagged with some priority value which maps to the appropriate traffic 736 class. 738 6.1.3. Communication Protocols 740 The protocols for communication between the various components of the 741 BM system must be specified. These include the following: 743 - Communication between the higher layer protocols and the RM: 744 The BM must define primitives for the application to initiate 745 reservations, query the BA about available resources, and 746 change or delete reservations, etc. These primitives could be 747 implemented as an API for an application to invoke functions of 748 the BM via the RM. 750 - Communication between the RM and the BA: A signaling mechanism 751 must be defined for the communication between the RM and the BA. 752 This protocol will specify the messages which must be exchanged 753 between the RM and the BA in order to service various requests by 754 the higher layer entity. 756 - Communication between peer BAs: If there is more than one BA in 757 the subnet, a means must be specified for inter-BA communication. 758 Specifically, the BAs must be able to decide among themselves 759 about which BA would be responsible for which segments and 760 bridges or switches. Further, if a request is made for resource 761 reservation along the domain of multiple BAs, the BAs must be 762 able to handle such a scenario correctly. Inter-BA communication 763 will also be responsible for back-up and recovery in the event of 764 failure. 766 6.2. Centralized vs. Distributed Implementations 768 Example scenarios are provided showing the location of the the 769 components of the bandwidth manager in centralized and fully 770 distributed implementations. Note that in either case, the RM must 771 be present in all end stations which desire to make reservations. 772 Essentially, centralized or distributed refers to the implementation 773 of the BA, the component responsible for resource reservation 774 and admission control. In the figures below, "App" refers to 775 the application making use of the BM. It could either be a user 776 application, or a higher layer protocol process such as RSVP. 778 +---------+ 779 .-->| BA |<--. 780 / +---------+ \ 781 / .-->| Layer 2 |<--. \ 782 / / +---------+ \ \ 783 / / \ \ 784 / / \ \ 785 +---------+ / / \ \ +---------+ 786 | App |<----- /-/---------------------------\-\----->| App | 787 +---------+ / / \ \ +---------+ 788 | RM |<----. / \ .--->| RM | 789 +---------+ / +---------+ +---------+ \ +---------+ 790 | Layer 2 |<------>| Layer 2 |<------>| Layer 2 |<------>| Layer 2 | 791 +---------+ +---------+ +---------+ +---------+ 793 RSVP Host/ Intermediate Intermediate RSVP Host/ 794 Router Bridge/Switch Bridge/Switch Router 796 Figure 1: Bandwidth Manager with centralized Bandwidth Allocator 798 Figure 1 shows a centralized implementation where a single BA is 799 responsible for admission control decisions for the entire subnet. 801 Every end station contains a RM. Intermediate bridges and switches 802 in the network need not have any functions of the BM since they will 803 not be actively participating in admission control. The RM at the 804 end station requesting a reservation initiates communication with 805 its BA. For larger subnets, a single BA may not be able to handle 806 the reservations for the entire subnet. In that case it would be 807 necessary to deploy multiple BAs, each managing the resources of a 808 non-overlapping subset of segments. In a centralized implementation, 809 the BA must have some model of the Layer 2 topology of the subnet 810 e.g. link layer spanning tree information, in order to be able to 811 reserve resources on appropriate segments. Without this topology 812 information, the BM would have to reserve resources on all segments 813 for all flows which, in a switched network, would lead to very 814 inefficient utilization of resources. 816 +---------+ +---------+ 817 | App |<-------------------------------------------->| App | 818 +---------+ +---------+ +---------+ +---------+ 819 | RM/BA |<------>| BA |<------>| BA |<------>| RM/BA | 820 +---------+ +---------+ +---------+ +---------+ 821 | Layer 2 |<------>| Layer 2 |<------>| Layer 2 |<------>| Layer 2 | 822 +---------+ +---------+ +---------+ +---------+ 824 RSVP Host/ Intermediate Intermediate RSVP Host/ 825 Router Bridge/Switch Bridge/Switch Router 827 Figure 2: Bandwidth Manager with fully 828 distributed Bandwidth Allocator 830 Figure 2 depicts the scenario of a fully distributed bandwidth 831 manager. In this case, all devices in the subnet have BM 832 functionality. All the end hosts are still required to have a 833 RM. In addition, all stations actively participate in admission 834 control. With this approach, each BA would need only local topology 835 information since it is responsible for the resources on segments 836 that are directly connected to it. This local topology information, 837 such as a list of ports active on the spanning tree and which unicast 838 addresses are reachable from which ports, is readily available in 839 today's switches. Note that in the figures above, the arrows between 840 peer layers are used to indicate logical connectivity. 842 7. Model of the Bandwidth Manager in a Network 844 In this section we describe how the model above fits with the 845 existing IETF Integrated Services model of IP hosts and routers. 846 First, we describe Layer 3 host and router implementations. Next, we 847 describe how the model is applied in Layer 2 switches. Throughout 848 we indicate any differences between centralized and distributed 849 implementations. 851 7.1. End Station Model 853 7.1.1. Layer 3 Client Model 855 We assume the same client model as IntServ and RSVP where we use the 856 term "client" to mean the entity handling QoS in the Layer 3 device 857 at each end of a Layer 2 hop. In this model, the sending client 858 is responsible for local admission control and packet scheduling 859 onto its link in accordance with the negotiated service. As with 860 the IntServ model, this involves per flow scheduling with possible 861 traffic shaping/policing in every such originating source. 863 For now, we assume that the client runs an RSVP process which 864 presents a session establishment interface to applications, signals 865 over the network, programs a scheduler and classifier in the driver, 866 and interfaces to a policy control module. In particular, RSVP also 867 interfaces to a local admission control module which is the focus of 868 this section. 870 The following figure, reproduced from the RSVP specification, depicts 871 the RSVP process in sending hosts. 873 7.1.2. Requests to Layer 2 ISSLL 875 The local admission control entity within a client is responsible for 876 mapping Layer 3 session establishment requests into Layer 2 language. 878 The upper layer entity makes a request, in generalized terms to ISSLL 879 of the form: 881 "May I reserve for traffic with 882 with from to and 883 how should I label it?" 885 where 887 = Sender Tspec (e.g. bandwidth, burstiness, 888 +-----------------------------+ 889 | +-------+ +-------+ | RSVP 890 | |Appli- | | RSVP <-------------------> 891 | | cation<--> | | 892 | | | |process| +-----+| 893 | +-+-----+ | +->Polcy|| 894 | | +--+--+-+ |Cntrl|| 895 | |data | | +-----+| 896 |===|===========|==|==========| 897 | | +--------+ | +-----+| 898 | | | | +--->Admis|| 899 | +-V--V-+ +---V----+ |Cntrl|| 900 | |Class-| | Packet | +-----+| 901 | | ifier|==>Schedulr|===================> 902 | +------+ +--------+ | data 903 +-----------------------------+ 905 Figure 3: RSVP in Sending Hosts 907 MTU) 908 = FlowSpec (e.g. latency, jitter bounds) 909 = IP address(es) 910 = IP address(es) - may be multicast 912 7.1.3. At the Layer 3 Sender 914 The ISSLL functionality in the sender is illustrated in Figure 4. 916 The functions of the Requester Module may be summarized as follows: 918 - Maps the endpoints of the conversation to Layer 2 addresses 919 in the LAN, so that the client can determine what traffic is 920 going where. This function probably makes reference to the ARP 921 protocol cache for unicast or performs an algorithmic mapping for 922 multicast destinations. 924 - Communicates with any local Bandwidth Allocator module for local 925 admission control decisions. 927 - Formats a SBM request to the network with the mapped addresses 928 and flow/filter specs. 930 from IP from RSVP 931 +----|------------|------------+ 932 | +--V----+ +---V---+ | 933 | | Addr <---> | | SBM signaling 934 | |mapping| |Request|<-----------------------> 935 | +---+---+ |Module | | 936 | | | | | 937 | +---+---+ | | | 938 | | 802 <---> | | 939 | | header| +-+-+-+-+ | 940 | +--+----+ / | | | 941 | | / | | +-----+ | 942 | | +-----+ | +->|Band-| | 943 | | | | |width| | 944 | +--V-V-+ +-----V--+ |Alloc| | 945 | |Class-| | Packet | +-----+ | 946 | | ifier|==>Schedulr|=========================> 947 | +------+ +--------+ | data 948 +------------------------------+ 950 Figure 4: ISSLL in a Sending End Station 952 - Receives a response from the network and reports the admission 953 control decision to the higher layer entity, along with any 954 negotiated modifications to the session parameters. 956 - Saves any returned user_priority to be associated with this 957 session in a "802 header" table. This will be used when 958 constructing the Layer 2 headers for future data packets 959 belonging to this session. This table might, for example, be 960 indexed by the RSVP flow identifier. 962 The Bandwidth Allocator (BA) component is only present when a 963 distributed BA model is implemented. When present, its function is 964 basically to apply local admission control for the outgoing link 965 bandwidth and driver's queueing resources. 967 7.1.4. At the Layer 3 Receiver 969 The ISSLL functionality in the receiver is simpler is illustrated in 970 Figure 5. 972 The functions of the Requester Module may be summarized as follows: 974 to RSVP to IP 975 ^ ^ 976 +----|------------|------+ 977 | +--+----+ | | 978 SBM signaling | |Request| +---+---+ | 979 <-------------> |Module | | Strip | | 980 | +--+---++ |802 hdr| | 981 | | \ +---^---+ | 982 | +--v----+\ | | 983 | | Band- | \ | | 984 | | width| \ | | 985 | | Alloc | . | | 986 | +-------+ | | | 987 | +------+ +v---+----+ | 988 data | |Class-| | Packet | | 989 <==============>| ifier|==>|Scheduler| | 990 | +------+ +---------+ | 991 +------------------------+ 993 Figure 5: ISSLL in a Receiving End Station 995 - Handles any received SBM protocol indications. 997 - Communicates with any local BA for local admission control 998 decisions. 1000 - Passes indications up to RSVP if OK. 1002 - Accepts confirmations from RSVP and relays them back via SBM 1003 signaling towards the requester. 1005 - May program a receive classifier and scheduler, if used, to 1006 identify traffic classes of received packets and accord them 1007 appropriate treatment e.g. reservation of buffers for particular 1008 traffic classes. 1010 - Programs the receiver to strip away link layer header information 1011 from received packets. 1013 The Bandwidth Allocator, present only in a distributed implementation 1014 applies local admission control to see if a request can be supported 1015 with appropriate local receive resources. 1017 7.2. Switch Model 1019 7.2.1. Centralized Bandwidth Allocator 1021 Where a centralized Bandwidth Allocator model is implemented, 1022 switches do not take part in the admission control process. 1023 Admission control is implemented by a centralized BA, e.g. a "Subnet 1024 Bandwidth Manager" (SBM) as described in [14]. This centralized BA 1025 may actually be co-located with a switch but its functions would 1026 not necessarily then be closely tied with the switch's forwarding 1027 functions as is the case with the distributed BA described below. 1029 7.2.2. Distributed Bandwidth Allocator 1031 The model of Layer 2 switch behavior described here uses the 1032 terminology of the SBM protocol as an example of an admission control 1033 protocol. The model is equally applicable when other mechanisms, 1034 e.g. static configuration or network management, are in use for 1035 admission control. We define the following entities within the 1036 switch: 1038 - Local Admission Control Module: One of these on each port 1039 accounts for the available bandwidth on the link attached to that 1040 port. For half duplex links, this involves taking account of the 1041 resources allocated to both transmit and receive flows. For full 1042 duplex links, the input port accountant's task is trivial. 1044 - Input SBM Module: One instance on each port performs the 1045 "network" side of the signaling protocol for peering with clients 1046 or other switches. It also holds knowledge about the mappings of 1047 IntServ classes to user_priority. 1049 - SBM Propagation Module: Relays requests that have passed 1050 admission control at the input port to the relevant output ports' 1051 SBM modules. This will require access to the switch's forwarding 1052 table (Layer-2 "routing table" cf. RSVP model) and port spanning 1053 tree state. 1055 - Output SBM Module: Forwards requests to the next Layer 2 or 1056 Layer 3 hop. 1058 - Classifier, Queue and Scheduler Module: The functions of this 1059 module are basically as described by the Forwarding Process of 1060 IEEE 802.1D (see Section 3.7 of [3]). The Classifier module 1061 identifies the relevant QoS information from incoming packets and 1062 uses this, together with the normal bridge forwarding database, 1063 to decide at which output port and traffic class to enqueue 1064 the packet. Different types of switches will use different 1065 techniques for flow identification (see Section 8.1) In IEEE 1066 802.1D switches this information is the regenerated user_priority 1067 parameter which has already been decoded by the receiving MAC 1068 service and potentially remapped by the forwarding process (see 1069 section 3.7.3 of [3]). This does not preclude more sophisticated 1070 classification rules such as the classification of individual 1071 IntServ flows. The Queue and Scheduler hold the output queues 1072 for ports and provide the algorithm for servicing the queues 1073 for transmission onto the output link in order to provide the 1074 promised IntServ service. Switches will implement one or more 1075 output queues per port and all will implement at least a basic 1076 static priority dequeueing algorithm as their default, in 1077 accordance with IEEE 802.1D. 1079 - Ingress Traffic Class Mapping and Policing Module: Its functions 1080 are as described in IEEE 802.1D Section 3.7. This optional 1081 module may police the data within traffic classes for conformance 1082 to the negotiated parameters, and may discard packets or re-map 1083 the user_priority. The default behavior is to pass things 1084 through unchanged. 1086 - Egress Traffic Class Mapping Module: Its functions are as 1087 described in IEEE 802.1D Section 3.7. This optional module may 1088 perform re-mapping of traffic classes on a per output port basis. 1089 The default behavior is to pass things through unchanged. 1091 Figure 6 shows all of the modules in an ISSLL enabled switch. The 1092 ISSLL model is a superset of the IEEE 802.1D bridge model. 1094 7.3. Admission Control 1096 On receipt of an admission control request, a switch performs the 1097 following actions, again using SBM as an example. The behavior 1098 is different depending on whether the "Designated SBM" for this 1099 segment is within this switch or not. See [14] for a more detailed 1100 specification of the DSBM/SBM actions. 1102 - If the ingress SBM is the "Designated SBM" for this link, it 1103 either translates any received user_priority or selects a Layer 1104 2 traffic class which appears compatible with the request and 1105 whose use does not violate any administrative policies in force. 1106 In effect, it matches the requested service with the available 1107 traffic classes and chooses the "best" one. It ensures that, 1108 if this reservation is successful, the value of user_priority 1109 corresponding to that traffic class is passed back to the client. 1111 +-------------------------------+ 1112 SBM signaling | +-----+ +------+ +------+ | SBM signaling 1113 <------------------>| IN |<->| SBM |<->| OUT |<----------------> 1114 | | SBM | | prop.| | SBM | | 1115 | +-++--+ +---^--+ /----+-+ | 1116 | / | | / | | 1117 ______________| / | | | | +-------------+ 1118 | \ /+--V--+ | | +--V--+ / | 1119 | \ ____/ |Local| | | |Local| / | 1120 | \ / |Admis| | | |Admis| / | 1121 | \/ |Cntrl| | | |Cntrl| / | 1122 | +-----V+\ +-----+ | | +-----+ /+-----+ | 1123 | |traff | \ +---+--+ +V-------+ / |egrss| | 1124 | |class | \ |Filter| |Queue & | / |traff| | 1125 | |map & |=====|==========>|Data- |=| Packet |=|===>|class| | 1126 | |police| | | base| |Schedule| | |map | | 1127 | +------+ | +------+ +--------+ | +-+---+ | 1128 +----^---------+-------------------------------+------|------+ 1129 data in | |data out 1130 ========+ +========> 1132 Figure 6: ISSLL in a Switch 1134 - The ingress DSBM observes the current state of allocation of 1135 resources on the input port/link and then determines whether 1136 the new resource allocation from the mapped traffic class can 1137 be accommodated. The request is passed to the reservation 1138 propagator if accepted. 1140 - If the ingress SBM is not the "Designated SBM" for this link then 1141 it directly passes the request on to the reservation propagator. 1143 - The reservation propagator relays the request to the bandwidth 1144 accountants on each of the switch's outbound links to which 1145 this reservation would apply. This implies an interface to 1146 routing/forwarding database. 1148 - The egress bandwidth accountant observes the current state 1149 of allocation of queueing resources on its outbound port and 1150 bandwidth on the link itself and determines whether the new 1151 allocation can be accommodated. Note that this is only a local 1152 decision at this switch hop; further Layer 2 hops through the 1153 network may veto the request as it passes along. 1155 - The request, if accepted by this switch, is propagated on 1156 each output link selected. Any user_priority described in the 1157 forwarded request must be translated according to any egress 1158 mapping table. 1160 - If accepted, the switch must notify the client of the 1161 user_priority to be used for packets belonging to that flow. 1162 Again, this is an optimistic approach assuming that admission 1163 control succeeds; downstream switches may refuse the request. 1165 - If this switch wishes to reject the request, it can do so by 1166 notifying the original client by means of its Layer 2 address. 1168 7.4. QoS Signaling 1170 The mechanisms described in this document make use of a signaling 1171 protocol for devices to communicate their admission control requests 1172 across the network. The service definitions to be provided by 1173 such a protocol e.g. [14] are described below. We illustrate the 1174 primitives and information that need to be exchanged with such a 1175 signaling protocol entity. In all of the examples, appropriate 1176 delete/cleanup mechanisms will also have to be provided for tearing 1177 down established sessions. 1179 7.4.1. Client Service Definitions 1181 The following interfaces can be identified from Figures 4 and 5. 1183 - SBM <-> Address Mapping 1185 This is a simple lookup function which may require ARP protocol 1186 interactions or an algorithmic mapping. The Layer 2 addresses 1187 are needed by SBM for inclusion in its signaling messages to 1188 avoid requiring that switches participating in the signaling have 1189 Layer 3 information to perform the mapping. 1191 l2_addr = map_address( ip_addr ) 1193 - SBM <-> Session/Link Layer Header 1195 This is for notifying the transmit path of how to add Layer 2 1196 header information, e.g. user_priority values to the traffic 1197 of each outgoing flow. The transmit path will provide the 1198 user_priority value when it requests a MAC layer transmit 1199 operation for each packet. The user_priority is one of the 1200 parameters passed in the packet transmit primitive defined by the 1201 IEEE 802 service model. 1203 bind_l2_header( flow_id, user_priority ) 1205 - SBM <-> Classifier/Scheduler 1207 This is for notifying transmit classifier/scheduler of any 1208 additional Layer 2 information associated with scheduling the 1209 transmission of a packet flow. This primitive may be unused in 1210 some implementations or it may be used, for example, to provide 1211 information to a transmit scheduler that is performing per 1212 traffic class scheduling in addition to the per flow scheduling 1213 required by IntServ; the Layer 2 header may be a pattern (in 1214 addition to the FilterSpec) to be used to identify the flow's 1215 traffic. 1217 bind_l2schedulerinfo( flow_id, , l2_header, traffic_class ) 1219 - SBM <-> Local Admission Control 1221 This is used for applying local admission control for a session 1222 e.g. is there enough transmit bandwidth still uncommitted 1223 for this potential new session? Are there sufficient receive 1224 buffers? This should commit the necessary resources if it 1225 succeeds. It will be necessary to release these resources at 1226 a later stage if the admission control fails at a later stage. 1227 This call would be made, for example, by a segment's Designated 1228 SBM. 1230 status = admit_l2session( flow_id, Tspec, FlowSpec ) 1232 - SBM <-> RSVP 1234 This is outlined above in Section 7.1.2 and fully described in 1235 [14]. 1237 - Management Interfaces 1239 Some or all of the modules described by this model will also 1240 require configuration management. It is expected that details of 1241 the manageable objects will be specified by future work in the 1242 ISSLL WG. 1244 7.4.2. Switch Service Definitions 1246 The following interfaces are identified from Figure 6. 1248 - SBM <-> Classifier 1250 This is for notifying the receive classifier of how to match 1251 incoming Layer 2 information with the associated traffic class. 1252 It may in some cases consist of a set of read only default 1253 mappings. 1255 bind_l2classifierinfo( flow_id, l2_header, traffic_class ) 1257 - SBM <-> Queue and Packet Scheduler 1259 This is for notifying transmit scheduler of additional Layer 2 1260 information associated with a given traffic class. It may be 1261 unused in some cases (see discussion in previous section). 1263 bind_l2schedulerinfo( flow_id, l2_header, traffic_class ) 1265 - SBM <-> Local Admission Control 1267 Same as for the host discussed above. 1269 - SBM <-> Traffic Class Map and Police 1271 Optional configuration of any user_priority remapping that 1272 might be implemented on ingress to and egress from the ports of 1273 a switch. For IEEE 802.1D switches, it is likely that these 1274 mappings will have to be consistent across all ports. 1276 bind_l2ingressprimap( inport, in_user_pri, internal_priority ) 1277 bind_l2egressprimap( outport, internal_priority, out_user_pri ) 1279 Optional configuration of any Layer 2 policing function to be 1280 applied on a per class basis to traffic matching the Layer 2 1281 header. If the switch is capable of per flow policing then 1282 existing IntServ/RSVP models will provide a service definition 1283 for that configuration. 1285 bind_l2policing( flow_id, l2_header, Tspec, FlowSpec ) 1287 - SBM <-> Filtering Database 1289 SBM propagation rules need access to the Layer 2 forwarding 1290 database to determine where to forward SBM messages. This is 1291 analogous to RSRR interface in Layer 3 RSVP. 1293 output_portlist = lookup_l2dest( l2_addr ) 1295 - Management Interfaces 1296 Some or all of the modules described by this model will also 1297 require configuration management. It is expected that details of 1298 the manageable objects will be specified by future work in the 1299 ISSLL working group. 1301 8. Implementation Issues 1303 As stated earlier, the Integrated Services working group has defined 1304 various service classes offering varying degrees of QoS guarantees. 1305 Initial effort will concentrate on enabling the Controlled Load [6] 1306 and Guaranteed Service classes [7]. The Controlled Load service 1307 provides a loose guarantee, informally stated as "the same as best 1308 effort would be on an unloaded network". The Guaranteed Service 1309 provides an upper bound on the transit delay of any packet. The 1310 extent to which these services can be supported at the link layer 1311 will depend on many factors including the topology and technology 1312 used. Some of the mapping issues are discussed below in light of 1313 the emerging link layer standards and the functions supported by 1314 higher layer protocols. Considering the limitations of some of the 1315 topologies, it may not be possible to satisfy all the requirements 1316 for Integrated Services on a given topology. In such cases, it 1317 is useful to consider providing support for an approximation of 1318 the service which may suffice in most practical instances. For 1319 example, it may not be feasible to provide policing/shaping at each 1320 network element (bridge/switch) as required by the Controlled Load 1321 specification. But if this task is left to the end stations, a 1322 reasonably good approximation to the service can be obtained. 1324 8.1. Switch Characteristics 1326 There are many LAN bridges/switches with varied capabilities for 1327 supporting QoS. We discuss below the various kinds of devices that 1328 that one may expect to find in a LAN environment. 1330 The most basic bridge is one which conforms to IEEE 802.1D [2]. This 1331 device has a single queue per output port, and uses the spanning tree 1332 algorithm to eliminate topology loops. Networks constructed from 1333 this kind of device cannot be expected to provide service guarantees 1334 of any kind because of the complete lack of traffic isolation. 1336 The next level of bridges/switches are those which conform to the 1337 more recently revised IEEE 802.1D specification. It will include 1338 support for queueing up to eight traffic classes separately. The 1339 level of traffic isolation provided is coarse because all flows 1340 corresponding to a particular traffic class are aggregated. Further, 1341 it is likely that more than one priority will map to a traffic class 1342 depending on the number of queues implemented in the switch. It 1343 would be difficult for such a device to offer protection against 1344 misbehaving flows. The scope of multicast traffic may be limited by 1345 using GMRP to only those segments which are on the path to interested 1346 receivers. 1348 A next step above these devices are bridges/switches which implement 1349 optional parts of the IEEE 802.1D specification such as mapping the 1350 received user_priority to some internal set of canonical values 1351 on a per-input-port basis. It may also support the mapping of 1352 these internal canonical values onto transmitted user_priority on 1353 a per-output-port basis. With these extra capabilities, network 1354 administrators can perform mapping of traffic classes between 1355 specific pairs of ports, and in doing so gains more control over 1356 admission to traffic into the protected classes. 1358 Other entirely optional features that some bridges/switches may 1359 support include classification of IntServ flows using fields in the 1360 network layer header, per-flow policing and/or reshaping which is 1361 essential for supporting Guaranteed Service, and more sophisticated 1362 scheduling algorithms such as variants of weighted fair queueing to 1363 limit the bandwidth consumed by a traffic class. Note that it is 1364 advantageous to perform flow isolation and for all network elements 1365 to police each flow in order to support the Controlled Load and 1366 Guaranteed Service. 1368 8.2. Queueing 1370 Connectionless packet networks in general, and LANs in particular, 1371 work today because of scaling choices in network provisioning. 1372 Typically, excess bandwidth and buffering is provisioned in the 1373 network to absorb the traffic sourced by higher layer protocols, 1374 often sufficient to cause their transmission windows to run out on a 1375 statistical basis, so that network overloads are rare and transient 1376 and the expected loading is very low. 1378 With the advent of time-critical traffic such over-provisioning 1379 has become far less easy to achieve. Time-critical frames may be 1380 queued for annoyingly long periods of time behind temporary bursts 1381 of file transfer traffic, particularly at network bottleneck points, 1382 e.g. at the 100 Mbps to 10 Mbps transition that might occur between 1383 the riser to the wiring closet and the final link to the user from 1384 a desktop switch. In this case, however, if it is known a priori 1385 (either by application design, on the basis of statistics, or on 1386 by administrative control), that time-critical traffic is a small 1387 fraction of the total bandwidth, it suffices to give it strict 1388 priority over the non-time-critical traffic. The worst case delay 1389 experienced by the time-critical traffic is roughly the maximum 1390 transmission time of a maximum length non-time-critical frame -- less 1391 than a millisecond for 10 Mbps Ethernet, and well below the end to 1392 end delay budget based on human perception times. 1394 When more than one priority service is to be offered by a network 1395 element e.g. one which supports both Controlled Load as well as 1396 Guaranteed Service, the requirements for the scheduling discipline 1397 becomes more complex. In order to provide the required isolation 1398 between the service classes, it will probably be necessary to queue 1399 them separately. There is then an issue of how to service the 1400 queues which requires a combination of admission control and more 1401 intelligent queueing disciplines. As with the service specifications 1402 themselves, the specification of queueing algorithms is beyond the 1403 scope of this document. 1405 8.3. Mapping of Services to Link Level Priority 1407 The number of traffic classes supported and access methods of the 1408 technology under consideration will determine how many and what 1409 services may be supported. Native Token Ring/IEEE 802.5, for 1410 instance, supports eight priority levels which may be mapped to 1411 one or more traffic classes. Ethernet/IEEE 802.3 has no support 1412 for signaling priorities within frames. However, the IEEE 802 1413 standards committee has recently developed a new standard for 1414 bridges/switches related to multimedia traffic expediting and 1415 dynamic multicast filtering [3]. A packet format for carrying a 1416 user_priority field on all IEEE 802 LAN media types is now defined 1417 in [4]. These standards allow for up to eight traffic classes 1418 on all media. The user_priority bits carried in the frame are 1419 mapped to a particular traffic class within a bridge/switch. The 1420 user_priority is signaled on an end-to-end basis, unless overridden 1421 by bridge/switch management. The traffic class that is used by a 1422 flow should depend on the quality of service desired and whether the 1423 reservation is successful or not. Therefore, a sender should use the 1424 user_priority value which maps to the best effort traffic class until 1425 told otherwise by the BM. The BM will, upon successful completion of 1426 resource reservation, specify the value of user_priority to be used 1427 by the sender for that session's data. An accompanying memo [13] 1428 addresses the issue of mapping the various Integrated Services to 1429 appropriate traffic classes. 1431 8.4. Re-mapping of Non-conforming Aggregated Flows 1433 One other topic under discussion in the IntServ context is how to 1434 handle the traffic for data flows from sources that exceed their 1435 negotiated traffic contract with the network. An approach that shows 1436 some promise is to treat such traffic with "somewhat less than best 1437 effort" service in order to protect traffic that is normally given 1438 "best effort" service from having to back off. Best effort traffic 1439 is often adaptive, using TCP or other congestion control algorithms, 1440 and it would be unfair to penalize those flows due to badly behaved 1441 traffic from reserved flows which are often set up by non-adaptive 1442 applications. 1444 A possible solution might be to assign normal best effort traffic 1445 to one user_priority and to label excess non-conforming traffic 1446 as a lower user_priority although the re-ordering problems that 1447 might arise from doing this may make this solution undesirable, 1448 particularly if the flows are using TCP. For this reason the 1449 controlled load service recommends dropping excess traffic, rather 1450 than re-mapping to a lower priority. This is further discussed 1451 below. 1453 8.5. Override of Incoming User Priority 1455 In some cases, a network administrator may not trust the 1456 user_priority values contained in packets from a source and may wish 1457 to map these into some more suitable set of values. Alternatively, 1458 due perhaps to equipment limitations or transition periods, values 1459 may need to be re-mapped as the data flows to/from different regions 1460 of a network. 1462 Some switches may implement such a function on input that maps 1463 received user_priority to some internal set of values. This 1464 function is provided by a table known in IEEE 802.1D as the User 1465 Priority Regeneration Table (Table 3-1 in [3]). These values can 1466 then be mapped using an output table described above onto outgoing 1467 user_priority values. These same mappings must also be used when 1468 applying admission control to requests that use the user_priority 1469 values (see e.g. [14]). More sophisticated approaches are also 1470 possible where a device polices traffic flows and adjusts their 1471 onward user_priority based on their conformance to the admitted 1472 traffic flow specifications. 1474 8.6. Different Reservation Styles 1476 In the figure above, SW is a bridge/switch in the link layer domain. 1477 S1, S2, S3, R1 and R2 are end stations which are members of a group 1478 associated with the same RSVP flow. S1, S2 and S3 are upstream 1479 end stations. R1 and R2 are the downstream end stations which 1480 receive traffic from all the senders. RSVP allows receivers R1 and 1481 +-----+ +-----+ +-----+ 1482 | S1 | | S2 | | S3 | 1483 +-----+ +-----+ +-----+ 1484 | | | 1485 | v | 1486 | +-----+ | 1487 +--------->| SW |<---------+ 1488 +-----+ 1489 | | 1490 +----+ +----+ 1491 | | 1492 v V 1493 +-----+ +-----+ 1494 | R1 | | R2 | 1495 +-----+ +-----+ 1497 Figure 7: Illustration of filter styles 1499 R2 to specify reservations which can apply to: (a) one specific 1500 sender only (fixed filter); (b) any of two or more explicitly 1501 specified senders (shared explicit filter); and (c) any sender in 1502 the group (shared wildcard filter). Support for the fixed filter 1503 style is straightforward; a separate reservation is made for the 1504 traffic from each of the senders. However, support for the other 1505 two filter styles has implications regarding policing; i.e. the 1506 merged flow from the different senders must be policed so that they 1507 conform to traffic parameters specified in the filter's RSpec. This 1508 scenario is further complicated if the services requested by R1 and 1509 R2 are different. Therefore, in the absence of policing within 1510 bridges/switches, it may be possible to support only fixed filter 1511 reservations at the link layer. 1513 8.7. Receiver Heterogeneity 1515 At Layer 3, the IntServ model allows heterogeneous receivers 1516 for multicast flows where different branches of a tree can have 1517 different types of reservations for a given multicast destination. 1518 It also supports the notion that trees may have some branches with 1519 reserved flows and some using best effort service. If we were 1520 to treat a Layer 2 subnet as a single network element as defined 1521 in [8], then all of the branches of the distribution tree that 1522 lie within the subnet could be assumed to require the same QoS 1523 treatment and be treated as an atomic unit as regards admission 1524 control, etc. With this assumption, the model and protocols already 1525 defined by IntServ and RSVP already provide sufficient support for 1526 multicast heterogeneity. Note, however, that an admission control 1527 request may well be rejected because just one link in the subnet is 1528 oversubscribed leading to rejection of the reservation request for 1529 the entire subnet. 1531 +-----+ 1532 | S | 1533 +-----+ 1534 | 1535 v 1536 +-----+ +-----+ +-----+ 1537 | R1 |<-----| SW |----->| R2 | 1538 +-----+ +-----+ +-----+ 1540 Figure 8: Example of receiver heterogeneity 1542 As an example, consider Figure 8, SW is a Layer 2 device 1543 (bridge/switch) participating in resource reservation, S is the 1544 upstream source end station and R1 and R2 are downstream end station 1545 receivers. R1 would like to make a reservation for the flow while R2 1546 would like to receive the flow using best effort service. S sends 1547 RSVP PATH messages which are multicast to both R1 and R2. R1 sends 1548 an RSVP RESV message to S requesting the reservation of resources. 1550 If the reservation is successful at Layer 2, the frames addressed to 1551 the group will be categorized in the traffic class corresponding to 1552 the service requested by R1. At SW, there must be some mechanism 1553 which forwards the packet providing service corresponding to the 1554 reserved traffic class at the interface to R1 while using the best 1555 effort traffic class at the interface to R2. This may involve 1556 changing the contents of the frame itself, or ignoring the frame 1557 priority at the interface to R2. 1559 Another possibility for supporting heterogeneous receivers would 1560 be to have separate groups with distinct MAC addresses, one for 1561 each class of service. By default, a receiver would join the "best 1562 effort" group where the flow is classified as best effort. If the 1563 receiver makes a reservation successfully, it can be transferred to 1564 the group for the class of service desired. The dynamic multicast 1565 filtering capabilities of bridges and switches implementing the IEEE 1566 802.1D standard would be a very useful feature in such a scenario. 1567 A given flow would be transmitted only on those segments which are 1568 on the path between the sender and the receivers of that flow. The 1569 obvious disadvantage of such an approach is that the sender needs to 1570 send out multiple copies of the same packet corresponding to each 1571 class of service desired thus potentially duplicating the traffic on 1572 a portion of the distribution tree. 1574 The above approaches would provide very sub-optimal utilization of 1575 resources given the expected size and complexity of the Layer 2 1576 subnets. Therefore, it is desirable to enable switches to apply QoS 1577 differently on different egress branches of a tree that divide at 1578 that switch. 1580 IEEE 802.1D specifies a basic model for multicast whereby a switch 1581 makes multicast forwarding decisions based on the destination 1582 address. This would produce a list of output ports to which the 1583 packet should be forwarded. In its default mode, such a switch 1584 would use the user_priority value in received packets, or a value 1585 regenerated on a per input port basis in the absence of an explicit 1586 value, to enqueue the packets at each output port. Any IEEE 802.1D 1587 switch which supports multiple traffic classes can support this 1588 operation. 1590 If a switch selects per port output queues based only on the incoming 1591 user_priority, as described by IEEE 802.1D, it must treat all 1592 branches of all multicast sessions within that user_priority class 1593 with the same queueing mechanism. Receiver heterogeneity is then 1594 not possible and this could well lead to the failure of an admission 1595 control request for the whole multicast session due to a single 1596 link being oversubscribed Note that in the Layer 2 case as distinct 1597 from the Layer 3 case with RSVP/IntServ, the option of having some 1598 receivers getting the session with the requested QoS and some getting 1599 it best effort does not exist as basic IEEE 802.1 switches are unable 1600 to re-map the user_priority on a per link basis. This could become 1601 an issue with heavy use of dynamic multicast sessions. If a switch 1602 were to implement a separate user_priority mapping at each output 1603 port, then, in some cases, reservations can use a different traffic 1604 class on different paths that branch at such a switch in order to 1605 provide multiple receivers with different QoS. This is possible if 1606 all flows within a traffic class at the ingress to a switch egress 1607 in the same traffic class on a port. For example, traffic may be 1608 forwarded using user_priority 4 on one branch where receivers have 1609 performed admission control and as user_priority 0 on ones where 1610 they have not. We assume that per user_priority queueing without 1611 taking account of input or output ports is the minimum standard 1612 functionality for switches in a LAN environment (IEEE 802.1D) 1613 but that more functional Layer 2 or even Layer 3 switches (i.e. 1614 routers) can be used if even more flexible forms of heterogeneity are 1615 considered necessary to achieve more efficient resource utilization. 1617 The behavior of Layer 3 switches in this context is already well 1618 standardized by the IETF. 1620 9. Network Topology Scenarios 1622 The extent to which service guarantees can be provided by a 1623 network depend to a large degree on the ability to provide the key 1624 functions of flow identification and scheduling in addition to 1625 admission control and policing. This section discusses some of the 1626 capabilities of the LAN technologies under consideration and provides 1627 a taxonomy of possible topologies emphasizing the capabilities 1628 of each with regard to supporting the above functions. For the 1629 technologies considered here, the basic topology of a LAN may be 1630 shared, switched half duplex or switched full duplex. In the shared 1631 topology, multiple senders share a single segment. Contention for 1632 media access is resolved using protocols such as CSMA/CD in Ethernet 1633 and token passing in Token Ring and FDDI. Switched half duplex, 1634 is essentially a shared topology with the restriction that there 1635 are only two transmitters contending for resources on any segment. 1636 Finally, in a switched full duplex topology, a full bandwidth path is 1637 available to the transmitter at each end of the link at all times. 1638 Therefore, in this topology, there is no need for any access control 1639 mechanism such as CSMA/CD or token passing as there is no contention 1640 between the transmitters. Obviously, this topology provides the best 1641 QoS capabilities. Another important element in the discussion of 1642 topologies is the presence or absence of support for multiple traffic 1643 classes. These were discussed earlier in Section 4.1. Depending on 1644 the basic topology used and the ability to support traffic classes, 1645 we identify six scenarios as follows: 1647 1. Shared topology without traffic classes. 1648 2. Shared topology with traffic classes. 1649 3. Switched half duplex topology without traffic classes. 1650 4. Switched half duplex topology with traffic classes. 1651 5. Switched full duplex topology without traffic classes. 1652 6. Switched full duplex topology with traffic classes. 1654 There is also the possibility of hybrid topologies where two or more 1655 of the above coexist. For instance, it is possible that within a 1656 single subnet, there are some switches which support traffic classes 1657 and some which do not. If the flow in question traverses both 1658 kinds of switches in the network, the least common denominator will 1659 prevail. In other words, as far as that flow is concerned, the 1660 network is of the type corresponding to the least capable topology 1661 that is traversed. In the following sections, we present these 1662 scenarios in further detail for some of the different IEEE 802 1663 network types with discussion of their abilities to support the 1664 IntServ services. 1666 9.1. Full Duplex Switched Networks 1668 On a full duplex switched LAN, the MAC protocol is unimportant 1669 as far as access is concerned, but must be factored in to the 1670 characterization parameters advertised by the device since the 1671 access latency is equal to the time required to transmit the largest 1672 packet. Approximate values for the characteristics on various media 1673 are provided in the following tables. These delays should be also 1675 Table 4: Full duplex switched media access latency 1677 -------------------------------------------------- 1678 Type Speed Max Pkt Max Access 1679 Length Latency 1680 -------------------------------------------------- 1681 Ethernet 10 Mbps 1.2 ms 1.2 ms 1682 100 Mbps 120 us 120 us 1683 1 Gbps 12 us 12 us 1684 Token Ring 4 Mbps 9 ms 9 ms 1685 16 Mbps 9 ms 9 ms 1686 FDDI 100 Mbps 360 us 8.4 ms 1687 Demand Priority 100 Mbps 120 us 120 us 1688 -------------------------------------------------- 1690 be considered in the context of the speed of light delay which is 1691 approximately 400 ns for typical 100 m UTP links and 7 us for typical 1692 2 km multimode fiber links. 1694 Full duplex switched network topologies offer good QoS capabilities 1695 for both Controlled Load and Guaranteed Service when supported by 1696 suitable queueing strategies in the switches. 1698 9.2. Shared Media Ethernet Networks 1700 Thus far, we have not discussed the difficulty of dealing with 1701 allocation on a single shared CSMA/CD segment. As soon as any 1702 CSMA/CD algorithm is introduced the ability to provide any form of 1703 Guaranteed Service is seriously compromised in the absence of any 1704 tight coupling between the multiple senders on the link. There are a 1705 number of reasons for not offering a better solution to this problem. 1707 Firstly, we do not believe this is a truly solvable problem as 1708 it would require changes to the MAC protocol. IEEE 802.1 has 1709 examined research showing disappointing simulation results for 1710 performance guarantees on shared CSMA/CD Ethernet without MAC 1711 enhancements. There have been proposals for enhancements to the 1712 MAC layer protocols, e.g. BLAM and enhanced flow control in IEEE 1713 802.3. However, any solution involving an enhanced software MAC 1714 running above the traditional IEEE 802.3 MAC, or other proprietary 1715 MAC protocols, is outside the scope of the ISSLL working group and 1716 this document. Secondly, we are not convinced that it is really an 1717 interesting problem. While there will be end stations on repeated 1718 segments for some time to come, the number of deployed switches is 1719 steadily increasing relative to the number of stations on repeated 1720 segments. This trend is proceeding to the point where it may be 1721 satisfactory to have a solution which assumes that any network 1722 communication requiring resource reservations will take place 1723 through at least one switch or router. Put another way, the easiest 1724 upgrade to existing Layer 2 infrastructure for QoS support is the 1725 installation of segment switching. Only when this has been done 1726 is it worthwhile to investigate more complex solutions involving 1727 admission control. Thirdly, there core of campus networks typically 1728 consists of solutions based on switches rather than on repeated 1729 segments. There may be special circumstances in the future, e.g. 1730 Gigabit buffered repeaters, but the characteristics of these devices 1731 are different from existing CSMA/CD repeaters anyway. 1733 Table 5: Shared Ethernet media access latency 1735 -------------------------------------------------- 1736 Type Speed Max Pkt Max Access 1737 Length Latency 1738 -------------------------------------------------- 1739 Ethernet 10 Mbps 1.2 ms unbounded 1740 100 Mbps 120 us unbounded 1741 1 Gbps 12 us unbounded 1742 -------------------------------------------------- 1744 9.3. Half Duplex Switched Ethernet Networks 1746 Many of the same arguments for sub optimal support of Guaranteed 1747 Service on shared media Ethernet also apply to half duplex switched 1748 Ethernet. In essence, this topology is a medium that is shared 1749 between at least two senders contending for packet transmission. 1750 Unless these are tightly coupled and cooperative, there is always the 1751 chance that the best effort traffic of one will interfere with the 1752 reserved traffic of the other. Dealing with such a coupling would 1753 seem to require some form of modification to the MAC protocol. 1755 Not withstanding the above argument, half duplex switched topologies 1756 do seem to offer the chance to provide Controlled Load service. With 1757 the knowledge that there are exactly two potential senders that are 1758 both using prioritization for their Controlled Load traffic over best 1759 effort flows, and with admission control having been done for those 1760 flows based on that knowledge, the media access characteristics while 1761 not deterministic are somewhat predictable. This is probably a close 1762 enough useful approximation to the Controlled Load service. 1764 Table 6: Half duplex switched Ethernet media access latency 1766 ------------------------------------------ 1767 Type Speed Max Pkt Max Access 1768 Length Latency 1769 ------------------------------------------ 1770 Ethernet 10 Mbps 1.2 ms unbounded 1771 100 Mbps 120 us unbounded 1772 1 Gbps 12 us unbounded 1773 ------------------------------------------ 1775 9.4. Half Duplex Switched and Shared Token Ring Networks 1777 In a shared Token Ring network, the network access time for 1778 high priority traffic at any station is bounded and is given 1779 by (N+1)*THTmax, where N is the number of stations sending high 1780 priority traffic and THTmax is the maximum token holding time 1781 [14]. This assumes that network adapters have priority queues 1782 so that reservation of the token is done for traffic with the 1783 highest priority currently queued in the adapter. It is easy to 1784 see that access times can be improved by reducing N or THTmax. The 1785 recommended default for THTmax is 10 ms [6]. N is an integer from 2 1786 to 256 for a shared ring and 2 for a switched half duplex topology. 1787 A similar analysis applies for FDDI. Using the default values gives 1789 Given that access time is bounded, it is possible to provide an 1790 upper bound for end-to-end delays as required by Guaranteed Service 1791 assuming that traffic of this class uses the highest priority 1792 allowable for user traffic. The actual number of stations that send 1793 traffic mapped into the same traffic class as Guaranteed Service may 1794 vary over time but, from an admission control standpoint, this value 1795 is needed a priori. The admission control entity must therefore use 1796 a fixed value for N, which may be the total number of stations on the 1797 Table 7: Half duplex switched and shared Token 1798 Ring media access latency 1799 ---------------------------------------------------- 1800 Type Speed Max Pkt Max Access 1801 Length Latency 1802 ---------------------------------------------------- 1803 Token Ring 4/16 Mbps shared 9 ms 2570 ms 1804 4/16 Mbps switched 9 ms 30 ms 1805 FDDI 100 Mbps 360 us 8 ms 1806 ---------------------------------------------------- 1808 ring or some lower value if it is desired to keep the offered delay 1809 guarantees smaller. If the value of N used is lower than the total 1810 number of stations on the ring, admission control must ensure that 1811 the number of stations sending high priority traffic never exceeds 1812 this number. This approach allows admission control to estimate 1813 worst case access delays assuming that all of the N stations are 1814 sending high priority data even though, in most cases, this will mean 1815 that delays are significantly overestimated. 1817 Assuming that Controlled Load flows use a traffic class lower than 1818 that used by Guaranteed Service, no upper bound on access latency 1819 can be provided for Controlled Load flows. However, Controlled Load 1820 flows will receive better service than best effort flows. 1822 Note that on many existing shared Token Rings, bridges transmit 1823 frames using an Access Priority (see Section 4.3) value of 4 1824 irrespective of the user_priority carried in the frame control 1825 field of the frame. Therefore, existing bridges would need to be 1826 reconfigured or modified before the above access time bounds can 1827 actually be used. 1829 9.5. Half Duplex and Shared Demand Priority Networks 1831 In IEEE 802.12 networks, communication between end nodes and hubs and 1832 between the hubs themselves is based on the exchange of link control 1833 signals. These signals are used to control access to the shared 1834 medium. If a hub, for example, receives a high priority request 1835 while another hub is in the process of serving normal priority 1836 requests, then the service of the latter hub can effectively be 1837 preempted in order to serve the high priority request first. After 1838 the network has processed all high priority requests, it resumes the 1839 normal priority service at the point in the network at which it was 1840 interrupted. 1842 The network access time for high priority packets is basically the 1843 time needed to preempt normal priority network service. This access 1844 time is bounded and it depends on the physical layer and on the 1845 topology of the shared network. The physical layer has a significant 1846 impact when operating in half duplex mode as, e.g. when used across 1847 unshielded twisted pair cabling (UTP) links, because link control 1848 signals cannot be exchanged while a packet is transmitted over the 1849 link. Therefore the network topology has to be considered since, in 1850 larger shared networks, the link control signals must potentially 1851 traverse several links and hubs before they can reach the hub which 1852 has the network control function. This may delay the preemption of 1853 the normal priority service and hence increase the upper bound that 1854 may be guaranteed. 1856 Upper bounds on the high priority access time are given below for a 1857 UTP physical layer and a cable length of 100 m between all end nodes 1858 and hubs using a maximum propagation delay of 570 ns as defined in 1859 [19]. These values consider the worst case signaling overhead and 1860 assume the transmission of maximum sized normal priority data packets 1861 while the normal priority service is being preempted. 1863 Table 8: Half duplex switched Demand Priority UTP access latency 1865 ------------------------------------------------------------ 1866 Type Speed Max Pkt Max Access 1867 Length Latency 1868 ------------------------------------------------------------ 1869 Demand Priority 100 Mbps, 802.3 pkt, UTP 120 us 254 us 1870 802.5 pkt, UTP 360 us 733 us 1871 ------------------------------------------------------------ 1873 Shared IEEE 802.12 topologies can be classified using the hub 1874 cascading level "N". The simplest topology is the single hub network 1875 (N = 1). For a UTP physical layer, a maximum cascading level of 1876 N = 5 is supported by the standard. Large shared networks with 1877 many hundreds of nodes may be built with a level 2 topology. The 1878 bandwidth manager could be informed about the actual cascading level 1879 by network management mechanisms and can use this information in its 1880 admission control algorithms. 1882 In contrast to UTP, the fiber optic physical layer operates in dual 1883 simplex mode. Upper bounds for the high priority access time are 1884 given below for 2 km multimode fiber links with a propagation delay 1885 of 10 us. 1887 Table 9: Shared Demand Priority UTP access latency 1889 ---------------------------------------------------------------- 1890 Type Speed Max Pkt Max Access Topology 1891 Length Latency 1892 ---------------------------------------------------------------- 1893 Demand Priority 100 Mbps, 802.3 pkt 120 us 262 us N = 1 1894 120 us 554 us N = 2 1895 120 us 878 us N = 3 1896 120 us 1.24 ms N = 4 1897 120 us 1.63 ms N = 5 1899 Demand Priority 100 Mbps, 802.5 pkt 360 us 722 us N = 1 1900 360 us 1.41 ms N = 2 1901 360 us 2.32 ms N = 3 1902 360 us 3.16 ms N = 4 1903 360 us 4.03 ms N = 5 1904 ----------------------------------------------------------------- 1906 Table 10: Half duplex switched Demand Priority 1907 fiber access latency 1908 ------------------------------------------------------------ 1909 Type Speed Max Pkt Max Access 1910 Length Latency 1911 ------------------------------------------------------------ 1912 Demand Priority 100 Mbps,802.3 pkt, fiber 120 us 139 us 1913 802.5 pkt, fiber 360 us 379 us 1914 ------------------------------------------------------------ 1916 For shared media with distances of up to 2 km between all end nodes 1917 and hubs, the IEEE 802.12 standard allows a maximum cascading level 1918 of 2. Higher levels of cascaded topologies are supported but require 1919 a reduction of the distances [15]. 1921 The bounded access delay and deterministic network access allow the 1922 support of service commitments required for Guaranteed Service and 1923 Controlled Load, even on shared media topologies. The support of 1924 just two priority levels in 802.12, however, limits the number of 1925 services that can simultaneously be implemented across the network. 1927 Table 11: Shared Demand Priority fiber access latency 1929 --------------------------------------------------------------- 1930 Type Speed Max Pkt Max Access Topology 1931 Length Latency 1932 --------------------------------------------------------------- 1933 Demand Priority 100 Mbps, 802.3 pkt 120 us 160 us N = 1 1934 120 us 202 us N = 2 1936 Demand Priority 100 Mbps, 802.5 pkt 360 us 400 us N = 1 1937 360 us 682 us N = 2 1938 --------------------------------------------------------------- 1940 10. Justification 1942 An obvious concern is the complexity of this model. It essentially 1943 does what RSVP already does at Layer 3, so why do we think we can do 1944 better by reinventing the solution to this problem at Layer 2? 1946 The key is that there are a number of simple Layer 2 scenarios 1947 that cover a considerable portion of the real QoS problems that 1948 will occur. A solution that covers the majority of problems at 1949 significantly lower cost is beneficial. Full RSVP/IntServ with per 1950 flow queueing in strategically positioned high function switches or 1951 routers may be needed to completely resolve all issues, but devices 1952 implementing the architecture described in herein will allow for a 1953 significantly simpler network. 1955 11. Summary 1957 This document has specified a framework for providing Integrated 1958 Services over shared and switched LAN technologies. The ability to 1959 provide QoS guarantees necessitates some form of admission control 1960 and resource management. The requirements and goals of a resource 1961 management scheme for subnets have been identified and discussed. 1962 We refer to the entire resource management scheme as a Bandwidth 1963 Manager. Architectural considerations were discussed and examples 1964 were provided to illustrate possible implementations of a Bandwidth 1965 Manager. Some of the issues involved in mapping the services 1966 from higher layers to the link layer have also been discussed. 1967 Accompanying memos from the ISSLL working group address service 1968 mapping issues [13] and provide a protocol specification for the 1969 Bandwidth Manager protocol [14] based on the requirements and goals 1970 discussed in this document. 1972 References 1974 [1] IEEE Standards for Local and Metropolitan Area Networks: Overview 1975 and Architecture, ANSI/IEEE Std 802, 1990. 1977 [2] ISO/IEC 10038 Information technology - Telecommunications and 1978 information exchange between systems - Local area networks - Media 1979 Access Control (MAC) Bridges, (also ANSI/IEEE Std 802.1D-1993), 1980 1993. 1982 [3] ISO/IEC Final CD 15802-3 Information technology - Tele- 1983 communications and information exchange between systems - 1984 Local and metropolitan area networks - Common specifications - 1985 Part 3: Media Access Control (MAC) bridges, (current draft 1986 available as IEEE P802.1D/D15). 1988 [4] IEEE Standards for Local and Metropolitan Area Networks: Draft 1989 Standard for Virtual Bridged Local Area Networks, P802.1Q/D8, 1990 January 1998. 1992 [5] B. Braden, L. Zhang, S. Berson, S. Herzog and S. Jamin, Resource 1993 Reservation Protocol (RSVP) - Version 1 Functional Specification, 1994 RFC 2205, September 1997. 1996 [6] J. Wroclawski, Specification of the Controlled Load Network Element 1997 Service, RFC 2211, September 1997. 1999 [7] S. Shenker, C. Partridge and R. Guerin, Specification of Guaranteed 2000 Quality of Service, RFC 2212, September 1997. 2002 [8] R. Braden, D. Clark and S. Shenker, Integrated Services in the 2003 Internet Architecture: An Overview, RFC 1633, June 1994. 2005 [9] J. Wroclawski, The Use of RSVP with IETF Integrated Services, 2006 RFC 2210, September 1997. 2008 [10] S. Shenker and J. Wroclawski, Network Element Service Specification 2009 Template, RFC 2216, September 1997. 2011 [11] S. Shenker and J. Wroclawski, General Characterization Parameters 2012 for Integrated Service Network Elements, RFC 2215, September 1997. 2014 [12] L. Delgrossi and L. Berger (Editors), Internet Stream Protocol 2015 Version 2 (ST2) Protocol Specification - Version ST2+, 2016 RFC 1819, August 1995. 2018 [13] M. Seaman, A. Smith, and E. Crawley, Integrated Service Mappings on 2019 IEEE 802 Networks, work in progress, Internet Draft, 2020 , November 1997. 2022 [14] R. Yavatkar, D. Hoffman, Y. Bernet, and F. Baker, SBM 2023 (Subnet Bandwidth Manager): Protocol for RSVP-based Admission 2024 Control over IEEE 802-style networks, work in progress, 2025 Internet Draft, , 2026 November 1997. 2028 [15] ISO/IEC 8802-3 Information technology - Telecommunications and 2029 information exchange between systems - Local and metropolitan 2030 area networks - Common specifications - Part 3: Carrier Sense 2031 Multiple Access with Collision Detection (CSMA/CD) Access Method 2032 and Physical Layer Specifications, (also ANSI/IEEE Std 802.3-1996), 2033 1996. 2035 [15] ISO/IEC 8802-5 Information technology - Telecommunications and 2036 information exchange between systems - Local and metropolitan 2037 area networks - Common specifications - Part 5: Token Ring Access 2038 Method and Physical Layer Specifications, (also 2039 ANSI/IEEE Std 802.5-1995), 1995. 2041 [17] J. Postel and J. Reynolds, A Standard for the Transmission of 2042 IP Datagrams over IEEE 802 Networks, RFC 1042, February 1988. 2044 [18] C. Bisdikian, B. V. Patel, F. Schaffa, and M Willebeek-LeMair, 2045 The Use of Priorities on Token Ring Networks for Multimedia 2046 Traffic, IEEE Network, Nov/Dec 1995. 2048 [19] IEEE Standards for Local and Metropolitan Area Networks: 2049 Demand Priority Access Method, Physical Layer and Repeater 2050 Specification for 100 Mb/s Operation, IEEE Std 802.12-1995. 2052 [20] Fiber Distributed Data Interface MAC, ANSI Std. X3.139-1987. 2054 Security Considerations 2056 Implementation of the model described in this memo creates no known 2057 new avenues for malicious attack on the network infrastructure 2058 although readers are referred to Section 2.8 of the RSVP 2059 specification [5] for a discussion of the impact of the use of 2060 admission control signaling protocols on network security. 2062 Acknowledgements 2064 Much of the work presented in this document has benefited greatly 2065 from discussion held at the meetings of the Integrated Services 2066 over Specific Link Layers (ISSLL) working group. We would like to 2067 acknowledge contributions from the many participants via discussion 2068 at these meetings and on the mailing list. We would especially like 2069 to thank Eric Crawley, Don Hoffman and Raj Yavatkar for contributions 2070 via previous Internet drafts, and Peter Kim for contributing the text 2071 about Demand Priority networks. 2073 Authors' Addresses 2075 Anoop Ghanwani 2076 IBM Corporation 2077 P.O.Box 12195 2078 Research Triangle Park, NC 27709 2079 USA 2080 +1-919-254-0260 2081 anoop@raleigh.ibm.com 2083 J. Wayne Pace 2084 IBM Corporation 2085 P. O. Box 12195 2086 Research Triangle Park, NC 27709 2087 USA 2088 +1-919-254-4930 2089 pacew@raleigh.ibm.com 2091 Vijay Srinivasan 2092 IBM Corporation 2093 P. O. Box 12195 2094 Research Triangle Park, NC 27709 2095 USA 2096 +1-919-254-2730 2097 vijay@raleigh.ibm.com 2099 Andrew Smith 2100 Extreme Networks 2101 10460 Bandley Drive 2102 Cupertino CA 95014 2103 USA 2104 +1-408-863-2821 2105 andrew@extremenetworks.com 2107 Mick Seaman 2108 3Com Corporation 2109 5400 Bayfront Plaza 2110 Santa Clara CA 95052-8145 2111 USA 2112 +1-408-764-5000 2113 mick_seaman@3com.com