idnits 2.17.1 

draft-ietf-issll-is802-framework-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-16) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 41
     longer pages, the longest (page 40) being 74 lines

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 42 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([13], [14]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 39 has weird spacing: '...res. It  inclu...'

  == Line 554 has weird spacing: '...ted and  the s...'

  == Line 1139 has weird spacing: '...tion of  any l...'

  == Line 1570 has weird spacing: '...  1.2ms  unbou...'

  == Line 1571 has weird spacing: '...  120us  unbou...'

  == (4 more instances...)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (November 1997) is 9649 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: '11' is defined on line 1828, but no explicit
     reference was found in the text

  == Unused Reference: '12' is defined on line 1832, but no explicit
     reference was found in the text

  == Unused Reference: '17' is defined on line 1855, but no explicit
     reference was found in the text

  == Unused Reference: '18' is defined on line 1859, but no explicit
     reference was found in the text

  == Unused Reference: '20' is defined on line 1866, but no explicit
     reference was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. '1'

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  -- Possible downref: Non-RFC (?) normative reference: ref. '3'

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  ** Downref: Normative reference to an Informational RFC: RFC 1633 (ref. '8')

  -- Possible downref: Non-RFC (?) normative reference: ref. '10'

  ** Downref: Normative reference to an Historic RFC: RFC 1819 (ref. '12')

  -- Possible downref: Non-RFC (?) normative reference: ref. '13'

  -- Possible downref: Non-RFC (?) normative reference: ref. '14'

  -- Possible downref: Non-RFC (?) normative reference: ref. '15'

  -- Possible downref: Non-RFC (?) normative reference: ref. '16'

  -- Possible downref: Non-RFC (?) normative reference: ref. '18'

  -- Possible downref: Non-RFC (?) normative reference: ref. '19'

  -- Possible downref: Non-RFC (?) normative reference: ref. '20'


     Summary: 12 errors (**), 0 flaws (~~), 14 warnings (==), 14 comments
     (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                           Anoop Ghanwani
3	INTERNET DRAFT                                             J. Wayne Pace
4	Expires May 1998                                        Vijay Srinivasan
5	                                                               IBM Corp.
6	                                                            Andrew Smith
7	                                                        Extreme Networks
8	                                                             Mick Seaman
9	                                                              3Com Corp.
10	                                                           November 1997

12	                    A Framework for Providing Integrated Services
13	           Over Shared and Switched IEEE 802 LAN Technologies

15	                draft-ietf-issll-is802-framework-03.txt

17	Status of This Memo

19	   This document is an Internet-Draft.  Internet Drafts are working
20	   documents of the Internet Engineering Task Force (IETF), its areas,
21	   and its working groups.  Note that other groups may also distribute
22	   working documents as Internet Drafts. Internet Drafts are draft
23	   documents valid for a maximum of six months, and may be updated,
24	   replaced, or obsoleted by other documents at any time.  It is not
25	   appropriate to use Internet Drafts as reference material, or to cite
26	   them other than as a ``working draft'' or ``work in progress.''  To
27	   view the entire list of current Internet-Drafts, please check the
28	   "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
29	   Directories on ftp.is.co.za (Africa), ftp.nordu.net (Europe),
30	   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
31	   ftp.isi.edu (US West Coast). This document is a product of the IS802
32	   subgroup of the ISSLL working group of the Internet Engineering Task
33	   Force.  Comments are solicited and should be addressed to the working
34	   group's mailing list at issll@mercury.lcs.mit.edu and/or the authors.

36	Abstract

38	This memo describes a framework for supporting IETF Integrated Services
39	on shared and switched LAN infrastructures. It  includes background
40	material on the capabilities of IEEE 802-like networks with regard to
41	parameters that affect Integrated Services such as access latency, delay
42	variation and queueing support in LAN switches. It discusses aspects of
43	IETF's Integrated Services model that cannot easily be accommodated in
44	different LAN environments. It outlines a functional model for
45	supporting the Resource Reservation Protocol (RSVP) in such LAN
46	environments.  Details of extensions to RSVP for use over LANs are
47	described in an accompanying memo [14]. Mappings of the assorted
48	Integrated Services onto IEEE LANs are described in another memo [13].

50	1 Introduction

52	The Internet has traditionally provided support for best effort traffic
53	only.  However, with the recent advances in link layer technology, and
54	with numerous emerging real-time applications such as video conferencing
55	and Internet telephony, there has been much interest for developing
56	mechanisms which enable real-time services over the Internet.  A
57	framework for meeting these new requirements was set out in RFC1633 [8]
58	and this has driven the specification of various classes of network
59	service by the Integrated Services working group of the IETF, such as
60	Controlled Load RFC 2211 [6] and Guaranteed Service RFC 2212 [7].  Each
61	of these service classes is designed to provide certain Quality of
62	Service (QoS) to traffic conforming to a specified set of parameters.
63	Applications are expected to choose one of these classes according to
64	their QoS requirements.  One mechanism for end-stations to utilise such
65	services in an IP network is provided by a QoS signaling protocol, the
66	Resource Reservation Protocol (RSVP) RFC 2205 [5] developed by the RSVP
67	working group of the IETF.  The IEEE under its Project 802 has defined
68	standards for many different local area network technologies. These all
69	typically offer the same "MAC-layer" datagram service [1] to upper-layer
70	protocols such as IP although they often provide different dynamic
71	behaviour characteristics - it is these that are important when
72	considering their ability to support real-time services. Later in this
73	memo we describe some of the relevant characteristics of different MAC-
74	layer LAN technologies.   In addition, IEEE 802 has defined standards
75	for bridging multiple LAN segments together using devices known as "MAC
76	Bridges" or "Switches" [2]. Newer work has also defined enhanced queuing
77	[3] and "virtual LAN" [4] capabilities for these devices.  Such LANs
78	often constitute the last hop or hops between users and the Internetas
79	well as being a primary building-block for complete private campus
80	networks.  It is therefore necessary to provide standardized mechanisms
81	for using these technologies to support end- to-end real-time services.
82	In order to do this, there must be some mechanism for resource
83	management at the data-link link layer.  Resource management in this
84	context encompasses the functions of admission control, scheduling,
85	traffic policing, etc.  The ISSLL (Integrated Services over Specific
86	Link Layers) working group in the IETF was chartered with the purpose of
87	exploring and standardizing such mechanisms for various link layer
88	technologies.  2 Document Outline

90	This document is concerned with specifying a framework for providing
91	Integrated Services over shared and switched LAN technologies such as
92	Ethernet/802.3, token ring/802.5, FDDI, etc. We begin in section 4 with
93	a discussion of the capabilities of various IEEE 802 MAC-layer
94	technologies. Section 5lists the requirements and goals for a mechanism
95	capable of providing Integrated Services in a LAN. The resource
96	management functions outlined in Section 5are provided by an entity
97	referred to as a Bandwidth Manager (BM): the architectural model of the
98	the BM is described in section 6 and its various components are
99	discussed in section 7.  Some implementation issues with respect to link
100	layer support for Integrated Services are examined in Section 8. We then
101	in section 9 discuss a taxonomy of topologies for the LAN technologies
102	under consideration with an emphasis on the capabilities of each which
103	can be leveraged for enabling Integrated Services. In this framework, no
104	assumptions are made about the topology at the link layer.  The
105	framework is intended to be as exhaustive as possible; this means that
106	it is possible that all the functions discussed may not be supportable
107	by a particular topology or technology, but this should not preclude the
108	usage of this model for it.

110	3 Definitions

112	   The following is a list of terms used in this and other ISSLL
113	documents.

115	-  Link Layer or Layer 2 or L2: We refer to data-link layer technologies
116	such as IEEE 802.3/Ethernet as L2 or layer 2.

118	- Link Layer Domain or Layer 2 domain or L2 domain: a set of nodes and
119	links interconnected without passing through a L3 forwarding function.
120	One or more IP subnets can be overlaid on a L2 domain.

122	- Layer 2 or L2 devices: We refer to devices that only implement Layer 2
123	functionality as Layer 2 or L2 devices. These include 802.1D bridges or
124	switches.

126	- Internetwork Layer or Layer 3 or L3: Layer 3 of the ISO 7 layer model.
127	This memo is primarily concerned with networks that use the Internet
128	Protocol (IP) at this layer.

130	- Layer 3 Device or L3 Device or End-Station: these include hosts and
131	routers that use L3 and higher layer protocols or application programs
132	that need to make resource reservations.

134	- Segment: A L2 physical segment that is shared by one or more senders.
135	Examples of segments include (a) a shared Ethernet or Token-Ring wire
136	resolving contention for media access using CSMA or token passing, (b) a
137	half duplex link between two stations or switches, (c) one direction of
138	a switched full-duplex link.

140	-  Managed segment: A managed segment is a segment with a DSBM present
141	and responsible for exercising admission control over requests for
142	resource reservation. A managed segment includes those interconnected
143	parts of a shared LAN that are not separated by DSBMs.

145	- Traffic Class:  An aggregation of data flows which are given similar
146	service within a switched network.

148	- Subnet: used in this memo to indicate a group of L3 devices sharing a
149	common L3 network address prefix along with the set of segments making
150	up the L2 domain in which they are located.

152	- Bridge/Switch: a layer 2 forwarding device as defined by IEEE 802.1D.
153	The terms bridge and switch are used synonymously in this memo.

155	4 Frame Forwarding in IEEE 802 Networks

157	4.1 General IEEE 802 Service Model

159	User_priority is a value associated with the transmission and reception
160	of all frames in the IEEE 802 service model: it is supplied by the
161	sender that is using the MAC service. It is provided along with the data
162	to a receiver using the MAC service. It may or may not be actually
163	carried over the network: Token-Ring/802.5 carries this value (encoded
164	in its FC octet), basic Ethernet/802.3 does not, 802.12 may or may not
165	depending on the frame format in use. 802.1p defines a consistent way to
166	carry this value over the bridged network on Ethernet, Token Ring,
167	Demand-Priority, FDDI or other MAC-layer media using an extended frame
168	format. The usage of user_priority is summarised below but is more fully
169	described in section 2.5 of 802.1D [2] and 802.1p [3] "Support of the
170	Internal Layer Service by Specific MAC Procedures" and readers are
171	referred to these documents for further information.

173	If the "user_priority" is carried explicitly in packets, its utility is
174	as a simple label in the data stream enabling packets in different
175	classes to be discriminated easily by downstream nodes without their
176	having to parse the packet in more detail.

178	Apart from making the job of desktop or wiring-closet switches easier,
179	an explicit field means they do not have to change hardware or software
180	as the rules for classifying packets evolve (e.g. based on new protocols
181	or new policies). More sophisticated layer-3 switches, perhaps deployed
182	towards the core of a network, can provide added value here by
183	performing the classification more accurately and, hence, utilising
184	network resources more efficiently or providing better protection of
185	flows from one another: this appears to be a good economic choice since
186	there are likely to be very many more desktop/wiring closet switches in
187	a network than switches requiring layer-3 functionality.

189	The IEEE 802 specifications make no assumptions about how user_priority
190	is to be used by end stations or by the network. In particular it can
191	only be considered a "priority" in a loose sense: although 802.1p
192	defines static priority queuing as the default mode of operation of
193	switches that implement multiple queues (user_priority is defined as a
194	3-bit quantity so strict priority queueing would give value 7 = high
195	priority, 0 = low priority). The general switch algorithm is as follows:
196	packets are placed onto a particular queue based on the received
197	user_priority (perhaps directly from the packet if a 802.1p header or
198	802.5 network was used or else invented according to some local policy
199	if not). The selection of queue is based on a mapping from user_priority
200	[0,1,2,3,4,5,6 or 7] onto the number of available queues. Note that
201	switches may implement any number of queues from 1 upwards and it may
202	not be visible externally, except through any advertised int-serv
203	parameters and the switch's admission control behaviour, which
204	user_priority values get mapped internally onto the same vs. different
205	queues. Other algorithms that a switch might implement might include
206	e.g. weighted fair queueuing, round robin.

208	In particular, IEEE makes no recommendations about how a sender should
209	select the value for user_priority: one of the main purposes of this
210	current document is to propose such usage rules and how to communicate
211	the semantics of the values between switches, end-stations and routers.
212	In the remainder of this document we use the term "traffic class"
213	synonymously with user_priority.

215	4.2 Ethernet/802.3

217	There is no explicit traffic class or user_priority field carried in
218	Ethernet packets. This means that user_priority must be regenerated at a
219	downstream receiver or switch according to some defaults or by parsing
220	further into higher-layer protocol fields in the packet. Alternatively,
221	the IEEE 802.1Q encapsulation [4] may be used which provides an explicit
222	traffic class field on top of an basic MAC format.

224	For the different IP packet encapsulations used over Ethernet/802.3, it
225	will be necessary to adjust any admission-control calculations according
226	to the framing and to the padding requirements:

228	Encapsulation                          Framing Overhead  IP MTU
229	                                          bytes/pkt       bytes

231	IP EtherType (ip_len<=46 bytes)             64-ip_len    1500
232	             (1500>=ip_len>=46 bytes)         18         1500

234	IP EtherType over 802.1p/Q (ip_len<=42)     64-ip_len    1500*
235	             (1500>=ip_len>=42 bytes)         22         1500*

237	IP EtherType over LLC/SNAP (ip_len<=40)     64-ip_len    1492
238	             (1500>=ip_len>=40 bytes)         24         1492

240	* note that the draft IEEE 802.1Q specification exceeds the current IEEE
241	802.3 maximum packet length values by 4 bytes although work is
242	proceeding within IEEE to address this issue.

244	4.3 Token-Ring/802.5

246	The token ring standard [6] provides a priority mechanism that can be
247	used to control both the queuing of packets for transmission and the
248	access of packets to the shared media. The priority mechanisms are
249	implemented using bits within the Access Control (AC) and the Frame
250	Control (FC) fields of a LLC frame. The first three bits of the AC
251	field, the Token Priority bits, together with the last three bits of the
252	AC field, the Reservation bits, regulate which stations get access to
253	the ring. The last three bits of the FC field of an LLC frame, the User
254	Priority bits, are obtained from the higher layer in the user_priority
255	parameter when it requests transmission of a packet. This parameter also
256	establishes the Access Priority used by the MAC. The user_priority value
257	is conveyed end-to-end by the User Priority bits in the FC field and is
258	typically preserved through Token-Ring bridges of all types. In all
259	cases, 0 is the lowest priority.

261	Token-Ring also uses a concept of Reserved Priority: this relates to the
262	value of priority which a station uses to reserve the token for the next
263	transmission on the ring.  When a free token is circulating, only a
264	station having an Access Priority greater than or equal to the Reserved
265	Priority in the token will be allowed to seize the token for
266	transmission. Readers are referred to [14] for further discussion of
267	this topic.

269	A token ring station is theoretically capable of separately queuing each
270	of the eight levels of requested user priority and then transmitting
271	frames in order of priority.  A station sets Reservation bits according
272	to the user priority of frames that are queued for transmission in the
273	highest priority queue.  This allows the access mechanism to ensure that
274	the frame with the highest priority throughout the entire ring will be
275	transmitted before any lower priority frame.  Annex I to the IEEE 802.5
276	token ring standard recommends that stations send/relay frames as
277	follows:

279	            Application             user_priority

281	            non-time-critical data      0
282	                  -                     1
283	                  -                     2
284	                  -                     3
285	            LAN management              4
286	            time-sensitive data         5
287	            real-time-critical data     6
288	            MAC frames                  7

290	To reduce frame jitter associated with high-priority traffic, the annex
291	also recommends that only one frame be transmitted per token and that
292	the maximum information field size be 4399 octets whenever delay-
293	sensitive traffic is traversing the ring.  Most existing implementations
294	of token ring bridges forward all LLC frames with a default access
295	priority of 4.  Annex I recommends that bridges forward LLC frames that
296	have a user priorities greater that 4 with a reservation equal to the
297	user priority (although the draft IEEE P802.1p [2]   permits network
298	management override this behaviour). The capabilities provided by token
299	ring's user and reservation priorities and by IEEE 802.1p can provide
300	effective support for Integrated Services flows that request QoS using
301	RSVP. These mechanisms can provide, with few or no additions to the
302	token ring architecture, bandwidth guarantees with the network flow
303	control necessary to support such guarantees.

305	For the different IP packet encapsulations used over Token Ring/802.5,
306	it will be necessary to adjust any admission-control calculations
307	according to the framing requirements:

309	Encapsulation                          Framing Overhead  IP MTU
310	                                          bytes/pkt       bytes

312	IP EtherType over 802.1p/Q                    29          4370*
313	IP EtherType over LLC/SNAP                    25          4370*

315	*the suggested MTU from RFC 1042 [13] is 4464 bytes but there are issues
316	related to discovering what the maximum supported MTU between any two
317	points both within and between Token Ring subnets. We recommend here an
318	MTU consistent with the 802.5 Annex I recommendation.

320	4.4 FDDI

322	The Fiber Distributed Data Interface standard [16] provides a priority
323	mechanism that can be used to control both the queuing of packets for
324	transmission and the access of packets to the shared media. The priority
325	mechanisms are implemented using similar mechanisms to Token-Ring
326	described above. The standard also makes provision for "Synchronous"
327	data traffic with strict media access and delay guarantees - this mode
328	of operation is not discussed further here: this is an area within the
329	scope of the ISSLL WG that requires further work. In the remainder of
330	this document we treat FDDI as a 100Mbps Token Ring (which it is) using
331	a service interface compatible with IEEE 802 networks.

333	4.5 Demand-Priority/802.12
334	IEEE 802.12 [19] is a standard for a shared 100Mbit/s LAN. Data packets
335	are transmitted using either 803.3 or 802.5 frame formats. The MAC
336	protocol is called Demand Priority. Its main characteristics in respect
337	to QoS are the support of two service priority levels (normal- and high-
338	priority) and the service order: data packets from all network nodes
339	(e.g. end-hosts and bridges/switches) are served using a simple round
340	robin algorithm.

342	If the 802.3 frame format is used for data transmission then
343	user_priority is encoded in the starting delimiter of the 802.12 data
344	packet. If the 802.5 frame format is used then the priority is
345	additionally encoded in the YYY bits of the AC field in the 802.5 packet
346	header (see also section 4.3). Furthermore, the 802.1p/Q encapsulation
347	may also be applied in 802.12 networks with its own user_priority field.
348	Thus, in all cases, switches are able to recover any user_priority
349	supplied by a sender.

351	The same rules apply for 802.12 user_priority mapping through a bridge
352	as with other media types: the only additional information is that
353	"normal" priority is used by default for user_priority values 0 through
354	4 inclusive and "high" priority is used for user_priority levels 5
355	through 7: this ensures that the default Token-Ring user_priority level
356	of 4 for 802.5 bridges is mapped to "normal" on 802.12 segments.

358	The medium access in 802.12 LANs is deterministic: the demand priority
359	mechanism ensures that, once the normal priority service has been pre-
360	empted, all high priority packets have strict priority over packets with
361	normal priority. In the abnormal situation that a normal-priority packet
362	has been waiting at the front of a MAC transmit queue for a time period
363	longer than PACKET_PROMOTION (200 - 300 ms [15]),its priority is
364	automatically 'promoted' to high priority. Thus, even normal-priority
365	packets have a maximum guaranteed access time to the medium.

367	Integrated Services can be built on top of the 802.12 medium access
368	mechanism. When combined with admission control and bandwidth
369	enforcement mechanisms, delay guarantees as required for a Guaranteed
370	Service can be provided without any changes to the existing 802.12 MAC
371	protocol.

373	Since the 802.12 standard supports the 802.3 and 802.5 frame formats,
374	the same framing overhead as reported in sections 4.2 and 4.3 must be
375	considered in the admission control equations for 802.12 links.

377	5 Requirements and Goals

379	This section discusses the requirements and goals which should drive the
380	design of an architecture for supporting Integrated Services over LAN
381	technologies.  The requirements refer to functions and features which
382	must be supported, while goals refer to functions and features which are
383	desirable, but are not an absolute necessity.  Many of the requirements
384	and goals are driven by the functionality supported by Integrated
385	Services and RSVP.

387	5.1 Requirements

389	- Resource Reservation: The mechanism must be capable of reserving
390	resources on a single segment or multiple segments and at
391	bridges/switches connecting them.  It must be able to provide
392	reservations for both unicast and multicast sessions.  It should be
393	possible to change the level of reservation while the session is in
394	progress.

396	- Admission Control: The mechanism must be able to estimate the level of
397	resources necessary to meet the QoS requested by the session in order to
398	decide whether or not the session can be admitted.  For the purpose of
399	management, it is useful to provide the ability to respond to queries
400	about availability of resources. It must be able to make admission
401	control decisions for different types of services such as guaranteed
402	delay, controlled load, etc.

404	- Flow Separation and Scheduling:  It is necessary to provide a
405	mechanism for traffic flow separation so that real-time flows can be
406	given preferential treatment over best effort flows.  Packets of real-
407	time flows can then be isolated and scheduled according to their service
408	requirements.

410	- Policing:  Traffic policing must be performed in order to ensure that
411	sources adhere to their negotiated traffic specifications. Policing must
412	be implemented at the sources and must ensure that violating traffic is
413	either dropped or transmitted as best effort. Policing may optionally be
414	implemented in the bridges and switches.  Alternatively, traffic may be
415	shaped to insure conformance to the negotiated parameters.

417	- Soft State:  The mechanism must maintain soft state information about
418	the reservations.  This means that state information must be
419	periodically refreshed if the reservation is to be maintained; otherwise
420	the state information and corresponding reservations will expire after
421	some pre-specified interval.

423	- Centralized or Distributed Implementation: In the case of a
424	centralized implementation, a single entity manages the resources of the
425	entire subnet. This approach has the advantage of being easier to deploy
426	since bridges and switches may not need to be upgraded with additional
427	functionality. However, this approach scales poorly with geographical
428	size of the subnet and the number of end stations attached. In a fully
429	distributed implementation, each segment will have a local entity
430	managing its resources. This approach has better scalability than the
431	former.  However, it requires that all bridges and switches in the
432	network support new mechanisms. It is also possible to have a semi-
433	distributed implementation where there is more than one entity, each
434	managing the resources of a subset of segments and bridges/switches
435	within the subnet.  Ideally, implementation should be flexible; i.e. a
436	centralized approach may be used for small subnets and a distributed
437	approach can be used for larger subnets.  Examples of centralized and
438	distributed implementations are discussed in Section 4.

440	- Scalability:  The mechanism and protocols should have a low overhead
441	and should scale to the largest receiver groups likely to occur within a
442	single link layer domain.

444	- Fault Tolerance and Recovery:  The mechanism must be able to function
445	in the presence of failures; i.e.  there should not be a single point of
446	failure.  For instance, in a centralized implementation, some mechanism
447	must be specified for back-up and recovery in the event of failure.

449	- Interaction with Existing Resource Management Controls: The
450	interaction with existing infrastructure for resource management needs
451	to be specified.  For example, FDDI has a resource management mechanism
452	called the "Synchronous Bandwidth Manager". The mechanism must be
453	designed so that it takes advantage of, and specifies the interaction
454	with, existing controls where available.

456	5.2 Goals

458	- Independence from higher layer protocols: The mechanism should, as far
459	as possible, be independent of higher layer protocols such as RSVP and
460	IP. Independence from RSVP is desirable so that it can interwork with
461	other reservation protocols such as ST2 [10]. Independence from IP is
462	desirable so that it can interwork with network layer protocols such as
463	IPX, NetBIOS, etc.

465	- Receiver heterogeneity: this refers to multicast communication where
466	different receivers request different levels of service. For example, in
467	a multicast group with many receivers, it is possible that one of the
468	receivers desires a lower delay bound than the others.  A better delay
469	bound may be provided by increasing the amount of resources reserved
470	along the path to that receiver while leaving the reservations for the
471	other receivers unchanged.  In its most complex form, receiver
472	heterogeneity implies the ability to simultaneously provide various
473	levels of service as requested by different receivers.  In its simplest
474	form, receiver heterogeneity will allow a scenario where some of the
475	receivers use best effort service and those requiring service guarantees
476	make a reservation.  Receiver heterogeneity, especially for the
477	reserved/best effort scenario, is a very desirable function.  More
478	details on supporting receiver heterogeneity are provided in Section 6.

480	- Support for different filter styles: It is desirable to provide
481	support for the different filter styles defined by RSVP such as fixed
482	filter, shared explicit and wildcard.  Some of the issues with respect
483	to supporting such filter styles in the link layer domain are examined
484	in Section 6.

486	- Path Selection: In source routed LAN technologies such as token
487	ring/802.5, it may be useful for the mechanism to incorporate the
488	function of path selection. Using an appropriate path selection
489	mechanism may optimize utilization of network resources.

491	5.3 Non-goals

493	This document describes service mappings onto existing IEEE- and ANSI-
494	defined standard MAC layers and uses standard MAC-layer services as in
495	IEEE 802.1 bridging. It does not attempt to make use of or describe the
496	capabilities of other proprietary or standard MAC-layer protocols
497	although it should be noted that there exists published work regarding
498	MAC layers suitable for QoS mappings: these are outside the scope of the
499	IETF ISSLL working group charter.

501	5.4 Assumptions

503	For this framework, it is assumed that typical subnetworks that are
504	concerned about quality-of-service will be "switch-rich": that is to say
505	most communication between end stations using integrated services
506	support will pass through at least one switch. The mechanisms and
507	protocols described will be trivially extensible to communicating
508	systems on the same shared media, but it is important not to allow
509	problem generalisation to complicate the practical application that we
510	target: the access characteristics of Ethernet and Token-Ring LANs are
511	forcing a trend to switch-rich topologies. In addition, there have been
512	developments in the area of MAC enhancements to ensure delay-
513	deterministic access on network links e.g. IEEE 802.12 [19] and also
514	proprietary schemes.

516	Note that we illustrate most examples in this model using RSVP as an
517	"upper-layer" QoS signaling protocol but there are actually no real
518	dependencies on this protocol: RSVP could be replaced by some other
519	dynamic protocol or else the requests could be made by network
520	management or other policy entities. In particular, the SBM signaling
521	protocol [14], which is based upon RSVP, is designed to work seamlessly
522	in the architecture described in this memo.

524	There may be a heterogeneous mixture of switches with different
525	capabilities, all compliant with IEEE 802.1D [2] [3], but implementing
526	queuing and forwarding mechanisms in a range from simple 2-queue per
527	port, strict priority, up to more complex multi- queue (maybe even one
528	per-flow) WFQ or other algorithms.

530	The problem is broken down into smaller independent pieces: this may
531	lead to sub-optimal usage of the network resources but we contend that
532	such benefits are often equivalent to very small improvements in network
533	efficiency in a LAN environment. Therefore, it is a goal that the
534	switches in the network operate using a much simpler set of information
535	than the RSVP engine in a router. In particular, it is assumed that such
536	switches do not need to implement per-flow queuing and policing
537	(although they might do so).

539	It is a fundamental assumption of the int-serv model that flows are
540	isolated from each other throughout their transit across a network.
541	Intermediate queueing nodes are expected to police the traffic to ensure
542	that it conforms to the pre-agreed traffic flow specification. In the
543	architecture proposed here for mapping to layer-2, we diverge from that
544	assumption in the interests of simplicity: the policing function is
545	assumed to be implemented in the transmit schedulers of the layer-3
546	devices (end stations, routers). In the LAN environments envisioned, it
547	is reasonable to assume that end stations are "trusted" to adhere to
548	their agreed contracts at the inputs to the network and that we can
549	afford to over-allocate resources at admission -control time to
550	compensate for the inevitable extra jitter/bunching introduced by the
551	switched network itself.

553	These divergences have some implications on the types of receiver
554	heterogeneity that can be supported and  the statistical multiplexing
555	gains that might have been exploited, especially for Controlled Load
556	flows: this is discussed in a later section of this document.

558	6 Basic Architecture

560	The functional requirements described in Section 3 will be performed by
561	an entity which we refer to as the Bandwidth Manager (BM). The BM is
562	responsible for providing mechanisms for an application or higher layer
563	protocol to request QoS from the network.  For architectural purposes,
564	the BM consists of the following components.

566	6.1 Components

568	6.1.1 Requester Module

570	The Requester Module (RM) resides in every end station in the subnet.

572	One of its functions is to provide an interface between applications or
573	higher layer protocols such as RSVP, STII, SNMP, etc. and the BM. An
574	application can invoke the various functions of the BM by using the
575	primitives for communication with the RM and providing it with the
576	appropriate parameters.  To initiate a reservation, in the link layer
577	domain, the following parameters must be passed to the RM: the service
578	desired (Guaranteed Service or Controlled Load), the traffic descriptors
579	contained in the TSpec, and an RSpec specifying the amount of resources
580	to be reserved [9].  More information on these parameters may be found
581	in the relevant Integrated Services documents [6,7,8,9].  When RSVP is
582	used for signaling at the network layer, this information is available
583	and needs to be extracted from the RSVP PATH and RSVP RESV messages (See
584	[5] for details).  In addition to these parameters, the network layer
585	addresses of the end points must be specified.  The RM must then
586	translate the network layer addresses to link layer addresses and
587	convert the request into an appropriate format which is understood by
588	other components of the BM responsible admission control.  The RM is
589	also responsible for returning the status of requests processed by the
590	BM to the invoking application or higher layer protocol.

592	6.1.2 Bandwidth Allocator

594	The Bandwidth Allocator (BA) is responsible for performing admission
595	control and maintaining state about the allocation of resources in the
596	subnet.  An end station can request various services, e.g. bandwidth
597	reservation, modification of an existing reservation, queries about
598	resource availability, etc.  These requests are processed by the BA. The
599	communication between the end station and the BA takes place through the
600	RM. The location of the BA will depend largely on the implementation
601	method.  In a centralized implementation, the BA may reside on a single
602	station in the subnet. In a distributed implementation, the functions of
603	the BA may be distributed in all the end stations and bridges/switches
604	as necessary.  The BA is also responsible for deciding how to label
605	flows, e.g.  based on the admission control decision, the BA may
606	indicate to the RM that packets belonging to a particular flow be tagged
607	with some priority value which maps to the appropriate traffic class.

609	6.1.3 Communication Protocols

611	The protocols for communication between the various components of the BM
612	system must be specified.  These include the following:

614	- Communication between the higher layer protocols and the RM:  The BM
615	must define primitives for the application to initiate reservations,
616	query the BA about available resources, and change or delete
617	reservations, etc.  These primitives could be implemented as an API for
618	an application to invoke functions of the BM via the RM.

620	- Communication between the RM and the BA: A signaling mechanism must be
621	defined for the communication between the RM and the BA. This protocol
622	will specify the messages which must be exchanged between the RM and the
623	BA in order to service various requests by the higher layer entity.

625	- Communication between peer BAs: If there is more than one BA in the
626	subnet, a means must be specified for inter-BA communication.
627	Specifically, the BAs must be able to decide among themselves about
628	which BA would be responsible for which segments and bridges or
629	switches.  Further, if a request is made for resource reservation along
630	the domain of multiple BAs, the BAs must be able to handle such a
631	scenario correctly.  Inter-BA communication will also be responsible for
632	back-up and recovery in the event of failure.

634	6.2 Centralised vs. Distributed Implementations

636	Example scenarios are provided showing the location of the the
637	components of the bandwidth manager in centralized and fully distributed
638	implementations.  Note that in either case, the RM must be present in
639	all end stations which desire to make reservations. Essentially,
640	centralized or distributed refers to the implementation of the BA, the
641	component responsible for resource reservation and admission control.
642	In the figures below, "App" refers to the application making use of the
643	BM. It could either be a user application, or a higher layer protocol
644	process such as RSVP.

646	                               +---------+
647	                           .-->|  BA     |<--.
648	                          /    +---------+    \
649	                         / .-->| Layer 2 |<--. \
650	                        / /    +---------+    \ \
651	                       / /                     \ \
652	                      / /                       \ \
653	  +---------+        / /                         \ \       +---------+
654	  |  App    |<----- /-/---------------------------\-\----->|  App    |
655	  +---------+      / /                             \ \     +---------+
656	  |  RM     |<----. /                               \ .--->|  RM     |
657	  +---------+      / +---------+        +---------+  \     +---------+
658	  | Layer 2 |<------>| Layer 2 |<------>| Layer 2 |<------>| Layer 2 |
659	  +---------+        +---------+        +---------+        +---------+

661	  RSVP Host/         Intermediate       Intermediate       RSVP Host/
662	     Router          Bridge/Switch      Bridge/Switch         Router

664	   Figure 1 - Bandwidth Manager with centralized Bandwidth Allocator

666	Figure 1 shows a centralized implementation where a single BA is
667	responsible for admission control decisions for the entire subnet. Every
668	end station contains a RM. Intermediate bridges and switches in the
669	network need not have any functions of the BM since they will not be
670	actively participating in admission control.  The RM at the end station
671	requesting a reservation initiates communication with its BA. For larger
672	subnets, a single BA may not be able to handle the reservations for the
673	entire subnet.  In that case it would be necessary to deploy multiple
674	BAs, each managing the resources of a non-overlapping subset of
675	segments.  In a centralized implementation, the BA must have some model
676	of the layer-2 topology of the subnet e.g. link layer spanning tree
677	information, in order to be able to reserve resources on appropriate
678	segments.  Without this topology information, the BM would have to
679	reserve resources on all segments for all flows which, in a switched
680	network, would lead to very inefficient utilization of resources.

682	  +---------+                                              +---------+
683	  |  App    |<-------------------------------------------->|  App    |
684	  +---------+        +---------+        +---------+        +---------+
685	  |  RM/BA  |<------>|  BA     |<------>|  BA     |<------>|  RM/BA  |
686	  +---------+        +---------+        +---------+        +---------+
687	  | Layer 2 |<------>| Layer 2 |<------>| Layer 2 |<------>| Layer 2 |
688	  +---------+        +---------+        +---------+        +---------+

690	  RSVP Host/         Intermediate       Intermediate       RSVP Host/
691	     Router          Bridge/Switch      Bridge/Switch         Router

693	Figure 2 - Bandwidth Manager with fully distributed Bandwidth Allocator

695	Figure 2 depicts the scenario of a fully distributed bandwidth manager.
696	In this case, all devices in the subnet have BM functionality.  All the
697	end hosts are still required to have a RM. In addition, all stations
698	actively participate in admission control. With this approach, each BA
699	would need only local topology information since it is responsible for
700	the resources on segments that are directly connected to it.  This local
701	topology information, such as a list of ports active on the spanning
702	tree and which unicast addresses are reachable from which ports, is
703	readily available in today's switches. Note that in the figures above,
704	the arrows between peer layers are used to indicate logical
705	connectivity.

707	7 Model of the Bandwidth Manager in a Network

709	In this section we describe how the model above fits with the existing
710	IETF Integrated Services model of IP hosts and IP routers. First we
711	describe layer-3 host and router implementations; later we describe how
712	the model is applied in layer-2 switches. Throughout we indicate any
713	differences between centralised and distributed implementations.

715	7.1 End-station model

717	7.1.1 Layer-3 Client Model

719	We assume the same client model as int-serv and RSVP where we use the
720	term "client" to mean the entity handling QoS in the layer-3 device at
721	each end of a layer-2 hop (e.g. end-station, router). In this model, the
722	sending client is responsible for local admission control and scheduling
723	packets onto its link in accordance with the service agreed. As with the
724	current int-serv model, this involves per-flow scheduling (a.k.a.
725	traffic shaping) in every such originating source.

727	For now, we assume that the client is running an RSVP process which
728	presents a session establishment interface to applications, signals over
729	the network, programs a scheduler and classifier in the driver and
730	interfaces to a policy control module. In particular, RSVP also
731	interfaces to a local admission control module: it is this entity that
732	we focus on here.

734	The following diagram is taken from the RSVP specification [5]:
735	                      _____________________________
736	                     |  _______                    |
737	                     | |       |   _______         |
738	                     | |Appli- |  |       |        |   RSVP
739	                     | | cation|  | RSVP <-------------------->
740	                     | |       <-->       |        |
741	                     | |       |  |process|  _____ |
742	                     | |_._____|  |       -->Polcy||
743	                     |   |        |__.__._| |Cntrl||
744	                     |   |data       |  |   |_____||
745	                     |===|===========|==|==========|
746	                     |   |   --------|  |    _____ |
747	                     |   |  |        |  ---->Admis||
748	                     |  _V__V_    ___V____  |Cntrl||
749	                     | |      |  |        | |_____||
750	                     | |Class-|  | Packet |        |
751	                     | | ifier|==>Schedulr|====================>
752	                     | |______|  |________|        |    data
753	                     |                             |
754	                     |_____________________________|

756	                    Figure 3 - RSVP in Sending Hosts

758	Note that we illustrate examples in this document using RSVP as the
759	"upper-layer" signaling protocol but there are no actual dependencies on
760	this protocol: RSVP could be replaced by some other dynamic protocol or
761	else the requests could be made by network management or other policy
762	entities.

764	7.1.2 Requests to layer-2 ISSLL

766	The local admission control entity within a client is responsible for
767	mapping these layer-3 session-establishment requests into layer-2
768	language.

770	The upper-layer entity makes a request, in generalised terms to ISSLL of
771	the form:

773	   "May I reserve for traffic with <traffic characteristic> with
774	   <performance requirements> from <here> to <there> and how should I
775	   label it?"

777	where
778	   <traffic characteristic> = Sender Tspec
779	                            (e.g. bandwidth, burstiness, MTU)
780	   <performance requirements> = FlowSpec
781	                            (e.g. latency, jitter bounds)
782	   <here> = IP address(es)
783	   <there> = IP address(es) - may be multicast

785	7.1.3 At the Layer-3 Sender

787	The ISSLL  functionality in the sender is illustrated in Figure 4.

789	                    from IP     from RSVP
790	                   ____|____________|____________
791	                  |    |            |            |
792	                  |  __V____     ___V___         |
793	                  | |       |   |       |        |
794	                  | | Addr  |<->|       |        | SBM signaling
795	                  | |mapping|   |Request|<------------------------>
796	                  | |_______|   |Module |        |
797	                  |  ___|___    |       |        |
798	                  | |       |<->|       |        |
799	                  | |  802  |   |_______|        |
800	                  | | header|     / | |          |
801	                  | |_______|    /  | |          |
802	                  |    |        /   | |   _____  |
803	                  |    | +-----/    | +->|Band-| |
804	                  |  __V_V_    _____V__  |width| |
805	                  | |      |  |        | |Alloc| |
806	                  | |Class-|  | Packet | |_____| |
807	                  | | ifier|==>Schedulr|======================>
808	                  | |______|  |________|         |  data
809	                  |______________________________|

811	                 Figure 4 - ISSLL in End-station Sender

813	The functions of the Requestor Module may be summarised as:  - maps the
814	endpoints of the conversation to layer-2 addresses in the LAN, so that
815	the client can figure out what traffic is really going where (probably
816	makes reference to the ARP protocol cache for unicast or an algorithmic
817	mapping for multicast destinations).

819	- communicates with any local Bandwidth Allocator module for local
820	admission control decisions

822	- formats a SBM request to the network with the mapped addresses and
823	filter/flow specs

825	- receives response from the network and reports the YES/NO admission
826	control answer back to the upper layer entity, along with any negotiated
827	modifications to the session parameters.

829	- saves any returned user_priority to be associated with this session in
830	a "802 header" table: this will be used when adding layer-2 header
831	before sending any future data packet belonging to this session. This
832	table might, for example, be indexed by the RSVP flow identifier.

834	The Bandwidth Allocator (BA) component is only present when a
835	distributed BA model is implemented: when present, its functions can be
836	summarised as:  - applies local admission control on outgoing link
837	bandwidth and driver queueing resources

839	7.1.4 At the Layer-3 Receiver

841	The ISSLL functionality in the receiver is simpler. It is summarised
842	below and is illustrated by Figure 5.

844	The Requestor Module

846	- handles any received SBM protocol indications.

848	- communicates with any local BA for local admission control decisions

850	- passes indications up to RSVP if OK.

852	- accepts confirmations from RSVP and relays them back via SBM signaling
853	towards the requester.

855	- may program a receive classifier and scheduler, if any is used, to
856	identify traffic classes of received packets and accord them appropriate
857	treatment e.g. reserve some buffers for particular traffic classes.

859	- programs receiver to strip any 802 header information from received
860	packets.

862	The Bandwidth Allocator, present only in a distributed implementation

864	- applies local admission control to see if a request can be supported
865	with appropriate local receive resources.

867	                     to RSVP       to IP
868	                       ^            ^
869	                   ____|____________|___________
870	                  |    |            |           |
871	                  |  __|____        |           |
872	                  | |       |       |           |
873	 SBM signaling    | |Request|    ___|___        |
874	<-----------------> |Module |   | Strip |       |
875	                  | |_______|   |802 hdr|       |
876	                  |    |   \    |_______|       |
877	                  |  __v___ \       ^           |
878	                  | | Band- |\      |           |
879	                  | |  width| \     |           |
880	                  | | Alloc |  \    |           |
881	                  | |_______|   \   |           |
882	                  |  ______     v___|____       |
883	                  | |Class-|   | Packet  |      |
884	===================>| ifier|==>|Scheduler|      |
885	     data         | |______|   |_________|      |
886	                  |_____________________________|

888	                Figure 5 - ISSLL in End-station Receiver

890	7.2 Switch Model

892	7.2.1 Centralised BA

894	Where a centralised Bandwidth Allocator model is implemented, switches
895	do not take part in the admission control process: all admission control
896	is implemented by a central BA e.g. a "Subnet Bandwidth Manager" (SBM)
897	as described in [14]. Note that this centralised BA  may actually be co-
898	located with a switch but its functions would not necessarily then be
899	closely tied with the switches forwarding functions as is the case with
900	the distributed BA described below.

902	7.2.2 Distributed BA

904	The model of layer-2 switch behaviour described here uses the
905	terminology of the SBM protocol as an example of an admission control
906	protocol: the model is equally applicable when other mechanisms e.g.
907	static configuration, network management are in use for admission
908	control. We define the following entities within the switch:

910	* Local admission control - one of these on each port accounts for the
911	available bandwidth on the link attached to that port. For half- duplex
912	links, this involves taking account of the resources allocated to both
913	transmit and receive flows. For full-duplex, the input port accountant's
914	task is trivial.

916	* Input SBM module: one instance on each port, performs the "network"
917	side of the signaling protocol for peering with clients or other
918	switches. Also holds knowledge of the mappings of int-serv classes to
919	user_priority.

921	* SBM propagation - relays requests that have passed admission control
922	at the input port to the relevant output ports' SBM modules. This will
923	require access to the switch's forwarding table (layer-2 "routing table"
924	cf. RSVP model) and port spanning-tree states.

926	* Output SBM module - forwards requests to the next layer-2 or -3
927	network hop.

929	* Classifier, Queueing and Scheduler - these functions are basically as
930	described by the Forwarding Process of IEEE 802.1p (see section 3.7 of
931	[3]). The Classifier module identifies the relevant QoS information from
932	incoming packets and uses this, together with the normal bridge
933	forwarding database, to decide to which output queue of which output
934	port to enqueue the packet. Different types of switches will use
935	different techniques for flow idenfication - see section 8.1 for details
936	of a taxonomy of switch types. In Class I switches, this information is
937	the "regenerated user_priority" parameter which has already been decoded
938	by the receiving MAC service and potentially re- mapped by the 802.1p
939	forwarding process (see description in section 3.7.3 of [3]). This does
940	not preclude more sophisticated classification rules which may be
941	applied in more complex Class III switches e.g. matching on individual
942	int-serv flows.

944	The Queueing and Scheduler module holds the output queues for ports and
945	provides the algorithm for servicing the queues for transmission onto
946	the output link in order to provide the promised int-serv service.
947	Switches will implement one or more output queues per port and all will
948	implement at least a basic strict priority dequeueing algorithm as their
949	default, in accordance with 802.1p.

951	* Ingress traffic class mapper and policing - as described in 802.1p
952	section 3.7. This optional module may check on whether the data within
953	traffic classes are conforming to the patterns currently agreed:
954	switches may police this and discard or re-map packets. The default
955	behaviour is to pass things through unchanged.

957	* Egress traffic class mapper - as described in 802.1p section 3.7. This
958	optional module may apply re-mapping of traffic classes e.g. on a per-
959	output port basis. The default behaviour is to pass things through
960	unchanged.

962	These are shown by the following diagram which is a superset of the IEEE
963	802.1D bridge model:

965	                   _______________________________
966	                  |  _____     ______     ______  |
967	 SBM signaling    | |     |   |      |   |      | | SBM signaling
968	<------------------>| IN  |<->| SBM  |<->| OUT  |<---------------->
969	                  | | SBM |   | prop.|   | SBM  | |
970	                  | |_____|   |______|   |______| |
971	                  |  / |          ^     /     |   |
972	    ______________| /  |          |     |     |   |_____________
973	   | \             / __V__        |     |   __V__             / |
974	   |   \      ____/ |Local|       |     |  |Local|          /   |
975	   |     \   /      |Admis|       |     |  |Admis|        /     |
976	   |       \/       |Cntrl|       |     |  |Cntrl|      /       |
977	   |  _____V \      |_____|       |     |  |_____|    / _____   |
978	   | |traff |  \               ___|__   V_______    /  |egrss|  |
979	   | |class |    \            |Filter| |Queue & | /    |traff|  |
980	   | |map & |=====|==========>|Data- |=| Packet |=|===>|class|  |
981	   | |police|     |           |  base| |Schedule| |    |map  |  |
982	   | |______|     |           |______| |________| |    |_____|  |
983	   |____^_________|_______________________________|______|______|
984	data in |                                                |data out
985	========+                                                +========>
986	                      Figure 6 - ISSLL in Switches

988	7.3 Admission Control

990	On reception of an admission control request, a switch performs the
991	following actions, again using SBM as an example: the behaviour is
992	different depending on whether the "Designated SBM"  for this segment is
993	within this switch or not - see [14] for a more detailed specification
994	of the DSBM/SBM actions:

996	* if the ingress SBM is the "Designated SBM" for this link/segment, it
997	translates any received user_priority or else selects a layer-2 traffic
998	class which appears compatible with the request and whose use does not
999	violate any administrative policies in force. In effect, it matches up
1000	the requested service with those available in each of the user_priority
1001	classes and chooses the "best" one. It ensures that, if this reservation
1002	is successful, the selected value is passed back to the client.

1004	* ingress DSBM observes the current state of allocation of resources on
1005	the input port/link and then determines whether the new resource
1006	allocation from the mapped traffic class would be excessive. The request
1007	is passed to the reservation propagator if accepted so far.

1009	* if the ingress SBM is not the "Designated SBM" for this link/segment
1010	then it passes the request on directly to the reservation propagator

1012	* reservation propagator relays the request to the bandwidth accountants
1013	on each of the switch's outbound links to which this reservation would
1014	apply (implied interface to routing/forwarding database).

1016	* egress bandwidth accountant observes the current state of allocation
1017	of queueing resources on its outbound port and bandwidth on the link
1018	itself and determines whether the new allocation would be excessive.
1019	Note that this is only the local decision of this switch hop: each
1020	further layer-2 hop through the network gets a chance to veto the
1021	request as it passes along.

1023	* the request, if accepted by this switch, is then passed on down the
1024	line on each output link selected. Any user_priority described in the
1025	forwarded request must be translated according to any egress mapping
1026	table.

1028	* if accepted, the switch must notify the client of the user_priority to
1029	use for packets belonging to this flow.  Note that this is a
1030	"provisional YES" - we assume an optimistic approach here: later
1031	switches can still say "NO" later.

1033	* if this switch wishes to reject the request, it can do so by notifying
1034	the original client (by means of its layer-2 address).

1036	7.4 QoS Signaling

1038	The mechanisms described in this document make use of a signaling
1039	protocol for devices to communicate their admission control requests
1040	across the network: the service definitions to be provided by such a
1041	protocol e.g. [14] are described below. Below, we illustrate the
1042	primitives and information that need to be exchanged with such a
1043	signaling protocol entity - in all these examples, appropriate
1044	delete/cleanup mechanisms will also have to be provided for when
1045	sessions are torn down.

1047	7.4.1 Client service definitions

1049	The following interfaces can be identified from Figure 4 and Figure 5

1051	* SBM <-> Address mapping

1053	 This is a simple lookup function which may cause ARP protocol
1054	interactions, may be just a lookup of an existing ARP cache entry or may
1055	be an algorithmic mapping. The layer-2 addresses are needed by SBM for
1056	inclusion in its signaling messages to/from switches which avoids the
1057	switches having to perform the mapping and, hence, have knowledge of
1058	layer-3 information for the complete subnet:

1060	 l2_addr = map_address( ip_addr )

1062	* SBM <-> Session/802 header

1064	This is for notifying the transmit path of how to add layer-2 header
1065	information e.g. user_priority values to the traffic of each outgoing
1066	flow: the transmit path will provide the user_priority value when it
1067	requests a MAC-layer transmit operation for each packet (user_priority
1068	is one of the parameters passed in the packet transmit primitive defined
1069	by the IEEE 802 service model):

1071	bind_l2_header( flow_id, user_priority )

1073	* SBM <-> Classifier/Scheduler

1075	This is for notifying transmit classifier/scheduler of any additional
1076	layer-2 information associated with scheduling the transmission of a
1077	flow packets: this primitive may be unused in some implementations or it
1078	may be used, for example, to provide information to a transmit scheduler
1079	that is performing per-traffic_class scheduling in addition to the per-
1080	flow scheduling required by int-serv: the l2_header may be a pattern
1081	(additional to the FilterSpec) to be used to identify the flow's
1082	traffic.

1084	bind_l2schedulerinfo( flow_id, , l2_header, traffic_class )

1086	* SBM <-> Local Admission Control

1088	For applying local admission control for a session e.g. is there enough
1089	transmit bandwidth still uncommitted for this potential new session? Are
1090	there sufficient receive buffers? This should commit the necessary
1091	resources if OK: it will be necessary to release these resources at a
1092	later stage if the session setup process fails. This call would be made
1093	by a segment's Designated SBM for example:

1095	status = admit_l2session( flow_id, Tspec, FlowSpec )

1097	* SBM <-> RSVP - this is outlined above in section 7.1.2 and fully
1098	described in [14].

1100	* Management Interfaces
1101	Some or all of the modules described by this model will also require
1102	configuration management: it is expected that details of the manageable
1103	objects will be specified by future work in the ISSLL WG.

1105	7.4.2 Switch service definitions

1107	The following interfaces are identified from Figure 6:

1109	* SBM <-> Classifier

1111	This is for notifying receive classifier of how to match up incoming
1112	layer-2 information with the associated traffic class: it may in some
1113	cases consist of a set of read-only default mappings:

1115	bind_l2classifierinfo( flow_id, l2_header, traffic_class )

1117	* SBM <-> Queue and Packet Scheduler

1119	This is for notifying transmit scheduler of additional layer-2
1120	information associated with a given traffic class (it may be unused in
1121	some cases - see discussion in previous section):

1123	bind_l2schedulerinfo( flow_id, l2_header, traffic_class )

1125	* SBM <-> Local Admission Control

1127	 As for host above.

1129	* SBM <-> Traffic Class Map and Police

1131	 Optional configuration of any user_priority remapping that might be
1132	implemented on ingress to and egress from the ports of a switch (note
1133	that, for Class I switches, it is likely that these mappings will have
1134	to be consistent across all ports):

1136	 bind_l2ingressprimap( inport, in_user_pri, internal_priority )
1137	 bind_l2egressprimap( outport, internal_priority, out_user_pri )

1139	 Optional configuration of  any layer-2 policing function to be applied
1140	on a per-class basis to traffic matching the l2_header. If the switch is
1141	capable of per-flow policing then existing int-serv/RSVP models will
1142	provide a service definition for that configuration:

1144	 bind_l2policing( flow_id, l2_header, Tspec, FlowSpec )

1146	* SBM <-> Filtering Database

1148	SBM propagation rules need access to the layer-2 forwarding database to
1149	determine where to forward SBM messages (analogous to RSRR interface in
1150	L3 RSVP):

1152	output_portlist = lookup_l2dest( l2_addr )

1154	* Management Interfaces

1156	Some or all of the modules described by this model will also require
1157	configuration management: it is expected that details of the manageable
1158	objects will be specified by future work in the ISSLL WG.

1160	8 Implementation Issues

1162	As stated earlier, the Integrated Services working group has defined
1163	various service classes offering varying degrees of QoS guarantees.
1164	Initial effort will concentrate on enabling the Controlled Load [6] and
1165	Guaranteed Service classes [7].  The Controlled Load service provides a
1166	loose guarantee, informally stated as "the same as best effort would be
1167	on an unloaded network".  The Guaranteed Service provides an upper-bound
1168	on the transit delay of any packet.  The extent to which these services
1169	can be supported at the link layer will depend on many factors including
1170	the topology and technology used.  Some of the mapping issues are
1171	discussed below in light of the emerging link layer standards and the
1172	functions supported by higher layer protocols.  Considering the
1173	limitations of some of the topologies under consideration, it may not be
1174	possible to satisfy all the requirements for Integrated Services on a
1175	given topology.  In such cases, it is useful to consider providing
1176	support for an approximation of the service which may suffice in most
1177	practical instances.  For example, it may not be feasible to provide
1178	policing/shaping at each network element (bridge/switch) as required by
1179	the Controlled Load specification.  But if this task is left to the end
1180	stations, a reasonably good approximation to the service can be
1181	obtained.

1183	8.1 Switch characteristics

1185	For the sake of illustration, we divide layer-2 bridges/switches into
1186	several categories, based on the level of sophistication of their QoS
1187	and software protocol capabilities: these categories are not intended to
1188	represent all possible implementation choices but, instead, to aid
1189	discussion of what QoS capabilities can be expected from a network made
1190	of these devices (the basic "class 0" device is included for
1191	completeness but cannot really provide useful integrated service).

1193	Class 0
1194	   - 802.1D MAC bridging
1195	   - single queue per output port, no separation of traffic classes
1196	   - Spanning-Tree to remove topology loops (single active path)

1198	Class I
1199	   - 802.1p priority queueuing between traffic classes.
1200	   - No multicast heterogeneity.
1201	   - 802.1p GARP/GMRP pruning of individual multicast addresses.

1203	Class II As (I) plus:
1204	   - can map received user_priority on a per-input-port basis to some
1205	   internal set of canonical values.
1206	   - can map internal canonical values onto transmitted user_priority on
1207	   a per-output-port basis giving some limited form of multicast
1208	   heterogeneity.
1209	   - maybe implements IGMP snooping for pruning.

1211	Class III As (II) plus:
1212	   - per-flow classification
1213	   - maybe per-flow policing and/or reshaping
1214	   - more complex transmit scheduling (probably not per-flow)

1216	8.2 Queueing

1218	Connectionless packet-based networks in general, and LAN-switched
1219	networks in particular, work today because of scaling choices in network
1220	provisioning. Consciously or (more usually) unconsciously, enough excess
1221	bandwidth and buffering is provisioned in the network to absorb the
1222	traffic sourced by higher-layer protocols or cause their transmission
1223	windows to run out, on a statistical basis, so that the network is only
1224	overloaded for a short duration and the average expected loading is less
1225	than 60% (usually much less).

1227	With the advent of time-critical traffic such over-provisioning has
1228	become far less easy to achieve. Time critical frames may find
1229	themselves queued for annoyingly long periods of time behind temporary
1230	bursts of file transfer traffic, particularly at network bottleneck
1231	points, e.g. at the 100 Mb/s to 10 Mb/s transition that might occur
1232	between the riser to the wiring closet and the final link to the user
1233	from a desktop switch. In this case, however, if it is known (guaranteed
1234	by application design, merely expected on the basis of statistics, or
1235	just that this is all that the network guarantees to support) that the
1236	time critical traffic is a small fraction of the total bandwidth, it
1237	suffices to give it strict priority over the "normal" traffic. The worst
1238	case delay experienced by the time critical traffic is roughly the
1239	maximum transmission time of a maximum length non-time-critical frame -
1240	less than a millisecond for 10 Mb/s Ethernet, and well below an end to
1241	end budget based on human perception times.

1243	When more than one "priority" service is to be offered by a network
1244	element e.g. it supports Controlled-Load as well as Guaranteed Service,
1245	the queuing discipline becomes more complex. In order to provide the
1246	required isolation between the service classes, it will probably be
1247	necessary to queue them separately. There is then an issue of how to
1248	service the queues - a combination of admission control and more
1249	intelligent queueing disciplines e.g. weighted fair queuing, may be
1250	required in such cases. As with the service specifications themselves,
1251	it is not the place for this document to specify queuing algorithms,
1252	merely to observe that the external behaviour meet the services'
1253	requirements.

1255	8.3 Mapping of Services to Link Level Priority

1257	The number of traffic classes supported and access methods of the
1258	technology under consideration will determine how many and what services
1259	may be supported.  Native token ring/802.5, for instance, supports eight
1260	priority levels which may be mapped to one or more traffic classes.
1261	Ethernet/802.3 has no support for signaling priorities within frames.
1262	However, the IEEE 802 standards committee has recently developed a new
1263	standard for bridges/switches related to multimedia traffic expediting
1264	and dynamic multicast filtering [3]. A packet format for carrying a User
1265	Priority field on all IEEE 802 media types is now defined in [4]. These
1266	standards allow for up to eight traffic classes on all media.  The User
1267	Priority bits carried in the frame are mapped to a particular traffic
1268	class within a bridge/switch.  The User Priority is signaled on an end-
1269	to-end basis, unless overridden by bridge/switch management.  The
1270	traffic class that is used by a flow should depend on the quality of
1271	service desired and whether the reservation is successful or not.
1272	Therefore, a sender should use the User Priority value which maps to the
1273	best effort traffic class until told otherwise by the BM.  The BM will,
1274	upon successful completion of resource reservation, specify the User
1275	Priority to be used by the sender for that session's data.  An
1276	accompanying memo [13] addresses the issue of mapping the various
1277	Integrated Services to appropriate traffic classes.

1279	8.4 Re-mapping of non-conformant aggregated flows

1281	One other topic under discussion in the int-serv context is how to
1282	handle the traffic for data flows from sources that are exceeding their
1283	currently agreed traffic contract with the network. An approach that
1284	shows some promise is to treat such traffic with "somewhat less than
1285	best effort" service in order to protect traffic that is normally given
1286	"best effort" service from having to back off (such traffic is often
1287	"adaptive" using TCP or other congestion control algorithms and it would
1288	be unfair to penalise it due to badly behaved traffic from reserved
1289	flows which are often set up by non-adaptive applications).

1291	One solution here might be to assign normal best effort traffic to one
1292	user_priority and to label excess non-conformant traffic as a "lower"
1293	user_priority although the re-ordering problems that might arise from
1294	doing this may make this solution undesirable, particularly if the flows
1295	are using TCP: for this reason the controlled load service recommends
1296	dropping excess traffic, rather than re-mapping to a lower priority.
1297	This topic is further discussed below.

1299	8.5 Override of incoming user_priority

1301	In some cases, a network administrator may not trust the user_priority
1302	values contained in packets from a source and may wish to map these into
1303	some more suitable set of values. Alternatively, due perhaps to
1304	equipment limitations or transition periods, values may need to be
1305	mapped to/from different regions of a network.

1307	Some switches may implement such a function on input that maps received
1308	user_priority into some internal set of values (this table is known in
1309	802.1p as the "user_priority regeneration table"). These values can then
1310	be mapped using the output table described above onto outgoing
1311	user_priority values: these same mappings must also be used when
1312	applying admission control to requests that use the user_priority values
1313	(see e.g. [14]).  More sophisticated approaches may also be envisioned
1314	where a device polices traffic flows and adjusts their onward
1315	user_priority based on their conformance to the admitted traffic flow
1316	specifications.

1318	8.6 Support for Different Reservation Styles

1320	              +-----+       +-----+       +-----+
1321	              | S1  |       | S2  |       | S3  |
1322	              +-----+       +-----+       +-----+
1323	                 |             |             |
1324	                 |             v             |
1325	                 |          +-----+          |
1326	                 +--------->| SW  |<---------+
1327	                            +-----+
1328	                             |   |
1329	                        +----+   +----+
1330	                        |             |
1331	                        v             V
1332	                     +-----+       +-----+
1333	                     | R1  |       | R2  |
1334	                     +-----+       +-----+

1336	               Figure 7 - Illustration of filter styles.

1338	In the figure above, SW is a bridge/switch in the link layer domain. S1,
1339	S2, S3, R1 and R2 are end stations which are members of a group
1340	associated with the same RSVP flow.  S1, S2 and S3 are upstream end
1341	stations.  R1 and R2 are the downstream end-stations which receive
1342	traffic from all the senders.  RSVP allows receivers R1 and R2 to
1343	specify reservations which can apply to: (a) one specific sender only
1344	(fixed filter); (b) any of two or more explicitly specified senders
1345	(shared explicit filter); and (c) any sender in the group (shared
1346	wildcard filter).  Support for the fixed filter style is
1347	straightforward; a separate reservation is made for the traffic from
1348	each of the senders.  However, support for the other two filter styles
1349	has implications regarding policing; i.e. the merged flow from the
1350	different senders must be policed so that they conform to traffic
1351	parameters specified in the filter's RSpec. This scenario is further
1352	complicated if the services requested by R1 and R2 are different.
1353	Therefore, in the absence of policing within bridges/switches, it may be
1354	possible to support only fixed filter reservations at the link layer.

1356	8.7 Supporting Receiver Heterogeneity

1358	At layer-3, the int-serv model allows heterogeneous multicast flows
1359	where different branches of a tree can have different types of
1360	reservations for a given multicast destination. It also supports the
1361	notion that trees may have some branches with reserved flows and some
1362	using best effort (default) service. If we were to treat a layer-2
1363	subnet as a single "network element", as defined in [8], then all of the
1364	branches of the distribution tree that lie within the subnet could be
1365	assumed to require the same QoS treatment and be treated as an atomic
1366	unit as regards admission control etc.. With this assumption, the model
1367	and protocols already defined by int-serv and RSVP already provide
1368	sufficient support for multicast heterogeneity. Note, however, that an
1369	admission control request may well be rejected because just one link in
1370	the subnet has reached its traffic limit and that this will lead to
1371	rejection of the request for the whole subnet.

1373	                           +-----+
1374	                           |  S  |
1375	                           +-----+
1376	                              |
1377	                              v
1378	              +-----+      +-----+      +-----+
1379	              | R1  |<-----| SW  |----->| R2  |
1380	              +-----+      +-----+      +-----+

1382	              Figure 8 - Example of receiver heterogeneity

1384	As an example, consider Figure 8, SW is a Layer 2 device (bridge/switch)
1385	participating in resource reservation, S is the upstream source end
1386	station and R1 and R2 are downstream end station receivers.  R1 would
1387	like to make a reservation for the flow while R2 would like to receive
1388	the flow using best effort service.  S sends RSVP PATH messages which
1389	are multicast to both R1 and R2.  R1 sends an RSVP RESV message to S
1390	requesting the reservation of resources.

1392	If the reservation is successful at Layer 2, the frames addressed to the
1393	group will be categorized in the traffic class corresponding to the
1394	service requested by R1.  At SW, there must be some mechanism which
1395	forwards the packet providing service corresponding to the reserved
1396	traffic class at the interface to R1 while using the best effort traffic
1397	class at the interface to R2.  This may involve changing the contents of
1398	the frame itself, or ignoring the frame priority at the interface to R2.

1400	Another possibility for supporting heterogeneous receivers would be to
1401	have separate groups with distinct MAC addresses, one for each class of
1402	service.  By default, a receiver would join the "best effort" group
1403	where the flow is classified as best effort.  If the receiver makes a
1404	reservation successfully, it can be transferred to the group for the
1405	class of service desired.  The dynamic multicast filtering capabilities
1406	of bridges and switches implementing the emerging IEEE 802.1p standard
1407	would be a very useful feature in such a scenario.  A given flow would
1408	be transmitted only on those segments which are on the path between the
1409	sender and the receivers of that flow.  The obvious disadvantage of such
1410	an approach is that the sender needs to send out multiple copies of the
1411	same packet corresponding to each class of service desired thus
1412	potentially duplicating the traffic on a portion of the distribution
1413	tree.

1415	The above approaches would provide very sub-optimal utilisation of
1416	resources given the size and complexity of the layer-2 subnets
1417	envisioned by this document. Therefore, it is desirable to support the
1418	ability of layer-2 switches to apply QoS differently on different egress
1419	branches of a tree that divides at that switch: this is discussed in the
1420	following paragraphs.

1422	IEEE 802.1D and 802.1p specify a basic model for multicast whereby a
1423	switch performs multicast routing decisions based on the destination
1424	address: this would produce a list of output ports to which the packet
1425	should be forwarded. In its default mode, such a switch would use the
1426	user_priority value in received packets (or a value regenerated on a
1427	per-input-port basis in the absence of an explicit value) to enqueue the
1428	packets at each output port. All of the classes of switch identified
1429	above can support this operation.

1431	If a switch is selecting per-port output queues based only on the
1432	incoming user_priority, as described by 802.1p, it must treat all
1433	branches of all multicast sessions within that user_priority class with
1434	the same queuing mechanism: no heterogeneity is then possible and this
1435	could well lead to the failure of an admission control request for the
1436	whole multicast session due to a single link being at its maximum
1437	allocation, as described above. Note that, in the layer-2 case as
1438	distinct from the layer-3 case with RSVP/int-serv, the option of having
1439	some receivers getting the session with the requested QoS and some
1440	getting it best effort does not exist as the Class I switches are unable
1441	to re-map the user_priority on a per-link basis: this could well become
1442	an issue with heavy use of dynamic multicast sessions. If a switch were
1443	to implement a separate user_priority mapping at each output port, as
1444	described under "Class II switch" above, then some limited form of
1445	receiver heterogeneity can be supported e.g. forwarding of traffic as
1446	user_priority 4 on one branch where receivers have performed admission
1447	control reservations and as user_priority 0 on one where they have not.
1448	We assume that per-user_priority queuing without taking account of input
1449	or output ports is the minimum standard functionality for switches in a
1450	LAN environment (Class I switch, as defined above) but that more
1451	functional layer-2 or even layer-3 switches (a.k.a. routers) can be used
1452	if even more flexible forms of heterogeneity are considered necessary to
1453	achieve more efficient resource utilisation: note that the behaviour of
1454	layer-3 switches in this context is already well standardised by IETF.

1456	9 Network Topology Scenarios
1457	As stated earlier, this memo is concerned with specifying a framework
1458	for supporting Integrated Services in LAN technologies such as
1459	Ethernet/IEEE 802.3, token ring/IEEE 802.5 and FDDI. The extent to which
1460	service guarantees can be provided by a network depend to a large degree
1461	on the ability to provide the key functions of flow identification and
1462	scheduling in addition to admission control and policing.  This section
1463	discusses some of the capabilities of these LAN technologies and
1464	provides a taxonomy of possible topologies emphasizing the capabilities
1465	of each with regard to supporting the above functions.  For the
1466	technologies considered here, the basic topology of a LAN may be shared,
1467	switched half duplex or switched full duplex.  In the shared topology,
1468	multiple senders share a single segment.  Contention for media access is
1469	resolved using protocols such as CSMA/CD in Ethernet and token passing
1470	in token ring and FDDI. Switched half duplex, is essentially a shared
1471	topology with the restriction that there are only two transmitters
1472	contending for resources on any segment.  Finally, in a switched full
1473	duplex topology, a full bandwidth path is available to the transmitter
1474	at each end of the link at all times.  Therefore, in this topology,
1475	there is no need for any access control mechanism such as CSMA/CD or
1476	token passing as there is no contention between the transmitters -
1477	obviously, this topology provides the best QoS capabilities.  Another
1478	important element in the discussion of topologies is the presence or
1479	absence of support for multiple traffic classes: these were discussed
1480	earlier in section 4.1.Depending on the basic topology used and the
1481	ability to support traffic classes, we identify six scenarios as
1482	follows:
1483	   1.      Shared topology without traffic classes
1484	   2.      Shared topology with traffic classes.
1485	   3.      Switched half duplex topology without traffic classes
1486	   4.      Switched half duplex topology with traffic classes
1487	   5.      Switched full duplex topology without traffic classes
1488	   6.      Switched full duplex topology with traffic classes

1490	There is also the possibility of hybrid topologies where two or more of
1491	the above coexist.  For instance, it is possible that within a single
1492	subnet, there are some switches which support traffic classes and some
1493	which do not.  If the flow in question traverses both kinds of switches
1494	in the network, the least common denominator will prevail.  In other
1495	words, as far as that flow is concerned, the network is of the type
1496	corresponding to the least capable topology that is traversed.  In the
1497	following sections, we present these scenarios in further detail for
1498	some of the different IEEE 802 network types with discussion of their
1499	abilities to support the Integrated Service classes.

1501	9.1 Full-duplex switched networks

1503	We have up to now ignored the MAC access protocol. On a full-duplex
1504	switched LAN (of either Ethernet or Token-Ring types - the MAC algorithm
1505	is, by definition, unimportant) this can be factored in to the
1506	characterisation parameters advertised by the device since the access
1507	latency is well controlled (jitter = one largest packet time). Some
1508	example characteristics (approximate):

1510	        Type        Speed             Max Pkt   Max Access
1511	                                       Length    Latency

1513	        Ethernet         10Mbps         1.2ms     1.2ms
1514	                        100Mbps         120us     120us
1515	                          1Gbps          12us      12us
1516	        Token-Ring        4Mbps           9ms       9ms
1517	                         16Mbps           9ms       9ms
1518	        FDDI            100Mbps         360us     8.4ms
1519	        Demand-Priority 100Mbps         120us     253us

1521	          Table 1 - Full-duplex switched media access latency

1523	These delays should be also be considered in the context of speed-of-
1524	light delays of e.g. ~400ns for typical 100m UTP links and ~7us for
1525	typical 2km multimode fibre links.

1527	Therefore we see Full-Duplex switched network topologies as offering
1528	good QoS capabilities for both Controlled Load and Guaranteed Service
1529	when supported by suitable queueing strategies in the switch nodes.

1531	9.2 Shared-media Ethernet networks

1533	We have not mentioned the difficulty of dealing with allocation on a
1534	single shared CSMA/CD segment: as soon as any CSMA/CD algorithm is
1535	introduced then the ability to provide any form of Guaranteed Service is
1536	seriously compromised in the absence of any tight coupling between the
1537	multiple senders on the link. There are a number of reasons for not
1538	offering a better solution for this issue.

1540	Firstly, we do not believe this is a truly solvable problem: it would
1541	seem to require a new MAC protocol. There have been proposals for
1542	enhancements to the MAC layer protocols e.g.  BLAM and enhanced flow-
1543	control in IEEE 802.3; IEEE 802.1 has examined research showing
1544	disappointing simulation results for performance guarantees on shared
1545	CSMA/CD Ethernet without MAC enhancements. However, any solution
1546	involving a new "software MAC" running above the traditional 802.3 MAC
1547	or other proprietary MAC protocols is clearly outside the scope of the
1548	work of the ISSLL WG and this document. Secondly, we are not convinced
1549	that it is really an interesting problem. While not everyone in the
1550	world is buying desktop switches today and there will be end stations
1551	living on repeated segments for some time to come, the number of
1552	switches is going up and the number of stations on repeated segments is
1553	going down. This trend is proceeding to the point that we may be happy
1554	with a solution which assumes that any network conversation requiring
1555	resource reservations will take place through at least one switch (be it
1556	layer-2 or layer-3). Put another way, the easiest QoS upgrade to a
1557	layer-2 network is to install segment switching: only when this has been
1558	done is it worthwhile to investigate more complex solutions involving
1559	admission control.

1561	Thirdly, in the core of the network (as opposed to at the edges), there
1562	does not seem to be wide deployment of repeated segments as opposed to
1563	switched solutions. There may be special circumstances in the future
1564	(e.g. Gigabit buffered repeaters) but these have differing
1565	characteristics to existing CSMA/CD repeaters anyway.

1567	        Type             Speed        Max Pkt   Max Access
1568	                                       Length    Latency

1570	        Ethernet        10Mbps         1.2ms  unbounded
1571	                       100Mbps         120us  unbounded
1572	                         1Gbps          12us  unbounded

1574	             Table 2 - Shared Ethernet media access latency

1576	9.3 Half-duplex switched Ethernet networks

1578	Many of the same arguments for sub-optimal support of Guaranteed Service
1579	apply to half-duplex switched Ethernet as to shared media: in essence,
1580	this topology is a medium that *is* shared between at least two senders
1581	contending for each packet transmission opportunity. Unless these are
1582	tightly coupled and cooperative then there is always the chance that the
1583	best-effort traffic of one will interfere with the important traffic of
1584	the other. Such coupling would seem to need some form of modifications
1585	to the MAC protocol (see above).

1587	Notwithstanding the above, half-duplex switched topologies do seem to
1588	offer the chance to provide Controlled Load service: with the knowledge
1589	that there are only a small limited number (e.g. two) of potential
1590	senders that are both using prioritisation for their CL traffic (with
1591	admission control for those CL flows based on the knowledge of the
1592	number of potential senders) over best effort, the media access
1593	characteristics, whilst not deterministic in the true mathematical
1594	sense, are somewhat predictable. This is probably a close enough
1595	approximation to CL to be useful.

1597	        Type        Speed             Max Pkt   Max Access
1598	                                       Length    Latency

1600	        Ethernet        10Mbps         1.2ms  unbounded
1601	                       100Mbps         120us  unbounded
1602	                         1Gbps          12us  unbounded

1604	      Table 3 - Half-duplex switched Ethernet media access latency

1606	9.4 Half-duplex and shared Token Ring networks

1608	In a shared Token Ring network, the network access time for high
1609	priority traffic at any station is bounded and is given by (N+1)*THTmax,
1610	where N is the number of stations sending high priority traffic and
1611	THTmax is the maximum token holding time [14].  This assumes that
1612	network adapters have priority queues so that reservation of the token
1613	is done for traffic with the highest priority currently queued in the
1614	adapter.  It is easy to see that access times can be improved by
1615	reducing N or THTmax.  The recommended default for THTmax is 10 ms [6].
1616	N is an integer from 2 to 256 for a shared ring and 2 for a switched
1617	half duplex topology. A similar analysis applies for FDDI. Using default
1618	values gives:

1620	        Type        Speed               Max Pkt   Max Access
1621	                                         Length    Latency

1623	        Token-Ring  4/16Mbps shared         9ms    2570ms
1624	                    4/16Mbps switched       9ms      30ms
1625	        FDDI        100Mbps               360us       8ms

1627	    Table 4 - Half-duplex and shared Token-Ring media access latency

1629	Given that access time is bounded, it is possible to provide an upper
1630	bound for end-to-end delays as required by Guaranteed Service assuming
1631	that traffic of this class uses the highest priority allowable for user
1632	traffic.  The actual number of stations that send traffic mapped into
1633	the same traffic class as GS may vary over time but, from an admission
1634	control standpoint, this value is needed a priori.  The admission
1635	control entity must therefore use a fixed value for N, which may be the
1636	total number of stations on the ring or some lower value if it is
1637	desired to keep the offered delay guarantees smaller. If the value of N
1638	used is lower than the total number of stations on the ring, admission
1639	control must ensure that the number of stations sending high priority
1640	traffic never exceeds this number. This approach allows admission
1641	control to estimate worst case access delays assuming that all of the N
1642	stations are sending high priority data even though, in most cases, this
1643	will mean that delays are significantly overestimated.

1645	Assuming that Controlled Load flows use a traffic class lower than that
1646	used by GS, no upper-bound on access latency can be provided for CL
1647	flows.  However, CL flows will receive better service than best effort
1648	flows.

1650	Note that, on many existing shared token rings, bridges will transmit
1651	frames using an Access Priority (see section 4.3) value 4 irrespective
1652	of the user_priority carried in the frame control field of the frame.
1653	Therefore, existing bridges would need to be reconfigured or modified
1654	before the above access time bounds can actually be used.

1656	9.5 Half-duplex and shared Demand-Priority networks

1658	In 802.12 networks, communication between end-nodes and hubs and between
1659	the hubs themselves is based on the exchange of link control signals.
1660	These signals are used to control the shared medium access. If a hub,
1661	for example, receives a high-priority request while another hub is in
1662	the process of serving normal-priority requests, then the service of the
1663	latter hub can effectively be pre-empted in order to serve the high-
1664	priority request first. After the network has processed all high-
1665	priority requests, it resumes the normal-priority service at the point
1666	in the network at which it was interrupted.

1668	The time needed to preempt normal-priority network service (the high-
1669	priority network access time) is bounded: the bound depends on the
1670	physical layer and on the topology of the shared network. The physical
1671	layer has a significant impact when operating in half-duplex mode as
1672	e.g. used across unshielded twisted-pair cabling (UTP) links, because
1673	link control signals cannot be exchanged while a packet is transmitted
1674	over the link. Therefore the network topology has to be considered
1675	since, in larger shared networks, the link control signals must
1676	potentially traverse several links (and hubs) before they can reach the
1677	hub which possesses the network control. This may delay the preemption
1678	of the normal priority service and hence increase the upper bound that
1679	may be guaranteed.

1681	Upper bounds on the high-priority access time are given below for a UTP
1682	physical layer and a cable length of 100 m between all end-nodes and
1683	hubs using a maximum propagation delay of 570ns as defined in [15].
1684	These values consider the worst case signaling overhead and assume the
1685	transmission of maximum-sized normal-priority data packets while the
1686	normal-priority service is being pre-empted.

1688	        Type            Speed                  Max Pkt   Max Access
1689	                                                Length    Latency

1691	        Demand Priority 100Mbps, 802.3pkt, UTP   120us     253us
1692	                                 802.5pkt, UTP   360us     733us

1694	   Table 5 - Half-duplex switched Demand-Priority UTP access latency

1696	Shared 802.12 topologies can be classified using the hub cascading level
1697	"N". The simplest topology is the single hub network (N = 1). For a UTP
1698	physical layer, a maximum cascading level of N = 5 is supported by the
1699	standard. Large shared networks with many hundreds nodes can however
1700	already be built with a level 2 topology. The bandwidth manager could be
1701	informed about the actual cascading level by using network management
1702	mechanisms and use this information in its admission control algorithms.

1704	        Type            Speed             Max Pkt  Max Access Topology
1705	                                           Length   Latency

1707	        Demand Priority 100Mbps, 802.3pkt  120us     262us      N=1
1708	                                           120us     554us      N=2
1709	                                           120us     878us      N=3
1710	                                           120us     1.24ms     N=4
1711	                                           120us     1.63ms     N=5

1713	        Demand Priority 100Mbps, 802.5pkt  360us     722us      N=1
1714	                                           360us     1.41ms     N=2
1715	                                           360us     2.32ms     N=3
1716	                                           360us     3.16ms     N=4
1717	                                           360us     4.03ms     N=5

1719	          Table 6 - Shared Demand-Priority UTP access latency

1721	In contrast to UTP, the fibre-optic physical layer operates in dual
1722	simplex mode: Upper bounds for the high-priority access time are given
1723	below for 2 km multimode fibre links with a propagation delay of 10 us.

1725	        Type            Speed                  Max Pkt   Max Access
1726	                                                Length    Latency

1728	        Demand Priority 100Mbps,802.3pkt,Fibre   120us     139us
1729	                                802.5pkt,Fibre   360us     379us

1731	  Table 7 - Half-duplex switched Demand-Priority Fibre access latency

1733	For shared-media with distances of 2km between all end-nodes and hubs,
1734	the 802.12 standard allows a maximum cascading level of 2. Higher levels
1735	of cascaded topologies are supported but require a reduction of the
1736	distances [15].

1738	        Type            Speed             Max Pkt  Max Access Topology
1739	                                           Length   Latency

1741	        Demand Priority 100Mbps,802.3pkt    120us     160us     N=1
1742	                                            120us     202us     N=2

1744	        Demand Priority 100Mbps,802.5pkt    360us     400us     N=1
1745	                                            360us     682us     N=2

1747	         Table 8 - Shared Demand-Priority Fibre access latency

1749	The bounded access delay and deterministic network access allow the
1750	support of service commitments required for Guaranteed Service and
1751	Controlled Load, even on shared-media topologies. The support of just
1752	two priority levels in 802.12, however, limits the number of services
1753	that can simultaneously be implemented across the network.

1755	10 Justification

1757	An obvious comment is that this whole model is too complex, it is what
1758	RSVP is doing already, why do we think we can do better by reinventing
1759	the solution to this problem at layer-2?

1761	The key is that there are a number of simple layer-2 scenarios that
1762	cover a considerable proportion of the real QoS problems that will
1763	occur: a solution that covers nearly all of the problems at
1764	significantly lower cost is beneficial. Full RSVP/int-serv with per-flow
1765	queueing in strategically-positioned high-function switches or routers
1766	may be needed to completely solve all issues but devices implementing
1767	the architecture described in this document will allow a significantly
1768	simpler network.

1770	11 Summary

1772	This document has specified a framework for providing Integrated
1773	Services over shared and switched LAN technologies.  The ability to
1774	provide QoS guarantees necessitates some form of admission control and
1775	resource management.  The requirements and goals of a resource
1776	management scheme for subnets have been identified and discussed. We
1777	refer to the entire resource management scheme as a Bandwidth Manager.
1778	Architectural considerations were discussed and examples were provided
1779	to illustrate possible implementations of a Bandwidth Manager. Some of
1780	the issues involved in mapping the services from higher layers to the
1781	link layer have also been discussed. Accompanying memos from the ISSLL
1782	working group address service mapping issues [13] and provide a protocol
1783	specification for the Bandwidth Manager protocol [14] based on the
1784	requirements and goals discussed in this document.

1786	12 References
1787	[1] IEEE Standards for Local and Metropolitan Area Networks: Overview
1788	and
1789	        Architecture ANSI/IEEE Std. 802.1

1791	[2] ISO/IEC 10038, ANSI/IEEE Std 802.1D-1993 "MAC Bridges"

1793	[3] ISO/IEC 15802-3 "Information technology - Telecommunications and
1794	        information exchange between systems - Local and metropolitan
1795	area
1796	        networks - Common specifications - Part 3: Media Access Control
1797	(MAC)
1798	        Bridges" (current draft available as IEEE P802.1p/D8)

1800	[4] IEEE Standards for Local and Metropolitan Area Networks: Draft
1801	Standard
1802	        for Virtual Bridged Local Area Networks, P802.1Q/D7, October
1803	1997.

1805	[5] B. Braden, L. Zhang, S. Berson, S. Herzog and S. Jamin, "Resource
1806	        Reservation Protocol (RSVP) - Version 1 Functional
1807	Specification" RFC
1808	        2205 September 1997

1810	[6] J. Wroclawski, "Specification of the Controlled Load Network Element
1811	        Service" RFC 2211 September 1997

1813	[7] S. Shenker, C. Partridge and R. Guerin, "Specification of Guaranteed
1814	        Quality of Service" RFC 2212 September 1997

1816	[8] R. Braden, D. Clark and S. Shenker, "Integrated Services in the
1817	Internet
1818	        Architecture: An Overview" RFC 1633, June 1994.

1820	[9] J. Wroclawski, "The Use of RSVP with IETF Integrated Services" RFC
1821	2210
1822	        September 1997

1824	[10] S. Shenker and J. Wroclawski, "Network Element Service
1825	Specification
1826	        Template" Internet-Draft

1828	[11] S. Shenker and J. Wroclawski, "General Characterization Parameters
1829	for
1830	        Integrated Service Network Elements" RFC 2215 September 1997

1832	[12] L. Delgrossi and L. Berger (Editors), "Internet Stream Protocol
1833	Version 2
1834	        (ST2)  Protocol Specification - Version ST2+", RFC 1819, August
1835	1995.

1837	[13] M.Seaman, A.Smith, E.Crawley "Integrated Service Mappings on IEEE
1838	802
1839	        Networks", Internet Draft, November 1997,
1840	        <draft-ietf-issll-is802-svc-mapping-03.txt>

1842	[14] D.Hoffman et al. "SBM (Subnet Bandwidth Manager): A Proposal for
1843	        Admission Control over Ethernet", Internet Draft, November 1997
1844	        <draft-ietf-issll-sbm-05>

1846	[15] "Carrier Sense Multiple Access with Collision Detection (CSMA/CD)
1847	Access
1848	        Method and Physical Layer Specifications"

1850	       ANSI/IEEE Std 802.3-1985.

1852	[16] "Token-Ring Access Method and Physical Layer Specifications"
1853	      ANSI/IEEE Std 802.5-1995

1855	[17] "A Standard for the Transmission of IP Datagrams over IEEE 802
1856	Networks",
1857	        RFC 1042, February 1988

1859	[18] C. Bisdikian, B. V. Patel, F. Schaffa, and M Willebeek-LeMair,
1860	        The Use of Priorities on Token-Ring Networks for Multimedia
1861	        Traffic, IEEE Network, Nov/Dec 1995.

1863	[19] "Demand Priority Access Method, Physical Layer and Repeater
1864	        Specification for 100Mbit/s", IEEE Std. 802.12-1995.

1866	[20] "Fiber Distributed Data Interface MAC",
1867	        ANSI Std. X3.139-1987

1869	13 Security Considerations

1871	Implementation of the model described in this memo creates no known new
1872	avenues for malicious attack on the network infrastructure although
1873	readers
1874	are referred to section 2.8 of the RSVP specification [5] for a
1875	discussion of
1876	the impact of the use of admission control signaling protocols on
1877	network
1878	security.

1880	14 Acknowledgements

1882	Much of the work presented in this document has benefited greatly
1883	from discussion held at the meetings of the Integrated Services over
1884	Specific Link Layers (ISSLL) working group.  In particular we would
1885	like to thank Eric Crawley, Don Hoffman and Raj Yavatkar.

1887	Authors' Addresses

1889	        Anoop Ghanwani
1890	        IBM Corporation
1891	        P.O.Box 12195
1892	        Research Triangle Park, NC 27709
1893	        USA
1894	        +1 (919) 254-0260
1895	        anoop@raleigh.ibm.com

1897	        J. Wayne Pace
1898	        IBM Corporation
1899	        P. O. Box 12195
1900	        Research Triangle Park, NC 27709
1901	        +1 (919) 254-4930
1902	        pacew@raleigh.ibm.com

1904	        Vijay Srinivasan
1905	        IBM Corporation
1906	        P. O. Box 12195
1907	        Research Triangle Park, NC 27709
1908	        +1 (919) 254-2730
1909	        vijay@raleigh.ibm.com

1911	        Andrew Smith
1912	        Extreme Networks
1913	        10460 Bandley Drive
1914	        Cupertino CA 95014
1915	        USA
1916	        +1 (408) 863 2821
1917	        andrew@extremenetworks.com

1919	        Mick Seaman
1920	        3Com Corp.
1921	        5400 Bayfront Plaza
1922	        Santa Clara CA 95052-8145
1923	        USA
1924	        +1 (408) 764 5000
1925	        mick_seaman@3com.com