idnits 2.17.1 

draft-ietf-issll-802-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-19) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([2], [4], [5], [6], [7], [8],
     [9], [10]), which it shouldn't.  Please replace those with straight
     textual mentions of the documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 49 has weird spacing: '...tension  to th...'

  == Line 158 has weird spacing: '...ted and  the s...'

  == Line 1118 has weird spacing: '...  1.2ms    unb...'

  == Line 1119 has weird spacing: '...  120us    unb...'

  == Line 1120 has weird spacing: '...   12us    unb...'

  == (4 more instances...)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (June 1997) is 9805 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: '3' is defined on line 1380, but no explicit reference
     was found in the text

  == Unused Reference: '12' is defined on line 1413, but no explicit
     reference was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. '1'

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  ** Downref: Normative reference to an Informational RFC: RFC 1633 (ref. '3')

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'

  -- Possible downref: Non-RFC (?) normative reference: ref. '8'

  -- Possible downref: Non-RFC (?) normative reference: ref. '9'

  -- Possible downref: Non-RFC (?) normative reference: ref. '10'

  -- Possible downref: Non-RFC (?) normative reference: ref. '11'

  -- Possible downref: Non-RFC (?) normative reference: ref. '12'

  -- Possible downref: Non-RFC (?) normative reference: ref. '14'


     Summary: 11 errors (**), 0 flaws (~~), 9 warnings (==), 14 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Draft                                        Mick Seaman
3	Expires November 1997                                        3Com
4	draft-ietf-issll-802-01.txt                          Andrew Smith
5	                                                 Extreme Networks
6	                                                     Eric Crawley
7	                                              Gigapacket Networks
8	                                                        June 1997

10	          Integrated Services over IEEE 802.1D/802.1p Networks

12	Status of this Memo

14	   This document is an Internet Draft.  Internet Drafts are working
15	   documents of the Internet Engineering Task Force (IETF), its Areas,
16	   and its Working Groups. Note that other groups may also distribute
17	   working documents as Internet Drafts.

19	   Internet Drafts are draft documents valid for a maximum of six
20	   months. Internet Drafts may be updated, replaced, or obsoleted by
21	   other documents at any time.  It is not appropriate to use Internet
22	   Drafts as reference material or to cite them other than as a "working
23	   draft" or "work in progress."

25	   Please check the I-D abstract listing contained in each Internet
26	   Draft directory to learn the current status of this or any other
27	   Internet Draft.

29	Abstract

31	This document describes the support of IETF Integrated Services over
32	LANs built from IEEE 802 network segments which may be interconnected by
33	draft standard IEEE P802.1p switches.

35	It describes the practical capabilities and limitations of this
36	technology for supporting Controlled Load [8] and Guaranteed Service [9]
37	using the inherent capabilities of the relevant 802 technologies [5],[6]
38	etc. and the proposed 802.1p queuing features in switches. IEEE P802.1p
39	[2] is a superset of the existing IEEE 802.1D bridging specification.
40	This document provides a functional model for the layer 3 to layer 2 and
41	user-to-network dialogue which supports admission control and defines
42	requirements for interoperability between switches. The special case of
43	such networks where the sender and receiver are located on the same
44	segment is also discussed.

46	This scheme expands on the ISSLL over 802 LANs framework described in
47	[7]. It makes reference to an admission control signaling protocol
48	developed by the ISSLL WG which is known as the "Subnet Bandwidth
49	Manager". This is an extension  to the IETF's RSVP protocol [4] and is
50	described in a separate document [10].

52	1. Introduction

54	The IEEE 802.1 Interworking Task Group is currently enhancing the basic
55	MAC Service provided in Bridged Local Area Networks (aka "switched
56	LANs"). As a supplement to the original IEEE MAC Bridges standard [1],
57	the update P802.1p [2] proposes differential traffic class queuing and
58	access to media on the basis of a "user_priority" signaled in frames.

60	In this document we
61	* review the meaning and use of user_priority in LANs and the frame
62	forwarding capabilities of a standard LAN switch.
63	* examine alternatives for identifying layer 2 traffic flows for
64	admission control.
65	* review the options available for policing traffic flows.
66	* derive requirements for consistent traffic class handling in a network
67	of switches and use these requirements to discuss queue handling
68	alternatives for 802.1p and the way in which these meet administrative
69	and interoperability goals.
70	* consider the benefits and limitations of this switched-based approach,
71	contrasting it with full router based RSVP implementation in terms of
72	complexity, utilisation of transmission resources and administrative
73	controls.

75	The model used is outlined in the "framework document" [7] which in
76	summary:
77	* partitions the admission control process into two separable
78	operations:
79	* an interaction between the user of the integrated service and the
80	local network elements ("provision of the service" in the terms of
81	802.1D) to confirm the availability of transmission resources for
82	traffic to be introduced.
83	* selection of an appropriate user_priority for that traffic on the
84	basis of the service and service parameters to be supported.
85	* distinguishes between the user to network interface above and the
86	mechanisms used by the switches ("support of the service"). These
87	include communication between the switches (network to network
88	signaling).
89	* describes a simple architecture for the provision and support of these
90	services, broken down into components with functional and interface
91	descriptions:
92	* a single "user" component: a layer-3 to layer-2 negotiation and
93	translation component for both sending and receiving, with interfaces to
94	other components residing in the station.
95	* processes residing in a bridge/switch to handle admission control and
96	mapping requests, including proposals for actual traffic mappings to
97	user_priority values.
98	* identifies a need for a signaling protocol to carry admission control
99	requests between devices.

101	It will be noted that this document is written from the pragmatic
102	viewpoint that there will be a widely deployed network technology and we
103	are evaluating it for its ability to support some or all of the defined
104	IETF integrated services: this approach is intended to ensure
105	development of a system which can provide useful new capabilities in
106	existing (and soon to be deployed) network infrastructures.

108	2. Goals and Assumptions

110	It is assumed that typical subnetworks that are concerned about
111	quality-of-service will be"switch-rich": that is to say most
112	communication between end stations using integrated services support
113	will pass through at least one switch. The mechanisms and protocols
114	described will be trivially extensible to communicating systems on the
115	same shared media, but it is important not to allow problem
116	generalisation to complicate the practical application that we target:
117	the access characteristics of Ethernet and Token-Ring LANs are forcing a
118	trend to switch-rich topologies along with MAC enhancements to ensure
119	access predictability on half-duplex switch to switch links.

121	Note that we illustrate most examples in this document using RSVP as an
122	"upper-layer" QoS signaling protocol but there are actually no real
123	dependencies on this protocol: RSVP could be replaced by some other
124	dynamic protocol or else the requests could be made by network
125	management or other policy entities. In any event, no extra
126	modifications to the RSVP protocol are assumed.

128	There may be a heterogeneous mixture of switches with different
129	capabilities, all compliant with IEEE 802.1p, but implementing queuing
130	and forwarding mechanisms in a range from simple 2-queue per port,
131	strict priority, up to more complex multi-queue (maybe even one per-
132	flow) WFQ or other algorithms.

134	The problem is broken down into smaller independent pieces: this may
135	lead to sub-optimal usage of the network resources but we contend that
136	such benefits are often equivalent to very small improvements in network
137	efficiency in a LAN environment. Therefore, it is a goal that the
138	switches in the network operate using a much simpler set of information
139	than the RSVP engine in a router. In particular, it is assumed that such
140	switches do not need to implement per-flow queuing and policing
141	(although they might do so).

143	It is a fundamental assumption of the int-serv model that flows are
144	isolated from each other throughout their transit across a network.
145	Intermediate queueing nodes are expected to police the traffic to ensure
146	that it conforms to the pre-agreed traffic flow specification. In the
147	architecture proposed here for mapping to layer-2, we diverge from that
148	assumption in the interests of simplicity: the policing function is
149	assumed to be implemented in the transmit schedulers of the layer-3
150	devices (end stations, routers). In the LAN environments envisioned, it
151	is reasonable to assume that end stations are "trusted" to adhere to
152	their agreed contracts at the inputs to the network and that we can
153	afford to over-allocate resources at admission -control time to
154	compensate for the inevitable extra jitter/bunching introduced by the
155	switched network itself.

157	These divergences have some implications on the receiver heterogeneity
158	that can be supported and  the statistical multiplexing gains that might
159	have been exploited, especially for Controlled Load flows.

161	3. User Priority and Frame Forwarding in IEEE 802 Networks

163	3.1 General IEEE 802 Service Model

165	User_priority is a value associated with the transmission and reception
166	of all frames in the IEEE 802 service model: it is supplied by the
167	sender which is using the MAC service. It is provided along with the
168	data to a receiver using the MAC service. It may or may not be actually
169	carried over the network: Token- Ring/802.5 carries this value (encoded
170	in its FC octet), basic Ethernet/802.3 does not. 802.1p defines a way to
171	carry this value over the network in a consistent way on Ethernet, Token
172	Ring, FDDI or other MAC-layer media using an extended frame format. The
173	usage of user_priority is summarised below but is more fully described
174	in section 2.5 of 802.1D [1] and 802.1p [2] "Support of the Internal
175	Layer Service by Specific MAC Procedures" and readers are referred to
176	these documents for further information.

178	If the "user_priority" is carried explicitly in packets, its utility is
179	as a simple label in the data stream enabling packets in different
180	classes to be discriminated easily by downstream nodes without their
181	having to parse the packet in more detail.

183	Apart from making the job of desktop or wiring-closet switches easier,
184	an explicit field means they do not have to change hardware or software
185	as the rules for classifying packets evolve (e.g. based on new protocols
186	or new policies). More sophisticated layer-3 switches, perhaps deployed
187	towards the core of a network, can provide added value here by
188	performing the classification more accurately and, hence, utilising
189	network resources more efficiently or providing better protection of
190	flows from one another: this appears to be a good economic choice since
191	there are likely to be very many more desktop/wiring closet switches in
192	a network than switches requiring layer-3 functionality.

194	The IEEE 802 specifications make no assumptions about how user_priority
195	is to be used by end stations or by the network. In particular it can
196	only be considered a "priority" in a loose sense: although the current
197	802.1p draft defines static priority queuing as the default mode of
198	operation of switches that implement multiple queues (user_priority is
199	defined as a 3-bit quantity so strict priority queueing would give value
200	7 = high priority, 0 = low priority). The general switch algorithm is as
201	follows: packets are placed onto a particular queue based on the
202	received user_priority (from the packet if a 802.1p header or 802.5
203	network was used, invented according to some local policy if not). The
204	selection of queue is based on a mapping from user_priority
205	[0,1,2,3,4,5,6 or 7] onto the number of available queues.Note that
206	switches may implement any number of queues from 1 upwards and it may
207	not be visible externally, except through any advertised switch
208	parameters and the its admission control behaviour, which user_priority
209	values get mapped to the same vs. Different queues internally.Other
210	algorithms that a switch might implement might include e.g. weighted
211	fair queueuing, round robin.

213	In particular, IEEE makes no recommendations about how a sender should
214	select the value for user_priority: one of the main purposes of this
215	current document is to propose such usage rules and how to communicate
216	the semantics of the values between switches, end- stations and routers.
217	For the remainder of this document we use the term "traffic class" when
218	discussing the treatment of packets with one of the user_priority
219	values.

221	3.2 Ethernet/802.3

223	There is no explicit traffic class or user_priority field carried in
224	Ethernet packets. This means that user_priority must be regenerated at a
225	downstream receiver or switch according to some defaults or by parsing
226	further into higher-layer protocol fields in the packet. Alternatively,
227	the IEEE 802.1Q encapsulation [11] may be used which provides an
228	explicit traffic class field on top of an basic MAC format.

230	For the different IP packet encapsulations used over Ethernet/802.3, it
231	will be necessary to adjust any admission- control calculations
232	according to the framing and to the padding requirements:

234	Encapsulation                          Framing Overhead  IP MTU
235	                                          bytes/pkt       bytes

237	IP EtherType (ip_len<=46 bytes)             64-ip_len    1500
238	             (1500>=ip_len>=46 bytes)         18         1500

240	IP EtherType over 802.1p/Q (ip_len<=42)     64-ip_len    1500*
241	             (1500>=ip_len>=42 bytes)         22         1500*

243	IP EtherType over LLC/SNAP (ip_len<=40)     64-ip_len    1492
244	             (1500>=ip_len>=40 bytes)         24         1492

246	* note that the draft IEEE 802.1Q specification exceeds the IEEE 802.3
247	maximum packet length values by 4 bytes.

249	3.3 Token-Ring/802.5

251	The token ring standard [6] provides a priority mechanism that can be
252	used to control both the queuing of packets for transmission and the
253	access of packets to the shared media. The priority mechanisms are
254	implemented using bits within the Access Control (AC) and the Frame
255	Control (FC) fields of a LLC frame. The first three bits of the AC
256	field, the Token Priority bits, together with the last three bits of the
257	AC field, the Reservation bits, regulate which stations get access to
258	the ring. The last three bits of the FC field of an LLC frame, the User
259	Priority bits, are obtained from the higher layer in the user_priority
260	parameter when it requests transmission of a packet. This parameter also
261	establishes the Access Priority used by the MAC. The user_priority value
262	is conveyed end-to-end by the User Priority bits in the FC field and is
263	typically preserved through Token-Ring bridges of all types. In all
264	cases, 0 is the lowest priority.

266	Token-Ring also uses a concept of Reserved Priority: this relates to the
267	value of priority which a station uses to reserve the token for the next
268	transmission on the ring.  When a free token is circulating, only a
269	station having an Access Priority greater than or equal to the Reserved
270	Priority in the token will be allowed to seize the token for
271	transmission. Readers are referred to [14] for further discussion of
272	this topic.

274	A token ring station is theoretically capable of separately queuing each
275	of the eight levels of requested user priority and then transmitting
276	frames in order of priority.  A station sets Reservation bits according
277	to the user priority of frames that are queued for transmission in the
278	highest priority queue.  This allows the access mechanism to ensure that
279	the frame with the highest priority throughout the entire ring will be
280	transmitted before any lower priority frame.  Annex I to the IEEE 802.5
281	token ring standard recommends that stations send/relay frames as
282	follows:

284	            Application             user_priority
285	            non-time-critical data      0
286	                  -                     1
287	                  -                     2
288	                  -                     3
289	            LAN management              4
290	            time-sensitive data         5
291	            real-time-critical data     6
292	            MAC frames                  7

294	To reduce frame jitter associated with high-priority traffic, the annex
295	also recommends that only one frame be transmitted per token and that
296	the maximum information field size be 4399 octets whenever delay-
297	sensitive traffic is traversing the ring.  Most existing implementations
298	of token ring bridges forward all LLC frames with a default access
299	priority of 4.  Annex I recommends that bridges forward LLC frames that
300	have a user priorities greater that 4 with a reservation equal to the
301	user priority (although the draft IEEE P802.1p [2] permits network
302	management override this behaviour). The capabilities provided by token
303	ring's user and reservation priorities and by IEEE 802.1p can provide
304	effective support for Integrated Services flows that request QoS using
305	RSVP. These mechanisms can provide, with few or no additions to the
306	token ring architecture, bandwidth guarantees with the network flow
307	control necessary to support such guarantees.

309	For the different IP packet encapsulations used over Token Ring/802.5,
310	it will be necessary to adjust any admission-control calculations
311	according to the framing requirements:

313	Encapsulation                          Framing Overhead  IP MTU
314	                                          bytes/pkt       bytes

316	IP EtherType over 802.1p/Q                    29          4370*
317	IP EtherType over LLC/SNAP                    25          4370*

319	*the suggested MTU from RFC 1042 [13] is 4464 bytes but there are issues
320	related to discovering what the maximum supported MTU between any two
321	points both within and between Token Ring subnets. We recommend here an
322	MTU consistent with the 802.5 Annex I recommendation.

324	4. Integrated services through layer-2 switches

326	4.1 Summary of switch characteristics

328	For the sake of illustration, we divide layer-2 bridges/switches into
329	several categories, based on the level of sophistication of their QoS
330	and software protocol capabilities: these categories are not intended to
331	represent all possible implementation choices but, instead, to aid
332	discussion of what QoS capabilities can be expected from a network made
333	of these devices.

335	Class I  - 802.1p priority queueuing between traffic classes.
336	         - No multicast heterogeneity.
337	         - 802.1p GARP/GMRP pruning of individual multicast addresses.

339	Class II As (I) plus:
340	         - can map received user_priority on a per-input-port basis to
341	some internal set of canonical values.
342	         - can map internal canonical values onto transmitted
343	user_priority on a per-output-port basis giving some limited form of
344	multicast heterogeneity.
345	         - maybe implements IGMP snooping for pruning.

347	Class III As (II) plus:
348	         - per-flow classification
349	         - maybe per-flow policing and/or reshaping
350	         - WFQ or other transmit scheduling (probably not per-flow) 4.2
351	Queueing

353	Connectionless packet-based networks in general, and LAN-switched
354	networks in particular, work today because of scaling choices in network
355	provisioning. Consciously or (more usually) unconsciously, enough excess
356	bandwidth and buffering is provisioned in the network to absorb the
357	traffic sourced by higher-layer protocols or cause their transmission
358	windows to run out, on a statistical basis, so that the network is only
359	overloaded for a short duration and the average expected loading is less
360	than 60% (usually much less).

362	With the advent of time-critical traffic such overprovisioning has
363	become far less easy to achieve. Time critical frames may find
364	themselves queued for annoyingly long periods of time behind temporary
365	bursts of file transfer traffic, particularly at network bottleneck
366	points, e.g. at the 100 Mb/s to 10 Mb/s transition that might occur
367	between the riser to the wiring closet and the final link to the user
368	from a desktop switch. In this case, however, if it is known (guaranteed
369	by application design, merely expected on the basis of statistics, or
370	just that this is all that the network guarantees to support) that the
371	time critical traffic is a small fraction of the total bandwidth, it
372	suffices to give it strict priority over the "normal" traffic. The worst
373	case delay experienced by the time critical traffic is roughly the
374	maximum transmission time of a maximum length non-time-critical frame -
375	less than a millisecond for 10 Mb/s Ethernet, and well below an end to
376	end budget based on human perception times.

378	When more than one "priority" service is to be offered by a network
379	element e.g. it supports Controlled-Load as well as Guaranteed Service,
380	the queuing discipline becomes more complex. In order to provide the
381	required isolation between the service classes, it will probably be
382	necessary to queue them separately. There is then an issue of how to
383	service the queues - a combination of admission control and maybe
384	weighted fair queuing may be required in such cases. As with the service
385	specifications themselves, it is not the place for this document to
386	specify queuing algorithms, merely to observe that the external
387	behaviour meet the services' requirements.

389	4.3 Multicast Heterogeneity

391	IEEE 802.1D and 802.1p specify a basic model for multicast whereby a
392	switch performs multicast routing decisions based on the destination
393	address: this would produce a list of output ports to which the packet
394	should be forwarded. In its default mode, such a switch would use any
395	user_priority value in received packets to enqueue the packets at each
396	output port. All of the classes of switch identified above can support
397	this operation.

399	At layer-3, the int-serv model allows heterogeneous multicast flows
400	where different branches of a tree can have different types of
401	reservations for a given multicast destination, or even supports the
402	notion that some trees will have some branches with reserved flows and
403	some using best effort (default) service.

405	If a switch is selecting per-port output queues based only on the
406	incoming user_priority, as described by 802.1p, it must treat all
407	branches of all multicast sessions within that user_priority class with
408	the same queuing mechanism: no heterogeneity is then possible.I If a
409	switch were to implement a separate user_priority mapping at each output
410	port, as described under "Class II switch" above, then some limited form
411	of receiver heterogeneity can be supported e.g. forwarding of traffic as
412	user_priority 4 on one branch where receivers have performed admission
413	control reservations and as user_priority 0 on one where they have not.
414	We assume that per-user_priority queuing without taking account of input
415	or output ports is the minimum standard functionality for systems in a
416	LAN environment (Class I switch, as defined above). More functional
417	layer-2 switches or even layer-3 switches (a.k.a. routers) can be used
418	if even more flexible forms of heterogeneity are considered necessary:
419	their behaviour is well standardised.

421	4.4 Override of incoming user_priority

423	In some cases, a network administrator may not trust the user_priority
424	values contained in packets from a source and may which to map these
425	into some more suitable set of values. Alternatively, due perhaps to
426	equipment limitations or transition periods, values may need to be
427	mapped to/from different regions of a network.

429	Some switches may implement such a function on input that maps received
430	user_priority into some internal set of values (this table is known in
431	802.1p as the "user_priority regeneration table"). These values can then
432	be mapped using the output table described above onto outgoing
433	user_priority values: these same mappings must also be used when
434	applying admission control to requests that use the user_priority values
435	(see e.g. [10]).  More sophisticated approaches may also be envisioned
436	where a device polices traffic flows and adjusts their onward
437	user_priority based on their conformance to the admitted traffic flow
438	specifications.

440	4.5 Remapping of non-conformant aggregated flows

442	One other topic under discussion in the int-serv context is how to
443	handle the traffic for data flows from sources that are exceeding their
444	currently agreed traffic contract with the network. An approach that
445	shows much promise is to treat such traffic with "somewhat less than
446	best effort" service in order to protect traffic that is normally given
447	"best effort" service from having to back off (such traffic is often
448	"adaptive" using TCP or other congestion control algorithms and it would
449	be unfair to penalise it due to badly behaved traffic from reserved
450	flows which are usually set up by non-adaptive applications).

452	A solution here might be to assign normal best effort traffic to one
453	user_priority and to label excess non-conformant traffic as a "lower"
454	user_priority. This topic is further discussed below.

456	5. Selecting traffic classes

458	One fundamental question is "who gets to decide what the classes mean
459	and who gets access to them?" One approach would be for the meanings of
460	the classes to be "well-known": we would then need to standardise a set
461	of classes e.g. 1 = best effort, 2 = controlled- load, 3 = guaranteed
462	(loose delay bound, high bandwidth), 4 = guaranteed (slightly tighter
463	delay) etc. The values to encode in such a table in end stations, in
464	isolation from the network to which they are connected, is
465	problematical: one approach could be to define one user_priority value
466	per int-serv service and leave it at that (reserving the rest of the
467	combinations for future traffic classes - there are sure to be plenty!).

469	We propose here a more flexible mapping: clients ask "the network" which
470	user_priority traffic class to use for a given traffic flow, as
471	categorised by its flow-spec and layer-2 endpoints. The network provides
472	a value back to the requester which is appropriate to the current
473	network topology, load conditions, other admitted flows etc. The task of
474	configuring switches with this mapping (e.g. through network management,
475	a switch-switch protocol or via some network-wide QoS-mapping directory
476	service) is an order of magnitude less complex than performing the same
477	function in end stations. Also, when new services (or other network
478	reconfigurations) are added to such a network, the network elements will
479	typically be the ones to be upgraded with new queuing algorithms etc.
480	and can be provided with new mappings at this time.

482	Given the need for a new session or "flow" requiring some QoS support, a
483	client then needs answers to the following questions:

485	1. which traffic class do I add this flow to?
486	 The client needs to know how to label the packets of the flow as it
487	places them into the network.

489	2. who do I ask/tell?
490	 The proposed model is that a client ask "the network" which
491	user_priority traffic class to use for a given traffic flow. This has
492	several benefits as compared to a model which allows clients to select a
493	class for themselves.

495	3. how do I ask/tell them?
496	 A request/response protocol is needed between client and network: in
497	fact, the request can be piggy-backed onto an admission control request
498	and the response can be piggy-backed onto an admission control
499	acknowledgment: this "one pass" assignment has the benefit of completing
500	the admission control in a timely way and reducing the exposure to
501	changing conditions which could occur if clients cached the knowledge
502	for extensive periods.

504	The network (i.e. the first network element encountered downstream from
505	the client) must then answer the following questions:

507	1. which traffic class do I add this flow to?
508	 This is a packing problem, difficult to solve in general, but many
509	simplifying assumptions can be made: presumably some simple form of
510	allocation can be done without a more complex scheme able to dynamically
511	shift flows around between classes.

513	2. which traffic class has worst-case parameters which meet the needs of
514	this flow?
515	 This might be an ordering/comparison problem: which of two service
516	classes is "better" than another? Again, we can make this tractable by
517	observing that all of the current int-serv classes can be ranked (best
518	effort <= Controlled Load <= Guaranteed Service) in a simple manner. If
519	any classes are implemented in the future that cannot be simply ranked
520	then the issue can be finessed by either a priori knowledge about what
521	classes are supported or by configuration.

523	and return the chosen user_priority value to the client.

525	Note that the client may be either an end station, router or a first
526	switch which may be acting as a proxy for a client which does not
527	participate in these protocols for whatever reason. Note also that a
528	device e.g. a server or router, may choose to implement both the
529	"client" as well as the "network" portion of this model so that it can
530	select its own user_priority values: such an implementation would,
531	however, be discouraged unless the device really does have a close tie-
532	in with the network topology and resource allocation policies but would
533	work in some cases where there is known over- provisioning of resources.

535	6. Flow Identification

537	Several previous proposals for int-serv over lower-layers have treated
538	switches very much as a special case of routers: in particular, that
539	switches along the data path will make packet handling decisions based
540	on the RSVP flow and filter specifications and use them to classify the
541	corresponding data packets. However, filtering to the per-flow level
542	becomes cost-prohibitive with increasing switch speed: devices with such
543	filtering capabilities are unlikely to have a very different
544	implementation cost to IP routers, in which case we must question
545	whether a specification oriented toward switched networks is of any
546	benefit at all.

548	This document proposes that "aggregated flow" identification based on
549	user_priority be the minimum required of switches.

551	7. Reserving Network Resources - Admission Control

553	So far we have not discussed admission control. In fact, without
554	admission control it is possible to scratchbuild a LAN network of some
555	size capable of supporting real-time services, providing that the
556	traffic fits within certain scaling constraints (relative link speeds,
557	numbers of ports etc. - see below). This is not surprising since it is
558	possible to run a fair approximation to real time services on small LANs
559	today with no admission control or help from encoded priority bits.

561	Imagine a campus network providing dedicated 10 Mbps connections to each
562	user. Each floor of each building supports up to 96 users, organized
563	into groups of 24, with each group being supported by a 100 Mbps
564	downlink to a basement switch which concentrates 5 floors (20 x 100
565	Mbps) and a data center (4 x 100 Mbps) to a 1 Gbps link to an 8 Gbps
566	central campus switch, which in turn hooks 6 buildings together (with 2
567	x 1 Gbps full duplex links to support a corporate server farm). Such a
568	network could support 1.5 Mb/s of voice/video from every user to any
569	other user or (for half the population) the server farm, provided the
570	video ran high priority: this gives 3000 users, all with desktop video
571	conferencing running along with file transfer/email etc. In such a
572	network RSVP's role would be limited to ensuring resource availability
573	at the communicating end stations and for connection to the wide area.

575	In such a network, a discussion as to the best service policy to apply
576	to high and low priority queues may prove academic: while it is true
577	that "normal" traffic may be delayed by bunches of high priority frames,
578	queuing theory tells us that the average queue occupancy in the high
579	priority queue at any switch port will be somewhat less than 1 (with
580	real user behaviour, i.e. not all watching video conferences all the
581	time) it should be far less. A cheaper alternative to buying equipment
582	with a fancy queue service policy may be to buy equipment with more
583	bandwidth to lower the average link utilisation by a few per cent.

585	In practice a number of objections can be made to such a simple
586	solution. There may be long established expensive equipment in the
587	network which does not provide all the bandwidth required. There will be
588	considerable concern over who is allowed to say what traffic is high
589	priority. There may be a wish to give some form of "prioritised" service
590	to crucial business applications, above that given to experimental
591	video-conferencing. The task that faces us is to provide a degree of
592	control without making that control so elaborate to implement that the
593	control-oriented solution is not simply rejected in favor of providing
594	yet more bandwidth, at a lower cost.

596	The proposed admission control mechanism requires a query-response
597	interaction with the network returning a "YES/NO" answer and, if
598	successful, a user_priority value with which to tag the data frames of
599	this flow.

601	The relevant int-serv specifications describe the parameters which need
602	to be considered when making an admission control decision at each node
603	in the network path between sender and receiver. We discuss how to
604	calculate these parameters for different network technologies below but
605	we do not specify admission control algorithms or mechanisms as to how
606	to progress the admission control process across the network. One such
607	mechanism is described as SBM in [10].

609	Where there are multiple mechanisms in use for allocating resources e.g.
610	some combination of SBM and network management, it will be necessary to
611	ensure that network resources are partitioned amongst the different
612	mechanisms in some way: this could be by configuration or maybe by
613	having the mechanisms allocate from a common resource pool within any
614	device.

616	8. Mapping of integrated services to layer-2 in layer-3 devices

618	8.1 Layer-3 client

620	We assume the same client model as int-serv and RSVP where we use the
621	term "client" to mean the entity handling QoS in the layer-3 device at
622	each end of a layer-2 hop (e.g. end-station, router). The sending client
623	itself is responsible for local admission control and scheduling packets
624	onto its link in accordance with the service agreed. Just as in the
625	int-serv model, this involves per-flow schedulers (a.k.a. shapers) in
626	every such data source.

628	The client is running an RSVP process which presents a session
629	establishment interface to applications, signals RSVP over the network,
630	programs a scheduler and classifier in the driver and interfaces to a
631	policy control module. In particular, RSVP also interfaces to a local
632	admission control module: it is this entity that we focus on here.

634	The following diagram is taken from the RSVP specification [4]:
635	                      _____________________________
636	                     |  _______                    |
637	                     | |       |   _______         |
638	                     | |Appli- |  |       |        |   RSVP
639	                     | | cation|  | RSVP <-------------------->
640	                     | |       <-->       |        |
641	                     | |       |  |process|  _____ |
642	                     | |_._____|  |       -->Polcy||
643	                     |   |        |__.__._| |Cntrl||
644	                     |   |data       |  |   |_____||
645	                     |===|===========|==|==========|
646	                     |   |   --------|  |    _____ |
647	                     |   |  |        |  ---->Admis||
648	                     |  _V__V_    ___V____  |Cntrl||
649	                     | |      |  |        | |_____||
650	                     | |Class-|  | Packet |        |
651	                     | | ifier|==>Schedulr|====================>
652	                     | |______|  |________|        |    data
653	                     |                             |
654	                     |_____________________________|

656	                    Figure 1 - RSVP in Sending Hosts

658	Note that we illustrate examples in this document using RSVP as the
659	"upper-layer" signaling protocol but there are no actual dependencies on
660	this protocol: RSVP could be replaced by some other dynamic protocol or
661	else the requests could be made by network management or other policy
662	entities.

664	8.2 Requests to layer-2

666	The local admission control entity within a client is responsible for
667	mapping these layer-3 requests into layer-2 language.

669	The upper-layer entity requests from ISSLL:

671	"May I reserve for traffic with <traffic characteristic> with
672	<performance requirements> from <here> to <there> and how
673	should I label it?"

675	where
676	  <traffic characteristic> = Flow Spec, Tspec, Rspec (e.g.
677	              bandwidth, burstiness, MTU etc.)
678	  <performance requirements> = latency, jitter bounds etc.
679	  <here> = IP address(es)
680	  <there> = IP address(es) - may be multicast

682	8.3 Sender

684	The ISSLL  functionality in the sender is illustrated below and may be
685	summarised as:
686	* maps the endpoints of the conversation to layer-2 addresses in the
687	LAN, so it can figure out what traffic is really going where (probably
688	makes reference to the ARP protocol cache for unicast or an algorithmic
689	mapping for multicast destinations).
690	* applies local admission control on outgoing link and driver
691	* formats a SBM request to the network with the mapped addresses and
692	filter/flow specs
693	* receives response from the network and reports the YES/NO admission
694	control answer back to the upper layer entity, along with any negotiated
695	modifications to the session parameters.
696	* stores any resulting user_priority to be associated with this session
697	in a "802 header" lookup table for use when sending any future data
698	packets.
699	                    from IP     from RSVP
700	                   ____|____________|____________
701	                  |    |            |            |
702	                  |  __V____     ___V___         |
703	                  | |       |   |       |        |
704	                  | | Addr  |<->|       |        | SBM signaling
705	                  | |mapping|   | SBM   |<------------------------>
706	                  | |_______|   |Client |        |
707	                  |  ___|___    |       |        |
708	                  | |       |<->|       |        |
709	                  | |  802  |   |_______|        |
710	                  | | header|     / | |          |
711	                  | |_______|    /  | |          |
712	                  |    |        /   | |   _____  |
713	                  |    | +-----/    | +->|Local| |
714	                  |  __V_V_    _____V__  |Admis| |
715	                  | |      |  |        | |Cntrl| |
716	                  | |Class-|  | Packet | |_____| |
717	                  | | ifier|==>Schedulr|======================>
718	                  | |______|  |________|         |  data
719	                  |______________________________|

721	                Figure 2 - ISSLL in End-station Sender

723	ISSLL manageable objects in the sender:
724	  802 header table
725	  Local admission control resource status
726	  L2 additions to classifier/scheduler int-serv tables

728	8.4 Receiver
729	The ISSLL functionality in the receiver is a good deal simpler. It is
730	summarised below and is illustrated by the following picture:
731	* handles any received SBM protocol indications.
732	* applies local admission control to see if a request can be supported
733	with appropriate local receive resources.
734	* passes indications up to RSVP if OK.
735	* accepts confirmations from RSVP and relays them back via SBM signaling
736	towards the requester.
737	* may program a receive classifier and scheduler, if any is used, to
738	identify traffic classes of received packets and accord them appropriate
739	treatment e.g. reserve some buffers for particular traffic classes.
740	* programs receiver to strip any 802 header information from received
741	packets.

743	                     to RSVP       to IP
744	                       ^            ^
745	                   ____|____________|___________
746	                  |    |            |           |
747	                  |  __|____        |           |
748	                  | |       |       |           |
749	 SBM signaling    | |  SBM  |    ___|___        |
750	<-----------------> |Client |   | Strip |       |
751	                  | |_______|   |802 hdr|       |
752	                  |    |   \    |_______|       |
753	                  |  __v___ \       ^           |
754	                  | | Local |\      |           |
755	                  | | Admis | \     |           |
756	                  | | Cntrl |  \    |           |
757	                  | |_______|   \   |           |
758	                  |  ______     v___|____       |
759	                  | |Class-|   | Packet  |      |
760	===================>| ifier|==>|Scheduler|      |
761	     data         | |______|   |_________|      |
762	                  |_____________________________|

764	                Figure 3 - ISSLL in End-station Receiver

766	9. Layer-2 Switch Functions

768	9.1 Switch Model

770	In this model of layer-2 switch behaviour, we define the following
771	entities within the switch:

773	* Local admission control - one of these on each port accounts for the
774	available bandwidth on the link attached to that port. For half-duplex
775	links, this involves taking account of the resources allocated to both
776	transmit and receive flows. For full-duplex, the input port accountant's
777	task is trivial.

779	* Input SBM module: one instance on each port, performs the "network"
780	side of the signaling protocol for peering with clients or other
781	switches. Also holds knowledge of the mappings of int-serv classes to
782	user_priority.

784	* SBM propagation - relays requests that have passed admission control
785	at the input port to the relevant output ports' SBM modules. This will
786	require access to the switch's forwarding table (layer-2 "routing table"
787	- cf. RSVP model) and port spanning-tree states.

789	* Output SBM module - forwards requests to the next layer-2 or -3
790	network hop.

792	* Classifier, Queueing and Scheduler - these functions are basically as
793	described by the Forwarding Process of IEEE 802.1p (see section 3.7 of
794	[2]). The Classifier module identifies the relevant QoS information from
795	incoming packets and uses this, together with the normal bridge
796	forwarding database, to decide to which output queue of which output
797	port to enqueue the packet. In Class I switches, this information is the
798	"regenerated user_priority" parameter which has already been decoded by
799	the receiving MAC service and potentially re-mapped by the 802.1p
800	forwarding process (see description in section 3.7.3 of [2]). This does
801	not preclude more sophisticated classification rules which may be
802	applied in more complex Class III switches e.g. matching on individual
803	int-serv flows.

805	 The Queueing and Scheduler module holds the output queues for ports and
806	provides the algorithm for servicing the queues for transmission onto
807	the output link in order to provide the promised int-serv service.
808	Switches will implement one or more output queues per port and all will
809	implement at least a basic strict priority dequeueing algorithm as their
810	default, in accordance with 802.1p.

812	* Ingress traffic class mapper and policing - as described in 802.1p
813	section 3.7. This optional module may check on whether the data within
814	traffic classes are conforming to the patterns currently agreed:
815	switches may police this and discard or re-map packets. The default
816	behaviour is to pass things through unchanged.

818	* Egress traffic class mapper - as described in 802.1p section 3.7. This
819	optional module may apply re-mapping of traffic classes e.g. on a per-
820	output port basis. The default behaviour is to pass things through
821	unchanged.

823	These are shown by the following diagram which is a superset of the IEEE
824	802.1D/802.1p bridge model:

826	                   _______________________________
827	                  |  _____     ______     ______  |
828	 SBM signaling    | |     |   |      |   |      | | SBM signaling
829	<------------------>| IN  |<->| SBM  |<->| OUT  |<---------------->
830	                  | | SBM |   | prop.|   | SBM  | |
831	                  | |_____|   |______|   |______| |
832	                  |  / |          ^     /     |   |
833	    ______________| /  |          |     |     |   |_____________
834	   | \             / __V__        |     |   __V__             / |
835	   |   \      ____/ |Local|       |     |  |Local|          /   |
836	   |     \   /      |Admis|       |     |  |Admis|        /     |
837	   |       \/       |Cntrl|       |     |  |Cntrl|      /       |
838	   |  _____V \      |_____|       |     |  |_____|    / _____   |
839	   | |traff |  \               ___|__   V_______    /  |egrss|  |
840	   | |class |    \            |Filter| |Queue & | /    |traff|  |
841	   | |map & |=====|==========>|Data- |=| Packet |=|===>|class|  |
842	   | |police|     |           |  base| |Schedule| |    |map  |  |
843	   | |______|     |           |______| |________| |    |_____|  |
844	   |____^_________|_______________________________|______|______|
845	data in |                                                |data out
846	========+                                                +========>
847	                Figure 4 - ISSLL in Switches

849	9.2 Admission Control

851	On reception of an admission control request, a switch performs the
852	following actions:
853	* ingress SBM module translates any received user_priority or else
854	selects a layer-2 traffic class which appears compatible with the
855	request and whose use does not violate any administrative policies in
856	force. In effect, it matches up the requested service with those
857	available in each of the user_priority classes and chooses the "best"
858	one. It ensures that, if this reservation is successful, the selected
859	value is passed back to the client.
860	* ingress SBM observes the current state of allocation of resources on
861	the input port/link and then determines whether the new resource
862	allocation from the mapped traffic class would be excessive. The request
863	is passed to the reservation propagator if accepted so far.
864	* reservation propagator relays the request to the bandwidth accountants
865	on each of the switch's outbound links to which this reservation would
866	apply (implied interface to routing/forwarding database).
867	* egress bandwidth accountant observes the current state of allocation
868	of queueing resources on its outbound port and bandwidth on the link
869	itself and determines whether the new allocation would be excessive.
870	Note that this is only the local decision of this switch hop: each
871	further layer-2 hop through the network gets a chance to veto the
872	request as it passes along.
873	* the request, if accepted by this switch, is then passed on down the
874	line on each output link selected. Any user_priority described in the
875	forwarded request must be translated according to any egress mapping
876	table.

878	* if accepted, the switch must notify the client of the user_priority to
879	use for packets belonging to this flow.  Note that this is a
880	"provisional YES" - we assume an optimistic approach here: later
881	switches can still say "NO" later.
882	* if this switch wishes to reject the request, it can do so by notifying
883	the original client (by means of its layer-2 address).

885	10. Mappings from intserv service models to IEEE 802

887	It is assumed that admission control will be applied when deciding
888	whether or not to admit a new flow through a given network element and
889	that a device sending onto a link will be proxying the parameters and
890	admission control decisions on behalf of that link: this process will
891	require the device to be able to determine (by estimation, measurement
892	or calculation) several parameters. It is assumed that details of the
893	potential flow are provided to the device by some means (e.g. a
894	signaling protocol, network management). The service definition
895	specifications themselves provide some implementation guidance as to how
896	to calculate some of these quantities.

898	The accuracy of calculation of these parameters may not be very
899	critical: indeed it is an assumption of this model's being used with
900	relatively simple Class I switches that they merely provide values to
901	describe the device and admit flows conservatively.

903	10.1 General characterisation parameters

905	There are some general parameters that a device will need to use and/or
906	supply for all service types:
907	  - Ingress link
908	  - Egress links and their MTUs, framing overheads and minimum packet
909	sizes (see media-specific information presented above).
910	  - available path bandwidth: updated hop-by-hop by any device along the
911	path of the flow.
912	  - minimum latency

914	10.2 Parameters to implement Guaranteed Service

916	A network element must be able to determine the following parameters:

918	  - Constant delay bound through this device (in addition to any value
919	provided by "minimum latency" above) and up to the receiver at the next
920	network element for the packets of this flow if it were to be admitted:
921	this would include any access latency bound to the outgoing link as well
922	as propagation delay across that link.
923	  - Rate-proportional delay bound through this device and up to the
924	receiver at the next network element for the packets of this flow if it
925	were to be admitted.
926	  - Receive resources that would need to be associated with this flow
927	(e.g. buffering, bandwidth) if it were to be admitted and not suffer
928	packet loss if it kept within its supplied Tspec/Rspec.
929	  - Transmit resources that would need to be associated with this flow
930	(e.g. buffering, bandwidth, constant- and rate-proportional delay
931	bounds) if it were to be admitted.

933	10.3 Parameters to implement Controlled Load

935	A network element must be able to determine the following parameters
936	which can be extracted from [8]:

938	  - Receive resources that would need to be associated with this flow
939	(e.g. buffering) if it were to be admitted.
940	  - Transmit resources that would need to be associated with this flow
941	(e.g. buffering) if it were to be admitted.

943	10.4 Parameters to implement Best Effort

945	For a network element to implement best effort service there are no
946	explicit parameters that need to be characterised.

948	10.5 Mapping to IEEE 802 user_priority

950	There are many options available for mapping aggregations of flows
951	described by int-serv service models (Best Effort, Controlled Load, and
952	Guaranteed are the services considered here) onto user_priority classes.
953	There currently exists very little practical experience with particular
954	mappings to help make a determination as to the "best" mapping.  In that
955	spirit, the following options are presented in order to stimulate
956	experimentation in this area. Note, this does not dictate what
957	mechanisms/algorithms a network element (e.g. an Ethernet switch) needs
958	to perform to implement these mappings: this is an implementation choice
959	and does not matter so long as the requirements for the particular
960	service model are met. Having said that, we do explore below the ability
961	of a switch implementing strict priority queueing to support some or all
962	of the service types under discussion: this is worthwhile because this
963	is likely to be the most widely deployed dequeueing algorithm in simple
964	switches as it is the default specified in 802.1p.

966	In order to reduce the administrative problems , such a mapping table is
967	held by *switches* (and routers if desired) but generally not by end-
968	station hosts and is a read-write table. The values proposed below are
969	defaults and can be overridden by management control so long as all
970	switches agree to some extent (the required level of agreement requires
971	further analysis).

973	It is possible that some form of network-wide lookup service could be
974	implemented that serviced requests from clients e.g. traffic_class =
975	getQoSbyName("H.323 video") and notified switches of what sorts of
976	traffic categories they were likely to encounter and how to allocate
977	those requests into traffic classes: such mechanisms are for further
978	study.

980	Proposal:  A Simple Scheme

982	     user_priority      Service
983	       0                "less than" Best Effort
984	       1                Best Effort
985	       2                reserved
986	       3                reserved
987	       4                Controlled Load
988	       5                Guaranteed Service, 100ms bound
989	       6                Guaranteed Service, 10ms bound
990	       7                reserved

992	In this proposal, all traffic that uses the controlled load service is
993	mapped to a single 802.1p user_priority whilst that for guaranteed
994	service is placed into one of two user_priority classes with different
995	delay bounds. Unreserved best effort traffic is mapped to another.

997	The use of classes 4, 5 and 6 for Controlled Load and Guaranteed Service
998	is somewhat arbitrary as long as they are increasing. Any two classes
999	greater than Best Effort can be used as long as GS is "greater" than CL
1000	although those proposed here have the advantage that, for transit
1001	through 802.1p switches with only two-level strict priority queuing,
1002	they both get "high priority" treatment (the current 802.1p default
1003	split is 0-3 and 4-7 for a device with 2 queues). The choice of delay
1004	bound is also arbitrary but potentially very significant: this can lead
1005	to a much more efficient allocation of resources as well as greater
1006	(though still not very good) isolation between flows.

1008	The "less than best effort" class might be useful for devices that wish
1009	to tag packets that are exceeding a committed network capacity and can
1010	be optionally discarded by a downstream device.  Note, this is not
1011	*required* by any current int-serv models but is under study.

1013	The advantage to this approach is that it puts some real delay bounds on
1014	the Guaranteed Service without adding any additional complexity to the
1015	other services.  It still ignores the amount of *bandwidth* available
1016	for each class. This should behave reasonably well as long as all
1017	traffic for CL and GS flows does not exceed any resource capacities in
1018	the device. Some isolation between very delay-critical GS and less
1019	critical GS flows is provided but there is still an overall assumption
1020	that flows will in general be well- behaved. In addition, this mapping
1021	still leaves room for future service models.

1023	Expanding the number of classes for CL service is not as appealing since
1024	there is no need to map to a particular delay bound.  There may be cases
1025	where an administrator might map CL onto more classes for particular
1026	bandwidths or policy levels.  It may also be desirable to further
1027	subdivide CL traffic in cases where the itis frequently non-conformant
1028	for certain applications.

1030	11. Network Topology Scenarios

1032	11.1 Switched networks using priority scheduling algorithms

1034	In general, the int-serv standards work has tried to avoid any
1035	specification of scheduling algorithms, instead relying on implementers
1036	to deduce appropriate algorithms from the service definitions and on
1037	users to apply measurable benchmarks to check for conformance. However,
1038	since one standards' body has chosen to specify a single default
1039	scheduling algorithm for switches [2], it seems appropriate to examine
1040	to some degree, how well this "implementation" might actually support
1041	some or all of the int-serv services.

1043	If the mappings of Proposal A above are applied in a switch implementing
1044	strict priority queueing between the 8 traffic classes (7 = highest)
1045	then the result will be that all Guaranteed Service packets will be
1046	transmitted in preference to any other service. Controlled Load packets
1047	will be transmitted next, with everything else waiting until both of
1048	these queues are empty. If the admission control algorithms in use on
1049	the switch ensure that the sum of the "promised" bandwidth of all of the
1050	GS and CL sessions are never allowed to exceed the available link
1051	bandwidth then things are looking good.

1053	11.2 Full-duplex switched networks

1055	We have up to now ignored the MAC access protocol. On a full-duplex
1056	switched LAN (of either Ethernet or Token-Ring types - the MAC algorithm
1057	is, by definition, unimportant) this can be factored in to the
1058	characterisation parameters advertised by the device since the access
1059	latency is well controlled (jitter = one largest packet time). Some
1060	example characteristics (approximate):

1062	     Type        Speed         Max Pkt   Max Access
1063	                               Length    Latency

1065	     Ethernet    10Mbps         1.2ms     1.2ms
1066	                 100Mbps        120us     120us
1067	                 1Gbps           12us      12us
1068	     Token-Ring  4Mbps            9ms       9ms
1069	                 16Mbps           9ms       9ms
1070	     FDDI        100Mbps        360us     8.4ms

1072	These delays should be also be considered in the context of speed- of-
1073	light delays of e.g. ~400ns for typical 100m UTP links and ~7us for
1074	typical 2km multimode fibre links.

1076	Therefore we see Full-Duplex switched network topologies as offering
1077	good QoS capabilities for both Controlled Load and Guaranteed Service.

1079	11.3 Shared-media Ethernet networks

1081	We have not mentioned the difficulty of dealing with allocation on a
1082	single shared CSMA/CD segment: as soon as any CSMA/CD algorithm is
1083	introduced then the ability to provide any form of Guaranteed Service is
1084	seriously compromised in the absence of any tight coupling between the
1085	multiple senders on the link. There are a number of reasons for not
1086	offering a better solution for this issue.

1088	Firstly, we do not believe this is a truly solvable problem: it would
1089	seem to require a new MAC protocol. Those who are interested in solving
1090	this problem per se should probably be following the BLAM developments
1091	in 802.3 but we would be suspicious of the interoperability
1092	characteristics of a series of new software MACs running above the
1093	traditional 802.3 MAC.

1095	Secondly, we are not convinced that it is really an interesting problem.
1096	While not everyone in the world is buying desktop switches today and
1097	there will be end stations living on repeated segments for some time to
1098	come, the number of switches is going up and the number of stations on
1099	repeated segments is going down. This trend is proceeding to the point
1100	that we may be happy with a solution which assumes that any network
1101	conversation requiring resource reservations will take place through at
1102	least one switch (be it layer-2 or layer-3). Put another way, the
1103	easiest QoS upgrade to a layer-2 network is to install segment
1104	switching: only when has been done is it worthwhile to investigate more
1105	complex solutions involving admission control.

1107	Thirdly, in the core of the network (as opposed to at the edges), there
1108	does not seem to be enough economic benefit for repeated segment
1109	solutions as opposed to switched solutions. While repeated solutions
1110	*may* be 50% cheaper, their cost impact on the entire network is
1111	amortised across all of the edge ports. There may be special
1112	circumstances in the future (e.g. Gigabit buffered repeaters) but these
1113	have differing characteristics to existing CSMA/CD repeaters anyway.

1115	     Type        Speed         Max Pkt   Max Access
1116	                                  Length    Latency

1118	     Ethernet    10Mbps         1.2ms    unbounded
1119	                 100Mbps        120us    unbounded
1120	                 1Gbps           12us    unbounded

1122	11.4 Half-duplex switched Ethernet networks

1124	Many of the same arguments for sub-optimal support of Guaranteed Service
1125	apply to half-duplex switched Ethernet as to shared media: in essence,
1126	this topology is a medium that *is* shared between at least two senders
1127	contending for each packet transmission opportunity. Unless these are
1128	tightly coupled and cooperative then there is always the chance that the
1129	junk traffic of one will interfere with the other's important traffic.
1130	Such coupling would seem to need some form of modifications to the MAC
1131	protocol (see above).

1133	Notwithstanding this, these topologies do seem to offer the chance to
1134	provide Controlled Load service: with the knowledge that there are only
1135	a small limited number (e.g. two) of potential senders that are both
1136	using prioritisation for their CL traffic (with admission control for
1137	those CL flows based on the knowledge of the number of potential
1138	senders) over best effort, the media access characteristics, whilst not
1139	deterministic in the true mathematical sense, are somewhat predictable.
1140	This is probably a close enough approximation to CL to be useful.

1142	     Type        Speed          Max Pkt   Max Access
1143	                                Length    Latency

1145	     Ethernet    10Mbps           1.2ms   unbounded
1146	                 100Mbps          120us   unbounded
1147	                 1Gbps             12us   unbounded

1149	11.5 Half-duplex and shared Token Ring networks

1151	In a shared Token Ring network, the network access time for high
1152	priority traffic at any station is bounded and is given by (N+1)*THTmax,
1153	where N is the number of stations sending high priority traffic and
1154	THTmax is the maximum token holding time [14]. This assumes that network
1155	adapters have priority queues so that reservation of the token is done
1156	for traffic with the highest priority currently queued in the adapter.
1157	It is easy to see that access times can be improved by reducing N or
1158	THTmax.  The recommended default for THTmax is 10 ms [6]. N is an
1159	integer from 2 to 256 for a shared ring and 2 for a switched half duplex
1160	topology. A similar analysis applies for FDDI. Using default values
1161	gives:

1163	     Type        Speed              Max Pkt   Max Access
1164	                                    Length    Latency

1166	     Token-Ring  4/16Mbps shared       9ms    2570ms
1167	                 4/16Mbps switched     9ms      30ms
1168	     FDDI        100Mbps             360us       8ms

1170	Given that access time is bounded, it is possible to provide an upper
1171	bound for end-to-end delays as required by Guaranteed Service assuming
1172	that traffic of this class uses the highest priority allowable for user
1173	traffic.  The actual number of stations that send traffic mapped into
1174	the same traffic class as GS may vary over time but, from an admission
1175	control standpoint, this value is needed a priori.  The admission
1176	control entity must therefore use a fixed value for N, which may be the
1177	total number of stations on the ring or some lower value if it is
1178	desired to keep the offered delay guarantees smaller. If the value of N
1179	used is lower than the total number of stations on the ring, admission
1180	control must ensure that the number of stations sending high priority
1181	traffic never exceeds this number. This approach allows admission
1182	control to estimate worst case access delays assuming that all of the N
1183	stations are sending high priority data even though, in most cases, this
1184	will mean that delays are significantly overestimated.

1186	Assuming that Controlled Load flows use a traffic class lower than that
1187	used by GS, no upper-bound on access latency can be provided for CL
1188	flows.  However, CL flows will receive better service than best effort
1189	flows.

1191	Note that, on many existing shared token rings, bridges will transmit
1192	frames using an Access Priority (see section 3.3) value 4 irrespective
1193	of the user_priority carried in the frame control field of the frame.
1194	Therefore, existing bridges would need to be reconfigured or modified
1195	before the above access time bounds can actually be used.

1197	12. Signaling protocol

1199	The mechanisms described in this document make use of a signaling
1200	protocol for devices to communicate their admission control requests
1201	across the network: the service definitions to be provided by such a
1202	protocol are described below. The candidate IETF protocol for this
1203	purpose is called "Subnet Bandwidth Manager" and is described in [10].

1205	In all these cases, appropriate delete/cleanup mechanisms will also have
1206	to be provided for when sessions are torn down. All interactions are
1207	assumed to provide read as well as write capabilities.

1209	12.1 Client service definitions

1211	The following interfaces are identified from Figures 2 and 3:

1213	SBM <-> Address mapping

1215	 This is a simple lookup function which may cause ARP protocol
1216	interactions, may be just a lookup of an existing ARP cache entry or may
1217	be an algorithmic mapping. The layer-2 addresses are needed by SBM for
1218	inclusion in its signaling messages to/from switches which avoids the
1219	switches having to perform the mapping and, hence, have knowledge of
1220	layer-3 information for the complete subnet:

1222	     l2_addr = map_address( ip_addr )

1224	SBM <-> Session/802 header

1226	This is for notifying the transmit path of how to associate
1227	user_priority values with the traffic of each outgoing session: the
1228	transmit path will provide the user_priority value when it requests a
1229	MAC-layer transmit operation for each packet (user_priority is one of
1230	the parameters defined by the IEEE 802 service model):

1232	     bind_802_header( sessionid, user_priority )

1234	SBM <-> Classifier/Scheduler

1236	This is for notifying transmit classifier/scheduler of additional
1237	layer-2 information associated with scheduling the transmission of a
1238	session's packets (may be unused in some cases):

1240	     bind_l2sessioninfo( sessionid, l2_header, traffic_class )

1242	SBM <-> Local Admission Control

1244	For applying local admission control for a session e.g. is there enough
1245	transmit bandwidth still uncommitted for this potential new session? Are
1246	there sufficient receive buffers? This should commit the necessary
1247	resources if OK: it will be necessary to release these resources if a
1248	later stage of the session setup process fails.

1250	     status = admit_l2txsession( Tspec, flowspec )
1251	     status = admit_l2rxsession( Rspec, flowspec )

1253	SBM <-> RSVP - this is outlined above in section 8.2 and fully described
1254	in [10].

1256	12.2 Switch service definitions

1258	The following interfaces are identified from Figure 4:

1260	SBM <-> Classifier

1262	This is for notifying receive classifier of how to match up incoming
1263	layer-2 information with the associated traffic class: it may in some
1264	cases consist of a set of read-only default mappings:

1266	     bind_l2classifierinfo( l2_header, traffic_class )

1268	SBM <-> Queue and Packet Scheduler

1270	This is for notifying transmit scheduler of additional layer-2
1271	information associated with a given traffic class (it may be unused in
1272	some cases):

1274	     bind_l2schedulerinfo( l2_header, traffic_class )

1276	SBM <-> Local Admission Control

1278	 As for host above.

1280	SBM <-> Traffic Class Map and Police

1282	 Optional configuration of any layer-2 policing function and/or
1283	user_priority remapping that might be implemented on input to a switch:

1285	     bind_l2classmapping( in_user_priority, remap_user_priority )
1286	     bind_l2policing( l2_header, traffic_characteristics )

1288	SBM <-> Filtering Database

1290	SBM propagation rules need access to the layer-2 forwarding database to
1291	determine where to forward SBM messages (analogous to RSRR interface in
1292	L3 RSVP):

1294	     output_portlist = lookup_l2dest( l2_addr )

1296	13. Compatibility and Interoperability with existing equipment

1298	Layer-2-only "standard" 802.1p switches will have to work together with
1299	routers and layer-3 switches. Wide deployment of such 802.1p switches is
1300	envisaged, in a number of roles in the network. "Desktop switches" will
1301	provide dedicated 10/100 Mbps links to end stations at costs
1302	comparable/compatible with NICs/adapter cards. Very high speed core
1303	switches may act as central campus switching points for layer 3 devices.
1304	Real network deployments provide a wide range of examples today. The
1305	question is "what functionality beyond that of the basic 802.1D bridge
1306	should such 802.1p switches provide?". In the abstract the answer is
1307	"whatever they can do to broaden the applicability of the switching
1308	solution while still being economically distinct from the layer 3
1309	switches in their cost of acquisition, speed/bandwidth, cost of
1310	ownership and administration". Broadening the applicability means both
1311	addressing the needs of new traffic types and building larger switched
1312	networks (or making larger portions of existing networks switched). Thus
1313	one could imagine a network in which every device (along a network path)
1314	was layer-3 capable/intrusive into the full data stream; or one in which
1315	only the edge devices were pure layer-2; or one in which every alternate
1316	device lacked layer-3 functionality; or most do - excluding some key
1317	control points such as router firewalls, for example. Whatever the mix,
1318	the solution has to interoperate with these layer-3 QoS-aware devices.

1320	Of course, where int-serv flows pass through equipment which is ignorant
1321	of priority queuing and which places all packets through the same
1322	queuing/overload-dropping path, it is obvious that some of the
1323	characteristics of the flow get more difficult to support. Suitable
1324	courses of action in the cases where sufficient bandwidth or buffering
1325	is not available are of the form:

1327	(a)  buy more (and bigger) routers
1328	(b)  buy more capable switches
1329	(c)  rearrange the network topology: 802.1Q VLANs [11] may help here.
1330	(d)  buy more bandwidth

1332	It would also be possible to pass more information between switches
1333	about the capabilities of their neighbours and to route around non-
1334	QoS-capable switches: such methods are for further study.

1336	14. Justification

1338	An obvious comment is that this is all too complex, it's what RSVP is
1339	doing already, why do we think we can do better by reinventing the
1340	solution to this problem at layer-2?
1341	The key is that we do not have to tackle the full problem space of RSVP:
1342	there are a number of simple scenarios that cover a considerable
1343	proportion of the real situations that occur: all we have to do here is
1344	cover 99% of the territory at significantly lower cost and leave the
1345	other applications to full RSVP running in strategically  positioned
1346	high-function switches or routers. This will allow a significant
1347	reduction in overall network cost (equipment and ownership). This
1348	approach does mean that we have to discuss real life situations instead
1349	of abstract topologies that "could happen".

1351	Sometimes, for example, simple bandwidth configuration in a few switches
1352	e.g. to avoid overloading particular trunk links, can be used to
1353	overcome bottlenecks due to the network topology: if there are issues
1354	with overloading end station "last hops", RSVP in the end stations would
1355	exert the correct controls simply by examining local resources without
1356	much tie-in to the layer-2 topology. In this case there has been no need
1357	to resort to any form of complex topology computation and much
1358	complexity has been avoided.

1360	In the more general case, there remains work to be done. This will need
1361	to be done against the background constraint that the changing of queue
1362	service policies and the addition of extra functionality to support new
1363	service disciplines will proceed at the rate of hardware product
1364	development cycles and advance implementations of new algorithms may be
1365	pursued reluctantly or without the necessary 20/20 foresight.

1367	However, compared to the alternative of no traffic classes at all, there
1368	is substantial benefit in even the simplest of approaches (e.g. 2-4
1369	queues with straight priority), so there is significant reward for doing
1370	something: wide acceptance of that "something" probably means that even
1371	the simplest queue service disciplines will be provided for.

1373	15. References

1375	[1] ISO/IEC 10038, ANSI/IEEE Std 802.1D-1993 "MAC Bridges"

1377	[2] "Supplement to MAC Bridges: Traffic Class Expediting and
1378	       Dynamic Multicast Filtering",  May 1997, IEEE P802.1p/D6

1380	[3] "Integrated Services in the Internet Architecture: an Overview"
1381	       RFC1633, June 1994

1383	[4] "Resource Reservation Protocol (RSVP) - Version 1 Functional
1384	       Specification", Internet Draft, June 1997
1385	       <draft-ietf-rsvp-spec-16.[ps,txt]>

1387	[5] "Carrier Sense Multiple Access with Collision Detection
1388	       (CSMA/CD) Access Method and Physical Layer Specifications"
1389	       ANSI/IEEE Std 802.3-1985.

1391	[6] "Token-Ring Access Method and Physical Layer Specifications"
1392	       ANSI/IEEE Std 802.5-1995

1394	[7] "A Framework for Providing Integrated Services Over Shared and
1395	       Switched LAN Technologies", Internet Draft, May 1997
1396	       <draft-ietf-issll-is802-framework-02>

1398	[8] "Specification of the Controlled-Load Network Element Service",
1399	       Internet Draft, May 1997,
1400	       <draft-ietf-intserv-ctrl-load-svc-05.txt>

1402	[9] "Specification of Guaranteed Quality of Service",
1403	       Internet Draft, February 1997,
1404	       <draft-ietf-intserv-guaranteed-svc-07.txt>

1406	[10] "SBM (Subnet Bandwidth Manager): A Proposal for Admission
1407	       Control over Ethernet", Internet Draft, June 1997
1408	       <draft-yavatkar-sbm-ethernet-04>

1410	[11] "Draft Standard for Virtual Bridged Local Area Networks",
1411	        May 1997, IEEE P802.1Q/D6

1413	[12] "General Characterization Parameters for Integrated
1414	        Service Network Elements", Internet Draft, November 1996
1415	        <draft-ietf-intserv-charac-02.txt>

1417	[13] "A Standard for the Transmission of IP Datagrams over IEEE
1418	        802 Networks", RFC 1042, February 1988

1420	[14] "The Use of Priorities on Token-Ring Networks for Multimedia
1421	        Traffic", C. Bisdikian, B. V. Patel, F. Schaffa and M.
1422	        Willebeek-LeMair, IEEE Network, Nov/Dec 1995.

1424	16. Security Considerations

1426	There are no known security issues over and above those inherent in the
1427	Integrated Services architecture and the network technologies referenced
1428	by this document.

1430	17. Acknowledgments
1431	This document draws heavily on the work of the ISSLL WG of the IETF and
1432	the IEEE P802.1 Interworking Task Group. In particular, it includes
1433	previous work on Token-Ring by Anoop Ghanwani, Wayne Pace and Vijay
1434	Srinivasan.

1436	18. Authors' addresses

1438	Mick Seaman
1439	3Com Corp.
1440	5400 Bayfront Plaza
1441	Santa Clara CA 95052-8145
1442	USA
1443	+1 (408) 764 5000
1444	mick_seaman@3com.com

1446	Andrew Smith
1447	Extreme Networks
1448	10460 Bandley Drive
1449	Cupertino CA 95014
1450	USA
1451	+1 (408) 863 2821
1452	andrew@extremenetworks.com

1454	Eric Crawley
1455	Gigapacket Networks
1456	25 Porter Rd.
1457	Littleton MA 01460
1458	USA
1459	+1 (508) 486 0665
1460	esc@gigapacket.com