idnits 2.17.1 

draft-ietf-issll-802-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-25) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 10 instances of too long lines in the document, the longest
     one being 4 characters in excess of 72.

  ** The abstract seems to contain references ([5], [6], [7], [8], [9], [1]),
     which it shouldn't.  Please replace those with straight textual mentions
     of the documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 773 has weird spacing: '...gically  posit...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (November 1996) is 10023 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: '3' is defined on line 809, but no explicit reference
     was found in the text

  == Unused Reference: '4' is defined on line 812, but no explicit reference
     was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. '1'

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  ** Downref: Normative reference to an Informational RFC: RFC 1633 (ref. '3')

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'

  -- Possible downref: Non-RFC (?) normative reference: ref. '8'

  -- Possible downref: Non-RFC (?) normative reference: ref. '9'


     Summary: 12 errors (**), 0 flaws (~~), 4 warnings (==), 10 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Draft                                          Mick Seaman
3	Expires May 1997                                         3Com Corp.
4	draft-ietf-issll-802-00.txt                            Andrew Smith
5	                                                   Extreme Networks
6	                                                       Eric Crawley
7	                                                       Bay Networks
8	                                                      November 1996

10	          Integrated Services over IEEE 802.1D/802.1p Networks

12	Status of this Memo

14	   This document is an Internet Draft.  Internet Drafts are working
15	   documents of the Internet Engineering Task Force (IETF), its Areas,
16	   and its Working Groups. Note that other groups may also distribute
17	   working documents as Internet Drafts.

19	   Internet Drafts are draft documents valid for a maximum of six
20	   months. Internet Drafts may be updated, replaced, or obsoleted by
21	   other documents at any time.  It is not appropriate to use Internet
22	   Drafts as reference material or to cite them other than as a "working
23	   draft" or "work in progress."

25	   Please check the I-D abstract listing contained in each Internet
26	   Draft directory to learn the current status of this or any other
27	   Internet Draft.

29	Abstract

31	This document describes the support of IETF Integrated Services over
32	LANs built from IEEE 802 network segments which are interconnected by
33	standard IEEE 8021.D [1] switches.

35	It describes the practical capabilities and limitations of this
36	technology for supporting Controlled Load [8] and Guaranteed Service [9]
37	using the inherent capabilities the relevant 802 technologies [5],[6]
38	etc. and the proposed 802.1p queuing features in switches. It provides a
39	functional model for the layer 3 to layer 2 and user-to-network dialogue
40	which supports admission control and defines requirements for
41	interoperability between switches.

43	This scheme is consistent with the ISSLL over LANs framework discussed
44	at the October 1996 ISSLL interim meeting and described in [7].

46	1. Introduction

48	The IEEE 802.1 Interworking Task Group is currently enhancing the basic
49	MAC Service provided in Bridged Local Area Networks (aka "switched
50	LANs"). As a supplement to the IEEE MAC Bridges standard [1] , P802.1p
51	[2], proposes differential traffic class queuing ("priorities") and
52	access to media on the basis of a "user_priority" signaled in frames.

54	In this document we
55	* review the meaning and use of user_priority in LANs and the frame
56	   forwarding capabilities of a standard LAN switch.
57	* examine alternatives for identifying layer 2 traffic flows for
58	   admission control.
59	* review the options available for policing traffic flows.
60	* derive requirements for consistent priority handling in a network of
61	   switches and use these requirements to discuss priority queue
62	   handling alternatives for 802.1p and the way in which these meet
63	   administrative and interoperability goals.
64	* consider the benefits and limitations of this switched-based approach,
65	   contrasting it with full router based RSVP implementation in terms of
66	   complexity, utilisation of transmission resources and administrative
67	   controls.

69	We then describe a model which:
70	* partitions the admission control process into two separable
71	   operations:
72	* an interaction between the user of the integrated service and the
73	   local network elements ("provision of the service" in the terms of
74	   802.1D) to confirm the availability of transmission resources for
75	   traffic to be introduced.
76	* selection of an appropriate user_priority for that traffic on the
77	   basis of the service and service parameters to be supported.
78	* distinguishes between the user to network interface above and the
79	   mechanisms used by the switches ("support of the service"). These
80	   include communication between the switches (network to network
81	   signaling).
82	* describes a simple architecture for the provision and support of these
83	   services, broken down into components with functional and interface
84	   descriptions:
85	* a single "user" component: a layer-3 to layer-2 negotiation and
86	   translation component.
87	* bridge/switch processes to handle admission control and mapping
88	   requests, including proposals for actual traffic mappings to
89	   user_priority values.
90	* proposes a set of protocol exchange primitives based on the functions
91	   introduced.

93	This document contains much background material that is used as
94	justification for the approach taken. It is anticipated that much of
95	this material will not form a part of the final specification.

97	It will be noted that this document is written from the pragmatic
98	viewpoint that there will be a widely deployed network technology and we
99	are evaluating it for its ability to support some or all of the defined
100	IETF integrated services: this approach is intended to ensure
101	development of a system which can provide useful new capabilities in
102	existing (and soon to be deployed) network infrastructure.

104	2. Goals and Assumptions

106	It is assumed that the network is "switch-rich": that is to say all
107	communication between end stations using integrated services support
108	will pass through at least one switch. Perhaps the mechanisms and
109	protocols described will be trivially extensible to communicating
110	systems on the same shared media, but it is important not to allow
111	problem generalisation to complicate the practical application that we
112	target: the access characteristics of Ethernet are forcing a trend to
113	switch-rich topologies together with MAC enhancements to ensure access
114	predictability on half-duplex switch to switch links.

116	It is assumed that layer-3 entities, including end-stations, are running
117	the RSVP protocol in support of integrated services at that layer. No
118	extra modifications to this protocol are assumed.

120	There may be a heterogeneous mixture of switches with different
121	capabilities, all compliant with IEEE 802.1p, but implementing queuing
122	and forwarding mechanisms in a range from simple 2-queue per port,
123	strict priority, up to more complex multi-queue (maybe even one per-
124	flow) WFQ or other algorithms.

126	The problem is broken down into smaller independent pieces: this may
127	lead to sub-optimal usage of the network resources but we contend that
128	such benefits are often equivalent to very small improvements in network
129	efficiency in a LAN environment. Therefore, it is a goal that the
130	switches in the network operate using a much simpler set of information
131	than the RSVP engine in a router. In particular, it is assumed that such
132	switches do not need to implement per-flow queuing and policing.

134	One corollary is that no per-flow policing function need take place in
135	the switches: it is a fundamental part of the intserv model that flows
136	are isolated from each other throughout their transit across a network.
137	Intermediate queuing nodes are expected to police the traffic to ensure
138	that it conforms to the pre-agreed traffic flow specification. In the
139	architecture proposed here for mapping to layer-2, that policing
140	function is assumed to be implemented in the transmit schedulers of the
141	layer-3 devices (end stations, routers): it is reasonable to assume that
142	end stations are "trusted" to adhere to their agreed contracts at the
143	inputs to the network and that we can afford to over-allocate resources
144	to compensate for the inevitable extra jitter/bunching introduced by the
145	switched network itself.

147	3. User Priority and Frame Forwarding

149	User_priority is a value associated with the transmission and reception
150	of all frames in the IEEE 802 service model: it is supplied by a sender
151	which is using the MAC service. It is provided to a receiver using the
152	MAC service. It may or may not be actually carried over the network:
153	Token-Ring/802.5 carries this value (encoded in its FC octet), basic
154	Ethernet/802.3 does not. 802.1p defines a way to carry this value over
155	the network in a similar way on Ethernet, Token Ring, FDDI or other MACs
156	using an extended frame format.

158	The "user_priority" or "traffic class" (the latter term is to be
159	preferred and it is the title of the 802.1p document) field in packets
160	is a simple label in the data stream enabling packets in different
161	classes to be discriminated by downstream nodes. Apart from making the
162	job of desktop or wiring-closet switches easier, it means they do not
163	have to change (hardware or software) as the rules for classifying
164	packets evolve (based on new protocols or new policies). Layer-3
165	switches do provide added value here by performing the classification
166	more accurately and, hence, utilising network resources more
167	efficiently: this appears to be a good economic choice since there are
168	likely to be very many more desktop/wiring closet switches in a network
169	than switches requiring layer 3 functionality.

171	The IEEE 802 specifications make no assumptions about how user_priority
172	is to be used by end stations or by the network, although the current
173	802.1p draft defines static priority queuing as the default mode of
174	operation of all switches (user_priority is defined as a 3-bit quantity
175	with value 7 = high priority, 0 = low priority). The switch algorithm in
176	this case is as follows: packets are placed onto a particular queue
177	based on the received user_priority (from the packet if a 802.1p header
178	or 802.5 network was used, invented according to some local policy if
179	not). The selection of queue is based on a mapping from user_priority
180	[0,1,2,3,4,5,6 or 7] onto the number of available queues - switches may
181	implement any number of queues from 1 upwards. On transmit, any/all
182	frames from a higher priority queue are sent first before transmitting
183	any from a lower priority queue.

185	In particular, IEEE makes no recommendations about how a sender should
186	select the value for user_priority: one of the main purposes of this
187	draft is to propose such usage rules.

189	Additionally, there are no IEEE 802-defined rules for switches to agree
190	on how to treat frames with different user_priority values: later on in
191	this draft we make some recommendations as to what information needs to
192	be shared amongst switches.

194	4. Mapping of integrated services to layer-2 in layer-3 devices

196	The end-station or router itself is responsible for local admission
197	control and scheduling packets onto its link in accordance with the
198	service agreed. Just as in the intserv model, this involves per- flow
199	schedulers somewhere in every such data source: it is an implementation
200	issue whether there are separate schedulers for layer-3 and layer-2 or
201	whether these are combined.

203	5. Mapping of integrated services through layer-2 switches

205	5.1 Queuing

207	Connectionless packet-based networks in general and LAN switched
208	networks in particular, work today because of scaling choices in network
209	provisioning. Consciously or (more usually) unconsciously, enough excess
210	bandwidth and buffering is provisioned in the network to absorb the
211	traffic sourced by higher-layer protocols or cause their transmission
212	windows to run out, on a statistical basis, so that the network is only
213	overloaded for a short duration and the average expected loading is less
214	than 60% (usually much less).

216	With the advent of time-critical traffic such overprovisioning has
217	become far less easy to achieve. Time critical frames may find
218	themselves queued for annoyingly long periods of time behind temporary
219	bursts of file transfer traffic, particularly at network bottleneck
220	points, e.g. at the 100 Mb/s to 10 Mb/s transition that might occur
221	between the riser to the wiring closet and the final link to the user
222	from a desktop switch. In this case, however, if it is known (guaranteed
223	by application design, merely expected on the basis of statistics, or
224	just that this is all that the network guarantees to support) that the
225	time critical traffic is a small fraction of the total bandwidth, it
226	suffices to give it strict priority over the "normal" traffic. The worst
227	case delay experienced by the time critical traffic is roughly the
228	maximum transmission time of a maximum length non-time-critical frame -
229	less than a millisecond for 10 Mb/s Ethernet, and well below an end to
230	end budget based on human perception times.

232	When more than one "priority" service is to be offered by a network
233	element e.g. it supports controlled-load as well as Guaranteed Service,
234	the queuing discipline becomes more complex. In order to provide the
235	required isolation between the service classes, it will probably be
236	necessary to queue them separately. There is then an issue of how to
237	service the queues - a combination of admission control and maybe
238	weighted fair queuing may be required in such cases. As with the service
239	specifications themselves, it is not the place for this document to
240	specify queuing algorithms, merely to observe that the external
241	behaviour meet the services' requirements.

243	5.2 Multicast Heterogeneity

245	IEEE 802.1D and 802.1p use a model for multicast whereby a switch
246	performs multicast routing decisions based on the destination address:
247	this would produce a list of output ports to which the packet should be
248	forwarded. In its default mode, such a switch would use any
249	user_priority value in received packets to enqueue the packets at each
250	output port.

252	At layer-3, the intserv model allows heterogeneous multicast flows where
253	different branches of a tree can have different types of reservations
254	for a given multicast destination, or even supports the notion that some
255	trees will have some branches with reserved flows and some using best
256	effort (default) service.

258	If a switch is selecting per-port output queues based only on the
259	incoming user_priority, it will have to treat all branches of all
260	multicast sessions within that user_priority class with the same queuing
261	mechanism: no heterogeneity is then possible (if it were to implement a
262	separate mapping at each output port then some limited form of
263	heterogeneity could be supported). It is proposed that per-
264	user_priority queuing support is adequate as minimum standard
265	functionality for systems *in a LAN environment*. Layer-3 switches
266	(a.k.a. routers) can be used if more flexible forms of heterogeneity are
267	considered necessary: their behaviour is well standardised.

269	6. Selecting User Priority classes

271	One fundamental question is "who gets to decide what the classes mean
272	and who gets access to them?" One approach would be for the meanings of
273	the classes to be "well-known": we would then need to standardise a set
274	of classes e.g. 1 = best effort, 2 = controlled- load, 3 = guaranteed
275	(loose delay bound, high bandwidth), 4 = guaranteed (slightly tighter
276	delay) etc. The values to encode in such a table in end stations, in
277	isolation from the network to which they are connected, is
278	problematical: the best we could probably do would be to define on
279	user_priority value per intserv service type and leave it at that
280	(reserving the rest of the combinations for future traffic classes -
281	there are sure to be plenty!).

283	We propose a more flexible mapping: clients ask "the network" which
284	user_priority traffic class to use for a given traffic flow, as
285	categorised by its flow-spec and layer-2 endpoints. The network provides
286	a value back to the requester which is appropriate to the current
287	network topology, load conditions, other admitted flows etc. The task of
288	configuring switches with this mapping (e.g. through network management
289	or some other switch-switch protocol) is an order of magnitude less
290	complex than performing the same function in end stations. Also, when
291	new services (or other network reconfigurations) are added to such a
292	network, the network elements will typically be the ones to be upgraded
293	with new queuing algorithms etc. and can be provided with new mappings
294	at this time.

296	Given the need for a new session or "flow" requiring some QoS support, a
297	client then needs answers to the following questions:

299	1. which traffic class do I add this flow to?
300	    The client needs to know how to label the packets of the flow as it
301	    places them into the network.

303	2. who do I ask/tell?
304	    The proposed model is that a client ask "the network" which
305	    user_priority traffic class to use for a given traffic flow. This has
306	    several benefits as compared to a model which allows clients to select
307	    a class for themselves.

309	3. how do I ask/tell them?
310	    A request/response protocol is needed between client and network: in
311	    fact, the request can be piggy-backed onto an admission control request
312	    and the response can be piggy-backed onto an admission control
313	    acknowledgment.

315	The network (i.e. the first network element encountered downstream from
316	the client) must then answer the following questions:

318	1. which traffic class do I add this flow to?
319	    This is a packing problem, difficult to solve in general, but many
320	    simplifying assumptions can be made: presumably some simple form of
321	    allocation can be done without a more complex scheme able to
322	    dynamically shift flows around between classes.

324	2. which traffic class has worst-case parameters which meet the needs of
325	        this flow?
326	    This might be an ordering/comparison problem: which of two service
327	    classes is "better" than another? Again, we can make this tractable by
328	    observing that all of the current intserv classes can be ranked (best
329	    effort <= Controlled Load <= Guaranteed Service) in a simple manner. If
330	    any classes are implemented in the future that cannot be simply ranked
331	    then the issue can be finessed by either a priori knowledge about what
332	    classes are supported or by configuration.

334	and return the chosen user_priority value to the client.

336	Note that the client may be either an end station, router or a first
337	switch which may be acting as a proxy for a client which does not
338	participate in these protocols for whatever reason. Note also that a
339	device e.g. a server or router, may choose to implement both the
340	"client" as well as the "network" portion of this model so that it can
341	select its own user_priority values: such an implementation is, however,
342	discouraged unless the device really does have a close tie-in with the
343	network topology and resource allocation policies.

345	7. Flow Identification

347	Several previous proposals for intserv over lower-layers have treated
348	switches very much as a special case of routers: in particular, that
349	switches along the data path will make packet handling decisions based
350	on the RSVP flow and filter specifications and use them to classify the
351	corresponding data packets. However, filtering to the per-flow level
352	becomes cost-prohibitive with increasing switch speed: devices with such
353	filtering capabilities are unlikely to have a very different
354	implementation cost to IP routers, in which case we must question
355	whether a specification oriented toward switched networks is of any
356	benefit at all.

358	This document proposes that "flow" identification based in user_priority
359	be the minimum required of switches.

361	8. Reserving Network Resources - Admission Control

363	So far we have not discussed admission control. In fact, without
364	admission control it is possible to scratchbuild a LAN network of some
365	size capable of supporting real-time services, providing that the
366	traffic fits within certain scaling constraints (relative link speeds,
367	numbers of ports etc. - see below). This is not surprising since it is
368	possible to run a fair approximation to real time services on small LANs
369	today with no admission control or help from encoded priority bits.

371	Imagine a campus network providing dedicated 10 Mbps connections to each
372	user. Each floor of each building supports up to 96 users, organized
373	into groups of 24, with each group being supported by a 100 Mbps
374	downlink to a basement switch which concentrates 5 floors (20 x 100
375	Mbps) and a data center (4 x 100 Mbps) to a 1 Gbps link to an 8 Gbps
376	central campus switch, which in turn hooks 6 buildings together (with 2
377	x 1 Gbps full-duplex links to support a corporate server farm). Such a
378	network could support 1.5 Mb/s of voice/video from every user to any
379	other user or (for half the population) the server farm, provided the
380	video ran high priority: this gives 3000 users, all with desktop video
381	conferencing running along with file transfer/email etc. In such a
382	network RSVP's role would be limited to ensuring resource availability
383	at the communicating end stations and for connection to the wide area.

385	In such a network, a discussion as to the best service policy to apply
386	to high and low priority queues may prove academic: while it is true
387	that "normal" traffic may be delayed by bunches of high priority frames,
388	queuing theory tells us that the average queue occupancy in the high
389	priority queue at any switch port will be somewhat less than 1 (with
390	real user behaviour, i.e. not all watching video conferences all the
391	time) it should be far less. A cheaper alternative to buying equipment
392	with a fancy queue service policy may be to buy equipment with more
393	bandwidth to lower the average link utilisation by a few per cent.

395	In practice a number of objections can be made to such a simple
396	solution. There may be long established expensive equipment in the
397	network which does not provide all the bandwidth required. There will be
398	considerable concern over who is allowed to say what traffic is high
399	priority. There may be a wish to give some form of "prioritised" service
400	to crucial business applications, above that given to experimental
401	video-conferencing. The task that faces us is to provide a degree of
402	control without making that control so elaborate to implement that the
403	control oriented solution is not simply rejected in favor of providing
404	yet more bandwidth, at a lower cost.

406	The proposed admission control mechanism requires a query-response
407	interaction with the network returning a "YES/NO" answer and, if
408	successful, the user_priority value with which to tag the data frames of
409	this flow.

411	9. Client mapping to layer 2

413	We assume the same host model as intserv and RSVP: the client is running
414	an RSVP process which presents a session establishment interface to
415	applications, signals RSVP over the network, programs scheduler and
416	classifiers in the driver and interfaces to a policy control module. In
417	particular, RSVP also interfaces to a local admission control module: it
418	is this entity that we focus on here.

420	The following diagram is taken from the RSVP spec:
421	                      _____________________________
422	                     |  _______                    |
423	                     | |       |   _______         |
424	                     | |Appli- |  |       |        |   RSVP
425	                     | | cation|  | RSVP <--------------------
426	      | |       <--       |        |
427	                     | |       |  |process|  _____ |
428	                     | |_._____|  |       --Polcy||
429	                     |   |        |__.__._| |Cntrl||
430	                     |   |data       |  |   |_____||
431	                     |===|===========|==|==========|
432	                     |   |   --------|  |    _____ |
433	                     |   |  |        |  ----Admis||
434	                     |  _V__V_    ___V____  |Cntrl||
435	                     | |      |  |        | |_____||
436	                     | |Class-|  | Packet |        |
437	                     | | ifier|==Schedulr|====================
438	                     | |______|  |________|        |    data
439	                     |                             |
440	                     |_____________________________|

442	                      Figure 1 - RSVP in Hosts

444	The local admission control entity (known as "TUTU") within a client is
445	responsible for mapping these layer-3 requests in TO layer TwO language.

447	The upper-layer entity requests from TUTU:

449	    "May I reserve for traffic with <traffic characteristic with
450	      <performance requirements from <here to <there and how
451	      should I label it?"

453	where
454	  <traffic characteristic = Flow Spec, Tspec, Rspec (e.g.
455	              bandwidth, burstiness, MTU etc.)
456	  <performance requirements = latency, jitter bounds etc.
457	  <here = IP address(es)
458	  <there = IP address(es) - may be multicast

460	The TUTU entity:
461	* maps the endpoints of the conversation to layer-2 addresses in the
462	   LAN, so it can figure out what traffic is really going where.
463	* applies local admission control on outgoing link and driver (may have
464	   some interaction with classifier and scheduler here e.g. to give
465	   classifier information about which user_priority values to expect)
466	* formats a request to the network with the mapped addresses and flow
467	   specs
468	* receives response from the network and reports the YES/NO admission
469	   control answer and, for successful requests, the resulting
470	   user_priority back to the upper layer entity.

472	                    from IP     from RSVP
473	                   ____|____________|____________
474	                  |    |      |     |            |
475	                  |  __V____  |  ___V___         |
476	                  | |       | | |       |        |
477	                  | |  ARP  | | |       |        | ISSLL signaling
478	                  | |protocl| | | TUTU  |<------------------------
479	                  | |       |<-|       |        |
480	                  | |       | | |       |        |
481	                  | |_______| | |       |        |
482	                  |    |      | |_______|        |
483	                  |    |data  |    |  |          |
484	                  |====|===========|==|==========|
485	                  |    |  +--------|  |   _____  |
486	                  |    |  |        |  +-|Local| |
487	                  |  __V__V_   ____V___  |Admis| |
488	                  | |      |  |        | |Cntrl| |
489	                  | |Class-|  | Packet | |_____| |
490	                  | | ifier|==Schedulr|======================
491	                  | |______|  |________|         |  data
492	                  |                              |
493	                  |______________________________|

495	                Figure 2 - ISSLL in Hosts

497	10. Switch Functions

499	10.1 Admission Control

501	For the sake of this discussion, we define the following entities within
502	a layer-2 switch:
503	* traffic class mapping authority - this holds the mapping table of
504	   intserv classes to user_priority.
505	* reservation accountants - one of these on each port accounts for the
506	   available bandwidth on that link. For half-duplex links, this
507	   involves taking account of both transmit and receive flows. For
508	   full-duplex the input port accountant's task is trivial.
509	* reservation propagators - these propagate requests that have passed
510	   admission control at the input port's accountant to the relevant
511	   output ports' accountants. This will require access to the switch's
512	   forwarding table (layer-2 "routing table" - cf. RSVP model) and
513	   spanning-tree state.

515	These are shown by the following diagram:
516	                   _______________________________
517	                  |  _____     ______     _____   |
518	                  | |Span |   |filter|   |traff|  |
519	                  | |Tree |<-|data- |   |class|  |
520	                  | |Prot.|   |  base|   |map  |  |
521	                  | |_____|   |______|   |_____|  |
522	                  |              ^                |
523	                  |  _____     __|___     ______  |
524	 ISSLL signaling  | | in  |   |      |   | out  | | ISSLL signaling
525	<------------------|resv |<-| resv |<-| resv |<----------------
526	                  | |acct.|   | prop.|   | acct.| |
527	                  | |_____|   |______|  /|______| |
528	                  |    |   \           /     |    |
529	                  |====|====\=========|======|====|
530	                  |  __V__  |         |    __V__  |
531	                  | |Local| |         |   |Local| |
532	                  | |Admis| |         |   |Admis| |
533	                  | |Cntrl| |         |   |Cntrl| |
534	                  | |_____| |         |   |_____| |
535	                  |     ____V_      __V____       |
536	                  |    |Class-|    | Packet |     |
537	      ===============-| ifier|====Schedulr|===================
538	         data     |    |______|    |________|     |  data
539	                  |                               |
540	                  |_______________________________|

542	                    Figure 3 - ISSLL in Switches

544	On reception of an admission control request, a switch performs the
545	following actions:
546	* ingress bandwidth accountant observes the current state of allocation
547	   of resources on the input port/link and then determines whether the
548	   new allocation would be excessive. The request is passed to the
549	   reservation propagator if accepted so far.
550	* reservation propagator relays the request to the bandwidth accountants
551	   on each of the switch's outbound links to which this reservation
552	   would apply (implied interface to routing/forwarding database).
553	* egress bandwidth accountant observes the current state of allocation
554	   of queueing resources on its outbound port and bandwidth on the link
555	   itself and determines whether the new allocation would be excessive.
556	   Note that this is only the local decision of this switch hop: each
557	   further layer-2 hop through the network gets a chance to veto the
558	   request as it passes along.
559	* the request, if accepted by this switch, is then passed on down the
560	   line on each output link selected.
561	* if this is the first switch in line, the traffic class mapping
562	   authority selects a layer-2 traffic class which appears compatible
563	   with the request and whose use does not violate any administrative
564	   policies in force. In effect, it matches up the requested service with
565	   those available in each of the user_priority classes and chooses the
566	   "best" one. It ensures that, if this reservation is successful, the
567	   selected value is passed back to the client.
568	* if accepted, the switch must notify the client of the user_priority to
569	   use for packets belonging to this flow.  Note that this is a
570	   "provisional YES" - we assume an optimistic approach here: later
571	   switches can still say "NO" later.
572	* if this switch wishes to reject the request, it can do so by notifying
573	   the original client (by means of its layer-2 address).

575	10.2 Mappings to IEEE 802 user_priority

577	There are several options available for mapping service models (Best
578	Effort, Controlled Load, and Guaranteed) to IEEE 802.1p user_priority
579	classes.  The problem with making choices at this time is that we don't
580	have much experience with any particular mappings to help make a
581	determination as to the "best" mapping. So, the following options are
582	presented to stimulate discussion in this area.  Note, this does not
583	dictate what mechanisms/algorithms a network element (e.g. an Ethernet
584	switch) needs to do implement these mappings: this is an implementation
585	choice and does not matter so long as the requirements for the
586	particular service model are met.

588	In order to reduce the administrative problems of maintaining such
589	mappings, such a mapping table is held by *switches* only (and routers
590	if desired) and is a read-write table. The values proposed below are
591	defaults and can be overridden by management control so long as all
592	switches agree to some extent (the required level of agreement requires
593	further thought).

595	Option A:  The Simple Method

597	In this method, all traffic that uses a particular service model is
598	mapped to a single 802.1p user_priority.  This is fine as long as all
599	traffic for a given service model does not exceed any capacity in the
600	802 device and fine control of delay is not needed.  Here is an example:

602	      Priority  Service
603	        0       "less than" Best Effort
604	        1       Best Effort
605	        2       reserved
606	        3       reserved
607	        4       Controlled Load
608	        5       reserved
609	        6       Guaranteed Service
610	        7       reserved

612	The "less than" best effort service is useful for devices that wish to
613	tag packets that are exceeding a committed network capacity and can be
614	optionally discarded by a downstream device.  Note, this is not
615	necessarily incorporated in any current IntServ model.

617	The advantage of this mapping is that it leaves room for future service
618	models.  The choices of priority 4 and priority 6 for Controlled Load
619	and Guaranteed Service, respectively, is somewhat arbitrary.  Any two
620	priorities greater than Best Effort can be used as long as Guaranteed
621	Service is "greater" than Controlled Service although those proposed
622	here have the advantage that, for transit through 802.1p switches with
623	only two-level strict priority queuing, they both get "high priority"
624	treatment (the current 802.1p split is 0-3 and 4-7 for 2 queues).

626	One disadvantage to this mapping is that it ignores the delay
627	characteristics of the guaranteed service and groups all guaranteed
628	traffic, no matter what the delay bound, into the same priority.

630	Option B:  Two Classes of Guaranteed Service

632	For this method, we expand the number of priorities assigned to the
633	Guaranteed Service:
634	      Priority  Service
635	        0       "less than" Best Effort
636	        1       Best Effort
637	        2       reserved
638	        3       reserved
639	        4       Controlled Load
640	        5       Guaranteed Service, 100ms bound
641	        6       Guaranteed Service, 10ms bound
642	        7       reserved

644	Again, the choices of the exact priorities are somewhat arbitrary as
645	long as they are increasing.  Similarly, the choice of delay bound is
646	also arbitrary but potentially very significant.  One of the key
647	differences is that now there is a bound on delay through the network
648	(and hence through each device) which may be much harder to implement
649	although it can lead to a much more efficient allocation of resources.

651	The advantage to this approach is that it puts some real delay bounds on
652	the Guaranteed Service without adding any additional complexity to the
653	other services.  It still ignores the amount of *bandwidth* available
654	for each class.

656	Further derivations of this option could be made by dividing the
657	Guaranteed Service classes into more levels with particular delay
658	bounds.  Expanding the number of priorities for Controlled Load service
659	is not as appealing since there is no need to map to a particular delay
660	bound.  There may be a cases where an administrator might map Controlled
661	Load to more priorities for particular bandwidths or policy levels. It
662	may also be necessary to further classify Controlled Load traffic in
663	cases and where the Controlled Load traffic is frequently non-conformant
664	for certain applications.

666	10.3 Policy

668	A policy agent may also be implemented by a switch. This determines, how
669	to interpret received user_priority values from packets, whether to
670	trust them and whether to map them to something else. The policies in
671	force may be configured by network management. Default is to use what is
672	received and pass it on unchanged.

674	11. Signaling protocol

676	It is not the intention to precisely define a protocol in this document
677	at this time. For now, we propose only some issues that such a protocol
678	should consider:
679	* need to tackle problem of reservation request crossing on a shared
680	   medium ("collisions"): this needs some form of tie- breaker.
681	* failed reservation retry policy: may be a bad idea to retry but we
682	   have to specify behaviour.
683	* one simple approach might be to avoid the election of any "master"
684	   bandwidth arbiter on a segment: if we were to assume an optimistic
685	   approach to reservations with later "veto" power by subsequent
686	   switches or receivers then a large degree of complexity might be avoided.
687	* signaling protocol needs to be able to notify failure of admission
688	   control back to client or back to previous switch hop.

690	12. Shared media

692	The astute reader will have noticed that we have not mentioned the
693	difficulty of dealing with allocation on a single shared CSMA/CD
694	segment: there are a number of reasons for this.

696	Firstly, we do not believe this is a truly solvable problem: it would
697	seem to require a new MAC protocol. Those who are interested in solving
698	this problem per se should probably be following the BLAM developments
699	in 802.3 but we would be suspicious of the interoperability
700	characteristics of a series of new software MACs running above the
701	traditional 802.3 MAC.

703	Secondly, we are not convinced that it is really an interesting problem.
704	While not everyone in the world is buying desktop switches today and
705	there will be end stations living on repeated segments for some time to
706	come, the number of switches is going up and the number of stations on
707	repeated segments is going down. This trend is proceeding to the point
708	that we may be happy with a solution which assumes that any network
709	conversation requiring resource reservations will take place through at
710	least one switch (be it layer-2 or layer-3). Put another way, the
711	easiest QoS upgrade to a layer-2 network is to install segment
712	switching: only when has been done is it worthwhile to investigate more
713	complex solutions involving admission control.

715	Thirdly, in the core of the network (as opposed to at the edges), there
716	does not seem to be enough economic benefit for repeated segment
717	solutions as opposed to switched solutions. While repeated solutions
718	*may* be 50% cheaper, their cost impact on the entire network is
719	amortised across all of the edge ports. There may be special
720	circumstances in the future (e.g. Gigabit buffered repeaters) but these
721	have differing characteristics to existing CSMA/CD repeaters anyway.

723	13. Compatibility and Interoperability with existing equipment

725	Layer-2-only "standard" 802.1p switches will have to work together with
726	routers and layer-3 switches. Wide deployment of such 802.1p switches is
727	envisaged, in a number of roles in the network. "Desktop switches" will
728	provide dedicated 10/100 Mbps links to end stations at costs
729	comparable/compatible with NICs/adapter cards. Very high speed core
730	switches may act as central campus switching points for layer 3 devices.
731	Real network deployments provide a wide range of examples today. The
732	question is "what functionality beyond that of the basic 802.1D bridge
733	should such 802.1p switches provide?". In the abstract the answer is
734	"whatever they can do to broaden the applicability of the switching
735	solution while still being economically distinct from the layer 3
736	switches in their cost of acquisition, speed/bandwidth, cost of
737	ownership and administration". Broadening the applicability means both
738	addressing the needs of new traffic types and building larger switched
739	networks (or making larger portions of existing networks switched). Thus
740	one could imagine a network in which every device (along a network path)
741	was layer-3 capable/intrusive into the full data stream; or one in which
742	only the edge devices were pure layer-2; or one in which every alternate
743	device lacked layer-3 functionality; or most do - excluding some key
744	control points such as router firewalls, for example. Whatever the mix,
745	the solution has to interoperate with these layer-3 QoS-aware devices.

747	Of course, where intserv flows pass through equipment which is ignorant
748	of priority queuing and which places all packets through the same
749	queuing/overload-dropping path, it is obvious that some of the
750	characteristics of the flow get more difficult to support. Suitable
751	courses of action in the cases where sufficient bandwidth or buffering
752	is not available are of the form:

754	(a)  buy more (and bigger) routers
755	(b)  buy more capable switches
756	(c)  rearrange the network topology: 802.1Q VLANs may help here.
757	(d)  buy more bandwidth: Gigabit Ethernet is nearly here.

759	It would also be possible to pass more information between switches
760	about the capabilities of their neighbours and to route around non-
761	QoS-capable switches: such methods are for further study.

763	14. Epilogue

765	An obvious comment is that this is all too complex, it's what RSVP is
766	doing already, why do we think we can do better by reinventing the
767	solution to this problem at layer-2?

769	The key is that we do not have to tackle the full problem space of RSVP:
770	there are a number of simple scenarios that cover a considerable
771	proportion of the real situations that occur: all we have to do here is
772	cover 99% of the territory at significantly lower cost and leave the
773	other applications to full RSVP running in strategically  positioned
774	high-function switches or routers. This will allow a significant
775	reduction in overall network cost (equipment and ownership). This
776	approach does mean that we have to discuss real life situations instead
777	of abstract topologies that "could happen".

779	Sometimes, for example, simple bandwidth configuration in a few switches
780	e.g. to avoid overloading particular trunk links, can be used to
781	overcome bottlenecks due to the network topology: if there are issues
782	with overloading end station "last hops", RSVP in the end stations would
783	exert the correct controls simply by examining local resources without
784	much tie-in to the layer-2 topology. In this case there has been no need
785	to resort to any form of complex topology computation and much
786	complexity has been avoided.

788	In the more general case, there remains work to be done. This will need
789	to be done against the background constraint that the changing of queue
790	service policies and the addition of extra functionality to support new
791	service disciplines will proceed at the rate of hardware product
792	development cycles and advance implementations of new algorithms may be
793	pursued reluctantly or without the necessary 20-20 foresight.

795	However, compared to the alternative of no traffic classes at all, there
796	is substantial benefit in even the simplest of approaches (e.g. 2-4
797	queues with straight priority), so there is significant reward for doing
798	something: wide acceptance of that "something" probably means that even
799	the simplest queue service disciplines will be provided for.

801	15. References

803	[1] ISO/IEC 10038, ANSI/IEEE Std 802.1D-1993 "MAC Bridges"

805	[2] "MAC Bridges - Traffic Classes and Dynamic Multicast Filtering
806	        Services in Bridged Local Area Networks", October 1996
807	        IEEE P802.1p/D4

809	[3] "Integrated Services in the Internet Architecture: an Overview"
810	        RFC1633, June 1994

812	[4] "Resource Reservation Protocol (RSVP) - Version 1 Functional
813	       Specification" Internet Draft, November 1996
814	       <draft-ietf-rsvp-spec-14.ps

816	[5] "Carrier Sense Multiple Access with Collision Detection
817	       (CSMA/CD) Access Method and Physical Layer Specifications"
818	       ANSI/IEEE Std 802.3-1985.

820	[6] "Token-Ring Media Access Control"
821	         IEEE Std 802.5

823	[7] "A Framework for Providing Integrated Services Over Shared and
824	       Switched LAN Technologies", Internet Draft, November 1996
825	       <draft-ghanwani-framework-is-lan-01.txt

827	[8] "Specification of the Controlled-Load Network Element Service",
828	       Internet Draft, August 1996,
829	       <draft-ietf-intserv-ctrl-load-svc-03.txt

831	[9] "Specification of Guaranteed Quality of Service",
832	       Internet Draft, August 1996,
833	       <draft-ietf-intserv-guaranteed-svc-06.txt

835	16. Security Considerations

837	Security issues are not addressed in this memo.

839	17. Authors' addresses
840	Mick Seaman
841	3Com Corp.
842	5400 Bayfront Plaza
843	Santa Clara CA 95052-8145
844	USA
845	+1 (408) 764 5000
846	mick_seaman@3com.com

848	Andrew Smith
849	Extreme Networks
850	1601 S De Anza Blvd. #220
851	Cupertino CA 95014
852	USA
853	+1 (408) 342 0999
854	andrew@extremenetworks.com

856	Eric Crawley
857	Bay Networks
858	3 Federal St.
859	Billerica MA 01821
860	USA
861	+1 (508) 670 8888
862	esc@baynetworks.com