idnits 2.17.1 

draft-ietf-diffserv-model-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([DSARCH], [DSMIB],
     [QOSDEVMOD], [DSPIB]), which it shouldn't.  Please replace those with
     straight textual mentions of the documents in question.

  == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.

  == There are 2 instances of lines with private range IPv4 addresses in the
     document.  If these are generic example addresses, they should be changed
     to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x,
     198.51.100.x or 203.0.113.x.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 1095: '... such operations MUST NOT have the eff...'


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == Line 154 has weird spacing: '...serving    ser...'

  == Line 177 has weird spacing: '...tioning  other...'

  == Line 184 has weird spacing: '...serving    ser...'

  == Line 1052 has weird spacing: '...ally to  const...'

  == Line 1153 has weird spacing: '...such as  the c...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 2000) is 8737 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'DSCP' is mentioned on line 339, but not defined

  ** Downref: Normative reference to an Informational RFC: RFC 2475 (ref.
     'DSARCH')

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DSMIB'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DSPIB'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DSTERMS'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'E2E'

  ** Obsolete normative reference: RFC 2598 (ref. 'EF-PHB') (Obsoleted by RFC
     3246)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'GTC'

  ** Downref: Normative reference to an Informational RFC: RFC 1633 (ref.
     'INTSERV')

  -- Possible downref: Non-RFC (?) normative reference: ref. 'POLTERM'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'QOSDEVMOD'

  ** Downref: Normative reference to an Informational RFC: RFC 2697 (ref.
     'SRTCM')

  ** Downref: Normative reference to an Informational RFC: RFC 2698 (ref.
     'TRTCM')


     Summary: 11 errors (**), 0 flaws (~~), 10 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                                Y. Bernet
2	Diffserv Working Group                                         Microsoft
3	INTERNET-DRAFT                                                  A. Smith
4	Expires November 2000                                   Extreme Networks
5	draft-ietf-diffserv-model-03.txt                                S. Blake
6	                                                                Ericsson
7	                                                             D. Grossman
8	                                                                Motorola
9	                                                                May 2000
10	                A Conceptual Model for Diffserv Routers

12	Status of this Memo

14	This document is an Internet-Draft and is in full conformance with all
15	provisions of Section 10 of RFC2026.  Internet-Drafts are working
16	documents of the Internet Engineering Task Force (IETF), its areas, and
17	its working groups. Note that other groups may also distribute working
18	documents as Internet-Drafts.

20	Internet-Drafts are draft documents valid for a maximum of six months
21	and may be updated, replaced, or obsoleted by other documents at any
22	time. It is inappropriate to use Internet-Drafts as reference material
23	or to cite them other than as "work in progress."

25	The list of current Internet-Drafts can be accessed at
26	http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft
27	Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

29	This document is a product of the IETF's Differentiated Services Working
30	Group. Comments should be addressed to WG's mailing list at
31	diffserv@ietf.org. The charter for Differentiated Services may be found
32	at http://www.ietf.org/html.charters/diffserv-charter.html Copyright (C)
33	The Internet Society (2000). All Rights Reserved.

35	Distribution of this memo is unlimited.

37	Abstract

39	This draft proposes a conceptual model of Differentiated Services
40	(Diffserv) routers for use in their management and configuration.  This
41	model defines the general functional datapath elements (classifiers,
42	meters, markers, droppers, monitors, multiplexors, queues), their
43	possible configuration parameters, and how they might be interconnected
44	to realize the range of classification, traffic conditioning, and per-
45	hop behavior (PHB) functionalities described in [DSARCH]. The model is
46	intended to be abstract and capable of representing the configuration
47	parameters important to Diffserv functionality for a variety of specific
48	router implementations. It is not intended as a guide to hardware
49	implementation.

51	This model serves as the rationale for the design of an SNMP MIB [DSMIB]
52	and for other configuration interfaces (e.g.  [DSPIB]) and more detailed
53	models (e.g. [QOSDEVMOD]): these should all be based upon and consistent
54	with this model.

56	1.  Introduction

58	Differentiated Services (Diffserv) [DSARCH] is a set of technologies
59	which allow network service providers to offer different kinds of
60	network quality-of-service (QoS) to different customers and their
61	traffic streams. The premise of Diffserv networks is that routers within
62	the core of the network handle packets in different traffic streams by
63	forwarding them using different per-hop behaviors (PHBs).  The PHB to be
64	applied is indicated by a Diffserv codepoint (DSCP) in the IP header of
65	each packet [DSFIELD].  Note that this document uses the terminology
66	defined in [DSARCH, DSTERMS] and in Section 2.

68	The advantage of such a scheme is that many traffic streams can be
69	aggregated to one of a small number of behavior aggregates (BA) which
70	are each forwarded using the same PHB at the router, thereby simplifying
71	the processing and associated storage. In addition, there is no
72	signaling, other than what is carried in the DSCP of each packet, and no
73	other related processing that is required in the core of the Diffserv
74	network since QoS is invoked on a packet-by- packet basis.

76	The Diffserv architecture enables a variety of possible services which
77	could be deployed in a network. These services are reflected to
78	customers at the edges of the Diffserv network in the form of a Service
79	Level Specification (SLS) [DSTERMS]. The ability to provide these
80	services depends on the availability of cohesive management and
81	configuration tools that can be used to provision and monitor a set of
82	Diffserv routers in a coordinated manner. To facilitate the development
83	of such configuration and management tools it is helpful to define a
84	conceptual model of a Diffserv router that abstracts away implementation
85	details of particular Diffserv routers from the parameters of interest
86	for configuration and management. The purpose of this draft is to define
87	such a model.

89	The basic forwarding functionality of a Diffserv router is defined in
90	other specifications; e.g., [DSARCH, DSFIELD, AF-PHB, EF-PHB].

92	This document is not intended in any way to constrain or to dictate the
93	implementation alternatives of Diffserv routers. It is expected that
94	router implementers will demonstrate a great deal of variability in
95	their implementations. To the extent that implementers are able to model
96	their implementations using the abstractions described in this draft,
97	configuration and management tools will more readily be able to
98	configure and manage networks incorporating Diffserv routers of assorted
99	origins.

101	o    Section 3 starts by describing the basic high-level functional
102	     elements of a Diffserv router and then describe the various
103	     components, then focussing on the Diffserv-specific components of
104	     the router and a hierarchical management model for these
105	     components.

107	o    Section 4 describes classification elements.

109	o    Section 5 discusses meter elements.

111	o    Section 6 discusses action elements.

113	o    Section 7 discusses the basic queueing elements and their
114	     functional behaviors (e.g. shaping).

116	o    Section 8 shows how the basic classification, meter, action and
117	     queueing elements can be combined to build modules called Traffic
118	     Conditioning Blocks (TCBs).

120	o    Section 9 discusses open issues with this document

122	o    Section 10 discusses security concerns.

124	2.  Glossary

126	This memo uses terminology which is defined in [DSARCH] and in
127	[DSTERMS].  Some of the terms defined there are defined again here in
128	order to provide additional detail, along with some new terms specific
129	to this document.

131	   Classifier    A functional datapath element which consists of filters
132	                 which select packets based on the content of packet
133	                 headers or other packet data, and/or on implicit or
134	                 derived attributes associated with the packet, and
135	                 forwards the packet along a particular datapath within
136	                 the router. A classifier splits a single incoming
137	                 traffic stream into multiple outgoing ones.

139	   Counter       A functional datapath element which updates a packet
140	                 counter and also an octet counter for every
141	                 packet that passes through it. Used for collecting
142	                 statistics.

144	   Filter        A set of wildcard, prefix, masked, range and/or exact
145	                 match conditions on the components of a packet's
146	                 classification key. A filter is said to match only if
147	                 each condition is satisfied.

149	   Multiplexer   A functional datapath element that merges multiple
150	   (Mux)         traffic streams (datapaths) into a single traffic
151	                 stream (datapath).

153	   Non-work-     A property of a scheduling algorithm such that it
154	   conserving    services packets no sooner than a scheduled departure
155	                 time, even if this means leaving packets in a FIFO
156	                 while the link is idle.

158	   Queueing      A combination of functional datapath elements
159	   Block         that modulates the transmission of packets belonging
160	                 to a traffic streams and determines their
161	                 ordering, possibly storing them temporarily or
162	                 discarding them.

164	   Scheduling    An algorithm which determines which queue of a set
165	   algorithm     of qyeyes to service next. This may be based on the
166	                 relative priority of the queues, on a weighted fair
167	                 bandwidth sharing policy or some other policy. Such
168	                 an algorithm may be either work-conserving or non-
169	                 work-conserving.

171	   Shaping       The process of delaying packets within a traffic stream
172	                 to cause it to conform to some defined traffic profile.
173	                 Shaping can be implemented using a queue serviced by a
174	                 non-work-conserving scheduling algorithm.

176	   Traffic       A logical datapath entity consisting of a number of
177	   Conditioning  other functional datapath entities interconnected in
178	   Block (TCB)   such a way as to perform a specific set of traffic
179	                 conditioning functions on an incoming traffic stream.
180	                 A TCB can be thought of as an entity with one
181	                 input and one output and a set of control parameters.

183	   Work-         A property of a scheduling algorithm such that it
184	   conserving    servicess a packet, if one is available, at every
185	                 transmission opportunity."

187	3.  Conceptual Model

189	This section introduces a block diagram of a Diffserv router and
190	describes the various components illustrated. Note that a Diffserv core
191	router is assumed to include only a subset of these components: the
192	model presented here is intended to cover the case of both Diffserv edge
193	and core routers.

195	3.1.  Elements of a Diffserv Router

197	The conceptual model includes abstract definitions for the following:

199	   o    Traffic Classification elements.

201	   o    Metering functions.

203	   o    Traffic Conditioning (TC) actions of Marking, Absolute Dropping,
204	        Counting and Multiplexing.

206	   o    Queueing elements, including capabilities of algorithmic
207	        dropping.

209	   o    Certain combinations of traffic classification, traffic
210	        conditioning and queueing elements.

212	The components and combinations of components described in this document
213	form building blocks that need to be manageable by Diffserv
214	configuration and management tools. One of the goals of this document is
215	to show how a model of a Diffserv device can be built using these
216	component blocks. This model is in the form of a connected directed
217	acyclic graph (DAG) of functional datapath elements that describes the
218	traffic conditioning and queueing behaviors that any particular packet
219	will experience when forwarded to the Diffserv router.

221	The following diagram illustrates the major functional blocks of a
222	Diffserv router:

224	3.1.1.  Datapath

226	An ingress interface, routing core and egress interface are illustrated
227	at the center of the diagram. In actual router implementations, there
228	may be an arbitrary number of ingress and egress interfaces
229	interconnected by the routing core. The routing core element serves as
230	an abstraction of a router's normal routing and switching functionality.
231	The routing core moves packets between interfaces according to policies
232	outside the scope of Diffserv. The actual queueing delay and packet loss
233	               +---------------+
234	               | Diffserv      |
235	        Mgmt   | configuration |
236	      <----+-->| & management  |------------------+
237	      SNMP,|   | interface     |                  |
238	      COPS |   +---------------+                  |
239	      etc. |        |                             |
240	           |        |                             |
241	           |        v                             v
242	           |   +-------------+                 +-------------+
243	           |   | ingress i/f |   +---------+   | egress i/f  |
244	     --------->|  classify,  |-->| routing |-->|  classify,  |---->
245	     data  |   |  meter,     |   |  core   |   |  meter      |data out
246	      in   |   |  action,    |   +---------+   |  action,    |
247	           |   |  queueing   |                 |  queueing   |
248	           |   +-------------+                 +-------------+
249	           |        ^                             ^
250	           |        |                             |
251	           |        |                             |
252	           |   +------------+                     |
253	           +-->| QOS agent  |                     |
254	      -------->| (optional) |---------------------+
255	        QOS    | (e.g. RSVP)|
256	        cntl   +------------+
257	        msgs
258	              Figure 1:  Diffserv Router Major Functional Blocks

260	behavior of a specific router's switching fabric/backplane is not
261	modeled by the routing core; these should be modeled using the
262	functional elements described later. The routing core should be thought
263	of as an infinite bandwidth, zero- delay backplane connecting ingress
264	and egress interfaces.

266	The components of interest on the ingress/egress interfaces are the
267	traffic classifiers, traffic conditioning (TC) components, and the
268	queueing components that support Diffserv traffic conditioning and per-
269	hop behaviors [DSARCH]. These are the fundamental components comprising
270	a Diffserv router and will be the focal point of our conceptual model.

272	3.1.2.  Configuration and Management Interface

274	Diffserv operating parameters are monitored and provisioned through this
275	interface. Monitored parameters include statistics regarding traffic
276	carried at various Diffserv service levels. These statistics may be
277	important for accounting purposes and/or for tracking compliance to
278	Traffic Conditioning Specifications (TCSs) [DSTERMS] negotiated with
279	customers. Provisioned parameters are primarily classification rules, TC
280	and PHB configuration parameters. The network administrator interacts
281	with the Diffserv configuration and management interface via one or more
282	management protocols, such as SNMP or COPS, or through other router
283	configuration tools such as serial terminal or telnet consoles.

285	Specific policy objectives are presumed to be installed by or retrieved
286	from policy management mechanisms. However, diffserv routers are subject
287	to implementation decisions which form a meta- policy that scopes the
288	kinds of policies which can be created.

290	3.1.3.  Optional QoS Agent Module

292	Diffserv routers may snoop or participate in either per-microflow or
293	per-flow-aggregate signaling of QoS requirements [E2E] e.g.  using the
294	RSVP protocol. Snooping of RSVP messages may be used, for example, to
295	learn how to classify traffic without actually participating as a RSVP
296	protocol peer. Diffserv routers may reject or admit RSVP reservation
297	requests to provide a means of admission control to Diffserv-based
298	services or they may use these requests to trigger provisioning changes
299	for a flow-aggregation in the Diffserv network. A flow-aggregation in
300	this context might be equivalent to a Diffserv BA or it may be more
301	fine-grained, relying on a MF classifier [DSARCH]. Note that the
302	conceptual model of such a router implements the Integrated Services
303	Model as described in [INTSERV], applying the control plane controls to
304	the data classified and conditioned in the data plane, as desribed in
305	[E2E].

307	Note that a QoS Agent component of a Diffserv router, if present, might
308	be active only in the control plane and not in the data plane. In this
309	scenario, RSVP could be used merely to signal reservation state without
310	installing any actual reservations in the data plane of the Diffserv
311	router: the data plane could still act purely on Diffserv DSCPs and
312	provide PHBs for handling data traffic without the normal per-microflow
313	handling expected to support some Intserv services.

315	3.2.  Hierarchical Model of Diffserv Components

317	This document focuses on the Diffserv-specific components of the router:
318	classification, traffic conditioning and queueing functions.  Figure 2
319	shows a high-level view of ingress and egress interfaces of a router.
320	The diagram illustrates two Diffserv router interfaces, each having an
321	ingress and an egress component. It shows classification, meter, action
322	and queueing elements which might be instantiated on each interface's
323	ingress and egress component. The TC functionality is implemented by a
324	combination of classification, action, meter and queueing elements.

326	In principle, if one were to construct a network entirely out of two-
327	port routers (in appropriate places connected by LANs or similar media),
328	then it would be necessary for each router to perform four QoS control
329	functions in the datapath on traffic in each direction:

331	-    Classify each message according to some set of rules.

333	-    If necessary, determine whether the data stream the message is part
334	     of is within or outside its rate by metering the stream.

336	-    Perform a set of resulting actions, including applying a drop
337	     policy appropriate to the classification and queue in question and
338	     perhaps additionally marking the traffic with a Differentiated
339	     Services Code Point (DSCP) as defined in [DSCP].

341	-    Enqueue the traffic for output in the appropriate queue, which may
342	     either shape the traffic or simply forward it with some minimum
343	     rate or maximum latency.

345	If the network is now built out of N-port routers, the expected behavior
346	of the network should be identical. Therefore, this model must provide
347	for essentially the same set of functions on the ingress as on the
348	egress port of the router. Some interfaces will be "edge" interfaces and
349	some will be "interior" to the Differentiated Services domain. The one
350	point of difference between an ingress and an egress interface is that

352	             Interface A                        Interface B
353	          +-------------+     +---------+     +-------------+
354	          | ingress i/f |     |         |     | egress i/f  |
355	          |   classify, |     |         |     |   classify, |
356	      --->|   meter,    |---->|         |---->|   meter,    |--->
357	          |   action,   |     |         |     |   action,   |
358	          |   queueing  |     |         |     |   queueing  |
359	          +-------------+     | routing |     +-------------+
360	                              |  core   |
361	          +-------------+     |         |     +-------------+
362	          | egress i/f  |     |         |     | ingress i/f |
363	          |   classify, |     |         |     |   classify, |
364	      <---|   meter,    |<----|         |<----|   meter,    |<---
365	          |   action,   |     |         |     |   action,   |
366	          |   queueing  |     +---------+     |   queueing  |
367	          +-------------+                     +-------------+

369	      Figure 2. Traffic Conditioning and Queueing Elements

371	all traffic on an egress interface is queued, while traffic on an
372	ingress interface will typically be queued only for shaping purposes, if
373	at all.  Therefore, equivalent functional elements are modelled on both
374	the ingress and egress components of an interface.

376	Note that it is not mandatory that each of these functional elements be
377	implemented on both ingress and egress components; equally, the model
378	allows that multiple sets of these elements may be placed in series
379	and/or in parallel at ingress or at egress. The arrangement of elements
380	is dependent on the service requirements on a particular interface on a
381	particular router. By modelling these elements on both ingress and
382	egress components, it is not implied that they must be implemented in
383	this way in a specific router. For example, a router may implement all
384	shaping and PHB queueing on the interface egress component or may
385	instead implement it only on the ingress component. Furthermore, the
386	classification needed to map a packet to an egress component queue (if
387	present) need not be implemented on the egress component but instead may
388	be implemented on the ingress component, with the packet passed through
389	the routing core with in-band control information to allow for egress
390	queue selection.

392	>From a device-configuration and management perspective, the following
393	hierarchy exists:

395	     At the top level, the network administrator manages interfaces.
396	     Each interface consists of an ingress component and an egress
397	     component.  Each component may contain classifier, action, meter
398	     and queueing elements.

400	     At the next level, the network administrator manages groups of
401	     functional elements interconnected in a DAG. These elements are
402	     organized in self-contained Traffic Conditioning Blocks (TCBs)
403	     which are used to implement some desired network policy (see
404	     Section 8). One or more TCBs may be instantiated on each ingress or
405	     egress component; they may be connected in series and/or in
406	     parallel configurations on the multiple outputs of a classifier.
407	     The TCB is defined optionally to include classification and
408	     queueing elements so as to allow for flexible functionality. A TCB
409	     can be thought of as a "black box" with one input and one output in
410	     the data path. Each interface (ingress or egress) may have
411	     different TCB configurations.

413	     At the lowest level are individual functional elements, each with
414	     their own configuration parameters and management counters and
415	     flags.

417	4.  Classifiers

419	4.1.  Definition

421	Classification is performed by a classifier element. Classifiers are 1:N
422	(fan-out) devices: they take a single traffic stream as input and
423	generate N logically separate traffic streams as output. Classifiers are
424	parameterized by filters and output streams. Packets from the input
425	stream are sorted into various output streams by filters which match the
426	contents of the packet or possibly match other attributes associated
427	with the packet. Various types of classifiers are described in the
428	following sections.

430	We use the following diagram to illustrate a classifier, where the
431	outputs connect to succeeding functional elements:

433	      unclassified              classified
434	      traffic                   traffic
435	              +------------+
436	              |            |--> match Filter1 --> OutputA
437	      ------->| classifier |--> match Filter2 --> OutputB
438	              |            |--> no match      --> OutputC
439	              +------------+

441	      Figure 3. An Example Classifier

443	Note that we allow a multiplexor (see Section 6.5) before the classifier
444	to allow input from multiple traffic streams. For example, if multiple
445	ingress sub-interfaces feed through a single classifier then the
446	interface number can be considered by the classifier as a packet
447	attribute and be included in the packet's classification key. This
448	optimization may be important for scalability in the management plane.
449	Another example of a packet attribute could be an integer representing
450	the BGP community string associated with the packet's best-matching
451	route.

453	The following classifier separates traffic into one of three output
454	streams based on three filters:

456	      Filter Matched        Output Stream
457	      --------------       ---------------
458	      Filter1                    A
459	      Filter2                    B
460	      Filter3 (no match)         C

462	Where Filters1 and Filter2 are defined to be the following BA filters
463	([DSARCH], Section 4.2.1 ):

465	      Filter        DSCP
466	      ------       ------
467	        1           101010
468	        2           111111
469	        3           ****** (wildcard)

471	4.1.1.  Filters

473	A filter consists of a set of conditions on the component values of a
474	packet's classification key (the header values, contents, and attributes
475	relevant for classification). In the BA classifier example above, the
476	classification key consists of one packet header field, the DSCP, and
477	both Filter1 and Filter2 specify exact-match conditions on the value of
478	the DSCP. Filter3 is a wildcard default filter which matches every
479	packet, but which is only selected in the event that no other more
480	specific filter matches.

482	In general there are a set of possible component conditions including
483	exact, prefix, range, masked, and wildcard matches. Note that ranges can
484	be represented (with less efficiency) as a set of prefixes and that
485	prefix matches are just a special case of both masked and range matches.

487	In the case of a MF classifier [DSARCH], the classification key consists
488	of a number of packet header fields. The filter may specify a different
489	condition for each key component, as illustrated in the example below
490	for a IPv4/TCP classifier:

492	      Filter   IP Src Addr    IP Dest Addr   TCP SrcPort TCP DestPort
493	      ------   -------------  -------------  -----------  ------------
494	      Filter4  172.31.8.1/32  172.31.3.X/24       X          5003

496	In this example, the fourth octet of the destination IPv4 address and
497	the source TCP port are wildcard or "don't cares".

499	MF classification of fragmented packets is impossible if the filter uses
500	transport-layer port numbers e.g. TCP port numbers. MTU-discovery is
501	therefore a prerequisite for proper operation of a Diffserv network that
502	uses such classifiers.

504	4.1.2.  Overlapping Filters

506	Note that it is easy to define sets of overlapping filters in a
507	classifier. For example:

509	      Filter5:
510	      Type:   Masked-DSCP
511	      Value:  111000
512	      Mask:   111000

514	      Filter6:
515	      Type:   Masked-DSCP
516	      Value:  000111 (binary)
517	      Mask:   000111 (binary)

519	A packet containing DSCP = 111111 cannot be uniquely classified by this
520	pair of filters and so a precedence must be established between Filter5
521	and Filter6 in order to break the tie. This precedence must be
522	established either (a) by a manager which knows that the router can
523	accomplish this particular ordering e.g. by means of reported
524	capabilities, or (b) by the router along with a mechanism to report to a
525	manager which precedence is being used. These ordering mechanisms must
526	be supported by the configuration and management protocols although
527	further discussion of this is outside the scope of this document.

529	As another example, one might want first to disallow certain
530	applications from using the network at all, or to classify some
531	individual traffic streams that are not Diffserv-marked. Traffic that is
532	not classified by those tests might then be inspected for a DSCP. The
533	word "then" implies sequence and this must be specified by means of
534	precedence.

536	An unambiguous classifier requires that every possible classification
537	key match at least one filter (possibly the wildcard default) and that
538	any ambiguity between overlapping filters be resolved by precedence.
539	Therefore, the classifiers on any given interface must be "complete" and
540	will often include an "everything else" filter as the lowest precedence
541	element in order for the result of classification to be deterministic.
542	Note that this completeness is only required of the first classifier
543	that incoming traffic will meet as it enters an interface - subsequent
544	classifiers on an interface only need to handle the traffic that it is
545	known that they will receive.

547	4.2.  Examples

549	4.2.1.  Behaviour Aggregate (BA) Classifier

551	The simplest Diffserv classifier is a behavior aggregate (BA) classifier
552	[DSARCH]. A BA classifier uses only the Diffserv codepoint (DSCP) in a
553	packet's IP header to determine the logical output stream to which the
554	packet should be directed. We allow only an exact-match condition on
555	this field because the assigned DSCP values have no structure, and
556	therefore no subset of DSCP bits are significant.

558	The following defines a possible BA filter:

560	      Filter8:
561	      Type:   BA
562	      Value:  111000

564	4.2.2.  Multi-Field (MF) Classifier

566	Another type of classifier is a multi-field (MF) classifier [DSARCH].
567	This classifies packets based on one or more fields in the packet
568	(possibly including the DSCP). A common type of MF classifier is a 6-
569	tuple classifier that classifies based on six fields from the IP and TCP
570	or UDP headers (destination address, source address, IP protocol, source
571	port, destination port, and DSCP). MF classifiers may classify on other
572	fields such as MAC addresses, VLAN tags, link-layer traffic class fields
573	or other higher-layer protocol fields.

575	The following defines a possible MF filter:

577	      Filter9:
578	      Type:              IPv4-6-tuple
579	      IPv4DestAddrValue: 0.0.0.0
580	      IPv4DestAddrMask:  0.0.0.0
581	      IPv4SrcAddrValue:  172.31.8.0
582	      IPv4SrcAddrMask:   255.255.255.0
583	      IPv4DSCP:          28
584	      IPv4Protocol:      6
585	      IPv4DestL4PortMin: 0
586	      IPv4DestL4PortMax: 65535
587	      IPv4SrcL4PortMin:  20
588	      IPv4SrcL4PortMax:  20

590	A similar type of classifier can be defined for IPv6.

592	4.2.3.  Free-form Classifier

594	A Free-form classifier is made up of a set of user definable arbitrary
595	filters each made up of {bit-field size, offset (from head of packet),
596	mask}:

598	      Classifier2:
599	      Filter12:    OutputA
600	      Filter13:    OutputB
601	      Default:     OutputC

603	      Filter12:
604	      Type:        FreeForm
605	      SizeBits:    3 (bits)
606	      Offset:      16 (bytes)
607	      Value:       100 (binary)
608	      Mask:        101 (binary)

610	      Filter13:
611	      Type:        FreeForm
612	      SizeBits:    12 (bits)
613	      Offset:      16 (bytes)
614	      Value:       100100000000 (binary)
615	      Mask:        111111111111 (binary)

617	Free-form filters can be combined into filter groups to form very
618	powerful filters.

620	4.2.4.  Other Possible Classifiers

622	Classification may also be performed based on information at the
623	datalink layer below IP (e.g. VLAN or datalink-layer priority) or
624	perhaps on the ingress or egress IP, logical or physical interface
625	identifier.  (e.g. the incoming channel number on a channelized
626	interface).  A classifier that filters based on IEEE 802.1p Priority and
627	on 802.1Q VLAN-ID might be represented as:

629	      Classifier3:
630	      Filter14 AND Filter15:  OutputA
631	      Default:                OutputB

633	      Filter14:                        -- priority 4 or 5
634	      Type:        Ieee8021pPriority
635	      Value:       100 (binary)
636	      Mask:        110 (binary)

638	      Filter15:                        -- VLAN 2304
639	      Type:        Ieee8021QVlan
640	      Value:       100100000000 (binary)
641	      Mask:        111111111111 (binary)

643	Such classifiers may be subject of other standards or may be enterprise-
644	specific but are not discussed further here.

646	5.  Meters

648	Metering is is defined in [DSARCH].  Diffserv network providers may
649	choose to offer services to customers based on a temporal (i.e., rate)
650	profile within which the customer submits traffic for the service. In
651	this event, a meter might be used to trigger real-time traffic
652	conditioning actions (e.g., marking) by routing a non-conforming packet
653	through an appropriate next-stage action element. Alternatively, it
654	might also be used for out-of-band management functions like statistics
655	monitoring for billing applications.

657	Meters are logically 1:N (fan-out) devices (although a multiplexor can
658	be used in front of a meter). Meters are parameterized by a temporal
659	profile and by conformance levels, each of which is associated with a
660	meter's output. Each output can be connected to another functional
661	element.

663	Note that this model of a meter differs slightly from that described in
664	[DSARCH]. In that description the meter is not a datapath element but is
665	instead used to monitor the traffic stream and send control signals to
666	action elements to dynamically modulate their behavior based on the
667	conformance of the packet.

669	The following diagram illustrates a meter with 3 levels of conformance:

671	      unmetered              metered
672	      traffic                traffic
673	                +---------+
674	                |         |--------> conformance A
675	      --------->|  meter  |--------> conformance B
676	                |         |--------> conformance C
677	                +---------+

679	      Figure 4. A Generic Meter

681	In some Diffserv examples, three levels of conformance are discussed in
682	terms of colors, with green representing conforming, yellow representing
683	partially conforming and red representing non-conforming [AF-PHB]. These
684	different conformance levels may be used to trigger different queueing,
685	marking or dropping treatment later on in the processing. Other example
686	meters use a binary notion of conformance; in the general case N levels
687	of conformance can be supported. In general there is no constraint on
688	the type of functional element following a meter output, but care must
689	be taken not to inadvertently configure a datapath that results in
690	packet reordering within an OA.

692	A meter, according to this model, measures the rate at which packets
693	making up a stream of traffic pass it, compares the rate to some set of
694	thresholds and produces some number (two or more) potential results: a
695	given packet is said to "conform" to the meter if, at the time that the
696	packet is being looked at, the stream appears to be within the meter's
697	limit rate.

699	The concept of conformance to a meter bears comment. The concept applied
700	in several rate-control architectures, including ATM, Frame Relay,
701	Integrated Services and Differentiated Services, is variously described
702	as a "leaky bucket" or a "token bucket".

704	A leaky bucket algorithm is primarily used for traffic shaping (handled
705	under Queues and Schedulers in this model): traffic theoretically
706	departs from a device at a rate of one bit every so many time units but,
707	in fact, departs in multi-bit units (packets) at a rate approximating
708	that. It is also possible to build multi-rate leaky buckets, in which
709	traffic departs from the switch at varying rates depending on recent
710	activity or inactivity.

712	A simple token bucket is usually used in a Meter to measure the behavior
713	of a peer's leaky bucket, for verification purposes. It is, by
714	definition, a relationship between some defined burst_size, rate and
715	interval:

717	      interval = burst_size/rate
718	   or
719	      rate = burst_size/interval

721	Multi-rate token buckets (token buckets with both a peak and a mean rate
722	and sometimes more rates) are commonly used. In this case, the burst
723	size for the baseline traffic is conventionally referred to as the
724	"committed burst" and the time interval is as specified by

726	      interval = committed_burst/mean_rate

728	but additional burst sizes (each an increment over its predecessor) are
729	defined, which are conventionally referred to as "excess" burst sizes.
730	The peak rate therefore equals the sum of the burst sizes for any given
731	interval.

733	A data stream is said to conform to a simple token bucket if the switch
734	receives at most the "burst_size" of data in any time interval of length
735	"interval". In the multi-rate case, the traffic is said to conform at a
736	given level to the token bucket at if its rate does not exceed the sum
737	of the relevant burst sizes in any given interval. Received traffic that
738	arrives pre-classified as one of the "excess" rates (e.g. AF12 or AF13
739	traffic for a device implementing the AF1x PHB) is only compared to the
740	relevant excess buckets.

742	<ed: the following paragraphs may need fixing when we can all agree on a
743	stricter vs. looser definition: for now we assume strict schedulers and
744	lenient meters.>
745	The fact that data is organized into variable length packets introduces
746	some uncertainty in this conformance decision. When used in a Scheduler,
747	a leaky bucket releases a packet only when all of its bits would have
748	been allowed: it does not borrow from future capacity. When used in a
749	Meter, a token bucket accepts a packet if any of its bits would have
750	been accepted and "borrows" any excess capacity required from that
751	allotted to equivalently classified traffic in a previous or subsequent
752	interval. Note that [SRTCM] and [TRTCM] insist on stricter behaviour
753	from a meter than the model here insists on.

755	Multiple classes of traffic, as identified by the classifier table, may
756	be presented to the same meter. Imagine, for example, that it is desired
757	to drop all traffic that uses any DSCP that has not been publicly
758	defined.  A classifier entry might exist for each such DSCP, shunting it
759	to an "accepts everything" meter and dropping all traffic that conforms
760	to only that meter.

762	It is necessary to identify what is to be done with packets that conform
763	to the meter and with packets that do not. It is also necessary for the
764	meter to be arbitrarily extensible as some PHBs require the successive
765	application of an arbitrary number of meters.  The approach taken in
766	this model is to have each meter indicate what action is to be taken for
767	conforming traffic and what meter is to be used for traffic which fails
768	to conform. With the definition of a special type of meter to which all
769	traffic conforms, this has the necessary flexibility.

771	Note that this definition of a simple token bucket meter requires that
772	the minimal bucket size be at least the MTU of the incoming link and it
773	should also be initialised with sufficient tokens to allow for at least
774	one MTU-sized packet to conform if it arrives at time zero.

776	5.1.  Examples

778	The following are some examples of possible meters.

780	5.1.1.  Average Rate Meter

782	An example of a very simple meter is an average rate meter. This type of
783	meter measures the average rate at which packets are submitted to it
784	over a specified averaging time.

786	An average rate profile may take the following form:

788	      Meter1:
789	      Type:                AverageRate
790	      Profile:             Profile1
791	      ConformingOutput:    Queue1
792	      NonConformingOutput: Counter1

794	      Profile1:
795	      Type:                AverageRate
796	      AverageRate:         120 kbps
797	      Delta:               100 msec

799	A meter measuring against this profile would continually maintain a
800	count that indicates the total number of packets arriving between time T
801	(now) and time T - 100 msecs. So long as an arriving packet does not
802	push the count over 12 kbits in the last 100 msec then the packet would
803	be deemed conforming. Any packet that pushes the count over 12 kbits
804	would be deemed non-conforming. Thus, this meter deems packets to
805	correspond to one of two conformance levels: conforming or non-
806	conforming and sends them on for the appropriate subsequent treatment.

808	5.1.2.  Exponential Weighted Moving Average (EWMA) Meter

810	The EWMA form of meter is easy to implement in hardware and can be
811	parameterized as follows:

813	      avg_rate(t) = (1 - Gain) * avg_rate(t') +  Gain * rate(t)
814	      t = t' + Delta

816	For a packet arriving at time t:

818	      if (avg_rate(t) > AverageRate)
819	         non-conforming
820	      else
821	         conforming

823	"Gain" controls the time constant (e.g. frequency response) of what is
824	essentially a simple IIR low-pass filter. "rate(t)" measures the number
825	of incoming bytes in a small fixed sampling interval, Delta.  Any packet
826	that arrives and pushes the average rate over a predefined rate
827	AverageRate is deemed non-conforming. An EWMA meter profile might look
828	something like the following:

830	      Meter2:
831	      Type:                ExpWeightedMovingAvg
832	      Profile:             Profile2
833	      ConformingOutput:    Queue1
834	      NonConformingOutput: AbsoluteDropper1

836	      Profile2:
837	      Type:                ExpWeightedMovingAvg
838	      AverageRate:         25 kbps
839	      Delta:               10 usec
840	      Gain:                1/16

842	5.1.3.  Two-Parameter Token Bucket Meter

844	A more sophisticated meter might measure conformance to a token bucket
845	(TB) profile. A TB profile generally has two parameters, an average
846	token rate and a burst size. TB meters compare the arrival rate of
847	packets to the average rate specified by the TB profile.  Logically,
848	tokens accumulate in a bucket at the average rate, up to a maximum
849	credit which is the burst size. Packets of length L bytes are considered
850	conforming if any tokens are available in the bucket at the time of
851	packet arrival: up to L bytes may then be borrowed from future token
852	allocations. Packets are allowed to exceed the average rate in bursts up
853	to the burst size. Packets which arrive to find a bucket with no tokens
854	in it are deemed non-conforming. A two-parameter TB meter has exactly
855	two possible conformance levels (conforming, non-conforming). TB
856	implementation details are discussed in Appendix A. Note that this is a
857	"lenient" meter that allows some borrowing, as discussed above.

859	A two-parameter TB meter might appear as follows:

861	      Meter3:
862	      Type:                SimpleTokenBucket
863	      Profile:             Profile3
864	      ConformingOutput:    Queue1
865	      NonConformingOutput: AbsoluteDropper1

867	      Profile3:
868	      Type:                SimpleTokenBucket
869	      AverageRate:         200 kbps
870	      BurstSize:           100 kbytes

872	5.1.4.  Multi-Stage Token Bucket Meter

874	More complicated TB meters might define two burst sizes and three
875	conformance levels. Packets found to exceed the larger burst size are
876	deemed non-conforming. Packets found to exceed the smaller burst size
877	are deemed partially conforming. Packets exceeding neither are deemed
878	conforming. Token bucket meters designed for Diffserv networks are
879	described in more detail in [SRTCM, TRTCM, GTC]; in some of these
880	references three levels of conformance are discussed in terms of colors,
881	with green representing conforming, yellow representing partially
882	conforming and red representing non- conforming. Often these multi-
883	conformance level meters can be implemented using an appropriate
884	configuration of multiple two- parameter TB meters.

886	A profile for a multi-stage TB meter with three levels of conformance
887	might look as follows:

889	      Meter4:
890	      Type:                TwoRateTokenBucket
891	      ProfileA:            Profile4
892	      ConformingOutputA:   Queue1
893	      ProfileB:            Profile5
894	      ConformingOutputB:   Marker1
895	      NonConformingOutput: AbsoluteDropper1

897	      Profile4:
898	      Type:                SimpleTokenBucket
899	      AverageRate:         100 kbps
900	      BurstSize:           20 kbytes

902	      Profile5:
903	      Type:                SimpleTokenBucket
904	      AverageRate:         100 kbps
905	      BurstSize:           100 kbytes

907	5.1.5.  Null Meter

909	A null meter has only one output: always conforming, and no associated
910	temporal profile. Such a meter is useful to define in the event that the
911	configuration or management interface does not have the flexibility to
912	omit a meter in a datapath segment.

914	      Meter5:
915	      Type:                NullMeter
916	      Output:              Queue1

918	6.  Action Elements

920	The classifiers and meters described up to this point are fan-out
921	elements which are generally used to determine the appropriate action to
922	apply to a packet. The set of possible actions include:

924	-    Marking

926	-    Absolute Dropping

928	-    Multiplexing

930	-    Counting
931	-    Null action - do nothing

933	The corresponding action elements are described in the following
934	sections.

936	Diffserv nodes may apply shaping, policing and/or marking to traffic
937	streams that exceed the bounds of their TCS in order to prevent a
938	traffic stream from seizing more than its share of resources from a
939	Diffserv network. Shaping, sometimes considered as a TC action, is
940	treated as a part of the queueing module in this model, as is the use of
941	Algorithmic Dropping techniques - see section 7.  Policing is modelled
942	as the combination of either a meter or a scheduler with either an
943	absolute dropper or an algorithmic dropper.  These elements will discard
944	packets which exceed the TCS.  Marking is performed by a marker, which
945	(in this context) alters the DSCP, and thus the PHB, of the packet to
946	give it a lower-grade treatment at subsequent Diffserv nodes.

948	6.1.  Marker

950	Markers are 1:1 elements which set a codepoint (e.g. the DSCP in an IP
951	header). Markers may also act on unmarked packets (e.g. those submitted
952	with DSCP of zero) or may re-mark previously marked packets. In
953	particular, the model supports the application of marking based on a
954	preceding classifier match. The mark set in a packet will determine its
955	subsequent treatment in downstream nodes of a network and possibly also
956	in subsequent processing stages within this router.

958	DSCP Markers for Diffserv are normally parameterized by a single
959	parameter: the 6-bit DSCP to be marked in the packet header.

961	      Marker1:
962	      Type:                DSCPMarker
963	      Mark:                010010

965	6.2.  Absolute Dropper

967	Absolute droppers simply discard packets. There are no parameters for
968	these droppers. Because this dropper is a terminating point of the
969	datapath and have no outputs, it is probably desirable to forward the
970	packet through a counter action first for instrumentation purposes.

972	      AbsoluteDropper1:
973	      Type:                AbsoluteDropper

975	Absolute droppers are not the only elements than can cause a packet to
976	be discarded: another element is an Algorithmic Dropper element (see
977	Section 6.6). However, since this element's behavior is closely tied the
978	state of one or more queues, we choose to distinguish it as a separate
979	functional element.

981	6.3.  Multiplexer

983	It is occasionally necessary to multiplex traffic streams into a 1:1 or
984	1:N action element or classifier.  A M:1 (fan-in) multiplexer is a
985	simple logical device for merging traffic streams. It is parameterized
986	by its number of incoming ports.

988	      Mux1:
989	      Type:                Multiplexer
990	      Output:              Queue2

992	6.4.  Counter

994	One passive action is to account for the fact that a data packet was
995	processed. The statistics that result might be used later for customer
996	billing, service verification, or network engineering purposes. Counters
997	are 1:1 functional elements which update a counter by L and a packet
998	counter by 1 every time a L-byte sized packet passes through them.
999	Counters can be used to count packets about to be be dropped by a
1000	dropper or a queueing element.

1002	      Counter1:
1003	      Type:                Counter
1004	      Output:              Queue1

1006	6.5.  Null Action

1008	A null action has one input and one output. The element performs no
1009	action on the packet. Such an element is useful to define in the event
1010	that the configuration or management interface does not have the
1011	flexibility to omit an action element in a datapath segment.

1013	      Null1:
1014	      Type:                Null
1015	      Output:              Queue1

1017	7.  Queueing Blocks

1019	Queueing blocks modulate the transmission of packets belonging to the
1020	different traffic streams and determine their ordering, possibly storing
1021	them temporarily or discarding them. Packets are usually stored either
1022	because there is a resource constraint (e.g., available bandwidth) which
1023	prevents immediate forwarding, or because the queueing block is being
1024	used to alter the temporal properties of a traffic stream (i.e.
1025	shaping). Packets are discarded either because of buffering limitations,
1026	because a buffer threshold is exceeded (including when shaping is
1027	performed), as a feedback control signal to reactive control protocols
1028	such as TCP, because a meter exceeds a configured rate (i.e. policing).

1030	The queueing block in this model is a logical abstraction of a queueing
1031	system, which is used to configure PHB-related parameters.  There is no
1032	conformance to this model. The model can be used to represent a broad
1033	variety of possible implementations. However, it need not necessarily
1034	map one-to-one with physical queueing systems in a specific router
1035	implementation. Implementors should map the configurable parameters of
1036	the implementation's queueing systems to these queueing block parameters
1037	as appropriate to achieve equivalent behaviors.

1039	7.1.  Queueing Model

1041	Queueing is a function a which lends itself to innovation. It must be
1042	modelled to allow a broad range of possible implementations to be
1043	represented using common structures and parameters. This model uses
1044	functional decomposition as a tool to permit the needed lattitude.

1046	Queueing sytems, such as the queueing block defined in this model,
1047	perform three distinct, but related, functions:  they store packets,
1048	they modulate the departure of packets belonging to various traffic
1049	streams and they selectively discard packets. This model decomposes the
1050	queueing block into the component elements that perform each of these
1051	functions. These elements which may be connected together either
1052	dynamically or statically to  construct queueing blocks. A queueing
1053	block is thus composed of of one or more FIFOs, one or more Schedulers
1054	and zero or more Algorithmic Droppers.

1056	     <ed: should this be *one* or more? There are valid cases that do
1057	     not require a dropper but they are exceptional.>

1059	Note that the term FIFO has multiple different common usages: it is
1060	sometimes taken to mean, among other things, a data structure that
1061	permits items to be removed only in the order in which they were
1062	inserted or a service discipline which is non- reordering.

1064	7.1.1.  FIFO

1066	In this model, a FIFO element is a data structure which at any time may
1067	contain zero or more packets. It may have one or more thresholds
1068	associated with it. A FIFO has one or more inputs and exactly one
1069	output. It must support an enqueue operation to add a packet to the tail
1070	of the queue, and a dequeue operation to remove a packet from the head
1071	of the queue. Packets must be dequeued in the order in which they were
1072	enqueued. A FIFO has a current depth, which indicates the number of
1073	packets that it contains at a particular time. FIFOs in this model are
1074	modelled without inherent limits on their depth - obviously this does
1075	not reflect the reality of implementations: FIFO size limits are
1076	modelled here by an algorithmic dropper associated with the FIFO,
1077	typically at its input. It is quite likely that, every FIFO will be
1078	preceded by an algorithmic dropper.  One exception might be the case
1079	where the packet stream has already been policed to a profile that can
1080	never exceed the scheduler bandwidth available at the FIFO's output -
1081	this would not need an algorithmic dropper at the input to the FIFO.

1083	This representation of a FIFO allows for one common type of depth limit,
1084	one that results from a FIFO supplied from a limited pool of buffers,
1085	shared between multiple FIFOs.

1087	     <ed: should we instead model a FIFO as having a single input and
1088	     use a "multiplexer" at its input if it needs to collect from
1089	     multiple input sources?>

1091	Typically, the FIFO element of this model will be implemented as a FIFO
1092	data structure. However, this does not preclude implementations which
1093	are not strictly FIFO, in that they also support operations that remove
1094	or examine packets (e.g., for use by discarders) other than at the head
1095	or tail. However, such operations MUST NOT have the effect of reordering
1096	packets belonging to the same microflow.

1098	In an implementation, packets are presumably stored in one or more
1099	buffers. Buffers are allocated from one or more free buffer pools. If
1100	there are multiple instances of a FIFO, their packet buffers may or may
1101	not be allocated out of the same free buffer pool. Free buffer pools may
1102	also have one or more threshold associated with them, which may affect
1103	discarding and/or scheduling. Other than this, buffering mechanisms are
1104	implementation specific and not part of this model.

1106	A FIFO might be represented using the following parameters:

1108	     Fifo1:
1109	     Type:       FIFO
1110	     Output:     Scheduler1

1112	Note that a FIFO must provide triggers and/or current state information
1113	to other elements upstream and downstream from it: in particular, it is
1114	likely that the current depth will need to be used by Algorithmic
1115	Dropper elements placed before or after the FIFO. It will also likely
1116	need to provide an implicit "I have packets for you" signal to
1117	downstream Scheduler elements.

1119	7.1.2.  Scheduler

1121	A scheduler is an element which gates the departure of each packet that
1122	arrives at one of its inputs, based on a service discipline. It has one
1123	or more input and exactly one output. Each input has an upstream element
1124	to which it is connected, and a set of parameters that affects the
1125	scheduling of packets received at that input.

1127	The service discipline (also known as a scheduling algorithm) is an
1128	algorithm which might take any of the following as its input(s):

1130	a)   static parameters such as relative priority associated with each of
1131	     the scheduler's inputs.

1133	b)   absolute token bucket parameters for maximum or minimum rates
1134	     associated with each of the scheduler's inputs.

1136	c)   parameters, such as packet length or DSCP, associated with the
1137	     packet currently present at its input.

1139	d)   absolute time and/or local state.

1141	Possible service disciplines fall into a number of categories, including
1142	(but not limited to) first come, first served (FCFS), strict priority,
1143	weighted fair bandwidth sharing (e.g., WFQ, WRR, etc.), rate-limited
1144	strict priority and rate-based. Service disciplines can be further
1145	distinguished by whether they are work-conserving or non-work-conserving
1146	(see Glossary). Non-work-conserving schedulers can be used to shape
1147	traffic streams to match some profile by delaying packets that might be
1148	deemed non-conforming by some downstream node: a packet is delayed until
1149	such time as it would conform to a downstream meter using the same
1150	profile.

1152	[DSARCH] defines PHBs without specifying required scheduling algorithms.
1153	However, PHBs such as  the class selectors [DSFIELD], EF [EF-PHB] and AF
1154	[AF-PHB] have descriptions or configuration parameters which strongly
1155	suggest the sort of scheduling discipline needed to implement them. This
1156	memo discusses a minimal set of queue parameters to enable realization
1157	of these per- hop behaviors. It does not attempt to specify an all-
1158	embracing set of parameters to cover all possible implementation models.
1159	A mimimal set includes:

1161	a)   a minimum service rate profile which allows rate guarantees for
1162	     each traffic stream as required by EF and AF without specifying the
1163	     details of how excess bandwidth between these traffic streams is
1164	     shared. Additional parameters to control this behavior should be
1165	     made available, but are dependent on the particular scheduling
1166	     algorithm implemented.

1168	b)   a service priority, used only after the minimum rate profiles of
1169	     all inputs have been satisfied, to decide how to allocate any
1170	     remaining bandwidth.

1172	c)   a maximum service rate profile, for use only with a non-work-
1173	     conserving service discipline.

1175	For an implementation of the EF PHB using a strict priority scheduling
1176	algorithm that assumes that the aggregate EF rate has been appropriately
1177	bounded to avoid starvation, the minimum rate profile would be reported
1178	as zero and the maximum service rate would be reported as line rate.
1179	Such an implementation, with multiple priority classes, could also be
1180	used for the Diffserv class selectors [DSFIELD].

1182	Alternatively, setting the service priority values for each input to the
1183	scheduler to the same value enables the scheduler to satisfy the minimum
1184	service rates for each input, so long as the sum of all minimum service
1185	rates is less than or equal to the line rate.

1187	For example, a non-work-conserving scheduler, allocating spare bandwidth
1188	equally between all its inputs, might be represented using the following
1189	parameters:

1191	     Scheduler1:
1192	     Type:           Scheduler2Input

1194	     Input1:
1195	     MaxRateProfile: Profile1
1196	     MinRateProfile: Profile2
1197	     Priority:       none

1199	     Input2:
1200	     MaxRateProfile: Profile3
1201	     MinRateProfile: Profile4
1202	     Priority:       none

1204	A work-conserving scheduler might be represented using the following
1205	parameters:

1207	     Scheduler2:
1208	     Type:           Scheduler3Input

1210	     Input1:
1211	     MaxRateProfile: WorkConserving
1212	     MinRateProfile: Profile5
1213	     Priority:       1

1215	     Input2:
1216	     MaxRateProfile: WorkConserving
1217	     MinRateProfile: Profile6
1218	     Priority:       2

1220	     Input3:
1221	     MaxRateProfile: WorkConserving
1222	     MinRateProfile: none
1223	     Priority:       3

1225	7.1.3.  Algorithmic Dropper

1227	An Algorithmic Dropper is an element which selectively discards packets
1228	that arrive at its input, based on a discarding algorithm. It has one
1229	data input and one output. In this model (but not necessarily in a real
1230	implementation), a packet enters the dropper at its input and either its
1231	buffer is returned to a free buffer pool or the packet exits the dropper
1232	at the output.

1234	Alternatively, an Algorithmic Dropper may invoke operations on a FIFO
1235	which selectively removes a packet, then return its buffer to the free
1236	buffer pool, based on a discarding algorithm. In this case, the
1237	operation is modelled as a side-effect on the FIFO upon which it
1238	operates, rather than as having a discrete input and output.  These two
1239	treatments are equivalent and we choose the former here.

1241	The Algorithmic Dropper is modelled as having a single input. However,
1242	it is likely that packets which were classified differently by a
1243	Classifier in this TCB will end up passing through the same dropper. The
1244	dropper's algorithm may need to apply different calculations based on
1245	characteristics of the incoming packet e.g. its DSCP. So there is a
1246	need, in implementations of this model, to be able to relate information
1247	about which classifier element was matched by a packet from a Classifier
1248	to an Algorithmic Dropper.  This is modelled here as a reverse pointer
1249	from one of the drop probability calculation algorithms inside the
1250	dropper to the classifier element that selects this algorithm.

1252	There are many formulations of a model that could represent this
1253	linkage, other than the one described above: one way would have been to
1254	have multiple "inputs" fed from the preceding elements, leading
1255	eventually to the classifier elements that matched the packet. Another
1256	formulation might have been for the Classifier to (logically) include
1257	some sort of "classification identifier" along with the packet along its
1258	path, for use by any subsequent element. Yet another could have been to
1259	include a classifier inside the dropper, in order for it to pick out the
1260	drop algorithm to be applied. All of these other approaches were deemed
1261	to be more clumsy or less useful than the approach taken here.

1263	An Algorithmic Dropper, shown in Figure 5, has one or more triggers that
1264	cause it to make a decision whether or not to drop one (or possibly more
1265	than one) packet. A trigger may be internal (the arrival of a packet at
1266	the input to the dropper) or it may be external (resulting from one or
1267	more state changes at another element, such as a FIFO depth exceeding a
1268	threshold or a scheduling event). It is likely that an instantaneous
1269	FIFO depth will need to be smoothed over some averaging interval. Some
1270	dropping algorithms may require several trigger inputs feeding back from
1271	events elsewhere in the system e.g. smoothing functions that calculate
1272	averages over more than one time interval.  Smoothing functions are
1273	outside the scope of this document and are not modelled here, we merely
1274	indicate where they might be added in the model.

1276	A trigger may be a boolean combination of events (e.g. a FIFO depth
1277	exceeding a threshold OR a buffer pool depth falling below a threshold).

1279	The dropping algorithm makes a decision on whether to forward or to
1280	discard a packet. It takes as its parameters some set of dynamic
1281	parameters (e.g. averaged or instantaneous FIFO depth) and some set of
1282	static parameters (e.g. thresholds) and possibly parameters associated

1284	           +------------------+      +-----------+
1285	           | +-------+        |  n   |smoothing  |
1286	           | |trigger|<----------/---|function(s)|
1287	           | |calc.  |        |      |(optional) |
1288	           | +-------+        |      +-----------+
1289	           |     |            |          ^
1290	           |     v            |          |Depth
1291	  Input    | +-------+ no     |      ------------+   to Scheduler
1292	  ---------->|discard|-------------->    |x|x|x|x|------->
1293	           | |   ?   |        |      ------------+
1294	           | +-------+        |           FIFO
1295	           |    |yes          |
1296	           |  | | |           |
1297	           |  | v | count +   |
1298	           |  +---+ bit-bucket|
1299	           +------------------+
1300	           Algorithmic
1301	           Dropper

1303	      Figure 5. Algorithmic Dropper + Queue

1305	with the packet (e.g. its PHB, as determined by a classifier, which will
1306	determine on which of the droppers inputs trhe packet arrives). It may
1307	also have internal state and is likely to keep counters regarding the
1308	dropped packets (there is no appropriate place here to include a Counter
1309	Action element).

1311	RED, RED-on-In-and-Out (RIO) and Drop-on-threshold are examples of
1312	dropping algorithms. Tail-dropping and head-dropping are effected by the
1313	location of the dropper relative to the FIFO.

1315	Note that, although an Algorithmic Dropper may require knowledge of data
1316	fields in a packet, as discovered by a Classifier in the same TCB, it
1317	may not modify the packet (i.e. it is not a marker).

1319	     <ed: have rearranged this example so as not to include a Classifier
1320	     in the Dropper - this leads to needing either multiple inputs or an
1321	     implicit classification stage to separate the in- and out-of-
1322	     profile traffic. We have chosen the former representation.>

1324	A dropper which uses a RIO algorithm might be represented using the
1325	following parameters:

1327	      AlgorithmicDropper1:
1328	      Type:                   AlgorithmicDropper
1329	      Discipline:             RIO
1330	      Trigger:                Internal
1331	      Output:                 Fifo1

1333	      InputA: (in profile)
1334	      MinThresh:              Fifo1.Depth > 20 kbyte
1335	      MaxThresh:              Fifo1.Depth > 30 kbyte

1337	      InputB: (out of profile)
1338	      MinThresh:              Fifo1.Depth > 10 kbyte
1339	      MaxThresh:              Fifo1.Depth > 20 kbyte

1341	      SampleWeight            .002
1342	      MaxDropProb             1%

1344	Another form of dropper, a threshold-dropper, might be represented using
1345	the following parameters:

1347	      AlgorithmicDropper2:
1348	      Type:                   AlgorithmicDropper
1349	      Discipline:             Drop-on-threshold
1350	      Trigger:                Fifo2.Depth > 20 kbyte
1351	      Output:                 Fifo1

1353	Yet another dropper which drops all out-of-profile packets whenever the
1354	FIFO threshold exceeds a certain depth (this dropper is not part of the
1355	larger TCB example) might be represented with the following parameters:

1357	      AlgorithmicDropper3:
1358	      Type:                   AlgorithmicDropper2Input
1359	      Discipline:             Drop-out-packets-on-threshold
1360	      Output:                 Fifo3

1362	      InputA: (in profile)
1363	      Trigger:                none
1364	      InputB: (out of profile)
1365	      Trigger:                Fifo3.Depth > 100 kbyte

1367	     <ed: this models the dropper without using an embedded Classifier
1368	     which seems a cleaner model than embedding a classifier here>

1370	7.1.4.  Constructing queueing blocks from the elements

1372	A queueing block is constructed by concatenation of these elements so as
1373	to meet the meta-policy objectives of the implementation, subject to the
1374	grammar rules specified in this section.

1376	Elements of the same type may appear more than once in a queueing block,
1377	either in parallel or in series. Typically, a queueing block will have
1378	relatively many elements in parallel and few in series.  Iteration and
1379	recursion are not supported constructs in this grammar. A queueing block
1380	must have at least one FIFO, at least one dropper, and at least one
1381	scheduler.  The following connections are allowed:

1383	1)   The input of a FIFO may be the input of the queueing block or it
1384	     may be connected to the output of a dropper or to an output of a
1385	     scheduler.

1387	2)   Each input of a scheduler may be connected to the output of a FIFO,
1388	     to the output of a dropper or to the output of another scheduler.

1390	3)   The input of a dropper which has a discrete input and output may be
1391	     the input of the queueing block or it may be connected to the
1392	     output of a FIFO (e.g., head dropping).

1394	4)   The output of the queueing block may be the output of a FIFO
1395	     element, a discarding element or a scheduling element.

1397	Note, in particular, that schedulers may operate in series such that a
1398	packet at the head of a FIFO feeding the concatenated schedulers is
1399	serviced only after all of the scheduling criteria are met. For example,
1400	a FIFO which carries EF traffic streams may be served first by a non-
1401	work-conserving scheduler to shape the stream to a maximum rate, then by
1402	a work-conserving scheduler to mix EF traffic streams with other traffic
1403	streams. Alternatively, there might be a FIFO  and/or a dropper between
1404	the two schedulers.

1406	7.2.  Shaping

1408	Traffic shaping is often used to condition traffic such that packets
1409	arriving in a burst will be "smoothed" and deemed conforming by
1410	subsequent downstream meters in this or other nodes. Shaping may also be
1411	used to isolate certain traffic streams from the effects of other
1412	traffic streams of the same BA.

1414	In [DSARCH] a shaper is described as a queueing element controlled by a
1415	meter which defines its temporal profile. However, this representation
1416	of a shaper differs substantially from typical shaper implementations.

1418	In this conceptual model, a shaper is realized by using a non-work-
1419	conserving scheduler. Some implementations may elect to have queues
1420	whose sole purpose is shaping, while others may integrate the shaping
1421	function with other buffering, discarding and scheduling associated with
1422	access to a resource. Shapers operate by delaying the departure of
1423	packets that would be deemed non-conforming by a meter configured to the
1424	shaper's maximum service rate profile. The packet is scheduled to depart
1425	no sooner than such time that it would become conforming.

1427	8.  Traffic Conditioning Blocks (TCBs)

1429	The classifiers, meters, action elements and queueing elements described
1430	above can be combined into traffic conditioning blocks (TCBs). The TCB
1431	is an abstraction of a functional element that may be used to facilitate
1432	the definition of specific traffic conditioning functionality.

1434	A general TCB might consist of the following four stages:
1435	  - Classification stage
1436	  - Metering stage
1437	  - Action stage
1438	  - Queueing stage

1440	where each stage may consist of a set of parallel datapaths consisting
1441	of pipelined elements.

1443	Note that a classifier is a 1:N element, metering and actions are
1444	typically 1:1 elements and queueing is a N:1 element. The whole TCB
1445	should, however, result in a 1:1 abstract element.

1447	TCBs are constructed by connecting elements corresponding to these
1448	stages in any sensible order. It is possible to omit stages, to include
1449	null elements, or to concatenate multiple stages of the same type. TCB
1450	outputs may drive additional TCBs (on either the ingress or egress
1451	interfaces).

1453	8.1.  An Example TCB

1455	A SLS is presumed to have been negotiated between the customer and the
1456	provider which specifies the handling of the customer's traffic by the
1457	provider's network. The agreement might be of the following form:

1459	   DSCP     PHB   Profile     Treatment
1460	   ----     ---   -------     ----------------------
1461	   001001   EF    Profile4    Discard non-conforming.
1462	   001100   AF11  Profile5    Shape to profile, tail-drop when full.
1463	   001101   AF21  Profile3    Re-mark non-conforming to DSCP 001000,
1464	                                 tail-drop when full.
1465	   other    BE    none        Apply RED-like dropping.

1467	This SLS specifies that the customer may submit packets marked for DSCP
1468	001001 which will get EF treatment so long as they remain conforming to
1469	Profile1 and will be discarded if they exceed this profile. The
1470	discarded packets are counted in this example, perhaps for use by the
1471	provider's sales department in convincing the customer to buy a larger
1472	SLS.  Packets marked for DSCP 001100 will be shaped to Profile2 before
1473	forwarding. Packets marked for DSCP 001101 will be metered to Profile3
1474	with non-conforming packets "downgraded" by being re-marked with a DSCP
1475	of 001000.  It is implicit in this agreement that conforming packets are
1476	given the PHB originally indicated by the packets' DSCP field.

1478	Figures 6 and 7 illustrates a TCB that might be used to handle this SLS
1479	at an ingress interface at the customer/provider boundary.

1481	The Classification stage of this example consists of a single BA
1482	classifier. The BA classifier is used to separate traffic based on the
1483	Diffserv service level requested by the customer (as indicated by the
1484	DSCP in each submitted packet's IP header). We illustrate three DSCP
1485	filter values: A, B and C. The 'X' in the BA classifier is a wildcard
1486	filter that matches every packet not otherwise matched.

1488	The paths for DSCP 001001 and 001101 then include a metering stage.
1489	There is a separate meter for each set of packets corresponding to
1490	classifier outputs A and C. Each meter uses a specific profile, as
1491	specified in the TCS, for the corresponding Diffserv service level. The
1492	meters in this example each indicate one of two conforming levels,
1493	                          +-----+
1494	                          |    A|---------------------------> to Queue1
1495	                       +->|     |
1496	                       |  |    B|--+  +-----+    +-----+
1497	                       |  +-----+  |  |     |    |     |
1498	                       |  Meter1   +->|     |--->|     |
1499	                       |              |     |    |     |
1500	                       |              +-----+    +-----+
1501	                       |              Counter1   Absolute
1502	 submitted +-----+     |                         Dropper1
1503	 traffic   |    A|-----+
1504	 --------->|    B|----------------------------------------> to Dropper1
1505	           |    C|-----+
1506	           |    X|--+  |
1507	           +-----+  |  |  +-----+                +-----+
1508	         Classifier1|  |  |    A|--------------->|A    |
1509	            (BA)    |  +->|     |                |     |--> to Dropper2
1510	                    |     |    B|--+  +-----+ +->|B    |
1511	                    |     +-----+  |  |     | |  +-----+
1512	                    |     Meter2   +->|     |-+    Mux1
1513	                    |                 |     |
1514	                    |                 +-----+
1515	                    |                 Marker1
1516	                    +-------------------------------------> to Dropper3

1518	      Figure 6:  An Example Traffic Conditioning Block (Part 1)

1520	conforming or non-conforming.

1522	Following the Metering stage is the Action stage in the upper and lower
1523	branches. Packets submitted for DSCP 001001 that are deemed non-
1524	conforming are counted and discarded while packets that are conforming
1525	are passed on to Dropper1/Queue1. Packets submitted for DSCP 001101 that
1526	are deemed non-conforming are re-marked and then conforming and non-
1527	conforming packets are multiplexed together before being passed on to
1528	Dropper2/Queue3. Packets submitted for DSCP 001100 are passed straight
1529	on to Queue2.

1531	The Queueing stage is realised as follows, shown in figure 6.  The
1532	conforming 001001 packets are passed directly to Queue1: there is no
1533	way, with correct configuration of the scheduler for these to overflow
1534	the depth of Queue1 so there is never a requirement for dropping.
1535	Packets marked for 001100 must be passed through a tail-dropper,
1536	Dropper1, which serves to limit the depth of the following queue,
1537	Queue2: packets that arrive to a full queue will be discarded - this is
1538	likely to be an error case: the customer is obviously not sticking to
1539	its agreed profile.  Similarly, packets from the 001101 stream are
1540	passed to Dropper2 and Queue3.  Packets marked for all other DSCPs are
1541	passed to Dropper3 which is a RED-like algorithmic dropper: based on
1542	feedback of the current depth of Queue4, this dropper is likely to
1543	discard enough packets from its input stream to keep the queue depth
1544	under control.

1546	These four queues are then serviced by a scheduling algorithm in
1547	Scheduler1 which has been configured to give each of the queues an
1548	appropriate priority and/or bandwidth share. Inputs A and C are given
1549	guarantees of bandwidth, as appropriate for the contracted profiles.
1550	Input B is given a limit on the bandwidth it can use i.e. a non-work-
1551	conserving discipline in order to achieve the desired shaping of this
1552	stream.  Input D is given no limits or guarantees but a lower priority
1553	than the other queues, appropriate for its best-effort status.  Traffic
1554	then exits the scheduler in a single orderly stream.

1556	    from Meter1                     +-----+
1557	    ------------------------------->|     |----+
1558	                                    |     |    |
1559	                                    +-----+    |
1560	                                    Queue1     |
1561	                                               |  +-----+
1562	    from Classifier1 +-----+        +-----+    +->|A    |
1563	    ---------------->|     |------->|     |------>|B    |------->
1564	                     |     |        |     |  +--->|C    |  exiting
1565	                     +-----+        +-----+  | +->|D    |  traffic
1566	                     Dropper1       Queue2   | |  +-----+
1567	                                             | |  Scheduler1
1568	    from Mux1        +-----+        +-----+  | |
1569	    ---------------->|     |------->|     |--+ |
1570	                     |     |        |     |    |
1571	                     +-----+        +-----+    |
1572	                     Dropper2       Queue3     |
1573	                                               |
1574	    from Classifier1 +-----+        +-----+    |
1575	    ---------------->|     |------->|     |----+
1576	                     |     |        |     |
1577	                     +-----+        +-----+
1578	                     Dropper3       Queue4

1580	      Figure 7: An Example Traffic Conditioning Block (Part 2)

1582	The interconnections of the TCB elements illustrated in Figures 6 and 7
1583	can be represented as follows:

1585	      TCB1:

1587	      Classifier1:
1588	      FilterA:             Meter1
1589	      FilterB:             Dropper1
1590	      FilterC:             Meter2
1591	      Default:             Dropper3

1593	      Meter1:
1594	      Type:                AverageRate
1595	      Profile:             Profile1
1596	      ConformingOutput:    Queue1
1597	      NonConformingOutput: Counter1

1599	      Counter1:
1600	      Output:              AbsoluteDropper1

1602	      Meter2:
1603	      Type:                AverageRate
1604	      Profile:             Profile3
1605	      ConformingOutput:    Mux1.InputA
1606	      NonConformingOutput: Marker1

1608	      Marker1:
1609	      Type:                DSCPMarker
1610	      Mark:                001000
1611	      Output:              Mux1.InputB

1613	      Mux1:
1614	      Output:              Dropper2

1616	      Dropper1:
1617	      Type:                AlgorithmicDropper
1618	      Discipline:          Drop-on-threshold
1619	      Trigger:             Queue2.Depth > 10kbyte
1620	      Output:              Queue2

1622	      Dropper2:
1623	      Type:                AlgorithmicDropper
1624	      Discipline:          Drop-on-threshold
1625	      Trigger:             Queue3.Depth > 20kbyte
1626	      Output:              Queue3

1628	      Dropper3:

1630	      Type:                AlgorithmicDropper
1631	      Discipline:          RED93
1632	      Trigger:             Internal
1633	      Output:              Queue3
1634	      MinThresh:           Queue3.Depth > 20 kbyte
1635	      MaxThresh:           Queue3.Depth > 40 kbyte
1636	         <other RED parms too>

1638	      Queue1:
1639	      Type:                FIFO
1640	      Output:              Scheduler1.InputA

1642	      Queue2:
1643	      Type:                FIFO
1644	      Output:              Scheduler1.InputB

1646	      Queue3:
1647	      Type:                FIFO
1648	      Output:              Scheduler1.InputC

1650	      Queue4:
1651	      Type:                FIFO
1652	      Output:              Scheduler1.InputD

1654	      Scheduler1:
1655	      Type:                Scheduler4Input
1656	      InputA:
1657	      MaxRateProfile:      none
1658	      MinRateProfile:      Profile4
1659	      Priority:            20
1660	      InputB:
1661	      MaxRateProfile:      Profile5
1662	      MinRateProfile:      none
1663	      Priority:            40
1664	      InputC:
1665	      MaxRateProfile:      none
1666	      MinRateProfile:      Profile3
1667	      Priority:            20
1668	      InputD:
1669	      MaxRateProfile:      none
1670	      MinRateProfile:      none
1671	      Priority:            10

1673	8.2.  An Example TCB to Support Multiple Customers

1675	The TCB described above can be installed on an ingress interface to
1676	implement a provider/customer TCS if the interface is dedicated to the
1677	customer. However, if a single interface is shared between multiple
1678	customers, then the TCB above will not suffice, since it does not
1679	differentiate among traffic from different customers. Its classification
1680	stage uses only BA classifiers.

1682	The TCB is readily extended to support the case of multiple customers
1683	per interface, as follows. First, a TCB is defined for each customer to
1684	reflect the TCS with that customer: TCB1, defined above is the TCB for
1685	customer 1 and definitions are then added for TCB2 and for TCB3 which
1686	reflect the agreements with customers 2 and 3 respectively.

1688	Finally, a classifier is added to the front end to separate the traffic
1689	from the three different customers. This forms a new TCB, TCB4, which
1690	incorporates TCB1, TCB2, and TCB3 and is illustrated in Figure 8.

1692	A formal representation of this multi-customer TCB might be:

1694	      TCB4:

1696	      Classifier4:
1697	      Filter1:     to TCB1
1698	      Filter2:     to TCB2
1699	      Filter3:     to TCB3
1700	      No Match:    AbsoluteDropper4

1702	      TCB1:
1703	      (as defined above)

1705	      TCB2:

1707	      submitted +-----+
1708	      traffic   |    A|--------> TCB1
1709	            --->|    B|--------> TCB2
1710	                |    C|--------> TCB3
1711	                |    X|--------> AbsoluteDropper4
1712	                +-----+
1713	                Classifier4

1715	      Figure 8: An Example of a Multi-Customer TCB
1716	      (similar to TCB1, perhaps with different numeric parameters)

1718	      TCB3:
1719	      (similar to TCB1, perhaps with different numeric parameters)

1721	      TCB4:
1722	      (the total TCB)

1724	and the filters, based on each customer's source MAC address, could be
1725	defined as follows:

1727	      Filter1:
1728	      Type:        MacAddress
1729	      SrcValue:    01-02-03-04-05-06 (source MAC address of customer 1)
1730	      SrcMask:     FF-FF-FF-FF-FF-FF
1731	      DestValue:   00-00-00-00-00-00
1732	      DestMask:    00-00-00-00-00-00

1734	      Filter2:
1735	      (similar to Filter1 but with customer 2's source MAC address as
1736	      SrcValue)

1738	      Filter3:
1739	      (similar to Filter1 but with customer 3's source MAC address as
1740	      SrcValue)

1742	In this example, Classifier4 separates traffic submitted from different
1743	customers based on the source MAC address in submitted packets. Those
1744	packets with recognized source MAC addresses are passed to the TCB
1745	implementing the TCS with the corresponding customer. Those packets with
1746	unrecognized source MAC addresses are passed to a dropper.

1748	TCB4 has a Classifier stage and an Action element stage, which consists
1749	of either a dropper or another TCB.

1751	8.3.  TCBs Supporting Microflow-based Services

1753	The TCB illustrated above describes a configuration that might be
1754	suitable for enforcing a SLS at a router's ingress. It assumes that the
1755	customer marks its own traffic for the appropriate service level.  It
1756	then limits the rate of aggregate traffic submitted at each service
1757	level, thereby protecting the resources of the Diffserv network. It does
1758	not provide any isolation between the customer's individual microflows.

1760	A more complex example might be a TCB configuration that offers
1761	additional functionality to the customer. It recognizes individual
1762	customer microflows and marks each one independently. It also isolates
1763	the customer's individual microflows from each other in order to prevent
1764	a single microflow from seizing an unfair share of the resources
1765	available to the customer at a certain service level. This is
1766	illustrated in Figure 9.

1768	Suppose that the customer has an SLS which specifices 2 service levels,
1769	to be identifed to the provider by DSCP A and DSCP B.  Traffic is first
1770	directed to a MF classifier which classifies traffic based on
1771	miscellaneous classification criteria, to a granularity sufficient to
1772	identify individual customer microflows. Each microflow can then be
1773	marked for a specific DSCP The metering elements limit the contribution
1774	of each of the customer's microflows to the service level for which it
1775	was marked. Packets exceeding the allowable limit for the microflow are
1776	dropped.

1778	This TCB could be formally specified as follows:

1780	                     +-----+   +-----+
1781	    Classifier1      |     |   |     |---------------+
1782	        (MF)      +->|     |-->|     |     +-----+   |
1783	      +-----+     |  |     |   |     |---->|     |   |
1784	      |    A|------  +-----+   +-----+     +-----+   |
1785	  --->|    B|-----+  Marker1   Meter1      Absolute  |
1786	      |    C|---+ |                        Dropper1  |   +-----+
1787	      |    X|-+ | |  +-----+   +-----+               +-->|A    |
1788	      +-----+ | | |  |     |   |     |------------------>|B    |--->
1789	              | | +->|     |-->|     |     +-----+   +-->|C    | to TCB2
1790	              | |    |     |   |     |---->|     |   |   +-----+
1791	              | |    +-----+   +-----+     +-----+   |    Mux1
1792	              | |    Marker2   Meter2      Absolute  |
1793	              | |                          Dropper2  |
1794	              | |    +-----+   +-----+               |
1795	              | |    |     |   |     |---------------+
1796	              | |--->|     |-->|     |     +-----+
1797	              |      |     |   |     |---->|     |
1798	              |      +-----+   +-----+     +-----+
1799	              |      Marker3   Meter3      Absolute
1800	              |                            Dropper3
1801	              V etc.

1803	      Figure 9: An Example of a Marking and Traffic Isolation TCB
1804	      TCB1:
1805	      Classifier1: (MF)
1806	      FilterA:             Marker1
1807	      FilterB:             Marker2
1808	      FilterC:             Marker3
1809	      etc.

1811	      Marker1:
1812	      Output:              Meter1

1814	      Marker2:
1815	      Output:              Meter2

1817	      Marker3:
1818	      Output:              Meter3

1820	      Meter1:
1821	      ConformingOutput:    Mux1.InputA
1822	      NonConformingOutput: AbsoluteDropper1

1824	      Meter2:
1825	      ConformingOutput:    Mux1.InputB
1826	      NonConformingOutput: AbsoluteDropper2

1828	      Meter3:
1829	      ConformingOutput:    Mux1.InputC
1830	      NonConformingOutput: AbsoluteDropper3

1832	      etc.

1834	      Mux1:
1835	      Output:              to TCB2

1837	Note that the detailed traffic element declarations are not shown here.
1838	Traffic is either dropped by TCB1 or emerges marked for one of two
1839	DSCPs. This traffic is then passed to TCB2 which is illustrated in
1840	Figure 10.

1842	TCB2 could then be specified as follows:

1844	      Classifier2: (BA)
1845	      FilterA:               Meter5
1846	      FilterB:               Meter6

1848	      Meter5:
1849	      ConformingOutput:      Queue1
1850	                     +-----+
1851	                     |     |---------------> to Queue1
1852	                  +->|     |     +-----+
1853	        +-----+   |  |     |---->|     |
1854	        |    A|---+  +-----+     +-----+
1855	      ->|     |       Meter5     AbsoluteDropper4
1856	        |    B|---+  +-----+
1857	        +-----+   |  |     |---------------> to Queue2
1858	      Classifier2 +->|     |     +-----+
1859	         (BA)        |     |---->|     |
1860	                     +-----+     +-----+
1861	                      Meter6     AbsoluteDropper5

1863	      Figure 10: Additional Example: TCB2

1865	      NonConformingOutput:   AbsoluteDropper4

1867	      Meter6:
1868	      ConformingOutput:      Queue2
1869	      NonConformingOutput:   AbsoluteDropper5

1871	8.4.  Cascaded TCBs

1873	Nothing in this model prevents more complex scenarios in which one
1874	microflow TCB precedes another (e.g. for TCBs implementing separate TCSs
1875	for the source and for a set of destinations).

1877	9.  Open Issues

1879	<ed: this section to be deleted before WG last call and RFC publication.
1880	The current stance of this draft is supplied in parentheses.

1882	(1)  FIFOs are modelled here as having infinite depth: it is up to any
1883	     preceding meter/dropper to make sure that they do not overflow - a
1884	     hard stop on the depth would be modelled, for example, by preceding
1885	     the FIFO with an Absolute Dropper. Is this appropriate? (Yes)

1887	(2)  We must allow algorithmic droppers that apply different dropping
1888	     behaviour to packets with different classifier matches, with these
1889	     possibly fed through different meters and actions. Should we model
1890	     the dropper as a single input element with implicit pointers back
1891	     to the matching classifier that selects different dropper
1892	     algorithms/treatments? Or as multiple droppers? Or as having
1893	     multiple logical inputs? (single input, implicit pointers).

1895	10.  Security Considerations

1897	Security vulnerabilities of Diffserv network operation are discussed in
1898	[DSARCH]. This document describes an abstract functional model of
1899	Diffserv router elements. Certain denial-of-service attacks such as
1900	those resulting from resource starvation may be mitigated by appropriate
1901	configuration of these router elements; for example, by rate limiting
1902	certain traffic streams or by authenticating traffic marked for higher
1903	quality-of-service.

1905	One particular theft- or denial-of-service issue may arise where a
1906	token-bucket meter, with an absolute dropper for non-conforming traffic,
1907	is used in a TCB to police a stream to a given TCS: the definition of
1908	the token-bucket meter in section 5 indicates that it should be lenient
1909	in accepting a packet whenever any bits of the packet would have been
1910	within the profile; the definition of the leaky-bucket scheduler is
1911	conservative in that a packet is to be transmitted only if the whole
1912	packet fits within the profile. This difference may be exploited by a
1913	malicious scheduler either to obtain QoS treatment for more octets than
1914	allowed in the TCS or to disrupt (perhaps only slightly) the QoS
1915	guarantees promised to other traffic streams.

1917	11.  Acknowledgments

1919	Concepts, terminology, and text have been borrowed liberally from
1920	[POLTERM], [DSMIB] and [DSPIB].  We wish to thank the authors of those
1921	documents: Fred Baker, Michael Fine, Keith McCloghrie, John Seligson,
1922	Kwok Chan and Scott Hahn for their contributions.

1924	This document has benefitted from the comments and suggestions of
1925	several participants of the Diffserv working group.

1927	12.  References

1929	[AF-PHB]
1930	     J. Heinanen, F. Baker, W. Weiss, and J. Wroclawski, "Assured
1931	     Forwarding PHB Group", RFC 2597, June 1999.

1933	[DSARCH]
1934	     M. Carlson, W. Weiss, S. Blake, Z. Wang, D. Black, and E. Davies,
1935	     "An Architecture for Differentiated Services", RFC 2475, December
1936	     1998

1938	[DSFIELD]
1939	     K. Nichols, S. Blake, F. Baker, and D. Black, "Definition of the
1940	     Differentiated Services Field (DS Field) in the IPv4 and IPv6
1941	     Headers", RFC 2474, December 1998.

1943	[DSMIB]
1944	     F. Baker, A. Smith, K. Chan, "Differentiated Services MIB",
1945	     Internet Draft <draft-ietf-diffserv-mib-03.txt>, May 2000.

1947	[DSPIB]
1948	     M. Fine, K. McCloghrie, J. Seligson, K. Chan, S. Hahn, and A.
1949	     Smith, "Quality of Service Policy Information Base", Internet Draft
1950	     <draft-ietf-diffserv-pib-00.txt>, March 2000.

1952	[DSTERMS]
1953	     D. Grossman, "New Terminology for Diffserv", Internet Draft <draft-
1954	     ietf-diffserv-new-terms-02.txt>, November 1999.

1956	[E2E]
1957	     Y. Bernet, R. Yavatkar, P. Ford, F. Baker, L. Zhang, M. Speer, K.
1958	     Nichols, R. Braden, B. Davie, J. Wroclawski, and E. Felstaine,
1959	     "Integrated Services Operation over Diffserv Networks", Internet
1960	     Draft <draft-ietf-issll-diffserv-rsvp-04.txt>, March 2000.

1962	[EF-PHB]
1963	     V. Jacobson,  K. Nichols, and K. Poduri, "An Expedited Forwarding
1964	     PHB", RFC 2598, June 1999.

1966	[GTC]
1967	     L. Lin, J. Lo, and F. Ou, "A Generic Traffic Conditioner", Internet
1968	     Draft <draft-lin-diffserv-gtc-01.txt>, August 1999.

1970	[INTSERV]
1971	     R. Braden, D. Clark and S. Shenker, "Integrated Services in the
1972	     Internet Architecture: an Overview" RFC 1633, June 1994.

1974	[POLTERM]
1975	     F. Reichmeyer,  D. Grossman, J. Strassner, M. Condell, "A Common
1976	     Terminology for Policy Management", Internet Draft <draft-
1977	     reichmeyer-polterm-terminology-00.txt>, March 2000

1979	[QOSDEVMOD]
1980	     J. Strassner, W. Weiss, D. Durham, A. Westerinen, "Information
1981	     Model for Describing Network Device QoS Mechanisms", Internet Draft
1982	     <draft-ietf-policy-qos-device-info-model-00.txt>, April 2000

1984	[SRTCM]
1985	     J. Heinanen, and R. Guerin, "A Single Rate Three Color Marker", RFC
1986	     2697, September 1999.

1988	[TRTCM]
1989	     J. Heinanen, R. Guerin, "A Two Rate Three Color Marker", RFC 2698,
1990	     September 1999.

1992	13.  Authors' Addresses

1994	   Yoram Bernet
1995	   Microsoft
1996	   One Microsoft Way
1997	   Redmond, WA  98052
1998	   Phone:  +1 425 936 9568
1999	   E-mail: yoramb@microsoft.com

2001	   Andrew Smith
2002	   Extreme Networks
2003	   3585 Monroe St.
2004	   Santa Clara, CA  95051
2005	   Phone:  +1 408 579 2821
2006	   E-mail: andrew@extremenetworks.com

2008	   Steven Blake
2009	   Ericsson
2010	   920 Main Campus Drive, Suite 500
2011	   Raleigh, NC  27606
2012	   Phone:  +1 919 472 9913
2013	   E-mail: slblake@torrentnet.com

2015	   Daniel Grossman
2016	   Motorola Inc.
2017	   20 Cabot Blvd.
2018	   Mansfield, MA  02048
2019	   Phone:  +1 508 261 5312
2020	   E-mail: dan@dma.isg.mot.com

2022	Table of Contents

2024	1 Introduction ....................................................    2
2025	2 Glossary ........................................................    3
2026	3 Conceptual Model ................................................    5
2027	3.1 Elements of a Diffserv Router .................................    5
2028	3.1.1 Datapath ....................................................    5
2029	3.1.2 Configuration and Management Interface ......................    6
2030	3.1.3 Optional QoS Agent Module ...................................    7
2031	3.2 Hierarchical Model of Diffserv Components .....................    7
2032	4 Classifiers .....................................................   10
2033	4.1 Definition ....................................................   10
2034	4.1.1 Filters .....................................................   11
2035	4.1.2 Overlapping Filters .........................................   11
2036	4.2 Examples ......................................................   12
2037	4.2.1 Behaviour Aggregate (BA) Classifier .........................   12
2038	4.2.2 Multi-Field (MF) Classifier .................................   13
2039	4.2.3 Free-form Classifier ........................................   13
2040	4.2.4 Other Possible Classifiers ..................................   14
2041	5 Meters ..........................................................   14
2042	5.1 Examples ......................................................   17
2043	5.1.1 Average Rate Meter ..........................................   17
2044	5.1.2 Exponential Weighted Moving Average (EWMA) Meter ............   18
2045	5.1.3 Two-Parameter Token Bucket Meter ............................   19
2046	5.1.4 Multi-Stage Token Bucket Meter ..............................   19
2047	5.1.5 Null Meter ..................................................   20
2048	6 Action Elements .................................................   20
2049	6.1 Marker ........................................................   21
2050	6.2 Absolute Dropper ..............................................   21
2051	6.3 Multiplexer ...................................................   22
2052	6.4 Counter .......................................................   22
2053	6.5 Null Action ...................................................   22
2054	7 Queueing Blocks .................................................   22
2055	7.1 Queueing Model ................................................   23
2056	7.1.1 FIFO ........................................................   23
2057	7.1.2 Scheduler ...................................................   25
2058	7.1.3 Algorithmic Dropper .........................................   27
2059	7.1.4 Constructing queueing blocks from the elements ..............   30
2060	7.2 Shaping .......................................................   31
2061	8 Traffic Conditioning Blocks (TCBs) ..............................   31
2062	8.1 An Example TCB ................................................   32
2063	8.2 An Example TCB to Support Multiple Customers ..................   37
2064	8.3 TCBs Supporting Microflow-based Services ......................   38
2065	8.4 Cascaded TCBs .................................................   41
2066	9 Open Issues .....................................................   41
2067	10 Security Considerations ........................................   42
2068	11 Acknowledgments ................................................   42
2069	12 References .....................................................   42
2070	13 Authors' Addresses .............................................   44
2071	14.  Full Copyright

2073	   Copyright (C) The Internet Society (2000). All Rights Reserved.

2075	   This document and translations of it may be copied and furnished to
2076	   others, and derivative works that comment on or otherwise explain it
2077	   or assist in its implmentation may be prepared, copied, published and
2078	   distributed, in whole or in part, without restriction of any kind,
2079	   provided that the above copyright notice and this paragraph are
2080	   included on all such copies and derivative works. However, this
2081	   document itself may not be modified in any way, such as by removing
2082	   the copyright notice or references to the Internet Society or other
2083	   Internet organizations, except as needed for the purpose of
2084	   developing Internet standards in which case the procedures for
2085	   copyrights defined in the Internet Standards process must be
2086	   followed, or as required to translate it into languages other than
2087	   English.

2089	   The limited permissions granted above are perpetual and will not be
2090	   revoked by the Internet Society or its successors or assigns.

2092	   This document and the information contained herein is provided on an
2093	   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
2094	   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
2095	   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
2096	   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
2097	   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.