idnits 2.17.1 

draft-ietf-diffserv-model-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 9
     longer pages, the longest (page 1) being 59 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([DSARCH], [PIB], [DSMIB]),
     which it shouldn't.  Please replace those with straight textual mentions
     of the documents in question.

  == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.

  == There are 2 instances of lines with private range IPv4 addresses in the
     document.  If these are generic example addresses, they should be changed
     to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x,
     198.51.100.x or 203.0.113.x.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == Line 195 has weird spacing: '...agement    pac...'

  == Line 232 has weird spacing: '...serving    not...'

  == Line 265 has weird spacing: '...tioning  other...'

  == Line 273 has weird spacing: '...serving    ser...'

  == Line 1217 has weird spacing: '...monitor    dro...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (October 1999) is 8954 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'GTC' is defined on line 1603, but no explicit
     reference was found in the text

  ** Downref: Normative reference to an Informational RFC: RFC 2475 (ref.
     'DSARCH')

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DSTERMS'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'E2E'

  ** Obsolete normative reference: RFC 2598 (ref. 'EF-PHB') (Obsoleted by RFC
     3246)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DSMIB'

  ** Downref: Normative reference to an Informational RFC: RFC 2697 (ref.
     'SRTCM')

  -- Possible downref: Non-RFC (?) normative reference: ref. 'PIB'

  ** Downref: Normative reference to an Informational RFC: RFC 2698 (ref.
     'TRTCM')

  -- Possible downref: Non-RFC (?) normative reference: ref. 'GTC'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'MPLSDS'


     Summary: 8 errors (**), 0 flaws (~~), 11 warnings (==), 8 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                                Y. Bernet
2	Diffserv Working Group                                         Microsoft
3	INTERNET-DRAFT                                                  A. Smith
4	Expires: April 2000                                     Extreme Networks
5	                                                                S. Blake
6	                                                                Ericsson
7	                                                            October 1999

9	                A Conceptual Model for Diffserv Routers

11	                    draft-ietf-diffserv-model-01.txt

13	Status of this Memo

15	   This document is an Internet-Draft and is in full conformance with
16	   all provisions of Section 10 of RFC2026.

18	   Internet-Drafts are working documents of the Internet Engineering
19	   Task Force (IETF), its areas, and its working groups.  Note that
20	   other groups may also distribute working documents as Internet-
21	   Drafts.

23	   Internet-Drafts are draft documents valid for a maximum of six months
24	   and may be updated, replaced, or obsoleted by other documents at any
25	   time.  It is inappropriate to use Internet-Drafts as reference
26	   material or to cite them other than as "work in progress."

28	   The list of current Internet-Drafts can be accessed at
29	   http://www.ietf.org/ietf/1id-abstracts.txt.

31	   The list of Internet-Draft Shadow Directories can be accessed at
32	   http://www.ietf.org/shadow.html.

34	   This document is a product of the Diffserv working group.  Comments
35	   on this draft should be directed to the Diffserv mailing list
36	   <diffserv@ietf.org>.

38	   Distribution of this memo is unlimited.

40	Copyright Notice

42	   Copyright (C) The Internet Society (1999).  All Rights Reserved.

44	Abstract

46	   This draft proposes a conceptual model of Differentiated Services
47	   (Diffserv) routers for use in their management and configuration.
48	   This model defines the general functional datapath elements
49	   (classifiers, meters, markers, droppers, monitors, mirrors, muxes,
50	   queues), their possible configuration parameters, and how they might
51	   be interconnected to realize the range of classification, traffic
52	   conditioning, and per-hop behavior (PHB) functionalities described in
53	   [DSARCH].  The model is intended to be abstract and capable of
54	   representing the configuration parameters important to Diffserv
55	   functionality for a variety of specific router implementations.  It

57	Bernet, et. al.            Expires: April 2000                 [page  1]
58	   is not intended as a guide to hardware implementation.

60	   This model should serve as a rationale for the design of a Diffserv
61	   MIB [DSMIB], as well for various configuration interfaces (such as
62	   [PIB]).  Since these documents are all evolving simultaneously there
63	   are discrepancies between their current revisions; this should be
64	   resolved in a future revision of this draft.

66	Table of Contents

68	   1.  Introduction .................................................  3
69	   2.  Glossary  ....................................................  4
70	   3.  Conceptual Model .............................................  6
71	     3.1  Elements of a Diffserv Router .............................  6
72	       3.1.1  Datapath ..............................................  7
73	       3.1.2  Configuration and Management Interface ................  8
74	       3.1.3  Optional RSVP Module ..................................  8
75	     3.2  Hierarchical Model of Diffserv Components .................  8
76	   4.  Classifiers .................................................. 10
77	     4.1  Definition ................................................ 10
78	       4.1.1  Filters ............................................... 11
79	       4.1.2  Overlapping Filters ................................... 12
80	       4.1.3  Filter Groups ......................................... 12
81	     4.2  Examples .................................................. 12
82	       4.2.1  Behavior Aggregate (BA) Classifier .................... 12
83	       4.2.2  Multi-Field (MF) Classifier ........................... 13
84	       4.2.3  IEEE802 MAC Address Classifier ........................ 13
85	       4.2.4  Free-form Classifier .................................. 14
86	       4.2.5  Other Possible Classifiers ............................ 14
87	     4.3  MPLS ...................................................... 15
88	   5.  Meters ....................................................... 15
89	     5.1  Definition ................................................ 15
90	     5.2  Examples .................................................. 16
91	       5.2.1  Average Rate Meter .................................... 16
92	       5.2.2  Exponentially Weighted Moving Average (EWMA) Meter .... 17
93	       5.2.3  Two-Parameter Token Bucket Meter ...................... 17
94	       5.2.4  Multi-Stage Token Bucket Meter ........................ 18
95	       5.2.5  Null Meter ............................................ 19
96	   6.  Action Elements .............................................. 19
97	     6.1  Marker .................................................... 19
98	     6.2  Dropper ................................................... 20
99	     6.3  Shaper .................................................... 20
100	     6.4  Mirroring Element ......................................... 20
101	     6.5  Multiplexor ............................................... 20
102	     6.6  Enqueueing Element ........................................ 20
103	     6.7  Monitor ................................................... 21
104	     6.8  Null Action ............................................... 21
105	   7.  Queues ....................................................... 21
106	     7.1  Queue Sets and Scheduling ................................. 21
107	     7.2  Shaping ................................................... 23
108	   8.  Traffic Conditioning Blocks (TCBs) ........................... 23
109	     8.1  An Example TCB ............................................ 24

111	Bernet, et. al.            Expires: April 2000                 [page  2]
112	     8.2  An Example TCB to Support Multiple Customers .............. 27
113	     8.3  TCBs Supporting Microflow-based Services .................. 28
114	   9.  Open Issues .................................................. 31
115	  10.  Security Considerations ...................................... 31
116	  11.  Acknowledgments .............................................. 31
117	  12.  References ................................................... 32
118	  Appendix A.  Simple Token Bucket Definition ....................... 33

120	1. Introduction

122	   Differentiated Services (Diffserv) [DSARCH] is a set of technologies
123	   which allow network service providers to offer differing levels of
124	   network quality-of-service (QoS) to different customers and their
125	   traffic streams.  The premise of Diffserv networks is that routers
126	   within the core of the network handle packets in different traffic
127	   streams by forwarding them using different per-hop behaviors (PHBs).
128	   The PHB to be applied is indicated by a Diffserv codepoint (DSCP) in
129	   the IP header of each packet [DSFIELD].   Note that this document
130	   uses the terminology defined in [DSARCH, DSTERMS] and in Sec. 2.

132	   The advantage of such a scheme is that many traffic streams can be
133	   aggregated to one of a small number of behavior aggregates (BA)
134	   which are each forwarded using the same PHB at the router, thereby
135	   simplifying the processing and associated storage.  In addition,
136	   there is no signaling, other than what is carried in the DSCP of
137	   each packet, and no other related processing that is required in the
138	   core of the Diffserv network since QoS is invoked on a packet-by-
139	   packet basis.

141	   The Diffserv architecture enables a variety of possible services
142	   which could be deployed in a network.  These services are reflected
143	   to customers at the edges of the Diffserv network in the form of a
144	   Service Level Specification (SLS) [DSTERMS].  The ability to provide
145	   these services depends on the availability of cohesive management and
146	   configuration tools that can be used to provision and monitor a set
147	   of Diffserv routers in a coordinated manner.  To facilitate the
148	   development of such configuration and management tools it is helpful
149	   to define a conceptual model of a Diffserv router that abstracts
150	   away implementation details of particular Diffserv routers from the
151	   parameters of interest for configuration and management.  The purpose
152	   of this draft is to define such a model.

154	   The basic forwarding functionality of a Diffserv router is defined in
155	   other specifications; e.g., [DSARCH, DSFIELD, AF-PHB, EF-PHB].

157	   This document is not intended in any way to constrain or to dictate
158	   the implementation alternatives of Diffserv routers.  We expect that
159	   router vendors will demonstrate a great deal of variability in their
160	   implementations.  To the extent that vendors are able to model their
161	   implementations using the abstractions described in this draft,
162	   configuration and management tools will more readily be able to
163	   configure and manage networks incorporating Diffserv routers of

165	Bernet, et. al.            Expires: April 2000                 [page  3]
166	   various implementations.
167	   In Sec. 3 we start by describing the basic high-level functional
168	   elements of a Diffserv router and then describe the various
169	   components.  We then focus on the Diffserv-specific components of
170	   the router and describe a hierarchical management model for these.

172	   In Sec. 4 we describe classification elements and in Sec. 5, we
173	   discuss the meter elements.

175	   In Sec. 6 we discuss action elements.  In Sec. 7 we discuss the
176	   basic queueing elements and their functional behaviors (e.g.,
177	   shaping).

179	   In Sec. 8, we show how the basic classification, meter, action, and
180	   queueing elements can be combined to build modules called Traffic
181	   Conditioning Blocks (TCBs).

183	   In Sec. 9 we discuss open issues with this document and in Sec. 10 we
184	   discuss security concerns.

186	   Appendix A discusses token bucket implementation details.

188	2.  Glossary

190	   Some of the terms used in this draft are defined in [DSARCH] and in
191	   [DSTERMS].  We define a few of them here again only to provide
192	   additional detail.

194	   Buffer        An algorithm used to determine whether an arriving
195	   management    packet should be stored in a queue, or discarded.  This
196	   algorithm     decision is usually a function of the instantaneous or
197	                 average queue occupancy, but also may be a function of
198	                 the aggregate queue occupancy in a queue set, or of
199	                 other parameters.

201	   Classifier    A functional datapath element which consists of filters
202	                 which select packets based on the content of packet
203	                 headers or other packet data, and/or on implicit or
204	                 derived attributes associated with the packet, and
205	                 forwards the packet along a particular datapath within
206	                 the router.  A classifier splits a single incoming
207	                 traffic stream into multiple outgoing ones.

209	   Enqueueing    The process of executing a buffer management algorithm
210	                 to determine whether an arriving packet should be
211	                 stored in a queue.

213	   Filter        A set of (wildcard/prefix/masked/range/exact)
214	                 conditions on the components of a packet's
215	                 classification key.  A filter is said to match only if
216	                 each condition is satisfied.

218	Bernet, et. al.            Expires: April 2000                 [page  4]
219	   Mirroring     A functional datapath element which makes one or more
220	   element       copies of a packet and forwards them on distinct
221	                 datapaths; for example to a monitoring port.

223	   Monitor       A functional datapath element which increments an octet
224	                 and a packet counter for every packet which passes
225	                 through it.  Used for collecting statistics.

227	   Multiplexer   A functional datapath element that merges multiple
228	   (Mux)         traffic streams (datapaths) into a single traffic
229	                 stream (datapath).

231	   Non-work      A property of a scheduling algorithm such that it does
232	   conserving    not necessarily service a packet if available at every
233	                 transmission opportunity.

235	   Queue         A storage location for packets awaiting transmission or
236	                 processing by the next functional element in the data-
237	                 path.  The queues represented in this model are
238	                 abstract elements that may be implemented by multiple
239	                 physical queues in series and/or in parallel in a
240	                 specific implementation.  Note that we assume that a
241	                 queue is serviced such as to preserve the required
242	                 ordering constraint for each Ordering Aggregate (OA)
243	                 it queues [DSTERMS].  This can be achieved by a FIFO
244	                 (first in, first out) service policy or by other means
245	                 (e.g., multiple FIFOs exclusively servicing particular
246	                 OAs).

248	   Queue set     A set of queues which are serviced by a scheduling
249	                 algorithm and which may share a buffer management
250	                 algorithm.

252	   Scheduling    An algorithm which determines which queue of a queue
253	   algorithm     set to service next.  This may be based on the relative
254	                 priority of the queues, or on a weighted fair bandwidth
255	                 sharing policy, or some other policy.  A scheduling
256	                 algorithm may be either work-conserving or non-work-
257	                 conserving.

259	   Shaping       The process of delaying packets within a traffic stream
260	                 to cause it to conform to some defined traffic profile.
261	                 Shaping can be implemented using a queue serviced by a
262	                 non-work conserving scheduling algorithm.

264	   Traffic       A logical datapath element consisting of a number of
265	   Conditioning  other functional datapath elements interconnected in
266	   Block (TCB)   such a way as to perform a specific set of traffic
267	                 conditioning functions on an incoming traffic stream.
268	                 A TCB can be thought of as a "black box" with a single
269	                 input and output.

271	Bernet, et. al.            Expires: April 2000                 [page  5]
272	   Work          A property of a scheduling algorithm such that it
273	   conserving    services a packet if available at every transmission
274	                 opportunity.

276	3.  Conceptual Model

278	   In this section we introduce a block diagram of a Diffserv router and
279	   describe the various components illustrated.  Note that a Diffserv
280	   core router is assumed to include only a subset of these components:
281	   the model we present here is intended to cover the case of both
282	   Diffserv edge and core routers.

284	3.1  Elements of a Diffserv Router

286	   The conceptual model we define includes abstract definitions for the
287	   following:

289	   o  The basic traffic classification components.

291	   o  The basic traffic conditioning components.

293	   o  Certain combinations of traffic classification and conditioning
294	      components.

296	   o  Queueing components.

298	   The components and combinations of components described in this
299	   document form building blocks that need to be manageable by Diffserv
300	   configuration and management tools.  One of the goals of this
301	   document is to show how a model of a Diffserv device can be built
302	   using these component blocks.  This model is in the form of a
303	   connected directed acyclic graph (DAG) of functional datapath
304	   elements that describes the traffic conditioning and queueing
305	   behaviors that any particular packet will experience when forwarded
306	   to the Diffserv router.

308	   The following diagram illustrates the major functional blocks of a
309	   Diffserv router:

311	Bernet, et. al.            Expires: April 2000                 [page  6]
312	               +---------------+
313	               |  Diffserv     |
314	        Mgmt   | configuration |
315	      <----+-->| & management  |------------------+
316	      SNMP,|   |  interface    |                  |
317	      COPS |   +---------------+                  |
318	      etc. |        |                             |
319	           |        |                             |
320	           |        v                             v
321	           |   +-------------+   +---------+   +-------------+
322	      data |   | ingress i/f |   |         |   | egress i/f  |
323	      -------->|   class.,   |-->| routing |-->|   class.,   |---->
324	           |   |     TC,     |   |  core   |   |     TC,     |
325	           |   |   queueing  |   |         |   |   queueing  |
326	           |   +-------------+   +---------+   +-------------+
327	           |        ^                             ^
328	           |        |                             |
329	           |        |                             |
330	           |   +------------+                     |
331	           +-->|    RSVP    |                     |
332	      -------->| (optional) |---------------------+
333	        RSVP   +------------+
334	        cntl
335	        msgs

337	      Figure 1:  Diffserv Router Major Functional Blocks

339	3.1.1  Datapath

341	   An ingress interface, routing core, and egress interface are
342	   illustrated at the center of the diagram.  In actual router
343	   implementations, there may be an arbitrary number of ingress and
344	   egress interfaces interconnected by the routing core.  The routing
345	   core element serves as an abstraction of a router's normal routing
346	   and switching functionality.  The routing core moves packets between
347	   interfaces according to policies outside the scope of Diffserv.  The
348	   actual queueing delay and packet loss behavior of a specific router's
349	   switching fabric/backplane is not modeled by the routing core; these
350	   should be modeled using the functional elements described later.  The
351	   routing core should be thought of as an infinite bandwidth, zero-
352	   delay backplane connecting ingress and egress interfaces.

354	   The components of interest on the ingress/egress interfaces are the
355	   traffic classifiers, traffic conditioning (TC) components, and the
356	   queueing components that support Diffserv traffic conditioning and
357	   per-hop behaviors [DSARCH].  These are the fundamental components
358	   comprising a Diffserv router and will be the focal point of our
359	   conceptual model.

361	Bernet, et. al.            Expires: April 2000                 [page  7]
362	3.1.2  Configuration and Management Interface

364	   Diffserv operating parameters are monitored and provisioned through
365	   this interface.  Monitored parameters include statistics regarding
366	   traffic carried at various Diffserv service levels.  These statistics
367	   may be important for accounting purposes and/or for tracking
368	   compliance to traffic conditioning specifications (TCSs) [DSTERMS]
369	   negotiated with customers.  Provisioned parameters are primarily
370	   classification rules, TC and PHB configuration parameters.  The
371	   network administrator interacts with the Diffserv configuration and
372	   management interface via one or more management protocols, such as
373	   SNMP or COPS, or through other router configuration tools such as
374	   serial terminal or telnet consoles.

376	3.1.3 Optional RSVP Module

378	   Diffserv routers may snoop or participate in either per-microflow or
379	   per-flow-aggregate signaling of QoS requirements [E2E].  The example
380	   discussed here uses the RSVP protocol.  Snooping of RSVP messages may
381	   be used, for example, to learn how to classify traffic without
382	   actually participating as a RSVP protocol peer.  Diffserv routers may
383	   reject or admit RSVP reservation requests to provide a means of
384	   admission control to Diffserv-based services or they may use these
385	   requests to trigger provisioning changes for a flow-aggregation in
386	   the Diffserv network.  A flow-aggregation in this context might be
387	   equivalent to a Diffserv BA or it may be more fine-grained, relying
388	   on a MF classifier [DSARCH].  Note that the conceptual model of such
389	   a router starts to look the same as a Integrated Services (intserv)
390	   router in its component makeup [E2E].

392	   Note that a RSVP component of a Diffserv router, if present, might
393	   be active only in the control plane and not in the data plane.  In
394	   this scenario, RSVP is used strictly as a signaling protocol.  The
395	   data plane of such a Diffserv router can still act purely on Diffserv
396	   DSCPs and PHBs in handling data traffic.

398	3.2  Hierarchical Model of Diffserv Components

400	   We focus on the Diffserv specific functional components of the
401	   router: the classification, traffic conditioning, and queueing
402	   functionality.  The diagram below is based on the larger block
403	   diagram shown above:

405	Bernet, et. al.            Expires: April 2000                 [page  8]
406	             Interface A                        Interface B
407	          +-------------+     +---------+     +-------------+
408	          | ingress i/f |     |         |     | egress i/f  |
409	          |   class.,   |     |         |     |   class.,   |
410	      --->|   meter,    |---->|         |---->|   meter,    |--->
411	          |   action,   |     |         |     |   action,   |
412	          |   queueing  |     |         |     |   queueing  |
413	          +-------------+     | routing |     +-------------+
414	                              |  core   |
415	          +-------------+     |         |     +-------------+
416	          | egress i/f  |     |         |     | ingress i/f |
417	          |   class.,   |     |         |     |   class.,   |
418	      <---|   meter,    |<----|         |<----|   meter,    |<---
419	          |   action,   |     |         |     |   action,   |
420	          |   queueing  |     +---------+     |   queueing  |
421	          +-------------+                     +-------------+

423	      Figure 2.  Traffic Conditioning and Queueing Elements

425	   This diagram illustrates two Diffserv router interfaces, each having
426	   an ingress and an egress component.  It shows classification, meter,
427	   action, and queueing elements which might be instantiated on each
428	   interface's ingress and egress component.  The TC functionality is
429	   implemented by a combination of classification, action, meter, and
430	   queueing elements.  We show equivalent functional elements on both
431	   the ingress and egress components of an interface because we expect
432	   an N-port router to display the same Diffserv capabilities as a
433	   network of 2-port routers interconnected by LAN media [DSMIB].  Note
434	   that it is not mandatory that each of these functional elements be
435	   implemented on both ingress and egress components; it is dependent on
436	   the service requirements on a particular interface on a particular
437	   router.  Further, we wish to point out that by showing these elements
438	   on both ingress and egress components we do not mean to imply that
439	   they must be implemented in this way in a specific router.  For
440	   example, a router may implement all shaping and PHB queueing on the
441	   interface egress component, or may instead implement it only on the
442	   ingress component.  Further, the classification needed to map a
443	   packet to an egress component queue (if present) need not be
444	   implemented on the egress component but instead may be implemented on
445	   the ingress component, with the packet passed through the routing
446	   core with in-band control information to allow for egress queue
447	   selection.

449	   From a configuration and management perspective, the following
450	   hierarchy exists:

452	   At the top level, the network administrator manages interfaces.  Each
453	   interface consists of an ingress component and an egress component.
454	   Each component may contain classifier, action, meter, and queueing
455	   elements.

457	Bernet, et. al.            Expires: April 2000                 [page  9]
458	   At the next level, the network administrator manages groups of
459	   functional elements interconnected in a DAG.  These elements are
460	   organized in self-contained Traffic Conditioning Blocks (TCBs) which
461	   are used to implement some desired network policy (see Sec. 8).  One
462	   or more TCBs may be instantiated on each ingress or egress component,
463	   may be connected in series, and/or may be connected in a
464	   parallel configuration on the multiple outputs of a classifier.
465	   We define the TCB to optionally include classification and queueing
466	   elements so as to allow for rich functionality.  A TCB can be thought
467	   of as a "black box" with a single input and a single output (on the
468	   main data path).  TCBs can be constructed out of a DAG of other TCBs,
469	   recursively.  We do not assume the same TCB configuration on every
470	   interface (ingress or egress).

472	   At the lowest level are individual functional elements, each with
473	   their own configuration parameters and management counters and flags.

475	4.  Classifiers

477	4.1  Definition

479	   Classification is performed by a classifier element.  Classifiers are
480	   1:N (fan-out) devices: they take a single traffic stream as input and
481	   generate N logically separate traffic streams as output.  Classifiers
482	   are parameterized by filters and output streams.  Packets from the
483	   input stream are sorted into various output streams by filters which
484	   match the contents of the packet or possibly match other attributes
485	   associated with the packet.  Various types of classifiers are
486	   described in the following sections.

488	   We use the following diagram to illustrate a classifier, where the
489	   outputs connect to succeeding functional elements:

491	      unclassified              classified
492	      traffic                   traffic
493	              +------------+
494	              |            |--> match Filter1 --> output A
495	      ------->| classifier |--> match Filter2 --> output B
496	              |            |--> no match      --> output C
497	              +------------+

499	      Figure 3.  An Example Classifier

501	   Note that we allow a mux (see Sec. 6.5) before the classifier to
502	   allow input from multiple traffic streams.  For example, if multiple
503	   ingress sub-interfaces feed through a single classifier then the
504	   interface number can be considered by the classifier as a packet
505	   attribute and be included in the packet's classification key.  This
506	   optimization may be important for scalability in the management
507	   plane.  Another possible packet attribute could be an integer
508	   representing the BGP community string associated with the packet's
509	   best-matching route.

511	   The following classifier separates traffic into one of three output
512	   streams based on three filters:

514	      Filter Matched        Output Stream
515	      --------------       ---------------
516	      Filter1                    A
517	      Filter2                    B
518	      Filter3 (no match)         C

520	   Where Filters1 and Filter2 are defined to be the following BA filters
521	   ([DSARCH], see Sec. 4.2.1 ):

523	      Filter        DSCP
524	      ------       ------
525	        1           101010
526	        2           111111
527	        3           ****** (wildcard)

529	4.1.1  Filters

531	   A filter consists of a set of conditions on the component values of
532	   a packet's classification key (the header values, contents, and
533	   attributes relevant for classification).  In the BA classifier
534	   example above, the classification key consists of one packet header
535	   field, the DSCP, and both Filter1 and Filter2 specify exact-match
536	   conditions on the value of the DSCP.  Filter3 is a wildcard default
537	   filter which matches every packet, but which is only selected in the
538	   event that no other more specific filter matches.

540	   In general there are a set of possible component conditions including
541	   exact, prefix, range, masked, and wildcard matches.  Note that ranges
542	   can be represented (with less efficiency) as a set of prefixes and
543	   that prefix matches are just a special case of both masked and range
544	   matches.

546	   In the case of a MF classifier [DSARCH], the classification key
547	   consists of a number of packet header fields.  The filter may
548	   specify a different condition for each key component, as illustrated
549	   in the example below for a IPv4/TCP classifier:

551	      Filter   IP Src Addr    IP Dest Addr   TCP SrcPort TCP DestPort
552	      ------   -------------  -------------  -----------  ------------
553	      Filter4  172.31.8.1/32  172.31.3.X/24       X          5003

555	   In this example, the fourth octet of the destination IPv4 address
556	   and the source TCP port are wildcard or "don't cares".

558	4.1.2  Overlapping Filters

560	   Note that it is easy to define sets of overlapping filters in a
561	   classifier.  For example:

563	      Filter5:              Filter6:
564	      Type:   Masked-DSCP   Type:   Masked-DSCP
565	      Value:  111000        Value:  000111 (binary)
566	      Mask:   111000        Mask:   000111 (binary)

568	   A packet containing DSCP = 111111 cannot be uniquely classified by
569	   this pair of filters and so a precedence must be established between
570	   Filter5 and Filter6 in order to break the tie.  This precedence must
571	   be established either (a) by a manager which knows that the router
572	   can accomplish this particular ordering; e.g., by means of reported
573	   capabilities or (b) by the router along with a mechanism to report
574	   to a manager which precedence is being used.  These ordering
575	   mechanisms must be supported by the configuration and management
576	   protocols although further discussion of this is outside the scope of
577	   this document.

579	   An unambiguous classifier requires that every possible classification
580	   key match at least one filter (including the wildcard default), and
581	   that any ambiguity between overlapping filters be resolved by
582	   precedence.

584	4.1.3  Filter Groups

586	   Filters may be logically combined.  For example, consider the
587	   following DestMacAddress filter:

589	      Filter7:
590	      Type:        DestMacAddress
591	      Value:       01-02-03-04-05-06
592	      Mask:        FF-FF-FF-FF-FF-FF

594	   Classifier0 could then be declared as:

596	      Classifier0:
597	      Filter1 and Filter7:         output A
598	      Filter2 and Filter7:         output B
599	      Default (wildcard) filter:   output C

601	4.2  Examples

603	4.2.1  Behaviour Aggregate (BA) Classifier

605	   The simplest Diffserv classifier is a behavior aggregate (BA)
606	   classifier [DSARCH].  A BA classifier uses only the Diffserv
607	   codepoint (DSCP) in a packet's IP header to determine the logical
608	   output stream to which the packet should be directed.  We allow only
609	   an exact-match condition on this field because the assigned DSCP
610	   values have no structure, and therefore no subset of DSCP bits are
611	   significant.

613	   The following defines a possible BA filter:

615	      Filter8:
616	      Type:   BA
617	      Value:  111000

619	4.2.2  Multi-Field (MF) Classifier

621	   Another type of classifier is a multi-field (MF) classifier [DSARCH].
622	   This classifies packets based on one or more fields in the packet
623	   header (including the DSCP).  A common type of MF classifier is a 6-
624	   tuple classifier that classifies based on six IP header fields
625	   (destination address, source address, IP protocol, source port,
626	   destination port, and DSCP).  MF classifiers may classify on other
627	   fields such as MAC addresses, VLAN tags, link-layer traffic class
628	   fields or other higher-layer protocol fields.

630	   The following defines a possible MF filter:

632	      Filter9:
633	      Type:              IPv4-6-tuple
634	      IPv4DestAddrValue: 0
635	      IPv4DestAddrMask:  0.0.0.0
636	      IPv4SrcAddrValue:  172.31.8.0
637	      IPv4SrcAddrMask:   255.255.255.0
638	      IPv4DSCP:          28
639	      IPv4Protocol:      6
640	      IPv4DestL4PortMin: 0
641	      IPv4DestL4PortMax: 65535
642	      IPv4SrcL4PortMin:  20
643	      IPv4SrcL4PortMax:  20

645	   A similar type of classifier can be defined for IPv6.

647	4.2.3 IEEE802 MAC Address Classifier

649	   A MacAddress filter is parameterized by a 6-byte {value, mask} pair
650	   for either source or destination MAC address.  For example, the
651	   following classifier sends packets matching either DA =
652	   01-02-03-04-05-06 or SA = 00-E0-2B-XX-XX-XX to output A:

654	      Classifier1:
655	      Filter10:     output A
656	      Filter11:     output A
657	      Default:      output B
658	      Filter10:
659	      Type:        DestMacAddress
660	      Value:       01-02-03-04-05-06 (hex)
661	      Mask:        FF-FF-FF-FF-FF-FF (hex)

663	      Filter11:
664	      Type:        SrcMacAddress
665	      DestValue:   00-E0-2B-00-00-00 (hex)
666	      DestMask:    FF-FF-FF-00-00-00 (hex)

668	4.2.4  Free-form Classifier

670	   A Free-form classifier is made up of a set of user definable
671	   arbitrary filters each made up of {bit-field size, offset (from head
672	   of packet), mask}:

674	      Classifier2:
675	      Filter12:    output A
676	      Filter13:     output B
677	      Default:     output C

679	      Filter12:
680	      Type:        FreeForm
681	      SizeBits:    3 (bits)
682	      Offset:      16 (bytes)
683	      Value:       100 (binary)
684	      Mask:        101 (binary)

686	      Filter13:
687	      Type:        FreeForm
688	      SizeBits:    12 (bits)
689	      Offset:      16 (bytes)
690	      Value:       100100000000 (binary)
691	      Mask:        111111111111 (binary)

693	   Free-form filters can be combined into filter groups to form very
694	   powerful filters.

696	4.2.5  Other Possible Classifiers

698	      Classifier3:
699	      Filter14:     output A
700	      Filter15:     output B
701	      Default:      output C

703	      Filter14:
704	      Type:        IEEEPriority
705	      Value:       100 (binary)
706	      Mask:        101 (binary)
707	      Filter15:
708	      Type:        IEEEVLAN
709	      Value:       100100000000 (binary)
710	      Mask:        111111111111 (binary)

712	   Classification may be performed based on implicit information
713	   associated with a packet (e.g. the incoming channel number on a
714	   channelized interface) or on information derived from a different
715	   non-Diffserv classification operation (e.g. the outgoing interface
716	   determined by the route lookup operation).  Other vendor-specific
717	   filter formats are possible.  We do not discuss these further here.

719	4.3  MPLS

721	   It is possible for an MPLS label-switched router (LSR) to function as
722	   a Diffserv router [MPLSDS].  In this case the IP header is not
723	   visible for inspection and all header classification must be
724	   performed on the MPLS label, and in the event of shim encapsulation,
725	   on the 3-bit EXP field in addition.  In general a MPLS classification
726	   filter may specify either wildcard- or exact-match conditions for
727	   either field (but not both wildcard at once).  The distinction to be
728	   drawn here is that MPLS labels are dynamically established and torn
729	   down.  An EXP-only classifier may be statically configured but a
730	   label or label + EXP classifier must be established dynamically along
731	   with the LSP.  In all other respects (except marking) the labeled
732	   packet can be treated identically to an unlabeled packet.

734	5.  Meters

736	5.1  Definition

738	   Metering is the function of monitoring the arrival times of packets
739	   of a traffic stream and determining the level of conformance of each
740	   packet to a pre-established traffic profile.  Diffserv network
741	   providers may choose to offer services to customers based on a
742	   temporal (i.e., rate) profile within which the customer submits
743	   traffic for the service.  In this event, a meter might be used to
744	   trigger real-time traffic conditioning actions (e.g., marking) by
745	   routing a non-conforming packet through an appropriate next-stage
746	   action element.  Alternatively, it might also be used for out-of-band
747	   management functions like statistics monitoring for billing
748	   applications.

750	   Meters are logically 1:N (fan-out) devices (although a mux can be
751	   used in front of a meter).  Meters are parameterized by a temporal
752	   profile and by conformance levels, each of which is associated with
753	   a meter's output.  Each output can be connected to another functional
754	   element.

756	   Note that this model of a meter differs from that described in
757	   [DSARCH].  In that description the meter is not a datapath element
758	   but is instead used to monitor the traffic stream and send control
759	   signals to action elements to dynamically modulate their behavior
760	   based on the conformance of the packet.  We find the description here
761	   more powerful.

763	   We use the following diagram to illustrate a meter with 3 levels of
764	   conformance:

766	      unmetered              metered
767	      traffic                traffic

769	                +---------+
770	                |         |--------> conformanceA
771	      --------->|  meter  |--------> conformanceB
772	                |         |--------> conformanceC
773	                +---------+

775	      Figure 4.  An Example Meter

777	   In some Diffserv examples, three levels of conformance are discussed
778	   in terms of colors, with green representing conforming, yellow
779	   representing partially conforming, and red representing non-
780	   conforming [AF-PHB].  These different conformance levels are used to
781	   trigger different buffer management actions.  Other example meters
782	   use a binary notion of conformance; in the general case N levels of
783	   conformance can be supported.  In general there is no constraint on
784	   the type of functional element following a meter output, but care
785	   must be taken not to inadvertently configure a datapath that results
786	   in packet reordering within an OA.

788	5.2  Examples

790	   The following is a non-exhaustive list of possible meters.

792	5.2.1  Average Rate Meter

794	   An example of a very simple meter is an average rate meter.  This
795	   type of meter measures the average rate at which packets are
796	   submitted to it over a specified averaging time.

798	   An average rate profile may take the following form:

800	      Meter1:
801	      Type:                AverageRate
802	      Profile1:            output A
803	      NonConforming:       output B

805	      Profile1:
806	      Type:                AverageRate
807	      AverageRate:         120 KBps
808	      Delta:               1.0 msec

810	   A meter measuring against this profile would continually maintain a
811	   count that indicates the total number of packets arriving between
812	   time T (now) and time T - 1.0 msecs.  So long as an arriving packet
813	   does not push the count over 120 bytes, the packet would be deemed
814	   conforming.  Any packet that pushes the count over 120 would be
815	   deemed non-conforming.  Thus, this meter deems packets to correspond
816	   to one of two conformance levels: conforming or non-conforming.

818	5.2.2  Exponential Weighted Moving Average (EWMA) Meter

820	   The EWMA form of meter is easy to implement in hardware and can be
821	   parameterized as follows:

823	      avg_rate(t) = (1 - Gain) * avg_rate(t') +  Gain * rate(t)
824	      t = t' + Delta

826	   For a packet arriving at time t:

828	      if (avg_rate(t) > AverageRate)
829	         non-conforming
830	      else
831	         conforming

833	   Gain controls the time constant (e.g. frequency response) of what is
834	   essentially a simple IIR low-pass filter.  rate(t) measures the
835	   number of incoming bytes in a small fixed sampling interval, Delta.
836	   Any packet that arrives and pushes the average rate over a predefined
837	   rate AverageRate is deemed non-conforming.  An EWMA meter profile
838	   might look as follows:

840	      Meter2:
841	      Type:                ExpWeightedMovingAvg
842	      Profile2:            output A
843	      NonConforming:       output B

845	      Profile2:
846	      Type:                ExpWeightedMovingAvg
847	      AverageRate:         25 KBps
848	      Delta:               10.0 usec
849	      Gain:                1/16

851	5.2.3  Two-Parameter Token Bucket Meter

853	   A more sophisticated meter might measure conformance to a token
854	   bucket (TB) profile.  A TB profile generally has two parameters, an
855	   average token rate, a burst size.  TB meters compare the arrival
856	   rate of packets to the average rate specified by the TB profile.
857	   Logically, byte tokens accumulate in a bucket at the average rate,
858	   up to a maximum credit which is the burst size.  Packets of length
859	   L bytes are considered conforming if L tokens are available in the
860	   bucket at the time of packet arrival.  Packets are allowed to
861	   exceed the average rate in bursts up to the burst size.  Packets
862	   which arrive to find a bucket with insufficient tokens in it are
863	   deemed non-conforming.  A two-parameter TB meter has exactly two
864	   possible conformance levels (conforming, non-conforming).  TB
865	   implementation details are discussed in Appendix A.

867	   A two-parameter RB meter profile might look as follows:

869	      Meter3:
870	      Type:                SimpleTokenBucket
871	      Profile3:            output A
872	      NonConforming:       output B

874	      Profile3:
875	      Type:                SimpleTokenBucket
876	      AverageRate:         100 KBps
877	      BurstSize:           100 KB

879	5.2.4  Multi-Stage Token Bucket Meter

881	   More complicated TB meters might define two burst sizes and three
882	   conformance levels.  Packets found to exceed the larger burst size
883	   are deemed non-conforming.  Packets found to exceed the smaller
884	   burst size are deemed partially conforming.  Packets exceeding
885	   neither are deemed conforming.  Token bucket meters designed for
886	   Diffserv networks are described in more detail in [SRTCM, TRTCM,
887	   GTC]; in some of these references three levels of conformance are
888	   discussed in terms of colors, with green representing conforming,
889	   yellow representing partially conforming and red representing non-
890	   conforming.  Often these multi-conformance level meters can be
891	   implemented using an appropriate configuration of multiple two-
892	   parameter TB meters.

894	   A profile for a multi-stage TB meter with three levels of conformance
895	   might look as follows:

897	      Meter4:
898	      Type:                MultiTokenBucket
899	      Profile4:            output A
900	      Profile5:            output B
901	      NonConforming:       output C

903	      Profile4:
904	      Type:                SimpleTokenBucket
905	      AverageRate:         100 KBps
906	      BurstSize:           20 KB

908	      Profile5:
909	      Type:                SimpleTokenBucket
910	      AverageRate:         100 KBps
911	      BurstSize:           100 KB

913	5.2.5  Null Meter

915	   A null meter has only one output: always conforming, and no
916	   associated temporal profile.  Such a meter is useful to define in the
917	   event that the configuration or management interface does not have
918	   the flexibility to omit a meter in a datapath segment.

920	6.  Action Elements

922	   Classifiers and meters are fan-out elements which are generally used
923	   to determine the appropriate action to apply to a packet.  The set of
924	   possible actions include:

926	   1) Marking
927	   2) Dropping
928	   2) Shaping
929	   3) Mirroring
930	   4) Monitoring

932	   The corresponding action elements are described in the following
933	   paragraphs.

935	   Policing is a general term for the process of preventing a traffic
936	   stream from seizing more than its share of resources from a Diffserv
937	   network.  Each of the first three actions described above may be used
938	   to police traffic.  Markers do so by re-marking non-conforming
939	   packets to a DSCP value that is entitled to fewer network resources.
940	   Shapers and droppers do so by limiting the rate at which a particular
941	   traffic stream is submitted to the network.

943	6.1  Marker

945	   Markers are 1:1 elements which set the DSCP in an IP header (in
946	   the case of unlabeled packets).  Markers may act on unmarked packets
947	   (submitted with DSCP of zero) or may re-mark previously marked
948	   packets.  In particular, the model supports the application of
949	   marking based on a preceding classifier match.  The DSCP set in a
950	   packet will determine its subsequent treatment in downstream nodes
951	   of a network, and possible in subsequent processing stages within the
952	   router (depending on configuration).

954	   Markers are normally parameterized by a single parameter: the 6-bit
955	   DSCP to be marked in the packet header.

957	      ActionElement1:
958	      Type:                Marker
959	      Mark:                010010

961	   In the case of a MPLS labeled packet, the marker is parameterized
962	   by a 3-bit EXP value to be marked in the MPLS shim header.

964	6.2  Dropper

966	   Droppers simply discard packets. There are no parameters for
967	   droppers.  Because a dropper is a terminating point of the datapath,
968	   it may be desirable to forward the packet through a monitor first
969	   for instrumentation purposes.

971	   Droppers are not the only elements than can cause a packet to be
972	   discarded.  The other element is an enqueueing element (see Sec.
973	   6.6).  However, since the enqueueing element's behavior is closely
974	   tied the state of one or more queues, we choose to distinguish them
975	   as separate functional elements.

977	6.3  Shaper

979	   Shapers are used to shape traffic streams to a certain temporal
980	   profile.  For example, a shaper can be used to smooth traffic
981	   arriving in bursts.  In [DSARCH] a shaper is described as a
982	   queueing element controlled by a meter which defines its temporal
983	   profile.  This model of a shaper differs substantially from typical
984	   shaper implementations.  Further, with the inclusion of queueing
985	   elements in the model a separate shaping element becomes confusing.
986	   Therefore, the function of a shaper is embedded in a queue and is
987	   covered in Sec. 7.

989	6.4  Mirroring Element

991	   It is occasionally desirable to mirror data traffic on one or more
992	   additional interfaces for data collection purposes.  A mirroring
993	   element is a 1:N (fan-out) element.  However, each and every packet
994	   follows each output path simultaneously.  A mirroring element is
995	   parameterized by the number of outputs it supports.

997	6.5  Mux

999	   It is occasionally necessary to multiplex traffic streams into a 1:1
1000	   or 1:N action element or classifier.  A M:1 (fan-in) mux is a simple
1001	   logical device for merging traffic streams.  It is parameterized by
1002	   its number of incoming ports.

1004	6.6  Enqueueing Element

1006	   Queueing elements (discussed in Sec. 7) require an action element to
1007	   execute the appropriate buffer management algorithm and store or
1008	   discard a packet.  This is performed by an enqueueing element, which
1009	   is an M:1 (fan-in) element.  An enqueueing element executes the
1010	   buffer management algorithm appropriate for the queue it is feeding.
1011	   This may include a deterministic discard behavior if the queue size
1012	   exceeds a threshold, it may include a random discard behavior that
1013	   is a function of the average queue size [AF-PHB], or it may include
1014	   a more complex policy which is a function of the state of several
1015	   queues in a queue set (see Sec. 7).  The particular parameters to
1016	   apply to a packet may depend on the particular input port the element
1017	   receives it on; this allows packets which are classified into
1018	   different colors to follow different datapaths and be processed
1019	   appropriately at the enqueueing element.

1021	   The configuration parameters for an enqueueing element will depend on
1022	   the details of the algorithm it is executing.  For an algorithm such
1023	   as the one recommended in [AF-PHB], the parameters would include
1024	   separate RED min_th, max_th, and max_p parameters per-element input
1025	   port.

1027	   An enqueueing element must maintain octet/packet counters for both
1028	   the forwarded and discarded packets received at each element input
1029	   port.  Counters should be provided to distinguish between losses due
1030	   to the normal operation of the algorithm (e.g., random drop) and
1031	   those due to resource exhaustion (e.g., tail drop) [DSMIB].

1033	6.7  Monitor

1035	   One passive action is to account for the fact that a data packet was
1036	   processed.  The statistics that result might be used later for
1037	   customer billing, service verification, or network engineering
1038	   purposes.  Monitors are 1:1 functional elements which increment an
1039	   octet counter by L and a packet counter by 1 every time a L-byte
1040	   sized packet passes through it.  Monitors can also be used to count
1041	   packets on the verge of being dropped by a dropper.

1043	6.8  Null Action

1045	   A null action has one input and one output.  The element performs no
1046	   action on the packet.  Such an element is useful to define in the
1047	   event that the configuration or management interface does not have
1048	   the flexibility to omit an action element in a datapath segment.

1050	7.  Queues

1052	7.1  Queue Sets and Scheduling

1054	   Queues are used to store packets prior to transmission or prior to
1055	   forwarding to the next functional element.  Packets are usually
1056	   stored either because there is a resource constraint (e.g., available
1057	   bandwidth) which prevents immediate forwarding, or because the queue
1058	   is being used to alter the temporal properties of a traffic stream
1059	   (shaping).  Queues may be organized into queue sets, which are
1060	   serviced using a common scheduling algorithm (although each queue may
1061	   be individually parameterized).  Queue sets can be treated as
1062	   functional elements and organized hierarchically in queue supersets,
1063	   using an n-th order scheduling algorithm.  Such a queue set may be
1064	   used to implement the entire range of PHBs on an egress interface,
1065	   for instance.

1067	   Possible queue scheduling algorithms fall into a number of
1068	   categories, including strict priority, weighted fair bandwidth
1069	   sharing (e.g., WFQ, WRR, etc.), rate-limited strict priority, etc.
1070	   Scheduling algorithms can be further distinguished by whether they
1071	   are work conserving or non-work conserving.  A work conserving
1072	   algorithm will always transmit an available packet at every
1073	   transmission opportunity, while a non-work conserving algorithm will
1074	   not.  Non-work conserving schedulers can be used to shape traffic
1075	   streams by delaying packets that would be deemed non-conforming by
1076	   some traffic profile.  The packet is delayed until such time that it
1077	   would conform to a meter using the same profile.

1079	   [DSARCH] defines PHBs without specifying required queueing
1080	   algorithms.  However, PHBs such as EF [EF-PHB] and AF [AF-PHB] have
1081	   configuration parameters which strongly suggest the sort of queue
1082	   scheduling algorithm needed to implement them.  We have selected a
1083	   minimal set of queue parameters to enable realization of these per-
1084	   hop behaviors.  These include a minimum service rate and a strict
1085	   service priority along with an optional maximum service rate profile
1086	   (depending on whether the queue is meant to be non-work conserving).
1087	   The minimum service rate allows throughput guarantees for each queue
1088	   as required by EF and AF without specifying the details of how excess
1089	   bandwidth between these queues is shared (additional parameters to
1090	   control this behavior should be made available, but are dependent on
1091	   the particular scheduling algorithm implemented).  The strict service
1092	   priority is useful for implementing EF on some links (assuming that
1093	   the aggregate EF rate has been appropriately bounded to avoid
1094	   starvation).  Setting the service priority of each queue in a queue
1095	   set to the same value enables the scheduler to satisfy the minimum
1096	   service rate for each queue.  Queue sets can be serviced like
1097	   individual queues in a queue superset using the same scheduling
1098	   parameters.

1100	   It should be noted that the queues in this model are logical
1101	   abstractions used to configure PHB-related parameters.  They are not
1102	   expected to map one-to-one with physical queues in a specific router
1103	   implementation.  An implementor should map the configurable
1104	   parameters of the physical queues to these queue parameters as
1105	   appropriate to achieve equivalent behaviors.

1107	   Other queue parameters such as maximum capacity are assumed to be
1108	   mapped to the buffer management algorithm used by the enqueueing
1109	   element feeding the queue.

1111	   A queue set might be represented using the following parameters:

1113	      QueueSet1:
1114	      Type:        QueueSet
1115	      MaxProfile:  WorkConserving
1116	      MinGuarRate: 20 MBps
1117	      Interface:   ifIndex
1118	      QueueA:
1119	      Type:        Queue
1120	      QueueSet:    QueueSet1
1121	      MaxProfile:  Profile1
1122	      MinGuarRate: 2 MBps
1123	      Priority:    3

1125	      QueueB:
1126	      Type:        Queue
1127	      QueueSet:    QueueSet1
1128	      MaxProfile:  WorkConserving
1129	      MinGuarRate: 8 MBps
1130	      Priority:    3

1132	7.2  Shaping

1134	   Shapers are often used to pre-condition traffic such that packets
1135	   are deemed conforming by subsequent meters, e.g., in downstream
1136	   Diffserv nodes.  Shapers may also be used to isolate certain traffic
1137	   streams from the effects of other traffic streams of the same BA.

1139	   A shaper action element is implemented in this model by using a non-
1140	   work conserving queue.  Shapers operate by delaying packets that
1141	   would be deemed non-conforming by a meter configured to the shaper's
1142	   maximum service rate profile.  The packet is delayed until such
1143	   time that it would become conforming.

1145	   Profile definitions are identical in format to those described for
1146	   meters.  The use of a meter algorithm to control shaping is further
1147	   discussed in Appendix A.  Average, EWMA, and TB profiles are all
1148	   feasible for shaping.  Because a shaper is implemented as a queue it
1149	   can also utilize a variety of buffer management algorithms
1150	   (implemented in a enqueueing element).

1152	   A shaping queue might be represented using the following parameters:

1154	      QueueA:
1155	      Type:        Queue
1156	      QueueSet:    QueueSet1
1157	      MaxProfile:  Profile1
1158	      MinGuarRate: 2 MBps
1159	      Priority:    3

1161	      Profile1:
1162	      Type:                SimpleTokenBucket
1163	      AverageRate:         3 MBps
1164	      BurstSize:           8 KB

1166	8.  Traffic Conditioning Blocks (TCBs)

1168	   The classifiers, meters, action elements, and queueing elements
1169	   described above can be combined into traffic conditioning blocks
1170	   (TCBs).  The TCB is an abstraction of a functional element that may
1171	   be used to facilitate the definition of specific traffic conditioning
1172	   functionality.

1174	   One of the simplest possible TCBs would consist of the following
1175	   stages:

1177	   1.  Classifier stage
1178	   2.  Enqueueing stage
1179	   3.  Queueing stage

1181	   Note that a classifier is a 1:N element, while an enqueueing stage is
1182	   a N:1 element and a queue is a 1:1 element.  If the classifier split
1183	   traffic across multiple enqueueing elements then the queueing stage
1184	   may consist of a hierarchy of queue sets, all resulting in a 1:1
1185	   abstract element.

1187	   A more general TCB might consists of the following four stages:

1189	   1. Classifier stage
1190	   2. Metering stage
1191	   3. Action stage
1192	   4. Queueing stage

1194	   where each stage may consist of a set of parallel datapaths
1195	   consisting of pipelined elements.

1197	   TCBs are constructed by connecting elements corresponding to these
1198	   stages in any sensible order.  It is possible to omit stages, to
1199	   include null elements, or to concatenate multiple stages of the same
1200	   type.  TCB outputs may drive additional TCBs (on either the ingress
1201	   or egress interfaces).   Classifiers and meters are fan-out elements,
1202	   muxes and enqueueing elements are fan-in elements.

1204	8.1  An Example TCB

1206	   The following diagram illustrates an example TCB:

1208	                                       +------------> to Queue A
1209	                              +-----+  |              (not shown)
1210	                              |     |--+
1211	                           +->|     |
1212	                           |  |     |--+  +-----+    +-----+
1213	                           |  +-----+  |  |     |    |     |
1214	                           |   meter   +->|     |--->|     |
1215	                           |              |     |    |     |
1216	                           |              +-----+    +-----+
1217	                           |              monitor    dropper
1218	                           |
1219	                           |
1220	                           |
1221	     submitted +-----+     |  +-----+     +-----+
1222	     traffic   |  A  |-----+  |     |     |     |
1223	           --->|  B  |------->|     |---->|     |---> to Queue B
1224	               |  C  |-----+  |     |     |     |     (not shown)
1225	               |  X  |--+  |  +-----+     +-----+
1226	               +-----+  |  |   marker     shaper
1227	                 BA     |  |              queue
1228	              classifier|  |
1229	                        |  |
1230	                        |  |
1231	                        |  |
1232	                        |  |
1233	                        |  |  +-----+                +-----+
1234	                        |  |  |     |--------------->|     |  to Queue C
1235	                        |  +->|     |                |     |->
1236	                        |     |     |--+  +-----+ +->|     | (not shown)
1237	                        |     +-----+  |  |     | |  +-----+
1238	                        |      meter   +->|     |-+    mux
1239	                        |                 |     |
1240	                        |                 +-----+
1241	                        |                 marker
1242	                        |
1243	                        +---------------------------> to Queue D
1244	                                                      (not shown)
1245	      Figure 5:  An Example Traffic Conditioning Block

1247	   This sample TCB might be suitable for an ingress interface at a
1248	   customer/provider boundary.  A SLS is presumed to have been
1249	   negotiated between the customer and the provider which specifies the
1250	   handling of the customer's traffic by the provider's network.  The
1251	   agreement might be of the following form:

1253	      DSCP         PHB       Profile       Non-Conforming Packets
1254	      ----         ---       -------       ----------------------
1255	      001001       PHB1      Profile1      Discard
1256	      001100       PHB2      Profile2      Wait in shaper queue
1257	      001101       PHB3      Profile3      Re-mark to DSCP 001000

1259	   It is implicit in this agreement that conforming packets are given
1260	   the PHB originally indicated by the packets' DSCP field.  It
1261	   specifies that the customer may submit packets marked for DSCP
1262	   001001 which will get PHB1 treatment so long as they remain
1263	   conforming to Profile1 and will be discarded if they exceed this
1264	   profile.  Similar contract rules are applied for 001100 and 001101
1265	   traffic.

1267	   In this example, the classification stage consists of a single BA
1268	   classifier.  The BA classifier is used to separate traffic based on
1269	   the Diffserv service level requested by the customer (as indicated
1270	   by the DSCP in each submitted packet's IP header).  We illustrate
1271	   three DSCP filter values: A, B and C.  The 'X' in the BA classifier
1272	   is the default wildcard filter that matches every packet.

1274	   A metering stage is next in the upper and lower branches.  There is a
1275	   separate meter for each set of packets corresponding to DSCPs A and
1276	   C.  Each meter uses a specific profile as specified in the TCS for
1277	   the corresponding Diffserv service level.  The meters in this
1278	   example indicate one of two conforming levels, conforming or
1279	   non-conforming.  The middle branch has a marker which re-marks all
1280	   packets received with DSCP B.

1282	   Following the metering stage is the action stage in the upper and
1283	   lower branches.  Packets submitted for DSCP A that are deemed non-
1284	   conforming and are counted and discarded.  Packets that are
1285	   conforming are passed on to Queue A.  Packets submitted for DSCP C
1286	   that are deemed non-conforming are re-marked, and then conforming and
1287	   non-conforming packets are muxed together before being forwarded to
1288	   Queue C.  Packets submitted for DSCP B are shaped to Profile2 before
1289	   being forwarded to Queue B.

1291	   The interconnections of the TCB elements illustrated in Fig. 5 can be
1292	   represented as follows:

1294	      TCB1:

1296	      Classifier1:
1297	      Output A --> Meter1
1298	      Output B --> Marker1
1299	      Output C --> Meter2
1300	      Output X --> QueueD

1302	      Meter1:
1303	      Output A --> QueueA
1304	      Output B --> Monitor1

1306	      Monitor1:
1307	      Output A --> Dropper1

1309	      Marker1:
1310	      Output A --> Shaper1
1311	      Shaper1:
1312	      Output A --> Queue B

1314	      Meter2:
1315	      Output A --> Mux1
1316	      Output B --> Marker2

1318	      Marker2:
1319	      Output A --> Mux1

1321	      Mux1:
1322	      Output A --> Queue C

1324	8.2  An Example TCB to Support Multiple Customers

1326	   The TCB described above can be installed on an ingress interface to
1327	   implement a provider/customer TCS if the interface is dedicated to
1328	   the customer.  However, if a single interface is shared between
1329	   multiple customers, then the TCB above will not suffice, since it
1330	   does not differentiate among traffic from different customers.  Its
1331	   classification stage uses only BA classifiers.

1333	   The TCB is readily extended to support the case of multiple customers
1334	   per interface, as follows.  First, we define a TCB for each customer
1335	   to reflect the TCS with that customer.  TCB1, defined above is the
1336	   TCB for customer 1.  We add definitions for TCB2 and for TCB3 which
1337	   reflect the agreements with customers 2 and 3 respectively.

1339	   Finally, we add a classifier which provides a front end to separate
1340	   the traffic from the three different customers.  This forms a new
1341	   TCB which incorporates TCB1, TCB2, and TCB3, and can be illustrated
1342	   as follows:

1344	      submitted +-----+
1345	      traffic   |  A  |--------> TCB1
1346	            --->|  B  |--------> TCB2
1347	                |  C  |--------> TCB3
1348	                |  X  |--------> Dropper4
1349	                +-----+
1350	                Classifier4

1352	      Figure 6: An Example of a Multi-Customer TCB

1354	   A formal representation of this multi-customer TCB might be:

1356	      TCB1:
1357	      (as defined above)

1359	      TCB2:
1360	      (similar to TCB1, perhaps with different numeric parameters)
1361	      TCB3:
1362	      (similar to TCB1, perhaps with different numeric parameters)

1364	      TCB4:
1365	      (the total TCB)

1367	      Classifier4:
1368	      Output A --> TCB1
1369	      Output B --> TCB2
1370	      Output C --> TCB3
1371	      Output X --> Dropper4

1373	   Where Classifier2 is defined as follows:

1375	      Classifier4:
1376	      Filter1:     Output A
1377	      Filter2:     Output B
1378	      Filter3:     Output C
1379	      No Match:    Output X

1381	   and the filters, based on each customer's source MAC address, are
1382	   defined as follows:

1384	      Filter1:
1385	      Type:        MacAddress
1386	      SrcValue:    01-02-03-04-05-06 (source MAC address of customer 1)
1387	      SrcMask:     FF-FF-FF-FF-FF-FF
1388	      DestValue:   00-00-00-00-00-00
1389	      DestMask:    00-00-00-00-00-00

1391	      Filter2:
1392	      (similar to Filter1 but with customer 2's source MAC address as
1393	      SrcValue)

1395	      Filter3:
1396	      (similar to Filter1 but with customer 3's source MAC address as
1397	      SrcValue)

1399	   In this example, Classifier4 separates traffic submitted from
1400	   different customers based on the source MAC address in submitted
1401	   packets.  Those packets with recognized source MAC addresses are
1402	   passed to the TCB implementing the TCS with the corresponding
1403	   customer.  Those packets with unrecognized source MAC addresses are
1404	   passed to a dropper.

1406	   TCB4 has a classification stage and an action element stage, which
1407	   consists of either a dropper or another TCB.

1409	8.3 TCBs Supporting Microflow-based Services

1411	   The TCB illustrated above describes a configuration that might be
1412	   suitable for enforcing a SLS at a router's ingress.  It assumes that
1413	   the customer marks its own traffic for the appropriate service level.
1414	   It then limits the rate of aggregate traffic submitted at each
1415	   service level, thereby protecting the resources of the Diffserv
1416	   network.  It does not provide any isolation between the customer's
1417	   individual microflows (other than from separated queueing).

1419	   Next we present a TCB configuration that offers additional
1420	   functionality to the customer.  It recognizes individual customer
1421	   microflows and marks each one independently.  It also isolates the
1422	   customer's individual microflows from each other in order to prevent
1423	   a single microflow from seizing an unfair share of the resources
1424	   available to the customer at a certain service level.  This is
1425	   illustrated in Figure 7 below:

1427	                     +-----+   +-----+
1428	                     |     |   |     |---------------+
1429	                  +->|     |-->|     |     +-----+   |
1430	        +-----+   |  |     |   |     |---->|     |   |
1431	        |     |----  +-----+   +-----+     +-----+   |
1432	      ->|     |----  marker     meter      dropper   |   +-----+   to
1433	        |     |-+ |  +-----+   +-----+               +-->|     |
1434	        +-----+ | |  |     |   |     |------------------>|     |--->
1435	          MF    | +->|     |-->|     |     +-----+   +-->|     |
1436	        class.  |    |     |   |     |---->|     |   |   +-----+  TCB2
1437	                |    +-----+   +-----+     +-----+   |    mux
1438	                |    marker     meter      dropper   |
1439	                |    +-----+   +-----+               |
1440	                |    |     |   |     |---------------+
1441	                |--->|     |-->|     |     +-----+
1442	                |    |     |   |     |---->|     |
1443	                |    +-----+   +-----+     +-----+
1444	                |    marker     meter      dropper
1445	                |       .         .     .
1446	                V       V         V     V

1448	      Figure 7: An Example of a Marking and Traffic Isolation TCB

1450	   Traffic is first directed to a MF classifier which classifies traffic
1451	   based on miscellaneous classification criteria, to a granularity
1452	   sufficient to identify individual customer microflows.  Each
1453	   microflow can then be marked for a specific DSCP (in this particular
1454	   example we assume that one of two different DSCPs is marked).  The
1455	   metering stage limits the contribution of each of the customer's
1456	   microflows to the service level for which it was marked.  Packets
1457	   exceeding the allowable limit for the microflow are dropped.

1459	   The TCB could be formally specified as follows:

1461	      TCB1:
1462	      Classifier1: (MF)
1463	      Output A --> Marker1
1464	      Output B --> Marker2
1465	      Output C --> Marker3
1466	      . . .

1468	      Marker1 --> Meter1
1469	      Marker2 --> Meter2
1470	      Marker3 --> Meter3

1472	      Meter1:
1473	      Output A --> TCB2
1474	      Output B --> ActionElement1 (dropper)

1476	      Meter2:
1477	      Output A --> TCB2
1478	      Output B --> ActionElement2 (dropper)

1480	      Meter3:
1481	      Output A --> TCB2
1482	      Output B --> ActionElement3 (dropper)

1484	   The actual traffic element declarations are not shown here.

1486	   Traffic is either dropped by TCB1 or emerges marked for one of two
1487	   DSCPs.  This traffic is then passed to TCB2, illustrated below:

1489	                     +-----+
1490	                     |     |--------------->
1491	                  +->|     |     +-----+
1492	        +-----+   |  |     |---->|     |
1493	        |     |---+  +-----+     +-----+
1494	      ->|     |       meter      dropper
1495	        |     |---+  +-----+
1496	        +-----+   |  |     |--------------->
1497	          BA      +->|     |     +-----+
1498	        classifier   |     |---->|     |
1499	                     +-----+     +-----+
1500	                      meter      dropper

1502	      Figure 8: Additional Example TCB

1504	   TCB2 would be formally specified as follows:

1506	      Classifier2: (BA)
1507	      Output A --> Meter10
1508	      Output B --> Meter11
1509	      Meter10:
1510	      Output A --> PHBQueueA
1511	      Output B --> Dropper10

1513	      Meter11:
1514	      Output A --> PHBQueueB
1515	      Output B --> Dropper11

1517	9.  Open Issues

1519	  o  There is a difference in interpretation of token bucket behavior
1520	     between this document (Appendix A) and [DSMIB].  Specifically,
1521	     [DSMIB] allows a packet to conform if any smaller packet would
1522	     conform.

1524	  o  The meter in [SRTCM] cannot be precisely modeled using two
1525	     two-parameter token buckets because its two buckets do not
1526	     accumulate credits independently.  We intended to demonstrate how
1527	     the [TRTCM] meter could be implemented but ran out of time.

1529	  o  Are the queue parameters (scheduling and buffer management)
1530	     parameters defined sufficient?

1532	  o  Does Queue and Queue Set really belong in the model (and the MIB
1533	     and PIB?), or should the model stick to the abstract PHB
1534	     representation and leave the implementation details to the MIB and
1535	     PIB?

1537	  o  Should a classifier be part of a TCB?  We argue yes.  This allows a
1538	     TCB to be a one input/one output black box element.

1540	  o  Is the description of a shaper sufficient?  Is it overbroad?

1542	10. Security Considerations

1544	   Security vulnerabilities of Diffserv network operation are discussed
1545	   in [DSARCH].  This document describes an abstract functional model of
1546	   Diffserv router elements.  Certain denial-of-service attacks such as
1547	   those resulting from resource starvation may be mitigated by
1548	   appropriate configuration of these router elements; for example, by
1549	   rate limiting certain traffic streams or by authenticating traffic
1550	   marked for higher quality-of-service.

1552	11.  Acknowledgments

1554	   Concepts, terminology, and text have been borrowed liberally from
1555	   [DSMIB] and [PIB].  We wish to thank the authors: Fred Baker,
1556	   Michael Fine, Keith McCloghrie, John Seligson, Kwok Chan, and
1557	   Scott Hahn, for their permission.

1559	   This document has benefitted from the comments and suggestions of
1560	   several participants of the Diffserv working group.

1562	12. References

1564	   [DSARCH]   M. Carlson, W. Weiss, S. Blake, Z. Wang, D. Black, and
1565	              E. Davies, "An Architecture for Differentiated Services",
1566	              RFC 2475, December 1998

1568	   [DSTERMS]  D. Grossman, "New Terminology for Diffserv", Internet
1569	              Draft <draft-ietf-diffserv-new-terms-00.txt>, October
1570	              1999.

1572	   [E2E]      Y. Bernet, R. Yavatkar, P. Ford, F. Baker, L. Zhang,
1573	              M. Speer, K. Nichols, R. Braden, B. Davie, J. Wroclawski,
1574	              and E. Felstaine, "Integrated Services Operation over
1575	              Diffserv Networks", Internet Draft
1576	              <draft-ietf-issll-diffserv-rsvp-02.txt>, September 1999.

1578	   [DSFIELD]  K. Nichols, S. Blake, F. Baker, and D. Black,
1579	              "Definition of the Differentiated Services Field (DS
1580	              Field) in the IPv4 and IPv6 Headers", RFC 2474, December
1581	              1998.

1583	   [EF-PHB]   V. Jacobson,  K. Nichols, and K. Poduri, "An Expedited
1584	              Forwarding PHB", RFC 2598, June 1999.

1586	   [AF-PHB]   J. Heinanen, F. Baker, W. Weiss, and J. Wroclawski,
1587	              "Assured Forwarding PHB Group", RFC 2597, June 1999.

1589	   [DSMIB]    F. Baker, "Differentiated Services MIB", Internet Draft
1590	              <draft-ietf-diffserv-mib-00.txt>, June 1999.

1592	   [SRTCM]    J. Heinanen, and R. Guerin, "A Single Rate Three Color
1593	              Marker", RFC 2697, September 1999.

1595	   [PIB]      M. Fine, K. McCloghrie, J. Seligson, K. Chan, S. Hahn,
1596	              and A. Smith, "Quality of Service Policy Information
1597	              Base", Internet Draft <draft-mfine-cops-pib-01.txt>,
1598	              June 1999.

1600	   [TRTCM]    J. Heinanen, R. Guerin, "A Two Rate Three Color Marker",
1601	              RFC 2698, September 1999.

1603	   [GTC]      L. Lin, J. Lo, and F. Ou, "A Generic Traffic Conditioner",
1604	              Internet Draft <draft-lin-diffserv-gtc-01.txt>, August
1605	              1999.

1607	   [MPLSDS]   J. Heinanen, "Differentiated Services in MPLS Networks",
1608	              Internet Draft <draft-heinanen-diffserv-mpls-00.txt>,
1609	              June 1999.

1611	Appendix A.  Simple Token Bucket Definition

1613	  [DSMIB] presents a fairly detailed exposition on the operation of
1614	  two-parameter token buckets for metering.  However, the behavior
1615	  described does not appear to be consistent with the behavior defined
1616	  in [SRTCM] and [TRTCM].  Specifically, under the definition in
1617	  [DSMIB], a packet is assumed to conform to the meter if any of its
1618	  bytes would have been accepted, while in [SRTCM] and [TRTCM], a packet
1619	  is assumed to conform only if sufficient tokens are available for
1620	  every byte in the packet.  Further, a packet has no effect on the
1621	  token occupancy if it does not conform (no tokens are decremented).

1623	  The behavior defined in [SRTCM] and [TRTCM] is not mandatory for
1624	  compliance, but we give here a mathematical definition of two-
1625	  parameter token bucket operation which is consistent with these
1626	  documents, and which can be used to define a shaping profile.

1628	  Define a token bucket with bucket size BS, token accumulation rate
1629	  R, and instantaneous token occupancy T(t).  Assume that T(0) = BS.

1631	  Then after an arbitrary interval with no packet arrivals, T(t) will
1632	  not change since the bucket is already full of tokens.  Assume a
1633	  packet of size B bytes at time t'.  The bucket capacity T(t'-) = BS
1634	  still.  Then, as long as B <= BS, the packet conforms to the meter,
1635	  and

1637	     T(t') = BS - B.

1639	  Assume an interval v = t - t' elapses before the next packet, of
1640	  size C <= BS, arrives.  T(t-) is given by the following equation:

1642	    T(t-) = max { BS, T(t') + v*R }

1644	  (the packet has accumulated v*R tokens over the interval, up to a
1645	  maximum of BS tokens).

1647	  If T(t-) - C >= 0, the packet conforms and T(t) = T(t-) - C.
1648	  Otherwise, the packet does not conform and T(t) = T(t-).

1650	  This function can be used to define a shaping profile.  If a packet of
1651	  size C arrives at time t, it will be eligible for transmission at time
1652	  te given as follows (we still assume C <= BS):

1654	     te = max { t, t" }

1656	  where

1658	     t" = (C - T(t') + t'*R)/R.

1660	  T(t") = C, the time when C credits have accumulated in the bucket,
1661	  and when the packet would conform if the token bucket were a meter.
1662	  te != t" only if t > t".

1664	Authors' Addresses

1666	   Yoram Bernet
1667	   Microsoft
1668	   One Microsoft Way
1669	   Redmond, WA  98052
1670	   Phone:  +1 425 936 9568
1671	   E-mail: yoramb@microsoft.com

1673	   Andrew Smith
1674	   Extreme Networks
1675	   3585 Monroe St.
1676	   Santa Clara, CA  95051
1677	   Phone:  +1 408 579 2821
1678	   E-mail: andrew@extremenetworks.com

1680	   Steven Blake
1681	   Ericsson
1682	   3000 Aerial Center Parkway, Suite 140
1683	   Morrisville, NC  27560
1684	   Phone:  +1 919 468 8466 x232
1685	   E-mail: slblake@torrentnet.com