idnits 2.17.1 

draft-finn-detnet-bounded-latency-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 639 has weird spacing: '...N queue    non...'

  -- The document date (March 11, 2019) is 1871 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Looks like a reference, but probably isn't: '2' on line 301

  -- Looks like a reference, but probably isn't: '4' on line 301

  == Unused Reference: 'NetCalBook' is defined on line 1393, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-13) exists of
     draft-ietf-detnet-architecture-08

  == Outdated reference: A later version (-02) exists of
     draft-ietf-detnet-dp-sol-ip-00

  == Outdated reference: A later version (-02) exists of
     draft-ietf-detnet-dp-sol-mpls-00


     Summary: 2 errors (**), 0 flaws (~~), 7 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	DetNet                                                           N. Finn
3	Internet-Draft                               Huawei Technologies Co. Ltd
4	Intended status: Informational                            J-Y. Le Boudec
5	Expires: September 12, 2019                              E. Mohammadpour
6	                                                                    EPFL
7	                                                                J. Zhang
8	                                             Huawei Technologies Co. Ltd
9	                                                                B. Varga
10	                                                               J. Farkas
11	                                                                Ericsson
12	                                                          March 11, 2019

14	                         DetNet Bounded Latency
15	                  draft-finn-detnet-bounded-latency-03

17	Abstract

19	   This document presents a parameterized timing model for Deterministic
20	   Networking (DetNet), so that existing and future standards can
21	   achieve the DetNet quality of service features of bounded latency and
22	   zero congestion loss.  It defines requirements for resource
23	   reservation protocols or servers.  It calls out queuing mechanisms,
24	   defined in other documents, that can provide the DetNet quality of
25	   service.

27	Status of This Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at https://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on September 12, 2019.

44	Copyright Notice

46	   Copyright (c) 2019 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (https://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	Table of Contents

61	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
62	   2.  Terminology and Definitions . . . . . . . . . . . . . . . . .   4
63	   3.  DetNet bounded latency model  . . . . . . . . . . . . . . . .   4
64	     3.1.  Flow creation . . . . . . . . . . . . . . . . . . . . . .   4
65	       3.1.1.  Static flow creation  . . . . . . . . . . . . . . . .   4
66	       3.1.2.  Dynamic flow creation . . . . . . . . . . . . . . . .   5
67	     3.2.  Relay node model  . . . . . . . . . . . . . . . . . . . .   6
68	   4.  Computing End-to-end Latency Bounds . . . . . . . . . . . . .   9
69	     4.1.  Non-queuing delay bound . . . . . . . . . . . . . . . . .   9
70	     4.2.  Queuing delay bound . . . . . . . . . . . . . . . . . . .   9
71	       4.2.1.  Per-flow queuing mechanisms . . . . . . . . . . . . .  10
72	       4.2.2.  Per-class queuing mechanisms  . . . . . . . . . . . .  10
73	     4.3.  Ingress considerations  . . . . . . . . . . . . . . . . .  11
74	     4.4.  Interspersed non-DetNet transit nodes . . . . . . . . . .  11
75	   5.  Achieving zero congestion loss  . . . . . . . . . . . . . . .  12
76	     5.1.  A General Formula . . . . . . . . . . . . . . . . . . . .  12
77	   6.  Queuing model . . . . . . . . . . . . . . . . . . . . . . . .  13
78	     6.1.  Queuing data model  . . . . . . . . . . . . . . . . . . .  13
79	     6.2.  Preemption  . . . . . . . . . . . . . . . . . . . . . . .  15
80	     6.3.  Time-scheduled queuing  . . . . . . . . . . . . . . . . .  15
81	     6.4.  Time-Sensitive Networking with Asynchronous Traffic
82	           Shaping . . . . . . . . . . . . . . . . . . . . . . . . .  16
83	       6.4.1.  Flow Admission  . . . . . . . . . . . . . . . . . . .  19
84	     6.5.  IntServ . . . . . . . . . . . . . . . . . . . . . . . . .  20
85	   7.  Time-based DetNet QoS . . . . . . . . . . . . . . . . . . . .  22
86	     7.1.  Cyclic Queuing and Forwarding . . . . . . . . . . . . . .  22
87	       7.1.1.  CQF timing sequence . . . . . . . . . . . . . . . . .  23
88	       7.1.2.  Dead time computation . . . . . . . . . . . . . . . .  24
89	       7.1.3.  Tc computation  . . . . . . . . . . . . . . . . . . .  24
90	       7.1.4.  CQF latency calculation . . . . . . . . . . . . . . .  24
91	       7.1.5.  CQF parameterization  . . . . . . . . . . . . . . . .  25
92	       7.1.6.  Ingress conditioning for CQF  . . . . . . . . . . . .  25
93	       7.1.7.  CQF ingress conditioning timing model . . . . . . . .  27
94	     7.2.  Time-scheduled queuing  . . . . . . . . . . . . . . . . .  28
95	   8.  Parameters for the bounded latency model  . . . . . . . . . .  29
96	   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  29
97	     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  29
98	     9.2.  Informative References  . . . . . . . . . . . . . . . . .  30
99	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  31

101	1.  Introduction

103	   The ability for IETF Deterministic Networking (DetNet) or IEEE 802.1
104	   Time-Sensitive Networking (TSN, [IEEE8021TSN]) to provide the DetNet
105	   services of bounded latency and zero congestion loss depends upon A)
106	   configuring and allocating network resources for the exclusive use of
107	   DetNet/TSN flows; B) identifying, in the data plane, the resources to
108	   be utilized by any given packet, and C) the detailed behavior of
109	   those resources, especially transmission queue selection, so that
110	   latency bounds can be reliably assured.  Thus, DetNet is an example
111	   of an IntServ Guaranteed Quality of Service [RFC2212]

113	   As explained in [I-D.ietf-detnet-architecture], DetNet flows are
114	   characterized by 1) a maximum bandwidth, guaranteed either by the
115	   transmitter or by strict input metering; and 2) a requirement for a
116	   guaranteed worst-case end-to-end latency.  That latency guarantee, in
117	   turn, provides the opportunity for the network to supply enough
118	   buffer space to guarantee zero congestion loss.

120	   To be of use to the applications identified in
121	   [I-D.ietf-detnet-use-cases], it must be possible to calculate, before
122	   the transmission of a DetNet flow commences, both the worst-case end-
123	   to-end network latency, and the amount of buffer space required at
124	   each hop to ensure against congestion loss.

126	   This document references specific queuing mechanisms, defined in
127	   other documents, that can be used to control packet transmission at
128	   each output port and achieve the DetNet qualities of service.  This
129	   document presents a timing model for sources, destinations, and the
130	   DetNet transit nodes that relay packets that is applicable to all of
131	   those referenced queuing mechanisms.  The parameters specified in
132	   this model:

134	   o  Characterize a DetNet flow in a way that provides externally
135	      measurable verification that the sender is conforming to its
136	      promised maximum, can be implemented reasonably easily by a
137	      sending device, and does not require excessive over-allocation of
138	      resources by the network.

140	   o  Enable reasonably accurate computation of worst-case end-to-end
141	      latency, in a way that requires as little detailed knowledge as
142	      possible of the behavior of the Quality of Service (QoS)
143	      algorithms implemented in each device, including queuing, shaping,
144	      metering, policing, and transmission selection techniques.

146	   Using the model presented in this document, it should be possible for
147	   an implementor, user, or standards development organization to select
148	   a particular set of queuing mechanisms for each device in a DetNet
149	   network, and to select a resource reservation algorithm for that
150	   network, so that those elements can work together to provide the
151	   DetNet service.

153	   This document does not specify any resource reservation protocol or
154	   server.  It does not describe all of the requirements for that
155	   protocol or server.  It does describe requirements for such resource
156	   reservation methods, and for queuing mechanisms that, if met, will
157	   enable them to work together.

159	2.  Terminology and Definitions

161	   This document uses the terms defined in
162	   [I-D.ietf-detnet-architecture].

164	3.  DetNet bounded latency model

166	3.1.  Flow creation

168	   There are two models for flow creation, static (Section 3.1.1) and
169	   dynamic (Section 3.1.2).  Most of the mathematical analysis provided
170	   in this document is applicable to either flow creation model; any
171	   dependencies on the choice of flow creation model are pointed out in
172	   the text.

174	3.1.1.  Static flow creation

176	   The static problem:
177	           Given a network and a set of DetNet flows, compute an end-to-
178	           end latency bound (if computable) for each flow, and compute
179	           the resources, particularly buffer space, required in each
180	           DetNet transit node to achieve zero congestion loss.

182	   In this model, all of the DetNet flows are known before the
183	   calculation commences.  This problem is of interest to relatively
184	   static networks, or static parts of larger networks.  It gives the
185	   best possible worst-case behavior.  The calculations can be extended
186	   to provide global optimizations, such as altering the path of one
187	   DetNet flow in order to make resources available to another DetNet
188	   flow with tighter constraints.

190	   This calculation may be more difficult to perform than that of the
191	   dynamic model (Section 3.1.2), because the flows passing through one
192	   port on a DetNet transit node affect each others' latency.  The
193	   effects can even be circular, from Flow A to B to C and back to A.

195	   On the other hand, the static calculation can often accommodate
196	   queuing methods, such as transmission selection by strict priority,
197	   that are unsuitable for the dynamic calculation.

199	   The static flow creation model is not limited only to static
200	   networks; the entire calculation for all flows can be repeated each
201	   time a new DetNet flow is created or deleted.  If some already-
202	   established flow would be pushed beyond its latency requirements by
203	   the new flow, then either the new flow is refused, or some other
204	   suitable action taken.

206	3.1.2.  Dynamic flow creation

208	   The dynamic problem:
209	           Given a network whose maximum capacity for DetNet flows is
210	           bounded by a set of static configuration parameters applied
211	           to the DetNet transit nodes, and given just one DetNet flow,
212	           compute the worst-case end-to-end latency that can be
213	           experienced by that flow, no matter what other DetNet flows
214	           (within the network's configured parameters) might be created
215	           or deleted in the future.  Also, compute the resources,
216	           particularly buffer space, required in each DetNet transit
217	           node to achieve zero congestion loss.

219	   This model is dynamic, in the sense that flows can be added or
220	   deleted at any time, with a minimum of computation effort, and
221	   without affecting the guarantees already given to other flows.

223	   The choice of queuing methods is critical to the applicability of the
224	   dynamic model.  Some queuing methods (e.g.  CQF, Section 7.1) make it
225	   easy to configure bounds on the network's capacity, and to make
226	   independent calculations for each flow.  Other queuing methods (e.g.,
227	   transmission selection by strict priority), make this calculation
228	   impossible, because the worst case for one flow cannot be computed
229	   without complete knowledge of all other flows.  Other queuing methods
230	   (e.g. the credit-based shaper defined in [IEEE8021Q] section 8.6.8.2)
231	   can be used for dynamic flow creation, but yield poorer latency and
232	   buffer space guarantees than when that same queuing method is used
233	   for static flow creation (Section 3.1.1).

235	   The dynamic flow creation model assumes the use of the following
236	   paradigm for provisioning DetNet flows:

238	   1.  Perform any configuration required by the DetNet transit nodes in
239	       the network for the classes of service to be offered, including
240	       one or more classes of DetNet service.  This configuration is
241	       done beforehand, and not tied to any particular flow.

243	   2.  Characterize the new DetNet flow in IntServ terms (Section 8).

245	   3.  Establish the path that the DetNet flow will take through the
246	       network from the source to the destination(s).  This can be a
247	       point-to-point or a point-to-multipoint path.

249	   4.  Select one of the DetNet classes of service for the DetNet flow.

251	   5.  Compute the worst-case end-to-end latency for the DetNet flow.
252	       In the process, determine whether sufficient resources are
253	       available for that flow to guarantee the required latency and to
254	       provide zero congestion loss.

256	   6.  Assuming that the resources are available, commit those resources
257	       to the flow.  This may or may not require adjusting the
258	       parameters that control the queuing mechanisms at each hop along
259	       the flow's path.

261	   This paradigm can be static and/or dynamic, and can be implemented
262	   using peer-to-peer protocols or using a central server model.  In
263	   some situations, backtracking and recursing through this list may be
264	   necessary.

266	   Issues such as un-provisioning a DetNet flow in favor of another when
267	   resources are scarce are not considered, but are left to the static
268	   flow creation model (Section 3.1.1).  How the path to be taken by a
269	   DetNet flow is chosen is not considered in this document.

271	3.2.  Relay node model

273	   A model for the operation of a DetNet transit node is required, in
274	   order to define the latency and buffer calculations.  In Figure 1 we
275	   see a breakdown of the per-hop latency experienced by a packet
276	   passing through a DetNet transit node, in terms that are suitable for
277	   computing both hop-by-hop latency and per-hop buffer requirements.

279	         DetNet transit node A            DetNet transit node B
280	      +-------------------------+       +------------------------+
281	      |              Queuing    |       |              Queuing   |
282	      |   Regulator subsystem   |       |   Regulator subsystem  |
283	      |   +-+-+-+-+ +-+-+-+-+   |       |   +-+-+-+-+ +-+-+-+-+  |
284	   -->+   | | | | | | | | | +   +------>+   | | | | | | | | | +  +--->
285	      |   +-+-+-+-+ +-+-+-+-+   |       |   +-+-+-+-+ +-+-+-+-+  |
286	      |                         |       |                        |
287	      +-------------------------+       +------------------------+
288	      |<->|<------>|<------->|<->|<---->|<->|<------>|<------>|<->|<--
289	   2,3  4      5        6      1    2,3   4      5        6     1   2,3
290	                   1: Output delay       4: Processing delay
291	                   2: Link delay         5: Regulation delay
292	                   3: Preemption delay   6: Queuing delay.

294	                 Figure 1: Timing model for DetNet or TSN

296	   In Figure 1, we see two DetNet transit nodes (typically, bridges or
297	   routers), with a wired link between them.  In this model, the only
298	   queues we deal with explicitly are attached to the output port; other
299	   queues are modeled as variations in the other delay times.  (E.g., an
300	   input queue could be modeled as either a variation in the link delay
301	   [2] or the processing delay [4].)  There are six delays that a packet
302	   can experience from hop to hop.

304	   1.  Output delay
305	      The time taken from the selection of a packet for output from a
306	      queue to the transmission of the first bit of the packet on the
307	      physical link.  If the queue is directly attached to the physical
308	      port, output delay can be a constant.  But, in many
309	      implementations, the queuing mechanism in a forwarding ASIC is
310	      separated from a multi-port MAC/PHY, in a second ASIC, by a
311	      multiplexed connection.  This causes variations in the output
312	      delay that are hard for the forwarding node to predict or control.

314	   2.  Link delay
315	      The time taken from the transmission of the first bit of the
316	      packet to the reception of the last bit, assuming that the
317	      transmission is not suspended by a preemption event.  This delay
318	      has two components, the first-bit-out to first-bit-in delay and
319	      the first-bit-in to last-bit-in delay that varies with packet
320	      size.  The former is typically measured by the Precision Time
321	      Protocol and is constant (see [I-D.ietf-detnet-architecture]).
322	      However, a virtual "link" could exhibit a variable link delay.

324	   3.  Preemption delay
325	      If the packet is interrupted in order to transmit another packet
326	      or packets, (e.g.  [IEEE8023] clause 99 frame preemption) an
327	      arbitrary delay can result.

329	   4.  Processing delay
330	      This delay covers the time from the reception of the last bit of
331	      the packet to the time the packet is enqueued in the regulator
332	      (Queuing subsystem, if there is no regulation).  This delay can be
333	      variable, and depends on the details of the operation of the
334	      forwarding node.

336	   5.  Regulator delay
337	      This is the time spent from the insertion of the last bit of a
338	      packet into a regulation queue until the time the packet is
339	      declared eligible according to its regulation constraints.  We
340	      assume that this time can be calculated based on the details of
341	      regulation policy.  If there is no regulation, this time is zero.

343	   6.  Queuing subsystem delay
344	      This is the time spent for a packet from being declared eligible
345	      until being selected for output on the next link.  We assume that
346	      this time is calculable based on the details of the queuing
347	      mechanism.  If there is no regulation, this time is from the
348	      insertion of the packet into a queue until it is selected for
349	      output on the next link.

351	   Not shown in Figure 1 are the other output queues that we presume are
352	   also attached to that same output port as the queue shown, and
353	   against which this shown queue competes for transmission
354	   opportunities.

356	   The initial and final measurement point in this analysis (that is,
357	   the definition of a "hop") is the point at which a packet is selected
358	   for output.  In general, any queue selection method that is suitable
359	   for use in a DetNet network includes a detailed specification as to
360	   exactly when packets are selected for transmission.  Any variations
361	   in any of the delay times 1-4 result in a need for additional buffers
362	   in the queue.  If all delays 1-4 are constant, then any variation in
363	   the time at which packets are inserted into a queue depends entirely
364	   on the timing of packet selection in the previous node.  If the
365	   delays 1-4 are not constant, then additional buffers are required in
366	   the queue to absorb these variations.  Thus:

368	   o  Variations in output delay (1) require buffers to absorb that
369	      variation in the next hop, so the output delay variations of the
370	      previous hop (on each input port) must be known in order to
371	      calculate the buffer space required on this hop.

373	   o  Variations in processing delay (4) require additional output
374	      buffers in the queues of that same DetNet transit node.  Depending
375	      on the details of the queueing subsystem delay (6) calculations,
376	      these variations need not be visible outside the DetNet transit
377	      node.

379	4.  Computing End-to-end Latency Bounds

381	4.1.  Non-queuing delay bound

383	   End-to-end latency bounds can be computed using the delay model in
384	   Section 3.2.  Here it is important to be aware that for several
385	   queuing mechanisms, the worst-case end-to-end delay is less than the
386	   sum of the per-hop worst-case delays.  An end-to-end latency bound
387	   for one DetNet flow can be computed as

389	      end_to_end_latency_bound = non_queuing_latency + queuing_latency

391	   The two terms in the above formula are computed as follows.  First,
392	   at the h-th hop along the path of this DetNet flow, obtain an upper
393	   bound per-hop_non_queuing_latency[h] on the sum of delays 1,2,3,4 of
394	   Figure 1.  These upper-bounds are expected to depend on the specific
395	   technology of the DetNet transit node at the h-th hop but not on the
396	   T-SPEC of this DetNet flow.  Then set non_queuing_latency = the sum
397	   of per-hop_non_queuing_latency[h] over all hops h.

399	4.2.  Queuing delay bound

401	   Second, compute queuing_latency as an upper bound to the sum of the
402	   queuing delays along the path.  The value of queuing_latency depends
403	   on the T-SPEC of this flow and possibly of other flows in the
404	   network, as well as the specifics of the queuing mechanisms deployed
405	   along the path of this flow.

407	   For several queuing mechanisms, queuing_latency is less than the sum
408	   of upper bounds on the queuing delays (5,6) at every hop.  This
409	   occurs with (1) per-flow queuing, and (2) per-class queuing with
410	   regulators, as explained in Section 4.2.1, Section 4.2.2, and
411	   Section 6.

413	   For other queuing mechanisms the only available value of
414	   queuing_latency is the sum of the per-hop queuing delay bounds.  In
415	   such cases, the computation of per-hop queuing delay bounds must
416	   account for the fact that the T-SPEC of a DetNet flow is no longer
417	   satisfied at the ingress of a hop, since burstiness increases as one
418	   flow traverses one DetNet transit node.

420	4.2.1.  Per-flow queuing mechanisms

422	   With such mechanisms, each flow uses a separate queue inside every
423	   node.  The service for each queue is abstracted with a guaranteed
424	   rate and a delay.  For every flow the per-node delay bound as well as
425	   end-to-end delay bound can be computed from the traffic specification
426	   of this flow at its source and from the values of rates and latencies
427	   at all nodes along its path.  Details of calculation for IntServ are
428	   described in Section 6.5.

430	4.2.2.  Per-class queuing mechanisms

432	   With such mechanisms, the flows that have the same class share the
433	   same queue.  A practical example is the queuing mechanism in Time
434	   Sensitive Networking.  One key issue in this context is how to deal
435	   with the burstiness cascade: individual flows that share a resource
436	   dedicated to a class may see their burstiness increase, which may in
437	   turn cause increased burstiness to other flows downstream of this
438	   resource.  Computing latency upper bounds for such cases is
439	   difficult, and in some conditions impossible
440	   [charny2000delay][bennett2002delay].  Also, when bounds are obtained,
441	   they depend on the complete configuration, and must be recomputed
442	   when one flow is added.

444	   A solution to deal with this issue is to reshape the flows at every
445	   hop.  This can be done with per-flow regulators (e.g. leaky bucket
446	   shapers), but this requires per-flow queuing and defeats the purpose
447	   of per-class queuing.  An alternative is the interleaved regulator,
448	   which reshapes individual flows without per-flow queuing
449	   ([Specht2016UBS], [IEEE8021Qcr]).  With an interleaved regulator, the
450	   packet at the head of the queue is regulated based on its (flow)
451	   regulation constraints; it is released at the earliest time at which
452	   this is possible without violating the constraint.  One key feature
453	   of per-flow or interleaved regulator is that, it does not increase
454	   worst-case latency bounds [le_boudec_theory_2018].  Specifically,
455	   when an interleaved regulator is appended to a FIFO subsystem, it
456	   does not increase the worst-case delay of the latter.

458	   Figure 2 shows an example of a network with 5 nodes, per-class
459	   queuing mechanism and interleaved regulators as in Figure 1.  An end-
460	   to-end delay bound for flow f, traversing nodes 1 to 5, is calculated
461	   as follows:

463	      end_to_end_latency_bound_of_flow_f = C12 + C23 + C34 + S4

465	   In the above formula, Cij is a bound on the aggregate response time
466	   of queuing subsystem in node i and interleaved regulator of node j,
467	   and S4 is a bound on the response time of the queuing subsystem in
468	   node 4 for flow f.  In fact, using the delay definitions in
469	   Section 3.2, Cij is a bound on sum of the delays 1,2,3,6 of node i
470	   and 4,5 of node j.  Similarly, S4 is a bound on sum of the delays
471	   1,2,3,6 of node 4.  A practical example of queuing model and delay
472	   calculation is presented Section 6.4.

474	                               f
475	                     ----------------------------->
476	                   +---+   +---+   +---+   +---+   +---+
477	                   | 1 |---| 2 |---| 3 |---| 4 |---| 5 |
478	                   +---+   +---+   +---+   +---+   +---+
479	                      \__C12_/\__C23_/\__C34_/\_S4_/

481	             Figure 2: End-to-end latency computation example

483	   REMARK: The end-to-end delay bound calculation provided here gives a
484	   much better upper bound in comparison with end-to-end delay bound
485	   computation by adding the delay bounds of each node in the path of a
486	   flow [TSNwithATS].

488	4.3.  Ingress considerations

490	   A sender can be a DetNet node which uses exactly the same queuing
491	   methods as its adjacent DetNet transit node, so that the latency and
492	   buffer calculations at the first hop are indistinguishable from those
493	   at a later hop within the DetNet domain.  On the other hand, the
494	   sender may be DetNet unaware, in which case some conditioning of the
495	   flow may be necessary at the ingress DetNet transit node.

497	   This ingress conditioning typically consists of a FIFO with an output
498	   regulator that is compatible with the queuing employed by the DetNet
499	   transit node on its output port(s).  For some queuing methods, simply
500	   requires added extra buffer space in the queuing subsystem.  Ingress
501	   conditioning requirements for different queuing methods are mentioned
502	   in the sections, below, describing those queuing methods.

504	4.4.  Interspersed non-DetNet transit nodes

506	   It is sometimes desirable to build a network that has both DetNet
507	   aware transit nodes and DetNet non-aware transit nodes, and for a
508	   DetNet flow to traverse an island of non-DetNet transit nodes, while
509	   still allowing the network to offer latency and congestion loss
510	   guarantees.  This is possible under certain conditions.

512	   In general, when passing through a non-DetNet island, the island
513	   causes delay variation in excess of what would be caused by DetNet
514	   nodes.  That is, the DetNet flow is "lumpier" after traversing the
515	   non-DetNet island.  DetNet guarantees for latency and buffer
516	   requirements can still be calculated and met if and only if the
517	   following are true:

519	   1.  The latency variation across the non-DetNet island must be
520	       bounded and calculable.

522	   2.  An ingress conditioning function (Section 4.3) may be required at
523	       the re-entry to the DetNet-aware domain.  This will, at least,
524	       require some extra buffering to accommodate the additional delay
525	       variation, and thus further increases the worst-case latency.

527	   The ingress conditioning is exactly the same problem as that of a
528	   sender at the edge of the DetNet domain.  The requirement for bounds
529	   on the latency variation across the non-DetNet island is typically
530	   the most difficult to achieve.  Without such a bound, it is obvious
531	   that DetNet cannot deliver its guarantees, so a non-DetNet island
532	   that cannot offer bounded latency variation cannot be used to carry a
533	   DetNet flow.

535	5.  Achieving zero congestion loss

537	   When the input rate to an output queue exceeds the output rate for a
538	   sufficient length of time, the queue must overflow.  This is
539	   congestion loss, and this is what deterministic networking seeks to
540	   avoid.

542	5.1.  A General Formula

544	   To avoid congestion losses, an upper bound on the backlog present in
545	   the regulator and queuing subsystem of Figure 1 must be computed
546	   during resource reservation.  This bound depends on the set of flows
547	   that use these queues, the details of the specific queuing mechanism
548	   and an upper bound on the processing delay (4).  The queue must
549	   contain the packet in transmission plus all other packets that are
550	   waiting to be selected for output.

552	   A conservative backlog bound, that applies to all systems, can be
553	   derived as follows.

555	   The backlog bound is counted in data units (bytes, or words of
556	   multiple bytes) that are relevant for buffer allocation.  For every
557	   class we need one buffer space for the packet in transmission, plus
558	   space for the packets that are waiting to be selected for output.
559	   Excluding transmission and preemption times, the packets are waiting
560	   in the queue since reception of the last bit, for a duration equal to
561	   the processing delay (4) plus the queuing delays (5,6).

563	   Let
564	   o  nb_classes be the number of classes of traffic that may use this
565	      output port

567	   o  total_in_rate be the sum of the line rates of all input ports that
568	      send traffic of any class to this output port.  The value of
569	      total_in_rate is in data units (e.g. bytes) per second.

571	   o  nb_input_ports be the number input ports that send traffic of any
572	      class to this output port

574	   o  max_packet_length be the maximum packet size for packets of any
575	      class that may be sent to this output port.  This is counted in
576	      data units.

578	   o  max_delay45 be an upper bound, in seconds, on the sum of the
579	      processing delay (4) and the queuing delays (5,6) for a packet of
580	      any class at this output port.

582	   Then a bound on the backlog of traffic of all classes in the queue at
583	   this output port is

585	      backlog_bound = ( nb_classes + nb_input_ports ) *
586	      max_packet_length + total_in_rate* max_delay45

588	6.  Queuing model

590	6.1.  Queuing data model

592	   Sophisticated queuing mechanisms are available in Layer 3 (L3, see,
593	   e.g., [RFC7806] for an overview).  In general, we assume that "Layer
594	   3" queues, shapers, meters, etc., are precisely the "regulators"
595	   shown in Figure 1.  The "queuing subsystems" in this figure are not
596	   the province solely of bridges; they are an essential part of any
597	   DetNet transit node.  As illustrated by numerous implementation
598	   examples, some of the "Layer 3" mechanisms described in documents
599	   such as [RFC7806] are often integrated, in an implementation, with
600	   the "Layer 2" mechanisms also implemented in the same node.  An
601	   integrated model is needed in order to successfully predict the
602	   interactions among the different queuing mechanisms needed in a
603	   network carrying both DetNet flows and non-DetNet flows.

605	   Figure 3 shows the general model for the flow of packets through the
606	   queues of a DetNet transit node.  Packets are assigned to a class of
607	   service.  The classes of service are mapped to some number of
608	   regulator queues.  Only DetNet/TSN packets pass through regulators.
609	   Queues compete for the selection of packets to be passed to queues in
610	   the queuing subsystem.  Packets again are selected for output from
611	   the queuing subsystem.

613	                                    |
614	   +--------------------------------V----------------------------------+
615	   |                    Class of Service Assignment                    |
616	   +--+------+----------+---------+-----------+-----+-------+-------+--+
617	      |      |          |         |           |     |       |       |
618	   +--V-+ +--V-+     +--V--+   +--V--+     +--V--+  |       |       |
619	   |Flow| |Flow|     |Flow |   |Flow |     |Flow |  |       |       |
620	   |  0 | |  1 | ... |  i  |   | i+1 | ... |  n  |  |       |       |
621	   | reg| | reg|     | reg |   | reg |     | reg |  |       |       |
622	   +--+-+ +--+-+     +--+--+   +--+--+     +--+--+  |       |       |
623	      |      |          |         |           |     |       |       |
624	   +--V------V----------V--+   +--V-----------V--+  |       |       |
625	   |  Trans.  selection    |   | Trans. select.  |  |       |       |
626	   +----------+------------+   +-----+-----------+  |       |       |
627	              |                      |              |       |       |
628	           +--V--+                +--V--+        +--V--+ +--V--+ +--V--+
629	           | out |                | out |        | out | | out | | out |
630	           |queue|                |queue|        |queue| |queue| |queue|
631	           |  1  |                |  2  |        |  3  | |  4  | |  5  |
632	           +--+--+                +--+--+        +--+--+ +--+--+ +--+--+
633	              |                      |              |       |       |
634	   +----------V----------------------V--------------V-------V-------V--+
635	   |                      Transmission selection                       |
636	   +----------+----------------------+--------------+-------+-------+--+
637	              |                      |              |       |       |
638	              V                      V              V       V       V
639	        DetNet/TSN queue       DetNet/TSN queue    non-DetNet/TSN queues

641	              Figure 3: IEEE 802.1Q Queuing Model: Data flow

643	   Some relevant mechanisms are hidden in this figure, and are performed
644	   in the queue boxes:

646	   o  Discarding packets because a queue is full.

648	   o  Discarding packets marked "yellow" by a metering function, in
649	      preference to discarding "green" packets.

651	   Ideally, neither of these actions are performed on DetNet packets.
652	   Full queues for DetNet packets should occur only when a flow is
653	   misbehaving, and the DetNet QoS does not include "yellow" service for
654	   packets in excess of committed rate.

656	   The Class of Service Assignment function can be quite complex, even
657	   in a bridge [IEEE8021Q], since the introduction of per-stream
658	   filtering and policing ([IEEE8021Q] clause 8.6.5.1).  In addition to
659	   the Layer 2 priority expressed in the 802.1Q VLAN tag, a DetNet
660	   transit node can utilize any of the following information to assign a
661	   packet to a particular class of service (queue):

663	   o  Input port.

665	   o  Selector based on a rotating schedule that starts at regular,
666	      time-synchronized intervals and has nanosecond precision.

668	   o  MAC addresses, VLAN ID, IP addresses, Layer 4 port numbers, DSCP.
669	      ([I-D.ietf-detnet-dp-sol-ip], [I-D.ietf-detnet-dp-sol-mpls]) (Work
670	      items are expected to add MPC and other indicators.)

672	   o  The Class of Service Assignment function can contain metering and
673	      policing functions.

675	   o  MPLS and/or pseudowire ([RFC6658]) labels.

677	   The "Transmission selection" function decides which queue is to
678	   transfer its oldest packet to the output port when a transmission
679	   opportunity arises.

681	6.2.  Preemption

683	   In [IEEE8021Q] and [IEEE8023], the transmission of a frame can be
684	   interrupted by one or more "express" frames, and then the interrupted
685	   frame can continue transmission.  This frame preemption is modeled as
686	   consisting of two MAC/PHY stacks, one for packets that can be
687	   interrupted, and one for packets that can interrupt the interruptible
688	   packets.  The Class of Service (queue) determines which packets are
689	   which.  Only one layer of preemption is supported -- a transmitter
690	   cannot have more than one interrupted frame in progress.  DetNet
691	   flows typically pass through the interrupting MAC.  Best-effort
692	   queues pass through the interruptible MAC, and can thus be preempted.

694	6.3.  Time-scheduled queuing

696	   In [IEEE8021Q], the notion of time-scheduling queue gates is
697	   described in section 8.6.8.4.  Below every output queue (the lower
698	   row of queues in Figure 3) is a gate that permits or denies the queue
699	   to present data for transmission selection.  The gates are controlled
700	   by a rotating schedule that can be locked to a clock that is
701	   synchronized with other DetNet transit nodes.  The DetNet class of
702	   service can be supplied by queuing mechanisms based on time, rather
703	   than the regulator model in Figure 3.  These queuing mechanisms are
704	   discussed in Section 7, below.

706	6.4.  Time-Sensitive Networking with Asynchronous Traffic Shaping

708	   Consider a network with a set of nodes (DetNet transit nodes and
709	   hosts) along with a set of flows between hosts.  Hosts are sources or
710	   destinations of flows.  There are four types of flows, namely,
711	   control-data traffic (CDT), class A, class B, and best effort (BE) in
712	   decreasing order of priority.  Flows of classes A and B are together
713	   referred to AVB flows.  It is assumed a subset of TSN functions as
714	   described next.

716	   It is also assumed that contention occurs only at the output port of
717	   a TSN node.  Each node output port performs per-class scheduling with
718	   eight classes: one for CDT, one for class A traffic, one for class B
719	   traffic, and five for BE traffic denoted as BE0-BE4 (according to TSN
720	   standard).  In addition, each node output port also performs per-flow
721	   regulation for AVB flows using an interleaved regulator (IR), called
722	   Asynchronous Traffic Shaper (ATS) in TSN.  Thus, at each output port
723	   of a node, there is one interleaved regulator per-input port and per-
724	   class.  The detailed picture of scheduling and regulation
725	   architecture at a node output port is given by Figure 4.  The packets
726	   received at a node input port for a given class are enqueued in the
727	   respective interleaved regulator at the output port.  Then, the
728	   packets from all the flows, including CDT and BE flows, are enqueued
729	   in a class based FIFO system (CBFS) [TSNwithATS].

731	         +--+   +--+ +--+   +--+
732	         |  |   |  | |  |   |  |
733	         |IR|   |IR| |IR|   |IR|
734	         |  |   |  | |  |   |  |
735	         +-++XXX++-+ +-++XXX++-+
736	           |     |     |     |
737	           |     |     |     |
738	   +---+ +-v-XXX-v-+ +-v-XXX-v-+ +-----+ +-----+ +-----+ +-----+ +-----+
739	   |   | |         | |         | |Class| |Class| |Class| |Class| |Class|
740	   |CDT| | Class A | | Class B | | BE4 | | BE3 | | BE2 | | BE1 | | BE0 |
741	   |   | |         | |         | |     | |     | |     | |     | |     |
742	   +-+-+ +----+----+ +----+----+ +--+--+ +--+--+ +--+--+ +--+--+ +--+--+
743	     |        |           |         |       |       |       |       |
744	     |      +-v-+       +-v-+       |       |       |       |       |
745	     |      |CBS|       |CBS|       |       |       |       |       |
746	     |      +-+-+       +-+-+       |       |       |       |       |
747	     |        |           |         |       |       |       |       |
748	   +-v--------v-----------v---------v-------V-------v-------v-------v--+
749	   |                     Strict Priority selection                     |
750	   +--------------------------------+----------------------------------+
751	                                    |
752	                                    V

754	     Figure 4: Architecture of a TSN node output port with interleaved
755	                             regulators (IRs)

757	   The CBFS includes two Credit-Based Shaper (CBS) subsystems, one for
758	   each class A and B.  The CBS serves a packet from a class according
759	   to the available credit for that class.  The credit for each class A
760	   or B increases based on the idle slope, and decreases based on the
761	   send slope, both of which are parameters of the CBS.  The CDT and
762	   BE0-BE4 flows in the CBFS are served by separate FIFO subsystems.
763	   Then, packets from all flows are served by a transmission selection
764	   subsystem that serves packets from each class based on its priority.
765	   All subsystems are non-preemptive.  Guarantees for AVB traffic can be
766	   provided only if CDT traffic is bounded; it is assumed that the CDT
767	   traffic has leaky bucket arrival curve with two parameters r_h as
768	   rate and b_h as bucket size, i.e., the amount of bits entering a node
769	   within a time interval t is bounded by r_h t + b_h.

771	   Additionally, it is assumed that the AVB flows are also regulated at
772	   their source according to leaky bucket arrival curve.  At the source
773	   hosts, the traffic satisfies its regulation constraint, i.e. the
774	   delay due to interleaved regulator at hosts is ignored.

776	   At each DetNet transit node implementing an interleaved regulator,
777	   packets of multiple flows are processed in one FIFO queue; the packet
778	   at the head of the queue is regulated based on its leaky bucket
779	   parameters; it is released at the earliest time at which this is
780	   possible without violating the constraint.  The regulation parameters
781	   for a flow (leaky bucket rate and bucket size) are the same at its
782	   source and at all DetNet transit nodes along its path.  A delay bound
783	   of CBFS for an AVB flow f of class A or B can be computed if the
784	   following condition holds:

786	      sum of leaky bucket rates of all flows of this class at this node
787	      <= R, where R is given below for every class.

789	   If the condition holds, the delay bound is:

791	      d_f = T + (b_t-L_min_f)/R - L_min_f/c

793	   where L_min_f is the minimum packet length of flow f; c is the output
794	   link transmission rate; b_t is the sum of the b term (bucket size)
795	   for all the flows having the same class as flow f at this node.
796	   Parameters R and T are calculated as follows for class A and class B,
797	   separately:

799	   If f is of class A:

801	      R = I_A (c-r_h)/ c

803	      T = L_nA + b_h + r_h L_n/c)/(c-r_h)

805	   where L_nA is the maximum packet length of class B and BE packets;
806	   L_n is the maximum packet length of classes A,B, and BE.

808	   If f is of class B:

810	      R = I_B (c-r_h)/ c

812	      T = (L_BE + L_A + L_nA I_A/(c_h-I_A) + b_h + r_h L_n/c)/(c-r_h)

814	   where L_A is the maximum packet length of class A; L_BE is the
815	   maximum packet length of class BE.

817	   Then, an end-to-end delay bound is calculated by the formula
818	   Section 4.2.2, where for Cij:

820	      Cij = max(d_f')

822	   where f' is any flow that shares the same CBFS class with flow f at
823	   node i and the same interleaved regulator as flow f at node j.

825	   More information of delay analysis in such a DetNet transit node is
826	   described in [TSNwithATS].

828	6.4.1.  Flow Admission

830	   The delay calculation requires some information about each node.  For
831	   each node, it is required to know the idle slope of CBS for each
832	   class A and B (I_A and I_B), as well as the transmission rate of the
833	   output link (c).  Besides, it is necessary to have the information on
834	   each class, i.e. maximum packet length of classes A, B, and BE.
835	   Moreover, the leaky bucket parameters of CDT (r_h,b_h) should be
836	   known.  To admit a flow/flows, their delay requirements should be
837	   guaranteed not to be violated.  As described in Section 3.1, the two
838	   problems static and dynamic are addressed separately.  In either of
839	   the problems, the rate and delay should be guaranteed.  Thus,

841	   The static admission control:
842	           The leaky bucket parameters of all flows are known,
843	           therefore, for each flow a delay bound can be calculated.
844	           The computed delay bound for every flow should not be more
845	           than its delay requirement.  Moreover, the sum of the rate of
846	           each flow (r_f) should not be more than the rate allocated to
847	           each class (R).  If these two conditions hold, the
848	           configuration is declared admissible.

850	   The dynamic admission control:
851	           For dynamic admission control, we allocate to every node and
852	           class A or B, static value for rate (R) and maximum
853	           burstiness (b_t).  In addition, for every node and every
854	           class A and B, two counters are maintained:

856	              R_acc is equal to the sum of the leaky-bucket rates of all
857	              flows of this class already admitted at this node; At all
858	              times, we must have:

860	                 R_acc <=R, (Eq. 1)

862	              b_acc is equal to the sum of the bucket sizes of all flows
863	              of this class already admitted at this node; At all times,
864	              we must have:

866	                 b_acc <=b_t.  (Eq. 2)

868	           A new flow is admitted at this node, if Eqs. (1) and (2)
869	           continue to be satisfied after adding its leaky bucket rate
870	           and bucket size to R_acc and b_acc.  A flow is admitted in
871	           the network, if it is admitted at all nodes along its path.
872	           When this happens, all variables R_acc and b_acc along its
873	           path must be incremented to reflect the addition of the flow.
874	           Similarly, when a flow leaves the network, all variables
875	           R_acc and b_acc along its path must be decremented to reflect
876	           the removal of the flow.

878	   The choice of the static values of R and b_t at all nodes and classes
879	   must be done in a prior configuration phase; R controls the bandwidth
880	   allocated to this class at this node, b_t affects the delay bound and
881	   the buffer requirement.  R must satisfy the constraints given in
882	   Annex L.1 of [IEEE8021Q].

884	6.5.  IntServ

886	   Integrated service (IntServ) is an architecture that specifies the
887	   elements to guarantee quality of service (QoS) on networks.  To
888	   satisfied guaranteed service, a flow must conform to a traffic
889	   specification (T-spec), and reservation is made along a path, only if
890	   routers are able to guarantee the required bandwidth and buffer.

892	   Consider the traffic model which conforms to token bucket regulator
893	   (r, b), with

895	   o  Token bucket depth (b).

897	   o  Token bucket rate (r).

899	   The traffic specification can be described as an arrival curve:

901	      alpha(t) = b + rt

903	   This token bucket regulator requires that, during any time window t,
904	   the number of bit for the flow is limited by alpha(t) = b + rt.

906	   If resource reservation on a path is applied, IntServ model of a
907	   router can be described as a rate-latency service curve beta(t).

909	      beta(t) = max(0, R(t-T))

911	   It describes that bits might have to wait up to T before being served
912	   with a rate greater or equal to R.

914	   It should be noted that, the guaranteed service rate R is a share of
915	   link's bandwidth.  The choice of R is related to the specification of
916	   flows which will transmit on this node.  For example, in strict
917	   priority policy, considering a flow with priority j, its share of
918	   bandwidth may be R=c-sum(r_i), i<j, where c is the link bandwidth,
919	   r_i is the token bucket rate for the flows with priority higher than
920	   j.  The choice of T is also related to the specification of all the
921	   flows traversing this node.  For example, in a generalized processor
922	   sharing (GPS) node, T = L / R + L_max/c, where L is the maximum
923	   packet size for the flow, L_max is the maximum packet size in the
924	   node across all flows.  Other choice of R and T are also supported,
925	   according to the specific scheduling of the node and flows traversing
926	   this node.

928	   As mentioned previously in this section, delay bound and backlog
929	   bound can be easily obtained by comparing arrival curve and service
930	   curve.  Backlog bound, or buffer bound, is the maximum vertical
931	   derivation between curves alpha(t) and beta(t), which is v=b+rT.
932	   Delay bound is the maximum horizontal derivation between curves
933	   alpha(t) and beta(t), which is h = T+b/R.  Graphical illustration of
934	   the IntServ model is shown in Figure 5.

936	                    + bit              .        *
937	                    |                 .     *
938	                    |                .  *
939	                    |               *
940	                    |           *  .
941	                    |       *     .
942	                    |   *   |    .        ..  Service curve
943	                    *-----h-|---.         **  Arrival curve
944	                    |       v  .           h  Delay_bound
945	                    |       | .            v  Backlog_bound
946	                    |       |.
947	                    +-------.--------------------+ time

949	    Figure 5: Computation of backlog bound and delay bound.  Note that
950	        arrival and service curves are not necessary to be linear.

952	   The output bound, or the next-hop arrival curve, is alpha_out(t) = b
953	   + rT + rt, where burstiness of the flow is increased by rT, compared
954	   with the arrival curve.

956	   We can calculate the end-to-end delay bound for a path including N
957	   nodes, among which the i-th node offers service curve beta_i(t),

959	      beta_i(t) = max(0, R_i(t-T_i)), i=1,...,N

961	   By concatenating these IntServ nodes, an end-to-end service curve can
962	   be computed as

964	      beta_e2e (t) = max(0, R_e2e(t-T_e2e) )

966	   where

968	      R_e2e = min(R_1,..., R_N)

970	      T_e2e = T_1 + ... + T_N

972	   Similarly, delay bound, backlog bound and output bound can be
973	   computed by using the original arrival curve alpha(t) and
974	   concatenated service curve beta_e2e(t).

976	7.  Time-based DetNet QoS

978	7.1.  Cyclic Queuing and Forwarding

980	   Annex T of [IEEE8021Q] describes Cyclic Queuing and Forwarding (CQF),
981	   which provides bounded latency and zero congestion loss using the
982	   time-scheduled gates of [IEEE8021Q] section 8.6.8.4.  For a given
983	   DetNet class of service, a set of two or three buffers is provided at
984	   the output queue layer of Figure 3.  A cycle time Tc is configured
985	   for each class c, and all of the buffer sets in a class swap buffers
986	   simultaneously throughout the DetNet domain at that cycle rate, all
987	   in phase.

989	   0 time -->  0.7     1   (units of Tc)   2                   3
990	                           DetNet transit node A out port 1
991	   |      a      <-DT->|        b          |          c        |       d
992	   +------------+------+-------------------+-------------------+--------
993	    \_____              \_____
994	          \_____              \_____  queue-to-queue delay = 1.3 Tc
995	                \_____              \_____
996	                      \_____              \_____  DetNet transit node B
997	                            \_                  \_ queue assignment, in
998	          |                   |            |<-DT->|  port 2 to out 3  |
999	   -------+-------------------+------------+------+-------------------+-
1000	         0.3  time-->        1.3          2.0    2.3                 3.3

1002	         window to transfer
1003	            to buffer c  --->  VVVVVVVVVVVV
1004	          if dead time not                         window to transfer
1005	             excessive         VVVVVVVVVVVVVVVVVVV <--- to buffer d
1006	                           DetNet transit node B out port 3
1007	   |         a         |         b         |         c         |       d
1008	   +-------------------+-------------------+-------------------+--------
1009	   0    time-->        1                   2                   3

1011	          Figure 6: CQF timing diagram and dead time computation

1013	   Figure 6 shows two DetNet transit nodes A and B, including three
1014	   timelines for:

1016	   1.  The output queues on port 1 in node A.

1018	   2.  The input gate function ([IEEE8021Q], 8.6.5.1) that assigns
1019	       packets received on port 1 of transit node B to output queues on
1020	       port 2 of transit node B.

1022	   3.  The output queues on port 2 of node B.

1024	   In this figure, the output ports on the two nodes are synchronized,
1025	   and a new buffer starts transmitting at each tick, shown as 0, 1, 2,
1026	   ...  The output times shown for timelines 1 and 3 are the times at
1027	   which packets are selected for output, which is the start point of
1028	   the output time (1) of Figure 1.  The queue assignments times on
1029	   timeline 3 take place at the beginning of the queuing delay (6) of
1030	   Figure 1.  Time-based CQF, as described here, does not require any
1031	   regulator queues.  In the shown in the figure, the total time for
1032	   delays 1 through 6 of Figure 1 is 1.3Tc.  Of course, any value is
1033	   possible.

1035	7.1.1.  CQF timing sequence

1037	   In general, as shown in Figure 6, the windows for buffer assignment
1038	   do not align perfectly with the windows for buffer transmission.  The
1039	   input gates (the center timeline in Figure 6) must switch from using
1040	   one buffer to using another buffer in sync with the (delayed)
1041	   received data, at times offset by the dead time from the output
1042	   buffer switching (the bottom timeline in Figure 6).

1044	   If the dead time DT in Figure 6 is not excessive, then it is feasible
1045	   to subtract the dead time from the cycle time Tc, and use the
1046	   remainder as the input window.  In the example in Figure 6, packets
1047	   from node A buffer a can be transferred from the input port to node
1048	   B's buffer "c" during the window shown by the upper row "VVVV...".
1049	   Input must cease by time = 2.0, because that is when transit node B
1050	   starts transmitting the contents of buffer c.  In this case, only two
1051	   output buffers are in use, one filling and one outputting.

1053	   If the dead time is too large (e.g., if the delays placed the middle
1054	   timeline's switching points at n+0.9, instead of n+0.3), three
1055	   buffers are used by node B.  This case is shown by the lower row
1056	   "VVVV..." in Figure 6.  In this case, node B places the data received
1057	   from node A buffer a into node B buffer d between the times 1.3 and
1058	   2.3 in Figure 6.  Buffer b starts outputting at time = 2.0, while
1059	   buffer d is filling.  Thus, three buffers are in use, one filling,
1060	   one waiting, and one emptying.

1062	7.1.2.  Dead time computation

1064	   The time for switching input packet buffer assignments is equal to
1065	   the minimum possible offset from transmission selection in node A to
1066	   buffer assignment in node B, which is the sum of the minimum values
1067	   for all of the delays 1 through 5 in Figure 1 (the queue-to-queue
1068	   delay).  All packets must be received and assigned to an output
1069	   buffer before the next switching point, which means that all must be
1070	   transmitted in time for them to arrive at buffer assignment even if
1071	   worst-case (longest delay) is encountered for the queue-to-queue
1072	   delay.  Thus, the minimum dead time for the 3-buffer case is the sum
1073	   of the worst-case variation in the queue-to-queue delay, plus the
1074	   worst-case difference between the two transit nodes' buffer switching
1075	   clocks.

1077	   For the 2-buffer case, we must add the offset (shown as "DT" in
1078	   Figure 6) from the end of node B's output switch to the end of node
1079	   B's input switch.

1081	7.1.3.  Tc computation

1083	   Given the dead time DT, there remains a transmit window of (Tc - DT -
1084	   Int).  The DT was explained in (Section 7.1.2).  "Int" is the worst-
1085	   case interference with the start of transmission, when the output
1086	   buffers switch, caused by lower-priority traffic.  This is equal to
1087	   one worst-case transmission time, which means that the size of the
1088	   packets in all lower-priority queues must be bounded.  If Ethernet
1089	   preemption ([IEEE8023] clause 99) is employed for lower-priority
1090	   queues, then this worst-case interference is reduced to the size of
1091	   the largest unfragmentable Ethernet frame.

1093	   The bandwidth requirement of any given DetNet flow has to be
1094	   translated to CQF terms, in order to determine whether that flow can
1095	   be accommodated at each port.  A flow has to be characterized as
1096	   using a maximum number of bit times on the wire per cycle time Tc.
1097	   For Ethernet, for example ([IEEE8023]), this includes the preamble (8
1098	   bytes), destination MAC address through CRC (minimum 64 bytes) and
1099	   the inter-packet gap (12 bytes).  The total bit times per cycle Tc
1100	   required by all of the DetNet flows passing through a given port
1101	   cannot exceed the available transmit window (Tc - DT - Int).

1103	7.1.4.  CQF latency calculation

1105	   The per-hop latency is trivially determined by the wire delay plus
1106	   the queuing delay.  Since the wire delay is either absorbed into the
1107	   queueing delay (dead time is small and two buffers are used) or
1108	   padded out to a whole cycle time Tc (three buffers are used) the per-
1109	   hop latency is always an integral number of cycle times Tc, with a
1110	   latency variation at the output of the final hop of Tc.

1112	   Ingress conditioning may be required if the source of a DetNet flow
1113	   does not, itself, employ CQF.  See Section 7.1.6.

1115	7.1.5.  CQF parameterization

1117	   The transmit window for a given DetNet transit node running CQF, for
1118	   example the transmit window for node A in Figure 6, depends on the
1119	   interference (Int in section Section 7.1.3), the value of Tc, and the
1120	   dead time required by the following node B.  The size of the transmit
1121	   window determines how many total bits can be reserved per period Tc
1122	   by DetNet flows.

1124	   Part of the dead time derives from delays and delay variations such
1125	   as output delay (1) and link delay (2) and preemption delay (3) of
1126	   Figure 1, all of which are known to node A.  However, the dead time
1127	   also depends on the processing delay (4) of node B and the upon
1128	   whether node B is using 2 or 3 output buffers, which is not
1129	   necessarily known to node A.

1131	   The information in DetNet transit node B necessary to compute the
1132	   dead time to be observed by transit node A must be known to the
1133	   entity responsible for making reservation decisions, whether that is
1134	   node A itself, or a central controller.  A decision can be made, by
1135	   the controller or by the node, whether to use the dead time and two
1136	   buffers, in order to reduce the per-hop latency by one cycle time, or
1137	   to use three buffers and eliminate the dead time and increase the
1138	   total allocable bandwidth.

1140	   If the packet sizes of a DetNet flow are variable, or perhaps even
1141	   unknown beyond the imposition of a maximum size, then some degree of
1142	   overprovisioning is required.  The measurement used to allocate
1143	   bandwidth to a given DetNet flow is bit times in one cycle time Tc.
1144	   Therefore, one extra maximum packet time (less one bit) has to be
1145	   allocated to a flow per cycle time Tc in order to ensure that, no
1146	   matter what mix of packet sizes are presented, the flow will get its
1147	   guaranteed latency.

1149	7.1.6.  Ingress conditioning for CQF

1151	   Assuming that a DetNet domain is using CQF, it is always possible
1152	   that the previous node (or sender) may not support the queuing method
1153	   of CQF, or may support CQF but not use the same configuration in
1154	   accordance of the current DetNet domain.  In this case, ingress
1155	   conditioning is helpful to shape the flow according to the TSPEC in
1156	   the current DetNet domain and transmission cycle of CQF, thus control
1157	   the burstiness and reduce overprovisioning.

1159	   A DetNet node running CQF satisfies that a maximum number of
1160	   max_number_packet_per_cycle packets, each with length no larger than
1161	   max_packet_size, can be transmitted during a CQF cycle Tc.  Also, the
1162	   dead time Dt, during which the former node cannot transmit to the
1163	   later node (See Section 7.1.1), should be considered.  Here we use
1164	   the notation max_number_packet_per_cycle, max_packet_size to describe
1165	   the maximum amount of transmitting data during the available transmit
1166	   window Tc-Dt.

1168	   An ingress conditioner typically consists with a FIFO queue with an
1169	   output regulator.  Every incoming packet enters the FIFO queue, which
1170	   passes it on if and only if the packet conforms to the CQF
1171	   requirement.  For this purpose, two criteria below is suggested to
1172	   follow.

1174	   o  The incoming flow's average rate would be no more than the average
1175	      output queue rate at ingress conditioner, to avoid overflow and
1176	      congestion loss.

1178	   o  The queueing buffer at ingress conditioner is suggested to be
1179	      large enough, to cover burstiness and jitter of bit rate from the
1180	      previous node (or sender).

1182	   The output regulator controls the transmitting of packets, with
1183	   credit function for instance.  At the start of a CQF cycle, credit is
1184	   set to the maximum bits to transmit in a cycle, max_bit_per_cycle =
1185	   max_number_packet_per_cycle * max_packet_size.  When a packet is
1186	   transmitted from the node, the credit is reduced by length(packet).
1187	   The operation of the output regulator can be described as below.

1189	        credit = max_bit_per_cycle; % Initial credit
1190	        t = 0; % Initial time offset within a cycle

1192	        outputRegulate(packet){
1193	        if (t>=0 && t< Tc-Dt) % during a transmit window
1194	        {
1195	            if ~isEmpty(queue) % if queue is not empty
1196	            {
1197	                if credit >= length(packet);
1198	                {
1199	                    dequeue(packet);
1200	                    credit = credit - length(packet);
1201	                }
1202	            }
1203	        }
1204	        elseif (t >= Tc) % when a cycle end, reset t and credit.
1205	        {
1206	            t = t - Tc; % reset t to [0, Tc-Dt)
1207	            credit = max_bit_per_cycle; % credit is refilled
1208	        }}

1210	   Other instance of output regulator may also meet the CQF requirement.

1212	7.1.7.  CQF ingress conditioning timing model

1214	   Consider the input traffic conforms to variable bit rate (VBR)
1215	   constraint (p, M, r, b), which can be modeled as arrival curve:

1217	      alpha(t) = min(pt+M, rt+b)

1219	   where p is the peak rate, M is the maximum size of a packet, (r, b)
1220	   stand for token bucket rate and burst.  Note that, if N flows enter
1221	   the ingress conditioner, each flow f_i conforms to token bucket
1222	   (r_i,b_i), the arrival curve parameters are of the superposition
1223	   form: r = sum(r_i), b = sum(b_i), i=1,...,N.

1225	   Ingress conditioner, which transmit packets as soon as possible if
1226	   conformance is satisfied, does not increase worst-case queuing
1227	   latency, if we consider the latency from the input of ingress
1228	   conditioner to the first output of CQF node.  The CQF node, which
1229	   transmits at most max_bit_per_cycle bits during a cycle Tc, offers a
1230	   service curve

1232	      beta(t) = max_bit_per_cycle * floor(t-Dt) + min(p*mod(t-Dt, Tc),
1233	      max_bit_per_cycle)

1235	   where Dt is the dead time (see Section 7.1.1), p is the peak rate,
1236	   max_bit_per_cycle = max_number_packet_per_cycle * max_packet_size is
1237	   the maximum transmitting data in a CQF cycle, floor(x) calculates the
1238	   nearest integer less than or equal to x, and mod(x,y)=x-floor(x/y).

1240	   According to network calculus, the worst-case queuing delay for
1241	   ingress conditioner and CQF node, denoted delay_bound, is the maximum
1242	   horizontal distance between arrival curve and service curve.

1244	   Specifically, in general assumptions, ingress conditioner's average
1245	   output rate r2 = max_bit_per_cycle/Tc is always no less than input
1246	   traffic rate r (or else congestion loss would happen).  In such
1247	   conditions, the queuing delay up-bound for ingress conditioner and
1248	   first CQF node is derived as ingress_delay_bound = max(t : beta(t) =
1249	   (pb-rm) / (p-r)).  Therefore, considering a path traverses ingress
1250	   conditioner and H CQF nodes, the total delay up-bound is H*Tc +
1251	   ingress_delay_bound.

1253	7.2.  Time-scheduled queuing

1255	   [IEEE8021Q] section 8.6.8.4 specifies a time-aware queue-draining
1256	   procedure for transmission selection at egress port of a DetNet
1257	   transit node, which supports up to eight traffic classes.  Each
1258	   traffic class has a separate queue, frame transmission from each
1259	   queue is allowed or prevented by a time gate.  This time gate
1260	   controlled scheduling allows time-sensitive traffic classes to
1261	   transmit on dedicate time slots.  Within the time slots, the
1262	   transmitting flows can be granted exclusive use of the transmission
1263	   medium.  Generally, this time-aware scheduling is a layer 2 time
1264	   division multiplexing (TDM) technique.

1266	   Consider the static configuration of a deterministic network.  To
1267	   provide end-to-end latency guaranteed service, network nodes can
1268	   support time-based behavior, which is determined by gate control list
1269	   (GCL).  GCL defines the gate operation, in open or closed state, with
1270	   associated timing for each traffic class queue.  A time slice with
1271	   gate state "open" is called transmission window.  The time-based
1272	   traffic scheduling must be coordinated among the DetNet transit nodes
1273	   along the path from sender to receiver, to control the transmission
1274	   of time-sensitive traffic.

1276	   Ideally all network devices are time synchronized and static GCL
1277	   configurations on all devices along the routed path are coordinated
1278	   to ensure that length of transmission window fits the assigned
1279	   frames, and no two time windows for DetNet traffic on the same port
1280	   overlap.  (DetNet flows' windows can overlap with best-effort
1281	   windows, so that unused DetNet bandwidth is available to best-effort
1282	   traffic.)  The processing delay, link delay and output delay in
1283	   transmitting are considered in GCL computation.  Transmission window
1284	   for a certain flow may require that a time offset on consecutive hops
1285	   be selected to reduce queueing delay as much as possible.  In this
1286	   case, TSN/DetNet frames transmit at the assigned transmission window
1287	   at every node through the routed path, with zero congestion loss and
1288	   bounded end-to-end latency.  Then, the worst-case end-to-end latency
1289	   of the flow can be derived from GCL configuration.  For a TSN or
1290	   DetNet frame, denote the transmission window on last hop closes at
1291	   gate_close_time_last_hop.  Assuming talker supports scheduled traffic
1292	   behavior, it starts the transmission at gate_open_time_on_talker.
1293	   Then worst case end-to-end delay of this flow is bounded by
1294	   gate_close_time_last_hop - gate_open_time_on_talker +
1295	   link_delay_last_hop.

1297	   It should be noted that scheduled traffic service relies on a
1298	   synchronized network and coordinated GCL configuration.  Synthesis of
1299	   GCL on multiple nodes in network is a scheduling problem considering
1300	   all TSN/DetNet flows traversing the network, which is a non-
1301	   deterministic polynomial-time hard (NP-hard) problem.  Also, at this
1302	   writing, scheduled traffic service supports no more than eight
1303	   traffic classes, typically using up to seven priority classes and at
1304	   least one best effort class.

1306	8.  Parameters for the bounded latency model

1308	   The use of the TSPEC parameters defined in [RFC2212] and related
1309	   documents for IntServ are well-established and adequate for DetNet
1310	   purposes.  The parameterization used by [IEEE8021Q] are somewhat
1311	   different, as discussed above (Section 7.1.6).  These parameters are
1312	   maximum number of frames per interval, interval size, and maximum
1313	   frame size.  They are more suitable for the physical determination of
1314	   compliance by a sender than for resource reservation purposes.

1316	9.  References

1318	9.1.  Normative References

1320	   [I-D.ietf-detnet-architecture]
1321	              Finn, N., Thubert, P., Varga, B., and J. Farkas,
1322	              "Deterministic Networking Architecture", draft-ietf-
1323	              detnet-architecture-08 (work in progress), September 2018.

1325	   [I-D.ietf-detnet-dp-sol-ip]
1326	              Korhonen, J. and B. Varga, "DetNet IP Data Plane
1327	              Encapsulation", draft-ietf-detnet-dp-sol-ip-00 (work in
1328	              progress), July 2018.

1330	   [I-D.ietf-detnet-dp-sol-mpls]
1331	              Korhonen, J. and B. Varga, "DetNet MPLS Data Plane
1332	              Encapsulation", draft-ietf-detnet-dp-sol-mpls-00 (work in
1333	              progress), July 2018.

1335	   [I-D.ietf-detnet-use-cases]
1336	              Grossman, E., "Deterministic Networking Use Cases", draft-
1337	              ietf-detnet-use-cases-20 (work in progress), December
1338	              2018.

1340	   [RFC2212]  Shenker, S., Partridge, C., and R. Guerin, "Specification
1341	              of Guaranteed Quality of Service", RFC 2212,
1342	              DOI 10.17487/RFC2212, September 1997,
1343	              <https://www.rfc-editor.org/info/rfc2212>.

1345	   [RFC6658]  Bryant, S., Ed., Martini, L., Swallow, G., and A. Malis,
1346	              "Packet Pseudowire Encapsulation over an MPLS PSN",
1347	              RFC 6658, DOI 10.17487/RFC6658, July 2012,
1348	              <https://www.rfc-editor.org/info/rfc6658>.

1350	   [RFC7806]  Baker, F. and R. Pan, "On Queuing, Marking, and Dropping",
1351	              RFC 7806, DOI 10.17487/RFC7806, April 2016,
1352	              <https://www.rfc-editor.org/info/rfc7806>.

1354	9.2.  Informative References

1356	   [bennett2002delay]
1357	              J.C.R. Bennett, K. Benson, A. Charny, W.F. Courtney, and
1358	              J.-Y. Le Boudec, "Delay Jitter Bounds and Packet Scale
1359	              Rate Guarantee for Expedited Forwarding",
1360	              <https://dl.acm.org/citation.cfm?id=581870>.

1362	   [charny2000delay]
1363	              A. Charny and J.-Y. Le Boudec, "Delay Bounds in a Network
1364	              with Aggregate Scheduling", <https://link.springer.com/
1365	              chapter/10.1007/3-540-39939-9_1>.

1367	   [IEEE8021Q]
1368	              IEEE 802.1, "IEEE Std 802.1Q-2018: IEEE Standard for Local
1369	              and metropolitan area networks - Bridges and Bridged
1370	              Networks", 2018,
1371	              <http://ieeexplore.ieee.org/document/8403927>.

1373	   [IEEE8021Qcr]
1374	              IEEE 802.1, "IEEE P802.1Qcr: IEEE Draft Standard for Local
1375	              and metropolitan area networks - Bridges and Bridged
1376	              Networks - Amendment: Asynchronous Traffic Shaping", 2017,
1377	              <http://www.ieee802.org/1/files/private/cr-drafts/>.

1379	   [IEEE8021TSN]
1380	              IEEE 802.1, "IEEE 802.1 Time-Sensitive Networking (TSN)
1381	              Task Group", <http://www.ieee802.org/1/>.

1383	   [IEEE8023]
1384	              IEEE 802.3, "IEEE Std 802.3-2018: IEEE Standard for
1385	              Ethernet", 2018,
1386	              <http://ieeexplore.ieee.org/document/8457469>.

1388	   [le_boudec_theory_2018]
1389	              J.-Y. Le Boudec, "A Theory of Traffic Regulators for
1390	              Deterministic Networks with Application to Interleaved
1391	              Regulators", <http://arxiv.org/abs/1801.08477/>.

1393	   [NetCalBook]
1394	              Le Boudec, Jean-Yves, and Patrick Thiran, "Network
1395	              calculus: a theory of deterministic queuing systems for
1396	              the internet", 2001, <https://arxiv.org/abs/1804.10608/>.

1398	   [Specht2016UBS]
1399	              J. Specht and S. Samii, "Urgency-Based Scheduler for Time-
1400	              Sensitive Switched Ethernet Networks",
1401	              <https://ieeexplore.ieee.org/abstract/document/7557870>.

1403	   [TSNwithATS]
1404	              E. Mohammadpour, E. Stai, M. Mohiuddin, and J.-Y. Le
1405	              Boudec, "End-to-end Latency and Backlog Bounds in Time-
1406	              Sensitive Networking with Credit Based Shapers and
1407	              Asynchronous Traffic Shaping",
1408	              <https://arxiv.org/abs/1804.10608/>.

1410	Authors' Addresses

1412	   Norman Finn
1413	   Huawei Technologies Co. Ltd
1414	   3101 Rio Way
1415	   Spring Valley, California  91977
1416	   US

1418	   Phone: +1 925 980 6430
1419	   Email: norman.finn@mail01.huawei.com
1420	   Jean-Yves Le Boudec
1421	   EPFL
1422	   IC Station 14
1423	   Lausanne EPFL  1015
1424	   Switzerland

1426	   Email: jean-yves.leboudec@epfl.ch

1428	   Ehsan Mohammadpour
1429	   EPFL
1430	   IC Station 14
1431	   Lausanne EPFL  1015
1432	   Switzerland

1434	   Email: ehsan.mohammadpour@epfl.ch

1436	   Jiayi Zhang
1437	   Huawei Technologies Co. Ltd
1438	   Q22, No.156 Beiqing Road
1439	   Beijing  100095
1440	   China

1442	   Email: zhangjiayi11@huawei.com

1444	   Balazs Varga
1445	   Ericsson
1446	   Konyves Kalman krt. 11/B
1447	   Budapest  1097
1448	   Hungary

1450	   Email: balazs.a.varga@ericsson.com

1452	   Janos Farkas
1453	   Ericsson
1454	   Konyves Kalman krt. 11/B
1455	   Budapest  1097
1456	   Hungary

1458	   Email: janos.farkas@ericsson.com