idnits 2.17.1 draft-ietf-detnet-bounded-latency-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 609 has weird spacing: '...N queue non...' -- The document date (November 2, 2020) is 1268 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '2' on line 264 -- Looks like a reference, but probably isn't: '4' on line 774 == Outdated reference: A later version (-07) exists of draft-ietf-detnet-ip-05 == Outdated reference: A later version (-13) exists of draft-ietf-detnet-mpls-05 Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 DetNet N. Finn 3 Internet-Draft Huawei Technologies Co. Ltd 4 Intended status: Informational J-Y. Le Boudec 5 Expires: May 6, 2021 E. Mohammadpour 6 EPFL 7 J. Zhang 8 Huawei Technologies Co. Ltd 9 B. Varga 10 J. Farkas 11 Ericsson 12 November 2, 2020 14 DetNet Bounded Latency 15 draft-ietf-detnet-bounded-latency-02 17 Abstract 19 This document presents a timing model for Deterministic Networking 20 (DetNet), so that existing and future standards can achieve the 21 DetNet quality of service features of bounded latency and zero 22 congestion loss. It defines requirements for resource reservation 23 protocols or servers. It calls out queuing mechanisms, defined in 24 other documents, that can provide the DetNet quality of service. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on May 6, 2021. 43 Copyright Notice 45 Copyright (c) 2020 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 2. Terminology and Definitions . . . . . . . . . . . . . . . . . 3 62 3. DetNet bounded latency model . . . . . . . . . . . . . . . . 3 63 3.1. Flow creation . . . . . . . . . . . . . . . . . . . . . . 4 64 3.1.1. Static flow latency calculation . . . . . . . . . . . 4 65 3.1.2. Dynamic flow latency calculation . . . . . . . . . . 5 66 3.2. Relay node model . . . . . . . . . . . . . . . . . . . . 6 67 4. Computing End-to-end Delay Bounds . . . . . . . . . . . . . . 8 68 4.1. Non-queuing delay bound . . . . . . . . . . . . . . . . . 8 69 4.2. Queuing delay bound . . . . . . . . . . . . . . . . . . . 8 70 4.2.1. Per-flow queuing mechanisms . . . . . . . . . . . . . 9 71 4.2.2. Per-class queuing mechanisms . . . . . . . . . . . . 9 72 4.3. Ingress considerations . . . . . . . . . . . . . . . . . 10 73 4.4. Interspersed non-DetNet transit nodes . . . . . . . . . . 11 74 5. Achieving zero congestion loss . . . . . . . . . . . . . . . 11 75 6. Queuing techniques . . . . . . . . . . . . . . . . . . . . . 12 76 6.1. Queuing data model . . . . . . . . . . . . . . . . . . . 12 77 6.2. Preemption . . . . . . . . . . . . . . . . . . . . . . . 14 78 6.3. Time Aware Shaper . . . . . . . . . . . . . . . . . . . . 15 79 6.4. Credit-Based Shaper with Asynchronous Traffic Shaping . . 15 80 6.4.1. Delay Bound Calculation . . . . . . . . . . . . . . . 17 81 6.4.2. Flow Admission . . . . . . . . . . . . . . . . . . . 18 82 6.5. IntServ . . . . . . . . . . . . . . . . . . . . . . . . . 19 83 6.6. Cyclic Queuing and Forwarding . . . . . . . . . . . . . . 20 84 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 85 7.1. Normative References . . . . . . . . . . . . . . . . . . 21 86 7.2. Informative References . . . . . . . . . . . . . . . . . 22 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 89 1. Introduction 91 The ability for IETF Deterministic Networking (DetNet) or IEEE 802.1 92 Time-Sensitive Networking (TSN, [IEEE8021TSN]) to provide the DetNet 93 services of bounded latency and zero congestion loss depends upon A) 94 configuring and allocating network resources for the exclusive use of 95 DetNet/TSN flows; B) identifying, in the data plane, the resources to 96 be utilized by any given packet, and C) the detailed behavior of 97 those resources, especially transmission queue selection, so that 98 latency bounds can be reliably assured. Thus, DetNet is an example 99 of an IntServ Guaranteed Quality of Service [RFC2212] 101 As explained in [RFC8655], DetNet flows are characterized by 1) a 102 maximum bandwidth, guaranteed either by the transmitter or by strict 103 input metering; and 2) a requirement for a guaranteed worst-case end- 104 to-end latency. That latency guarantee, in turn, provides the 105 opportunity for the network to supply enough buffer space to 106 guarantee zero congestion loss. 108 To be of use to the applications identified in [RFC8578], it must be 109 possible to calculate, before the transmission of a DetNet flow 110 commences, both the worst-case end-to-end network latency, and the 111 amount of buffer space required at each hop to ensure against 112 congestion loss. 114 This document references specific queuing mechanisms, defined in 115 other documents, that can be used to control packet transmission at 116 each output port and achieve the DetNet qualities of service. This 117 document presents a timing model for sources, destinations, and the 118 DetNet transit nodes that relay packets that is applicable to all of 119 those referenced queuing mechanisms. 121 Using the model presented in this document, it should be possible for 122 an implementor, user, or standards development organization to select 123 a particular set of queuing mechanisms for each device in a DetNet 124 network, and to select a resource reservation algorithm for that 125 network, so that those elements can work together to provide the 126 DetNet service. 128 This document does not specify any resource reservation protocol or 129 server. It does not describe all of the requirements for that 130 protocol or server. It does describe requirements for such resource 131 reservation methods, and for queuing mechanisms that, if met, will 132 enable them to work together. 134 2. Terminology and Definitions 136 This document uses the terms defined in [RFC8655]. 138 3. DetNet bounded latency model 139 3.1. Flow creation 141 This document assumes that following paradigm is used for 142 provisioning DetNet flows: 144 1. Perform any configuration required by the DetNet transit nodes in 145 the network for the classes of service to be offered, including 146 one or more classes of DetNet service. This configuration is 147 done beforehand, and not tied to any particular flow. 149 2. Characterize the new DetNet flow, particularly in terms of 150 required bandwidth. 152 3. Establish the path that the DetNet flow will take through the 153 network from the source to the destination(s). This can be a 154 point-to-point or a point-to-multipoint path. 156 4. Select one of the DetNet classes of service for the DetNet flow. 158 5. Compute the worst-case end-to-end latency for the DetNet flow, 159 using one of the methods, below (Section 3.1.1, Section 3.1.2). 160 In the process, determine whether sufficient resources are 161 available for that flow to guarantee the required latency and to 162 provide zero congestion loss. 164 6. Assuming that the resources are available, commit those resources 165 to the flow. This may or may not require adjusting the 166 parameters that control the filtering and/or queuing mechanisms 167 at each hop along the flow's path. 169 This paradigm can be implemented using peer-to-peer protocols or 170 using a central server. In some situations, a lack of resources can 171 require backtracking and recursing through this list. 173 Issues such as un-provisioning a DetNet flow in favor of another, 174 when resources are scarce, are not considered, here. Also not 175 addressed is the question of how to choose the path to be taken by a 176 DetNet flow. 178 3.1.1. Static flow latency calculation 180 The static problem: 181 Given a network and a set of DetNet flows, compute an end-to- 182 end latency bound (if computable) for each flow, and compute 183 the resources, particularly buffer space, required in each 184 DetNet transit node to achieve zero congestion loss. 186 In this calculation, all of the DetNet flows are known before the 187 calculation commences. This problem is of interest to relatively 188 static networks, or static parts of larger networks. It provides 189 bounds on delay and buffer size. The calculations can be extended to 190 provide global optimizations, such as altering the path of one DetNet 191 flow in order to make resources available to another DetNet flow with 192 tighter constraints. 194 The static flow calculation is not limited only to static networks; 195 the entire calculation for all flows can be repeated each time a new 196 DetNet flow is created or deleted. If some already-established flow 197 would be pushed beyond its latency requirements by the new flow, then 198 the new flow can be refused, or some other suitable action taken. 200 This calculation may be more difficult to perform than that of the 201 dynamic calculation (Section 3.1.2), because the flows passing 202 through one port on a DetNet transit node affect each others' 203 latency. The effects can even be circular, from Flow A to B to C and 204 back to A. On the other hand, the static calculation can often 205 accommodate queuing methods, such as transmission selection by strict 206 priority, that are unsuitable for the dynamic calculation. 208 3.1.2. Dynamic flow latency calculation 210 The dynamic problem: 211 Given a network whose maximum capacity for DetNet flows is 212 bounded by a set of static configuration parameters applied 213 to the DetNet transit nodes, and given just one DetNet flow, 214 compute the worst-case end-to-end latency that can be 215 experienced by that flow, no matter what other DetNet flows 216 (within the network's configured parameters) might be created 217 or deleted in the future. Also, compute the resources, 218 particularly buffer space, required in each DetNet transit 219 node to achieve zero congestion loss. 221 This calculation is dynamic, in the sense that flows can be added or 222 deleted at any time, with a minimum of computation effort, and 223 without affecting the guarantees already given to other flows. 225 The choice of queuing methods is critical to the applicability of the 226 dynamic calculation. Some queuing methods (e.g. CQF, Section 6.6) 227 make it easy to configure bounds on the network's capacity, and to 228 make independent calculations for each flow. Some other queuing 229 methods (e.g. strict priority with the credit-based shaper defined in 230 [IEEE8021Q] section 8.6.8.2) can be used for dynamic flow creation, 231 but yield poorer latency and buffer space guarantees than when that 232 same queuing method is used for static flow creation (Section 3.1.1). 234 3.2. Relay node model 236 A model for the operation of a DetNet transit node is required, in 237 order to define the latency and buffer calculations. In Figure 1 we 238 see a breakdown of the per-hop latency experienced by a packet 239 passing through a DetNet transit node, in terms that are suitable for 240 computing both hop-by-hop latency and per-hop buffer requirements. 242 DetNet transit node A DetNet transit node B 243 +-------------------------+ +------------------------+ 244 | Queuing | | Queuing | 245 | Regulator subsystem | | Regulator subsystem | 246 | +-+-+-+-+ +-+-+-+-+ | | +-+-+-+-+ +-+-+-+-+ | 247 -->+ | | | | | | | | | + +------>+ | | | | | | | | | + +---> 248 | +-+-+-+-+ +-+-+-+-+ | | +-+-+-+-+ +-+-+-+-+ | 249 | | | | 250 +-------------------------+ +------------------------+ 251 |<->|<------>|<------->|<->|<---->|<->|<------>|<------>|<->|<-- 252 2,3 4 5 6 1 2,3 4 5 6 1 2,3 253 1: Output delay 4: Processing delay 254 2: Link delay 5: Regulation delay 255 3: Preemption delay 6: Queuing delay. 257 Figure 1: Timing model for DetNet or TSN 259 In Figure 1, we see two DetNet transit nodes (typically, bridges or 260 routers), with a wired link between them. In this model, the only 261 queues, that we deal with explicitly, are attached to the output 262 port; other queues are modeled as variations in the other delay 263 times. (E.g., an input queue could be modeled as either a variation 264 in the link delay [2] or the processing delay [4].) There are six 265 delays that a packet can experience from hop to hop. 267 1. Output delay 268 The time taken from the selection of a packet for output from a 269 queue to the transmission of the first bit of the packet on the 270 physical link. If the queue is directly attached to the physical 271 port, output delay can be a constant. But, in many 272 implementations, the queuing mechanism in a forwarding ASIC is 273 separated from a multi-port MAC/PHY, in a second ASIC, by a 274 multiplexed connection. This causes variations in the output 275 delay that are hard for the forwarding node to predict or control. 277 2. Link delay 278 The time taken from the transmission of the first bit of the 279 packet to the reception of the last bit, assuming that the 280 transmission is not suspended by a preemption event. This delay 281 has two components, the first-bit-out to first-bit-in delay and 282 the first-bit-in to last-bit-in delay that varies with packet 283 size. The former is typically measured by the Precision Time 284 Protocol and is constant (see [RFC8655]). However, a virtual 285 "link" could exhibit a variable link delay. 287 3. Preemption delay 288 If the packet is interrupted in order to transmit another packet 289 or packets, (e.g. [IEEE8023] clause 99 frame preemption) an 290 arbitrary delay can result. 292 4. Processing delay 293 This delay covers the time from the reception of the last bit of 294 the packet to the time the packet is enqueued in the regulator 295 (Queuing subsystem, if there is no regulation). This delay can be 296 variable, and depends on the details of the operation of the 297 forwarding node. 299 5. Regulator delay 300 This is the time spent from the insertion of the last bit of a 301 packet into a regulation queue until the time the packet is 302 declared eligible according to its regulation constraints. We 303 assume that this time can be calculated based on the details of 304 regulation policy. If there is no regulation, this time is zero. 306 6. Queuing subsystem delay 307 This is the time spent for a packet from being declared eligible 308 until being selected for output on the next link. We assume that 309 this time is calculable based on the details of the queuing 310 mechanism. If there is no regulation, this time is from the 311 insertion of the packet into a queue until it is selected for 312 output on the next link. 314 Not shown in Figure 1 are the other output queues that we presume are 315 also attached to that same output port as the queue shown, and 316 against which this shown queue competes for transmission 317 opportunities. 319 The initial and final measurement point in this analysis (that is, 320 the definition of a "hop") is the point at which a packet is selected 321 for output. In general, any queue selection method that is suitable 322 for use in a DetNet network includes a detailed specification as to 323 exactly when packets are selected for transmission. Any variations 324 in any of the delay times 1-4 result in a need for additional buffers 325 in the queue. If all delays 1-4 are constant, then any variation in 326 the time at which packets are inserted into a queue depends entirely 327 on the timing of packet selection in the previous node. If the 328 delays 1-4 are not constant, then additional buffers are required in 329 the queue to absorb these variations. Thus: 331 o Variations in output delay (1) require buffers to absorb that 332 variation in the next hop, so the output delay variations of the 333 previous hop (on each input port) must be known in order to 334 calculate the buffer space required on this hop. 336 o Variations in processing delay (4) require additional output 337 buffers in the queues of that same DetNet transit node. Depending 338 on the details of the queueing subsystem delay (6) calculations, 339 these variations need not be visible outside the DetNet transit 340 node. 342 4. Computing End-to-end Delay Bounds 344 4.1. Non-queuing delay bound 346 End-to-end delay bounds can be computed using the delay model in 347 Section 3.2. Here, it is important to be aware that for several 348 queuing mechanisms, the end-to-end delay bound is less than the sum 349 of the per-hop delay bounds. An end-to-end delay bound for one 350 DetNet flow can be computed as 352 end_to_end_delay_bound = non_queuing_delay_bound + 353 queuing_delay_bound 355 The two terms in the above formula are computed as follows. 357 First, at the h-th hop along the path of this DetNet flow, obtain an 358 upperbound per-hop_non_queuing_delay_bound[h] on the sum of the 359 bounds over the delays 1,2,3,4 of Figure 1. These upper bounds are 360 expected to depend on the specific technology of the DetNet transit 361 node at the h-th hop but not on the T-SPEC of this DetNet flow. Then 362 set non_queuing_delay_bound = the sum of per- 363 hop_non_queuing_delay_bound[h] over all hops h. 365 Second, compute queuing_delay_bound as an upper bound to the sum of 366 the queuing delays along the path. The value of queuing_delay_bound 367 depends on the T-SPEC of this flow and possibly of other flows in the 368 network, as well as the specifics of the queuing mechanisms deployed 369 along the path of this flow. The computation of queuing_delay_bound 370 is described in Section 4.2 as a separate section. 372 4.2. Queuing delay bound 374 For several queuing mechanisms, queuing_delay_bound is less than the 375 sum of upper bounds on the queuing delays (5,6) at every hop. This 376 occurs with (1) per-flow queuing, and (2) per-class queuing with 377 regulators, as explained in Section 4.2.1, Section 4.2.2, and 378 Section 6. 380 For other queuing mechanisms the only available value of 381 queuing_delay_bound is the sum of the per-hop queuing delay bounds. 382 In such cases, the computation of per-hop queuing delay bounds must 383 account for the fact that the T-SPEC of a DetNet flow is no longer 384 satisfied at the ingress of a hop, since burstiness increases as one 385 flow traverses one DetNet transit node. 387 4.2.1. Per-flow queuing mechanisms 389 With such mechanisms, each flow uses a separate queue inside every 390 node. The service for each queue is abstracted with a guaranteed 391 rate and a latency. For every flow, a per-node delay bound as well 392 as an end-to-end delay bound can be computed from the traffic 393 specification of this flow at its source and from the values of rates 394 and latencies at all nodes along its path. The per-flow queuing is 395 used in IntServ. Details of calculation for IntServ are described in 396 Section 6.5. 398 4.2.2. Per-class queuing mechanisms 400 With such mechanisms, the flows that have the same class share the 401 same queue. A practical example is the credit-based shaper defined 402 in section 8.6.8.2 of [IEEE8021Q]. One key issue in this context is 403 how to deal with the burstiness cascade: individual flows that share 404 a resource dedicated to a class may see their burstiness increase, 405 which may in turn cause increased burstiness to other flows 406 downstream of this resource. Computing delay upper bounds for such 407 cases is difficult, and in some conditions impossible 408 [charny2000delay][bennett2002delay]. Also, when bounds are obtained, 409 they depend on the complete configuration, and must be recomputed 410 when one flow is added. (The dynamic calculation, Section 3.1.2.) 412 A solution to deal with this issue is to reshape the flows at every 413 hop. This can be done with per-flow regulators (e.g. leaky bucket 414 shapers), but this requires per-flow queuing and defeats the purpose 415 of per-class queuing. An alternative is the interleaved regulator, 416 which reshapes individual flows without per-flow queuing 417 ([Specht2016UBS], [IEEE8021Qcr]). With an interleaved regulator, the 418 packet at the head of the queue is regulated based on its (flow) 419 regulation constraints; it is released at the earliest time at which 420 this is possible without violating the constraint. One key feature 421 of per-flow or interleaved regulator is that, it does not increase 422 worst-case latency bounds [le_boudec_theory_2018]. Specifically, 423 when an interleaved regulator is appended to a FIFO subsystem, it 424 does not increase the worst-case delay of the latter. 426 Figure 2 shows an example of a network with 5 nodes, per-class 427 queuing mechanism and interleaved regulators as in Figure 1. An end- 428 to-end delay bound for flow f, traversing nodes 1 to 5, is calculated 429 as follows: 431 end_to_end_latency_bound_of_flow_f = C12 + C23 + C34 + S4 433 In the above formula, Cij is a bound on the delay of the queuing 434 subsystem in node i and interleaved regulator of node j, and S4 is a 435 bound on the delay of the queuing subsystem in node 4 for flow f. In 436 fact, using the delay definitions in Section 3.2, Cij is a bound on 437 sum of the delays 1,2,3,6 of node i and 4,5 of node j. Similarly, S4 438 is a bound on sum of the delays 1,2,3,6 of node 4. A practical 439 example of queuing model and delay calculation is presented 440 Section 6.4. 442 f 443 -----------------------------> 444 +---+ +---+ +---+ +---+ +---+ 445 | 1 |---| 2 |---| 3 |---| 4 |---| 5 | 446 +---+ +---+ +---+ +---+ +---+ 447 \__C12_/\__C23_/\__C34_/\_S4_/ 449 Figure 2: End-to-end delay computation example 451 REMARK: The end-to-end delay bound calculation provided here gives a 452 much better upper bound in comparison with end-to-end delay bound 453 computation by adding the delay bounds of each node in the path of a 454 flow [TSNwithATS]. 456 4.3. Ingress considerations 458 A sender can be a DetNet node which uses exactly the same queuing 459 methods as its adjacent DetNet transit node, so that the delay and 460 buffer bounds calculations at the first hop are indistinguishable 461 from those at a later hop within the DetNet domain. On the other 462 hand, the sender may be DetNet unaware, in which case some 463 conditioning of the flow may be necessary at the ingress DetNet 464 transit node. 466 This ingress conditioning typically consists of a FIFO with an output 467 regulator that is compatible with the queuing employed by the DetNet 468 transit node on its output port(s). For some queuing methods, simply 469 requires added extra buffer space in the queuing subsystem. Ingress 470 conditioning requirements for different queuing methods are mentioned 471 in the sections, below, describing those queuing methods. 473 4.4. Interspersed non-DetNet transit nodes 475 It is sometimes desirable to build a network that has both DetNet 476 aware transit nodes and DetNet non-aware transit nodes, and for a 477 DetNet flow to traverse an island of non-DetNet transit nodes, while 478 still allowing the network to offer delay and congestion loss 479 guarantees. This is possible under certain conditions. 481 In general, when passing through a non-DetNet island, the island 482 causes delay variation in excess of what would be caused by DetNet 483 nodes. That is, the DetNet flow is "lumpier" after traversing the 484 non-DetNet island. DetNet guarantees for delay and buffer 485 requirements can still be calculated and met if and only if the 486 following are true: 488 1. The latency variation across the non-DetNet island must be 489 bounded and calculable. 491 2. An ingress conditioning function (Section 4.3) may be required at 492 the re-entry to the DetNet-aware domain. This will, at least, 493 require some extra buffering to accommodate the additional delay 494 variation, and thus further increases the delay bound. 496 The ingress conditioning is exactly the same problem as that of a 497 sender at the edge of the DetNet domain. The requirement for bounds 498 on the latency variation across the non-DetNet island is typically 499 the most difficult to achieve. Without such a bound, it is obvious 500 that DetNet cannot deliver its guarantees, so a non-DetNet island 501 that cannot offer bounded latency variation cannot be used to carry a 502 DetNet flow. 504 5. Achieving zero congestion loss 506 When the input rate to an output queue exceeds the output rate for a 507 sufficient length of time, the queue must overflow. This is 508 congestion loss, and this is what deterministic networking seeks to 509 avoid. 511 To avoid congestion losses, an upper bound on the backlog present in 512 the regulator and queuing subsystem of Figure 1 must be computed 513 during resource reservation. This bound depends on the set of flows 514 that use these queues, the details of the specific queuing mechanism 515 and an upper bound on the processing delay (4). The queue must 516 contain the packet in transmission plus all other packets that are 517 waiting to be selected for output. 519 A conservative backlog bound, that applies to all systems, can be 520 derived as follows. 522 The backlog bound is counted in data units (bytes, or words of 523 multiple bytes) that are relevant for buffer allocation. For every 524 class we need one buffer space for the packet in transmission, plus 525 space for the packets that are waiting to be selected for output. 526 Excluding transmission and preemption times, the packets are waiting 527 in the queue since reception of the last bit, for a duration equal to 528 the processing delay (4) plus the queuing delays (5,6). 530 Let 532 o total_in_rate be the sum of the line rates of all input ports that 533 send traffic of any class to this output port. The value of 534 total_in_rate is in data units (e.g. bytes) per second. 536 o nb_input_ports be the number input ports that send traffic of any 537 class to this output port 539 o max_packet_length be the maximum packet size for packets of any 540 class that may be sent to this output port. This is counted in 541 data units. 543 o max_delay456 be an upper bound, in seconds, on the sum of the 544 processing delay (4) and the queuing delays (5,6) for a packet of 545 any class at this output port. 547 Then a bound on the backlog of traffic of all classes in the queue at 548 this output port is 550 backlog_bound = nb_input_ports * max_packet_length + 551 total_in_rate* max_delay456 553 6. Queuing techniques 555 In this section, for simplicity of delay computation, we assume that 556 the T-SPEC or arrival curve [NetCalBook] for each flow at source is 557 leaky bucket. Also, at each relay node, the service for each queue 558 is abstracted with a guaranteed rate and a latency. 560 6.1. Queuing data model 562 Sophisticated queuing mechanisms are available in Layer 3 (L3, see, 563 e.g., [RFC7806] for an overview). In general, we assume that "Layer 564 3" queues, shapers, meters, etc., are precisely the "regulators" 565 shown in Figure 1. The "queuing subsystems" in this figure are not 566 the province solely of bridges; they are an essential part of any 567 DetNet transit node. As illustrated by numerous implementation 568 examples, some of the "Layer 3" mechanisms described in documents 569 such as [RFC7806] are often integrated, in an implementation, with 570 the "Layer 2" mechanisms also implemented in the same node. An 571 integrated model is needed in order to successfully predict the 572 interactions among the different queuing mechanisms needed in a 573 network carrying both DetNet flows and non-DetNet flows. 575 Figure 3 shows the general model for the flow of packets through the 576 queues of a DetNet transit node. Packets are assigned to a class of 577 service. The classes of service are mapped to some number of 578 regulator queues. Only DetNet/TSN packets pass through regulators. 579 Queues compete for the selection of packets to be passed to queues in 580 the queuing subsystem. Packets again are selected for output from 581 the queuing subsystem. 583 | 584 +--------------------------------V----------------------------------+ 585 | Class of Service Assignment | 586 +--+------+----------+---------+-----------+-----+-------+-------+--+ 587 | | | | | | | | 588 +--V-+ +--V-+ +--V--+ +--V--+ +--V--+ | | | 589 |Flow| |Flow| |Flow | |Flow | |Flow | | | | 590 | 0 | | 1 | ... | i | | i+1 | ... | n | | | | 591 | reg| | reg| | reg | | reg | | reg | | | | 592 +--+-+ +--+-+ +--+--+ +--+--+ +--+--+ | | | 593 | | | | | | | | 594 +--V------V----------V--+ +--V-----------V--+ | | | 595 | Trans. selection | | Trans. select. | | | | 596 +----------+------------+ +-----+-----------+ | | | 597 | | | | | 598 +--V--+ +--V--+ +--V--+ +--V--+ +--V--+ 599 | out | | out | | out | | out | | out | 600 |queue| |queue| |queue| |queue| |queue| 601 | 1 | | 2 | | 3 | | 4 | | 5 | 602 +--+--+ +--+--+ +--+--+ +--+--+ +--+--+ 603 | | | | | 604 +----------V----------------------V--------------V-------V-------V--+ 605 | Transmission selection | 606 +----------+----------------------+--------------+-------+-------+--+ 607 | | | | | 608 V V V V V 609 DetNet/TSN queue DetNet/TSN queue non-DetNet/TSN queues 611 Figure 3: IEEE 802.1Q Queuing Model: Data flow 613 Some relevant mechanisms are hidden in this figure, and are performed 614 in the queue boxes: 616 o Discarding packets because a queue is full. 618 o Discarding packets marked "yellow" by a metering function, in 619 preference to discarding "green" packets. 621 Ideally, neither of these actions are performed on DetNet packets. 622 Full queues for DetNet packets should occur only when a flow is 623 misbehaving, and the DetNet QoS does not include "yellow" service for 624 packets in excess of committed rate. 626 The Class of Service Assignment function can be quite complex, even 627 in a bridge [IEEE8021Q], since the introduction of per-stream 628 filtering and policing ([IEEE8021Q] clause 8.6.5.1). In addition to 629 the Layer 2 priority expressed in the 802.1Q VLAN tag, a DetNet 630 transit node can utilize any of the following information to assign a 631 packet to a particular class of service (queue): 633 o Input port. 635 o Selector based on a rotating schedule that starts at regular, 636 time-synchronized intervals and has nanosecond precision. 638 o MAC addresses, VLAN ID, IP addresses, Layer 4 port numbers, DSCP. 639 ([I-D.ietf-detnet-ip], [I-D.ietf-detnet-mpls]) (Work items are 640 expected to add MPC and other indicators.) 642 o The Class of Service Assignment function can contain metering and 643 policing functions. 645 o MPLS and/or pseudowire ([RFC6658]) labels. 647 The "Transmission selection" function decides which queue is to 648 transfer its oldest packet to the output port when a transmission 649 opportunity arises. 651 6.2. Preemption 653 In [IEEE8021Q] and [IEEE8023], the transmission of a frame can be 654 interrupted by one or more "express" frames, and then the interrupted 655 frame can continue transmission. This frame preemption is modeled as 656 consisting of two MAC/PHY stacks, one for packets that can be 657 interrupted, and one for packets that can interrupt the interruptible 658 packets. The Class of Service (queue) determines which packets are 659 which. Only one layer of preemption is supported -- a transmitter 660 cannot have more than one interrupted frame in progress. DetNet 661 flows typically pass through the interrupting MAC. For those DetNet 662 flows with T-SPEC, latency bound can be calculated by the methods 663 provided in the following sections that accounts for the affect of 664 preemption, according to the specific queuing mechanism that is used 665 in DetNet nodes. Best-effort queues pass through the interruptible 666 MAC, and can thus be preempted. 668 6.3. Time Aware Shaper 670 In [IEEE8021Q], the notion of time-scheduling queue gates is 671 described in section 8.6.8.4. On each node, the transmission 672 selection for packets is controlled by time-synchronized gates; each 673 output queue is associated with a gate. The gates can be either open 674 or close. The states of the gates are determined by the gate control 675 list (GCL). The GCL specifies the opening and closing times of the 676 gates. Since the design of GCL should satisfy the requirement of 677 latency upper bounds of all time-sensitive flows, those flows travers 678 a network should have bounded latency, if the traffic and nodes are 679 conformant. 681 It should be noted that scheduled traffic service relies on a 682 synchronized network and coordinated GCL configuration. Synthesis of 683 GCL on multiple nodes in network is a scheduling problem considering 684 all TSN/DetNet flows traversing the network, which is a non- 685 deterministic polynomial-time hard (NP-hard) problem. Also, at this 686 writing, scheduled traffic service supports no more than eight 687 traffic classes, typically using up to seven priority classes and at 688 least one best effort class. 690 6.4. Credit-Based Shaper with Asynchronous Traffic Shaping 692 In the cosidered queuing model, there are four types of flows, 693 namely, control-data traffic (CDT), class A, class B, and best effort 694 (BE) in decreasing order of priority. Flows of classes A and B are 695 together referred to AVB flows. This model is a subset of Time- 696 Sensitive Networking as described next. 698 Based on the timing model described in Figure 1, the contention 699 occurs only at the output port of a relay node; therefore, the focus 700 of the rest of this subsection is on the regulator and queuing 701 subsystem in the output port of a relay node. The output port 702 performs per-class scheduling with eight classes (queuing 703 subsystems): one for CDT, one for class A traffic, one for class B 704 traffic, and five for BE traffic denoted as BE0-BE4. The queuing 705 policy for each queuing subsystem is FIFO. In addition, each node 706 output port also performs per-flow regulation for AVB flows using an 707 interleaved regulator (IR), called Asynchronous Traffic Shaper 708 [IEEE8021Qcr]. Thus, at each output port of a node, there is one 709 interleaved regulator per-input port and per-class; the interleaved 710 regulator is mapped to the regulator depicted in Figure 1. The 711 detailed picture of scheduling and regulation architecture at a node 712 output port is given by Figure 4. The packets received at a node 713 input port for a given class are enqueued in the respective 714 interleaved regulator at the output port. Then, the packets from all 715 the flows, including CDT and BE flows, are enqueued in queuing 716 subsytem; there is no regulator for such classes. 718 +--+ +--+ +--+ +--+ 719 | | | | | | | | 720 |IR| |IR| |IR| |IR| 721 | | | | | | | | 722 +-++XXX++-+ +-++XXX++-+ 723 | | | | 724 | | | | 725 +---+ +-v-XXX-v-+ +-v-XXX-v-+ +-----+ +-----+ +-----+ +-----+ +-----+ 726 | | | | | | |Class| |Class| |Class| |Class| |Class| 727 |CDT| | Class A | | Class B | | BE4 | | BE3 | | BE2 | | BE1 | | BE0 | 728 | | | | | | | | | | | | | | | | 729 +-+-+ +----+----+ +----+----+ +--+--+ +--+--+ +--+--+ +--+--+ +--+--+ 730 | | | | | | | | 731 | +-v-+ +-v-+ | | | | | 732 | |CBS| |CBS| | | | | | 733 | +-+-+ +-+-+ | | | | | 734 | | | | | | | | 735 +-v--------v-----------v---------v-------V-------v-------v-------v--+ 736 | Strict Priority selection | 737 +--------------------------------+----------------------------------+ 738 | 739 V 741 Figure 4: The architecture of an output port inside a relay node with 742 interleaved regulators (IRs) and credit-based shaper (CBS) 744 Each of the queuing subsystems for class A and B, contains Credit- 745 Based Shaper (CBS). The CBS serves a packet from a class according 746 to the available credit for that class. The credit for each class A 747 or B increases based on the idle slope, and decreases based on the 748 send slope, both of which are parameters of the CBS (Section 8.6.8.2 749 of [IEEE8021Q]). The CDT and BE0-BE4 flows are served by separate 750 queuing subsystems. Then, packets from all flows are served by a 751 transmission selection subsystem that serves packets from each class 752 based on its priority. All subsystems are non-preemptive. 753 Guarantees for AVB traffic can be provided only if CDT traffic is 754 bounded; it is assumed that the CDT traffic has leaky bucket arrival 755 curve with two parameters r_h as rate and b_h as bucket size, i.e., 756 the amount of bits entering a node within a time interval t is 757 bounded by r_h t + b_h. 759 Additionally, it is assumed that the AVB flows are also regulated at 760 their source according to leaky bucket arrival curve. At the source, 761 the traffic satisfies its regulation constraint, i.e. the delay due 762 to interleaved regulator at source is ignored. 764 At each DetNet transit node implementing an interleaved regulator, 765 packets of multiple flows are processed in one FIFO queue; the packet 766 at the head of the queue is regulated based on its leaky bucket 767 parameters; it is released at the earliest time at which this is 768 possible without violating the constraint. The regulation parameters 769 for a flow (leaky bucket rate and bucket size) are the same at its 770 source and at all DetNet transit nodes along its path. 772 6.4.1. Delay Bound Calculation 774 A delay bound of the queuing subsystem ([4] in Figure 1) for an AVB 775 flow of class A or B can be computed if the following condition 776 holds: 778 sum of leaky bucket rates of all flows of this class at this 779 transit node <= R, where R is given below for every class. 781 If the condition holds, the delay bounds for a flow of class X (A or 782 B) is d_X and calculated as: 784 d_X = T_X + (b_t_X-L_min_X)/R_X - L_min_X/c 786 where L_min_X is the minimum packet lengths of class X (A or B); c is 787 the output link transmission rate; b_t_X is the sum of the b term 788 (bucket size) for all the flows of the class X. Parameters R_X and 789 T_X are calculated as follows for class A and class B, separately: 791 If the flow is of class A: 793 R_A = I_A (c-r_h)/ c 795 T_A = L_nA + b_h + r_h L_n/c)/(c-r_h) 797 where L_nA is the maximum packet length of class B and BE packets; 798 L_n is the maximum packet length of classes A,B, and BE. 800 If the flow is of class B: 802 R_B = I_B (c-r_h)/ c 804 T_B = (L_BE + L_A + L_nA I_A/(c_h-I_A) + b_h + r_h L_n/c)/(c-r_h) 806 where L_A is the maximum packet length of class A; L_BE is the 807 maximum packet length of class BE. 809 Then, an end-to-end delay bound of class X (A or B)is calculated by 810 the formula Section 4.2.2, where for Cij: 812 Cij = d_X 814 More information of delay analysis in such a DetNet transit node is 815 described in [TSNwithATS]. 817 6.4.2. Flow Admission 819 The delay bound calculation requires some information about each 820 node. For each node, it is required to know the idle slope of CBS 821 for each class A and B (I_A and I_B), as well as the transmission 822 rate of the output link (c). Besides, it is necessary to have the 823 information on each class, i.e. maximum packet length of classes A, 824 B, and BE. Moreover, the leaky bucket parameters of CDT (r_h,b_h) 825 should be known. To admit a flow/flows, their delay requirements 826 should be guaranteed not to be violated. As described in 827 Section 3.1, the two problems, static and dynamic, are addressed 828 separately. In either of the problems, the rate and delay should be 829 guaranteed. Thus, 831 The static admission control: 832 The leaky bucket parameters of all flows are known, 833 therefore, for each flow f, a delay bound can be calculated. 834 The computed delay bound for every flow should not be more 835 than its delay requirement. Moreover, the sum of the rate of 836 each flow (r_f) should not be more than the rate allocated to 837 each class (R). If these two conditions hold, the 838 configuration is declared admissible. 840 The dynamic admission control: 841 For dynamic admission control, we allocate to every node and 842 class A or B, static value for rate (R) and maximum 843 burstiness (b_t). In addition, for every node and every 844 class A and B, two counters are maintained: 846 R_acc is equal to the sum of the leaky-bucket rates of all 847 flows of this class already admitted at this node; At all 848 times, we must have: 850 R_acc <=R, (Eq. 1) 852 b_acc is equal to the sum of the bucket sizes of all flows 853 of this class already admitted at this node; At all times, 854 we must have: 856 b_acc <=b_t. (Eq. 2) 858 A new flow is admitted at this node, if Eqs. (1) and (2) 859 continue to be satisfied after adding its leaky bucket rate 860 and bucket size to R_acc and b_acc. A flow is admitted in 861 the network, if it is admitted at all nodes along its path. 862 When this happens, all variables R_acc and b_acc along its 863 path must be incremented to reflect the addition of the flow. 864 Similarly, when a flow leaves the network, all variables 865 R_acc and b_acc along its path must be decremented to reflect 866 the removal of the flow. 868 The choice of the static values of R and b_t at all nodes and classes 869 must be done in a prior configuration phase; R controls the bandwidth 870 allocated to this class at this node, b_t affects the delay bound and 871 the buffer requirement. R must satisfy the constraints given in 872 Annex L.1 of [IEEE8021Q]. 874 6.5. IntServ 876 Integrated service (IntServ) is an architecture that specifies the 877 elements to guarantee quality of service (QoS) on networks. 879 The flow, at the source, has a leaky bucket arrival curve with two 880 parameters r as rate and b as bucket size, i.e., the amount of bits 881 entering a node within a time interval t is bounded by r t + b. 883 If a resource reservation on a path is applied, a node provides a 884 guaranteed rate R and maximum service latency of T. This can be 885 interpreted in a way that the bits might have to wait up to T before 886 being served with a rate greater or equal to R. The delay bound of 887 the flow traversing the node is T + b / R. 889 Consider an IntServ path including a sequence of nodes, where the 890 i-th node provides a guaranteed rate R_i and maximum service latency 891 of T_i. Then, the end-to-end delay bound for a flow on this can be 892 calculated as sum(T_i) + b / min(R_i). 894 If more information about the flow is known, e.g. the peak rate, the 895 delay bound is more complicated; the detail is available in 896 Section 1.4.1 of [NetCalBook]. 898 6.6. Cyclic Queuing and Forwarding 900 Annex T of [IEEE8021Q] describes Cyclic Queuing and Forwarding (CQF), 901 which provides bounded latency and zero congestion loss using the 902 time-scheduled gates of [IEEE8021Q] section 8.6.8.4. For a given 903 DetNet class of service, a set of two or more buffers is provided at 904 the output queue layer of Figure 3. A cycle time T_c is configured 905 for each class c, and all of the buffer sets in a class swap buffers 906 simultaneously throughout the DetNet domain at that cycle rate, all 907 in phase. In such a mechanism, the regulator, mentioned in Figure 1, 908 is not required. 910 In the case of two-buffer CQF, each class c has two buffers, namely 911 buffer1 and buffer2. In a cycle (i) when buffer1 accumulates 912 received packets from the node's reception ports, buffer2 transmits 913 the already stored packets from the previous cycle (i-1). In the 914 next cycle (i+1), buffer2 stores the received packets and buffer1 915 transmits the packets received in cycle (i). The duration of each 916 cycle is T_c. 918 The per-hop latency is trivially determined by the cycle time T_c: 919 the packet transmitted from a node at a cycle (i), is transmitted 920 from the next node at cycle (i+1). Hence, the maximum delay 921 experienced by a given packet is from the beginning of cycle (i) to 922 the end of cycle (i+1), or 2T_c; also, the minimum delay is from the 923 end of cycle (i) to the beginning of cycle (i+1), i.e., zero. Then, 924 if the packet traverses h hops, the maximum delay is: 926 (h+1) T_c 928 and the minimum delay is: 930 (h-1) T_c 932 which gives a latency variation of 2T_c. 934 The cycle length T_c should be carefully chosen; it needs to be large 935 enough to accomodate all the DetNet traffic, plus at least one 936 maximum interfering packet, that can be received within one cycle. 937 Also, the value of T_c includes a time interval, called dead time 938 (DT), which is the sum of the delays 1,2,3,4 defined in Figure 1. 939 The value of DT guarantees that the last packet of one cycle in a 940 node is fully delivered to a buffer of the next node is the same 941 cycle. A two-buffer CQF is recommended if DT is small compared to 942 T_c. For a large DT, CQF with more buffers can be used. 944 Ingress conditioning (Section 4.3) may be required if the source of a 945 DetNet flow does not, itself, employ CQF. Since there are no per- 946 flow parameters in the CQF technique, per-hop configuration is not 947 required in the CQF forwarding nodes. 949 7. References 951 7.1. Normative References 953 [I-D.ietf-detnet-ip] 954 Varga, B., Farkas, J., Berger, L., Fedyk, D., Malis, A., 955 and S. Bryant, "DetNet Data Plane: IP", draft-ietf-detnet- 956 ip-05 (work in progress), February 2020. 958 [I-D.ietf-detnet-mpls] 959 Varga, B., Farkas, J., Berger, L., Fedyk, D., Malis, A., 960 Bryant, S., and J. Korhonen, "DetNet Data Plane: MPLS", 961 draft-ietf-detnet-mpls-05 (work in progress), February 962 2020. 964 [RFC2212] Shenker, S., Partridge, C., and R. Guerin, "Specification 965 of Guaranteed Quality of Service", RFC 2212, 966 DOI 10.17487/RFC2212, September 1997, 967 . 969 [RFC6658] Bryant, S., Ed., Martini, L., Swallow, G., and A. Malis, 970 "Packet Pseudowire Encapsulation over an MPLS PSN", 971 RFC 6658, DOI 10.17487/RFC6658, July 2012, 972 . 974 [RFC7806] Baker, F. and R. Pan, "On Queuing, Marking, and Dropping", 975 RFC 7806, DOI 10.17487/RFC7806, April 2016, 976 . 978 [RFC8578] Grossman, E., Ed., "Deterministic Networking Use Cases", 979 RFC 8578, DOI 10.17487/RFC8578, May 2019, 980 . 982 [RFC8655] Finn, N., Thubert, P., Varga, B., and J. Farkas, 983 "Deterministic Networking Architecture", RFC 8655, 984 DOI 10.17487/RFC8655, October 2019, 985 . 987 7.2. Informative References 989 [bennett2002delay] 990 J.C.R. Bennett, K. Benson, A. Charny, W.F. Courtney, and 991 J.-Y. Le Boudec, "Delay Jitter Bounds and Packet Scale 992 Rate Guarantee for Expedited Forwarding", 993 . 995 [charny2000delay] 996 A. Charny and J.-Y. Le Boudec, "Delay Bounds in a Network 997 with Aggregate Scheduling", . 1000 [IEEE8021Q] 1001 IEEE 802.1, "IEEE Std 802.1Q-2018: IEEE Standard for Local 1002 and metropolitan area networks - Bridges and Bridged 1003 Networks", 2018, 1004 . 1006 [IEEE8021Qcr] 1007 IEEE 802.1, "IEEE P802.1Qcr: IEEE Draft Standard for Local 1008 and metropolitan area networks - Bridges and Bridged 1009 Networks - Amendment: Asynchronous Traffic Shaping", 2017, 1010 . 1012 [IEEE8021TSN] 1013 IEEE 802.1, "IEEE 802.1 Time-Sensitive Networking (TSN) 1014 Task Group", . 1016 [IEEE8023] 1017 IEEE 802.3, "IEEE Std 802.3-2018: IEEE Standard for 1018 Ethernet", 2018, 1019 . 1021 [le_boudec_theory_2018] 1022 J.-Y. Le Boudec, "A Theory of Traffic Regulators for 1023 Deterministic Networks with Application to Interleaved 1024 Regulators", 1025 . 1027 [NetCalBook] 1028 J.-Y. Le Boudec and P. Thiran, "Network calculus: a theory 1029 of deterministic queuing systems for the internet", 2001, 1030 . 1032 [Specht2016UBS] 1033 J. Specht and S. Samii, "Urgency-Based Scheduler for Time- 1034 Sensitive Switched Ethernet Networks", 1035 . 1037 [TSNwithATS] 1038 E. Mohammadpour, E. Stai, M. Mohiuddin, and J.-Y. Le 1039 Boudec, "End-to-end Latency and Backlog Bounds in Time- 1040 Sensitive Networking with Credit Based Shapers and 1041 Asynchronous Traffic Shaping", 1042 . 1044 Authors' Addresses 1046 Norman Finn 1047 Huawei Technologies Co. Ltd 1048 3101 Rio Way 1049 Spring Valley, California 91977 1050 US 1052 Phone: +1 925 980 6430 1053 Email: nfinn@nfinnconsulting.com 1055 Jean-Yves Le Boudec 1056 EPFL 1057 IC Station 14 1058 Lausanne EPFL 1015 1059 Switzerland 1061 Email: jean-yves.leboudec@epfl.ch 1063 Ehsan Mohammadpour 1064 EPFL 1065 IC Station 14 1066 Lausanne EPFL 1015 1067 Switzerland 1069 Email: ehsan.mohammadpour@epfl.ch 1070 Jiayi Zhang 1071 Huawei Technologies Co. Ltd 1072 Q27, No.156 Beiqing Road 1073 Beijing 100095 1074 China 1076 Email: zhangjiayi11@huawei.com 1078 Balazs Varga 1079 Ericsson 1080 Konyves Kalman krt. 11/B 1081 Budapest 1097 1082 Hungary 1084 Email: balazs.a.varga@ericsson.com 1086 Janos Farkas 1087 Ericsson 1088 Konyves Kalman krt. 11/B 1089 Budapest 1097 1090 Hungary 1092 Email: janos.farkas@ericsson.com