idnits 2.17.1
draft-qiang-detnet-large-scale-detnet-05.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** There are 6 instances of too long lines in the document, the longest one
being 2 characters in excess of 72.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
== The document doesn't use any RFC 2119 keywords, yet seems to have RFC
2119 boilerplate text.
-- The document date (September 3, 2019) is 1689 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
No issues found here.
Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group L. Qiang, Ed.
3 Internet-Draft X. Geng
4 Intended status: Informational B. Liu
5 Expires: March 6, 2020 T. Eckert, Ed.
6 Huawei
7 L. Geng
8 China Mobile
9 G. Li
10 September 3, 2019
12 Large-Scale Deterministic IP Network
13 draft-qiang-detnet-large-scale-detnet-05
15 Abstract
17 This document presents the overall framework and key method for
18 Large-scale Deterministic Network (LDN). LDN can provide bounded
19 latency and delay variation (jitter) without requiring precise time
20 synchronization among nodes, or per-flow state in transit nodes.
22 Status of This Memo
24 This Internet-Draft is submitted in full conformance with the
25 provisions of BCP 78 and BCP 79.
27 Internet-Drafts are working documents of the Internet Engineering
28 Task Force (IETF). Note that other groups may also distribute
29 working documents as Internet-Drafts. The list of current Internet-
30 Drafts is at https://datatracker.ietf.org/drafts/current/.
32 Internet-Drafts are draft documents valid for a maximum of six months
33 and may be updated, replaced, or obsoleted by other documents at any
34 time. It is inappropriate to use Internet-Drafts as reference
35 material or to cite them other than as "work in progress."
37 This Internet-Draft will expire on March 6, 2020.
39 Copyright Notice
41 Copyright (c) 2019 IETF Trust and the persons identified as the
42 document authors. All rights reserved.
44 This document is subject to BCP 78 and the IETF Trust's Legal
45 Provisions Relating to IETF Documents
46 (https://trustee.ietf.org/license-info) in effect on the date of
47 publication of this document. Please review these documents
48 carefully, as they describe your rights and restrictions with respect
49 to this document. Code Components extracted from this document must
50 include Simplified BSD License text as described in Section 4.e of
51 the Trust Legal Provisions and are provided without warranty as
52 described in the Simplified BSD License.
54 Table of Contents
56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
57 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3
58 1.2. Terminology & Abbreviations . . . . . . . . . . . . . . . 3
59 2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 4
60 2.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 4
61 2.2. Background . . . . . . . . . . . . . . . . . . . . . . . 4
62 2.2.1. Deterministic End-to-End Latency . . . . . . . . . . 4
63 2.2.2. Hop-by-Hop Delay . . . . . . . . . . . . . . . . . . 4
64 2.2.3. Cyclic Forwarding . . . . . . . . . . . . . . . . . . 5
65 2.2.4. Co-Existence with Non-Deterministic Traffic . . . . . 6
66 2.3. System Components . . . . . . . . . . . . . . . . . . . . 6
67 3. LDN Forwarding Mechanism . . . . . . . . . . . . . . . . . . 7
68 3.1. Cyclic Queues . . . . . . . . . . . . . . . . . . . . . . 8
69 3.2. Cycle Mapping . . . . . . . . . . . . . . . . . . . . . . 9
70 4. Performance Analysis . . . . . . . . . . . . . . . . . . . . 11
71 4.1. Queueing Delay . . . . . . . . . . . . . . . . . . . . . 11
72 4.2. Jitter . . . . . . . . . . . . . . . . . . . . . . . . . 11
73 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
74 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13
75 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 13
76 8. Normative References . . . . . . . . . . . . . . . . . . . . 14
77 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14
79 1. Introduction
81 This document explores the DetNet forwarding over large-scale
82 network. In contrast to TSN that deployed in LAN, DetNet is expected
83 to be deployed in larger scale network that has the following
84 features:
86 o a large number of network devices
88 o the distance between two network devices is long
90 o a lot of deterministic flows on the network
92 These above features will bring the following challenges to DetNet
93 forwarding:
95 o difficult to achieve precise time synchronization among all nodes
96 o long link propagation delay may introduce bigger jitter
98 o per-flow state is un-scalable
100 Motivated by these challenges, this document presents a Large-scale
101 Deterministic Network (LDN) mechanism. As
102 [draft-ietf-detnet-problem-statement] indicates, deterministic
103 forwarding can only apply on flows with well-defined traffic
104 characteristics. The traffic characteristics of DetNet flow has been
105 discussed in [draft-ietf-detnet-architecture], that could be achieved
106 through shaping at Ingress node or up-front commitment by
107 application. LDN assumes that DetNet flows follow some specific
108 traffic patterns accordingly.
110 1.1. Requirements Language
112 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
113 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
114 "OPTIONAL" in this document are to be interpreted as described in BCP
115 14 [RFC2119][RFC8174] when, and only when, they appear in all
116 capitals, as shown here.
118 1.2. Terminology & Abbreviations
120 This document uses the terminology defined in
121 [draft-ietf-detnet-architecture].
123 TSN: Time Sensitive Network
125 PQ: Priority Queuing
127 CQF: Cyclic Queuing and Forwarding
129 LDN: Large-scale Deterministic Network
131 DSCP: Differentiated Services Code Point
133 EXP: Experimental
135 TC: Traffic Class
137 T: the length of a cycle
139 H: the number of hops
141 2. Overview
143 2.1. Summary
145 In LDN, nodes (network devices) have synchronized frequency, and each
146 node forwards packets in a slotted fashion based on a cycle
147 identifiers carried in packets. Ingres nodes or senders have a
148 function called gate to shape/condition traffic flows. Except for
149 this gate function, the LDN has no awareness of individual flows.
151 2.2. Background
153 This section motivates the design choices taken by the proposed
154 solution and gives the necessary background for deterministic delay
155 based forwarding plane designs.
157 2.2.1. Deterministic End-to-End Latency
159 Bounded delay is delay that has a deterministic upper and lower
160 bound.
162 The delay for packets that need to be forwarded with deterministic
163 delay needs to be deterministic on every hop. If any hop in the
164 network introduces non-deterministic delay, then the network itself
165 can not deliver a deterministic delay service anymore.
167 2.2.2. Hop-by-Hop Delay
169 Consider a simple example shown in Figure 1, where Node X has 10
170 receiving interfaces and one outgoing interface I all of the same
171 speed. There are 10 deterministic traffic flows, each consuming 5%
172 of a links bandwidth, one from each receiving interface to the
173 outgoing interface.
175 Node X sends 'only' 50% deterministic traffic to interface I, so
176 there is no ongoing congestion, but there is added delay. If the
177 arrival time of packets for these 10 flows into X is uncontrolled,
178 then the worst case is for them to all arrive at the same time. One
179 packet has to wait in X until the other 9 packets are sent out on I,
180 resulting in a worst case deterministic delay of 9 packets
181 serialization time. On the next hop node Y downstream from X, this
182 problem can become worse. Assume Y has 10 upstream nodes like X, the
183 worst case simultaneous burst of packets is now 100 packets, or a 99
184 packet serialization delay as the worst case upper bounded delay
185 incurred on this hop.
187 To avoid the problem of high upper bound end-to-end delay, traffic
188 needs to be conditioned/interleaved on every hop. This allows to
189 create solutions where the per-hop-delay is bounded purely by the
190 physics of the forwarding plane across the node, but not the
191 accumulated characteristics of prior hop traffic profiles.
193 +--+ +--+ --- ---
194 |A1| |A0| - - - -
195 +--+ +--+ - - - -
196 ---------------->- - - -
197 +--+ +--+ - - - -
198 |B1| |B0| - - - -
199 +--+ +--+ - -Interface I - -
200 ---------------->-Node X ---------------> -Node Y ----->
201 +--++--+ - - - -
202 |C1||C0| - -
203 +--++--+ - - - -
204 ---------------->- - - -
205 . - - - -
206 . - - - -
207 . - - - -
208 --- ---
210 Figure 1: Micro-burst and micro-burst iteration
212 2.2.3. Cyclic Forwarding
214 The common approach to solve that problem is that of a cyclic hop-by-
215 hop forwarding mechanism. Assume packets forwarded from N1 via N2 to
216 N3 as shown in Figure 2. When N1 sends a packet P to interface I1
217 with a Cycle X, it must be guaranteed by the forwarding mechanism
218 that N2 will forward P via I2 to N3 in a cycle Y.
220 The cycle of a packet can either be deduced by a receiving node from
221 the exact time it was received as is done in SDN/TDMA systems, and/or
222 it can be indicated in the packet. This document solution relies on
223 such markings because they allow to reduce the need for synchronous
224 hop-by-hop transmission timings of packets.
226 In a packet marking based slotted forwarding model, node N1 needs to
227 send packets for cycle X before the latest possible time that will
228 allow for N2 to further forward it in cycle Y to N3. Because of the
229 marking, N1 could even transmit packets for cycle X before all
230 packets for the previous cycle (X-1) have been sent, reducing the
231 synchronization requirements between across nodes.
233 P sent in P sent in P sent in
234 cycle(N1,I1,X) cycle(N2,I2,Y) cycle(N3,I3,Z)
235 +--------+ +--------+ +--------+
236 | Node N1|------->| Node N2|-------->| Node N3|------>
237 +--------+I1 +--------+I2 +--------+I3
239 Figure 2: Cyclic Forwarding
241 2.2.4. Co-Existence with Non-Deterministic Traffic
243 Traffic with deterministic delay requirements can co-exist with
244 traffic only requiring non-deterministic delay by using packet
245 scheduling where the delay incurred by non-deterministic packets is
246 deterministic for the deterministic traffic (and low). If LDN is
247 deployed together with such non-deterministic delay traffic than such
248 a scheme must be supported by the forwarding plane. A simple
249 approach for the delay incurred on the sending interface of a
250 deterministic node due to non-deterministic traffic is to serve
251 deterministic traffic via a strict, highest-priority queue and
252 include the worst case delay of a currently serialized non-
253 deterministic packet into the deterministic delay budget of the node.
254 Similar considerations apply to the internal processing delays in a
255 node.
257 2.3. System Components
259 The Figure 3 shows an overview of the components considered in this
260 document system and how they interact.
262 A network topology of nodes, Ingress, Core and Egress support a
263 method for cyclic forwarding to enable LDN. This forwarding requires
264 no per-flow state on the nodes, and tolerates loss time
265 synchronization.
267 Ingress edge nodes may support the (G)ate function to shape traffic
268 from sources into the desired traffic characteristics, unless the
269 source itself has such function. Per-flow state is required on the
270 ingress edge node. LDN should work with some resource reservation
271 methods, that will be not discussed in this document.
273 /--\. +--+ +--+ +--+ +--+. /--\
274 | (G)+-----+GS+--------+ S+------+ S+--------+ S+-----+ |
275 \--/ +--+ +--+ +--+ +--+ \--/
277 Sender Ingress Core Core Egress Receiver
278 Edge Node Node Node Edge Node
280 Figure 3: Overview of LDN
282 3. LDN Forwarding Mechanism
284 DetNet aims at providing deterministic service over large scale
285 network. In such large scale network, it is difficulty to get
286 precise time synchronization among numerous devices. To reduce
287 requirements, the forwarding mechanism described in this document
288 assumes only frequency synchronization but not time synchronization
289 across nodes: nodes maintain the same clock frequency 1/T, but do not
290 require the same time as shown in Figure 4.
292 <-----T-----> <-----T----->
293 | | | | | |
294 Node A +-----------+-----------+ Node A +-----------+-----------+
295 T0 T0
297 | | | | | |
298 Node B +-----------+-----------+ Node B +-----------+-----------+
299 T0 T0
301 (i) time synchronization (ii) frequency synchronization
303 T: length of a cycle
304 T0: timestamp
306 Figure 4: Time Synchronization & Frequency Synchronization
308 IEEE 802.1 CQF is an efficient forwarding mechanism in TSN that
309 guarantees bounded end-to-end latency. CQF is designed for limited
310 scale networks. Time synchronization is required, and the link
311 propagation delay is required to be smaller than a cycle length T.
312 Considering the large scale network deployment, the proposed LDN
313 Forwarding mechanism permits frequency synchronization and link
314 propagation delay may exceed T. Besides these two points, CQF and
315 the asynchronous forwarding of LDN are very similar.
317 Figure 5 compares CQF and LDN through an example. Suppose Node A is
318 the upstream node of Node B. In CQF, packets sent from Node A at
319 cycle x, will be received by Node B at the same cycle, then further
320 be sent to downstream node by Node B at cycle x+1.
322 In LDN, due to long link propagation delay and frequency
323 synchronization, Node B will receive packets from Node A at different
324 cycle denoted by y, then re-send out at cycle y+1. The cycle mapping
325 relationship (e.g., x->y+1) exists between any pair of neighbor
326 nodes. With this kind of cycle mapping, the receiving node can
327 easily figure out when the received packets should be sent out, the
328 only requirement is to carry the cycle identifier of sending node in
329 the packets.
331 | cycle x | cycle x+1 | | cycle x | cycle x+1 |
332 Node A +-----------+-----------+ Node A +-----------+-----------+
333 \ \
334 \packet \packet
335 \receiving \receiving
336 \ \
337 | V | cycle x+1 | | V | cycle y+1|
338 Node B +-----------+-----------+ Node B +-----------+-----------+
339 cycle x \packets cycle y \packets
340 \sending \sending
341 \ \
342 \ \
343 V V
345 (i) CQF (ii) LDN
347 Figure 5: CQF & LDN
349 3.1. Cyclic Queues
351 In CQF each port needs to maintain 2 (or 3) queues, one receiving
352 queue is used to buffer newly received packets, one sending queue is
353 used to store the packets that are going to be sent out, one more
354 queue may be needed to avoid output starvation [scheduled-queues].
356 In LDN, at least 3 cyclic queues (2 receiving queues and 1 sending
357 queue) are maintained for each port on a node. A cyclic queue
358 corresponds to a cycle. As Figure 6 illustrated, the downstream Node
359 B may receive packets sent at two different cycles from Node A due to
360 the absence of time synchronization. Following the cycle mapping
361 (i.e., x --> y+1), packets that carry cycle identifier x should be
362 sent out by Node B at cycle y+1, and packets that carry cycle
363 identifier x+1 should be sent out by Node B at cycle y+2. Therefore,
364 2 receiving queues are needed to store the received packets, one is
365 for the packets that carry cycle identifier x, another one is for the
366 packets that carry cycle identifier x+1. Plus one sending queue,
367 each port needs at least 3 cyclic queues in LDN. In order to absorb
368 more link delay variation (such as on radio interface), more queues
369 may be necessary.
371 | cycle x | cycle x+1 |
372 Node A +-----------+-----------+
373 \ \
374 \ \packet
375 \ \receiving
376 | V V | |
377 Node B +-----------+-----------+
378 cycle y cycle y+1
380 Figure 6: An example illustrates for 2 receiving queue in LDN
382 3.2. Cycle Mapping
384 The cycle mapping relationship (e.g., x->y+1) exists between any pair
385 of neighbor nodes, that could be configured through control plane or
386 self-studied in data plane. As Figure 7 shows, the cycle mapping
387 relationship instructs the packet forwarding in two modes -- swap
388 mode or stack mode.
390 o In swap mode, node stores the cycle mapping relationship locally.
391 After receiving a packet carrying a cycle identifier, the node
392 will check its cycle mapping relationship table, swap the cycle
393 identifier with a new cycle identifier, then put the packet into
394 an appropriate queue. A path with dedicated resource needs to be
395 established first, then packet is forwarded along the path in swap
396 mode.
398 o In stack mode, a central controller computes the cycle identifier
399 of every node, which ensures that there is no flow confliction
400 along the path and satisfies the end-to-end latency requirement.
401 The cycle identifiers are encapsulated into the packet in the
402 ingress. No other status information needs to be maintained in
403 the intermediate nodes.
405 LDN Packet
406 +------+---+ +-----------------------+ +------+---+
407 | | x | | | | |y+1|
408 +------+---+ | Swap Mode Node | +------+---+
409 ----------->| |----------->
410 | (x->y+1) |
411 | |
412 +-----------------------+
414 LDN Packet
415 +------+---===== +-----------------------+ +------=====---+
416 | |y+1= x = | | | =y+1= x |
417 +------+---===== | | +------=====---+
418 ----------->| Stack Mode Node |----------->
419 | |
420 | |
421 +-----------------------+
423 =====
424 = = Current Cycle Identifier
425 =====
427 Figure 7: Two Modes
429 As section 3.1 illustrates, there are 3 (or 4) different queues at
430 each port. Therefore, the cycle identifier should be able to express
431 3 (or 4) different values, each value corresponds to a queue. That
432 means minimal 2 bits are needed to identify different cycles between
433 a pair of neighboring nodes. This document does not yet aim to
434 propose one, but gives an (incomplete) list of ideas:
436 o DSCP of IPv4 Header
438 o Traffic Class of IPv6 Header
440 o TC of MPLS Header (used to be EXP)
442 o IPv6 Extension Header
444 o UDP Option
446 o SID of SRv6
448 o Reserved of SRH
450 o TLV of SRv6
451 o TC of SR-MPLS Header (used to be EXP)
453 o 3 (or 4) labels/adjacency SIDs for SR-MPLS
455 4. Performance Analysis
457 4.1. Queueing Delay
459 Figure 8 describes one-hop packet forwarding delay, that mainly
460 consisted of A->B link propagation delay and queuing delay in Node B.
462 |cycle x |
463 Node A +-------\+
464 \
465 \
466 \
467 |\ cycle y|cycle y+1|
468 Node B +V--------+--------\+
469 : \
470 : Queueing Delay :\
471 :...=2*T ............ V
473 Figure 8: Single-Hop Queueing Delay
475 As Figure 8 shows, cycle x of Node A will be mapped into cycle y+1 of
476 Node B as long as the last packet sent from A->B is received within
477 the cycle y. If the last packet is re-sent out by B at the end of
478 cycle y+1, then the largest single-hop queueing delay is 2*T.
479 Therefore the end-to-end queueing delay's upper bound is 2*T*H, where
480 H is the number of hops.
482 If A did not forward the LDN packet from a prior LDN forwarder but is
483 the actual traffic source, then the packet may have been delayed by a
484 gate function before it was sent to B. The delay of this function is
485 outside of scope for the LDN delay considerations. If B is not
486 forwarding the LDN packet but the final receiver, then the packet may
487 not need to be queued and released in the same fashion to the
488 receiver as it would be queued/released to a downstream LDN node, so
489 if a path has one source followed by N LDN forwarders followed by one
490 receivers, this should be considered to be a path with N-1 LDN hops
491 for the purpose of latency and jitter calculations.
493 4.2. Jitter
495 Considering the simplest scenario one hop forwarding at first,
496 suppose Node A is the upstream node of Node B, the packet sent from
497 Node A at cycle x will be received by Node B at cycle y as Figure 9
498 shows.
500 - The best situation is Node A sends packet at the end of cycle x,
501 and Node B receives packet at the beginning of cycle y, then the
502 delay is denoted by w;
504 - The worst situation is Node A sends packet at the beginning of
505 cycle x, and Node B receives packet at the end of cycle y, then
506 the delay= w + length of cycle x + length of cycle y= w+2*T;
508 - Hence the jitter's upper bound of this simplest scenario= worst
509 case-best case=2*T.
511 |cycle x | |cycle x |
512 Node A +-------\+ Node A +\-------+
513 :\ \ :
514 : \ -------------\
515 : \ : \
516 :w |\ | :w| \ |
517 Node B : +V--------+ Node B : +--------V+
518 cycle y cycle y
520 (a) best situation (b) worst situation
522 Figure 9: Jitter Analysis for One Hop Forwarding
524 Next considering two hops forwarding as Figure 10 shows.
526 - The best situation is Node A sends packet at the end of cycle x,
527 and Node C receives packet at the beginning of cycle z, then the
528 delay is denoted by w';
530 - The worst situation is Node A sends packet at the beginning of
531 cycle x, and Node C receives packet at the end of cycle z, then
532 the delay= w' + length of cycle x + length of cycle z= w'+2*T;
534 - Hence the jitter's upper bound = worst case-best case=2*T.
536 |cycle x |
537 Node A +-------\+
538 \
539 :\| cycle y |
540 Node B : \---------+
541 : \
542 : \--------\
543 : \ |
544 Node C ......w'......+V--------+
545 cycle z
547 (a) best situation
549 |cycle x |
550 Node A +\-------+
551 \ :
552 \ : | cycle y |
553 Node B \ : +---------+
554 \ :
555 ---:--------------------\
556 : | \ |
557 Node C :......w'.....+--------V+
558 cycle z
560 (b) worst situation
562 Figure 10: Jitter Analysis for Two Hops Forwarding
564 And so on. For multi-hop forwarding, the end-to-end delay will
565 increase as the number of hops increases, while the delay variation
566 (jitter) still does not exceed 2*T.
568 5. IANA Considerations
570 This document makes no request of IANA.
572 6. Security Considerations
574 Security issues have been carefully considered in
575 [draft-ietf-detnet-security]. More discussion is TBD.
577 7. Acknowledgements
579 TBD.
581 8. Normative References
583 [draft-ietf-detnet-architecture]
584 "DetNet Architecture", .
587 [draft-ietf-detnet-dp-sol]
588 "DetNet Data Plane Encapsulation",
589 .
592 [draft-ietf-detnet-problem-statement]
593 "DetNet Problem Statement",
594 .
597 [draft-ietf-detnet-security]
598 "DetNet Security Considerations",
599 .
602 [draft-ietf-detnet-use-cases]
603 "DetNet Use Cases", .
606 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
607 Requirement Levels", BCP 14, RFC 2119,
608 DOI 10.17487/RFC2119, March 1997,
609 .
611 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
612 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
613 May 2017, .
615 [scheduled-queues]
616 "Scheduled queues, UBS, CQF, and Input Gates",
617 .
620 Authors' Addresses
622 Li Qiang (editor)
623 Huawei
624 Beijing
625 China
627 Email: qiangli3@huawei.com
628 Xuesong Geng
629 Huawei
630 Beijing
631 China
633 Email: gengxuesong@huawei.com
635 Bingyang Liu
636 Huawei
637 Beijing
638 China
640 Email: liubingyang@huawei.com
642 Toerless Eckert (editor)
643 Huawei USA - Futurewei Technologies Inc.
644 2330 Central Expy
645 Santa Clara 95050
646 USA
648 Email: tte+ietf@cs.fau.de
650 Liang Geng
651 China Mobile
652 Beijing
653 China
655 Email: gengliang@chinamobile.com
657 Guangpeng Li