idnits 2.17.1
draft-qiang-detnet-large-scale-detnet-04.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** There are 6 instances of too long lines in the document, the longest one
being 2 characters in excess of 72.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
== The document doesn't use any RFC 2119 keywords, yet seems to have RFC
2119 boilerplate text.
-- The document date (March 8, 2019) is 1875 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
No issues found here.
Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group L. Qiang, Ed.
3 Internet-Draft X. Geng
4 Intended status: Informational B. Liu
5 Expires: September 9, 2019 T. Eckert, Ed.
6 Huawei
7 L. Geng
8 China Mobile
9 March 8, 2019
11 Large-Scale Deterministic IP Network
12 draft-qiang-detnet-large-scale-detnet-04
14 Abstract
16 This document presents the overall framework and key method for
17 Large-scale Deterministic Network (LDN). LDN can provide bounded
18 latency and delay variation (jitter) without requiring precise time
19 synchronization among nodes, or per-flow state in transit nodes.
21 Status of This Memo
23 This Internet-Draft is submitted in full conformance with the
24 provisions of BCP 78 and BCP 79.
26 Internet-Drafts are working documents of the Internet Engineering
27 Task Force (IETF). Note that other groups may also distribute
28 working documents as Internet-Drafts. The list of current Internet-
29 Drafts is at https://datatracker.ietf.org/drafts/current/.
31 Internet-Drafts are draft documents valid for a maximum of six months
32 and may be updated, replaced, or obsoleted by other documents at any
33 time. It is inappropriate to use Internet-Drafts as reference
34 material or to cite them other than as "work in progress."
36 This Internet-Draft will expire on September 9, 2019.
38 Copyright Notice
40 Copyright (c) 2019 IETF Trust and the persons identified as the
41 document authors. All rights reserved.
43 This document is subject to BCP 78 and the IETF Trust's Legal
44 Provisions Relating to IETF Documents
45 (https://trustee.ietf.org/license-info) in effect on the date of
46 publication of this document. Please review these documents
47 carefully, as they describe your rights and restrictions with respect
48 to this document. Code Components extracted from this document must
49 include Simplified BSD License text as described in Section 4.e of
50 the Trust Legal Provisions and are provided without warranty as
51 described in the Simplified BSD License.
53 Table of Contents
55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
56 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3
57 1.2. Terminology & Abbreviations . . . . . . . . . . . . . . . 3
58 2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 3
59 2.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 4
60 2.2. Background . . . . . . . . . . . . . . . . . . . . . . . 4
61 2.2.1. Deterministic End-to-End Latency . . . . . . . . . . 4
62 2.2.2. Hop-by-Hop Delay . . . . . . . . . . . . . . . . . . 4
63 2.2.3. Cyclic Forwarding . . . . . . . . . . . . . . . . . . 5
64 2.2.4. Co-Existence with Non-Deterministic Traffic . . . . . 6
65 2.3. System Components . . . . . . . . . . . . . . . . . . . . 6
66 3. LDN Forwarding Mechanism . . . . . . . . . . . . . . . . . . 7
67 3.1. Cyclic Queues . . . . . . . . . . . . . . . . . . . . . . 8
68 3.2. Cycle Mapping . . . . . . . . . . . . . . . . . . . . . . 9
69 4. Performance Analysis . . . . . . . . . . . . . . . . . . . . 11
70 4.1. Queueing Delay . . . . . . . . . . . . . . . . . . . . . 11
71 4.2. Jitter . . . . . . . . . . . . . . . . . . . . . . . . . 11
72 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
73 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13
74 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 13
75 8. Normative References . . . . . . . . . . . . . . . . . . . . 14
76 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14
78 1. Introduction
80 This document explores the DetNet forwarding over large-scale
81 network. In contrast to TSN that deployed in LAN, DetNet is expected
82 to be deployed in larger scale network that has the following
83 features:
85 o a large number of network devices
87 o the distance between two network devices is long
89 o a lot of deterministic flows on the network
91 These above features will bring the following challenges to DetNet
92 forwarding:
94 o difficult to achieve precise time synchronization among all nodes
96 o long link propagation delay may introduce bigger jitter
97 o per-flow state is un-scalable
99 Motivated by these challenges, this document presents a Large-scale
100 Deterministic Network (LDN) mechanism. As
101 [draft-ietf-detnet-problem-statement] indicates, deterministic
102 forwarding can only apply on flows with well-defined traffic
103 characteristics. The traffic characteristics of DetNet flow has been
104 discussed in [draft-ietf-detnet-architecture], that could be achieved
105 through shaping at Ingress node or up-front commitment by
106 application. LDN assumes that DetNet flows follow some specific
107 traffic patterns accordingly.
109 1.1. Requirements Language
111 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
112 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
113 "OPTIONAL" in this document are to be interpreted as described in BCP
114 14 [RFC2119][RFC8174] when, and only when, they appear in all
115 capitals, as shown here.
117 1.2. Terminology & Abbreviations
119 This document uses the terminology defined in
120 [draft-ietf-detnet-architecture].
122 TSN: Time Sensitive Network
124 PQ: Priority Queuing
126 CQF: Cyclic Queuing and Forwarding
128 LDN: Large-scale Deterministic Network
130 DSCP: Differentiated Services Code Point
132 EXP: Experimental
134 TC: Traffic Class
136 T: the length of a cycle
138 H: the number of hops
140 2. Overview
141 2.1. Summary
143 In LDN, nodes (network devices) have synchronized frequency, and each
144 node forwards packets in a slotted fashion based on a cycle
145 identifiers carried in packets. Ingres nodes or senders have a
146 function called gate to shape/condition traffic flows. Except for
147 this gate function, the LDN has no awareness of individual flows.
149 2.2. Background
151 This section motivates the design choices taken by the proposed
152 solution and gives the necessary background for deterministic delay
153 based forwarding plane designs.
155 2.2.1. Deterministic End-to-End Latency
157 Bounded delay is delay that has a deterministic upper and lower
158 bound.
160 The delay for packets that need to be forwarded with deterministic
161 delay needs to be deterministic on every hop. If any hop in the
162 network introduces non-deterministic delay, then the network itself
163 can not deliver a deterministic delay service anymore.
165 2.2.2. Hop-by-Hop Delay
167 Consider a simple example shown in Figure 1, where Node X has 10
168 receiving interfaces and one outgoing interface I all of the same
169 speed. There are 10 deterministic traffic flows, each consuming 5%
170 of a links bandwidth, one from each receiving interface to the
171 outgoing interface.
173 Node X sends 'only' 50% deterministic traffic to interface I, so
174 there is no ongoing congestion, but there is added delay. If the
175 arrival time of packets for these 10 flows into X is uncontrolled,
176 then the worst case is for them to all arrive at the same time. One
177 packet has to wait in X until the other 9 packets are sent out on I,
178 resulting in a worst case deterministic delay of 9 packets
179 serialization time. On the next hop node Y downstream from X, this
180 problem can become worse. Assume Y has 10 upstream nodes like X, the
181 worst case simultaneous burst of packets is now 100 packets, or a 99
182 packet serialization delay as the worst case upper bounded delay
183 incurred on this hop.
185 To avoid the problem of high upper bound end-to-end delay, traffic
186 needs to be conditioned/interleaved on every hop. This allows to
187 create solutions where the per-hop-delay is bounded purely by the
188 physics of the forwarding plane across the node, but not the
189 accumulated characteristics of prior hop traffic profiles.
191 +--+ +--+ --- ---
192 |A1| |A0| - - - -
193 +--+ +--+ - - - -
194 ---------------->- - - -
195 +--+ +--+ - - - -
196 |B1| |B0| - - - -
197 +--+ +--+ - -Interface I - -
198 ---------------->-Node X ---------------> -Node Y ----->
199 +--++--+ - - - -
200 |C1||C0| - -
201 +--++--+ - - - -
202 ---------------->- - - -
203 . - - - -
204 . - - - -
205 . - - - -
206 --- ---
208 Figure 1: Micro-burst and micro-burst iteration
210 2.2.3. Cyclic Forwarding
212 The common approach to solve that problem is that of a cyclic hop-by-
213 hop forwarding mechanism. Assume packets forwarded from N1 via N2 to
214 N3 as shown in Figure 2. When N1 sends a packet P to interface I1
215 with a Cycle X, it must be guaranteed by the forwarding mechanism
216 that N2 will forward P via I2 to N3 in a cycle Y.
218 The cycle of a packet can either be deduced by a receiving node from
219 the exact time it was received as is done in SDN/TDMA systems, and/or
220 it can be indicated in the packet. This document solution relies on
221 such markings because they allow to reduce the need for synchronous
222 hop-by-hop transmission timings of packets.
224 In a packet marking based slotted forwarding model, node N1 needs to
225 send packets for cycle X before the latest possible time that will
226 allow for N2 to further forward it in cycle Y to N3. Because of the
227 marking, N1 could even transmit packets for cycle X before all
228 packets for the previous cycle (X-1) have been sent, reducing the
229 synchronization requirements between across nodes.
231 P sent in P sent in P sent in
232 cycle(N1,I1,X) cycle(N2,I2,Y) cycle(N3,I3,Z)
233 +--------+ +--------+ +--------+
234 | Node N1|------->| Node N2|-------->| Node N3|------>
235 +--------+I1 +--------+I2 +--------+I3
237 Figure 2: Cyclic Forwarding
239 2.2.4. Co-Existence with Non-Deterministic Traffic
241 Traffic with deterministic delay requirements can co-exist with
242 traffic only requiring non-deterministic delay by using packet
243 scheduling where the delay incurred by non-deterministic packets is
244 deterministic for the deterministic traffic (and low). If LDN is
245 deployed together with such non-deterministic delay traffic than such
246 a scheme must be supported by the forwarding plane. A simple
247 approach for the delay incurred on the sending interface of a
248 deterministic node due to non-deterministic traffic is to serve
249 deterministic traffic via a strict, highest-priority queue and
250 include the worst case delay of a currently serialized non-
251 deterministic packet into the deterministic delay budget of the node.
252 Similar considerations apply to the internal processing delays in a
253 node.
255 2.3. System Components
257 The Figure 3 shows an overview of the components considered in this
258 document system and how they interact.
260 A network topology of nodes, Ingress, Core and Egress support a
261 method for cyclic forwarding to enable LDN. This forwarding requires
262 no per-flow state on the nodes, and tolerates loss time
263 synchronization.
265 Ingress edge nodes may support the (G)ate function to shape traffic
266 from sources into the desired traffic characteristics, unless the
267 source itself has such function. Per-flow state is required on the
268 ingress edge node. LDN should work with some resource reservation
269 methods, that will be not discussed in this document.
271 /--\. +--+ +--+ +--+ +--+. /--\
272 | (G)+-----+GS+--------+ S+------+ S+--------+ S+-----+ |
273 \--/ +--+ +--+ +--+ +--+ \--/
275 Sender Ingress Core Core Egress Receiver
276 Edge Node Node Node Edge Node
278 Figure 3: Overview of LDN
280 3. LDN Forwarding Mechanism
282 DetNet aims at providing deterministic service over large scale
283 network. In such large scale network, it is difficulty to get
284 precise time synchronization among numerous devices. To reduce
285 requirements, the forwarding mechanism described in this document
286 assumes only frequency synchronization but not time synchronization
287 across nodes: nodes maintain the same clock frequency 1/T, but do not
288 require the same time as shown in Figure 4.
290 <-----T-----> <-----T----->
291 | | | | | |
292 Node A +-----------+-----------+ Node A +-----------+-----------+
293 T0 T0
295 | | | | | |
296 Node B +-----------+-----------+ Node B +-----------+-----------+
297 T0 T0
299 (i) time synchronization (ii) frequency synchronization
301 T: length of a cycle
302 T0: timestamp
304 Figure 4: Time Synchronization & Frequency Synchronization
306 IEEE 802.1 CQF is an efficient forwarding mechanism in TSN that
307 guarantees bounded end-to-end latency. CQF is designed for limited
308 scale networks. Time synchronization is required, and the link
309 propagation delay is required to be smaller than a cycle length T.
310 Considering the large scale network deployment, the proposed LDN
311 Forwarding mechanism permits frequency synchronization and link
312 propagation delay may exceed T. Besides these two points, CQF and
313 the asynchronous forwarding of LDN are very similar.
315 Figure 5 compares CQF and LDN through an example. Suppose Node A is
316 the upstream node of Node B. In CQF, packets sent from Node A at
317 cycle x, will be received by Node B at the same cycle, then further
318 be sent to downstream node by Node B at cycle x+1.
320 In LDN, due to long link propagation delay and frequency
321 synchronization, Node B will receive packets from Node A at different
322 cycle denoted by y, then re-send out at cycle y+1. The cycle mapping
323 relationship (e.g., x->y+1) exists between any pair of neighbor
324 nodes. With this kind of cycle mapping, the receiving node can
325 easily figure out when the received packets should be sent out, the
326 only requirement is to carry the cycle identifier of sending node in
327 the packets.
329 | cycle x | cycle x+1 | | cycle x | cycle x+1 |
330 Node A +-----------+-----------+ Node A +-----------+-----------+
331 \ \
332 \packet \packet
333 \receiving \receiving
334 \ \
335 | V | cycle x+1 | | V | cycle y+1|
336 Node B +-----------+-----------+ Node B +-----------+-----------+
337 cycle x \packets cycle y \packets
338 \sending \sending
339 \ \
340 \ \
341 V V
343 (i) CQF (ii) LDN
345 Figure 5: CQF & LDN
347 3.1. Cyclic Queues
349 In CQF each port needs to maintain 2 (or 3) queues, one receiving
350 queue is used to buffer newly received packets, one sending queue is
351 used to store the packets that are going to be sent out, one more
352 queue may be needed to avoid output starvation [scheduled-queues].
354 In LDN, at least 3 cyclic queues (2 receiving queues and 1 sending
355 queue) are maintained for each port on a node. A cyclic queue
356 corresponds to a cycle. As Figure 6 illustrated, the downstream Node
357 B may receive packets sent at two different cycles from Node A due to
358 the absence of time synchronization. Following the cycle mapping
359 (i.e., x --> y+1), packets that carry cycle identifier x should be
360 sent out by Node B at cycle y+1, and packets that carry cycle
361 identifier x+1 should be sent out by Node B at cycle y+2. Therefore,
362 2 receiving queues are needed to store the received packets, one is
363 for the packets that carry cycle identifier x, another one is for the
364 packets that carry cycle identifier x+1. Plus one sending queue,
365 each port needs at least 3 cyclic queues in LDN. In order to absorb
366 more link delay variation (such as on radio interface), more queues
367 may be necessary.
369 | cycle x | cycle x+1 |
370 Node A +-----------+-----------+
371 \ \
372 \ \packet
373 \ \receiving
374 | V V | |
375 Node B +-----------+-----------+
376 cycle y cycle y+1
378 Figure 6: An example illustrates for 2 receiving queue in LDN
380 3.2. Cycle Mapping
382 The cycle mapping relationship (e.g., x->y+1) exists between any pair
383 of neighbor nodes, that could be configured through control plane or
384 self-studied in data plane. As Figure 7 shows, the cycle mapping
385 relationship instructs the packet forwarding in two modes -- swap
386 mode or stack mode.
388 o In swap mode, node stores the cycle mapping relationship locally.
389 After receiving a packet carrying a cycle identifier, the node
390 will check its cycle mapping relationship table, swap the cycle
391 identifier with a new cycle identifier, then put the packet into
392 an appropriate queue. A path with dedicated resource needs to be
393 established first, then packet is forwarded along the path in swap
394 mode.
396 o In stack mode, a central controller computes the cycle identifier
397 of every node, which ensures that there is no flow confliction
398 along the path and satisfies the end-to-end latency requirement.
399 The cycle identifiers are encapsulated into the packet in the
400 ingress. No other status information needs to be maintained in
401 the intermediate nodes.
403 LDN Packet
404 +------+---+ +-----------------------+ +------+---+
405 | | x | | | | |y+1|
406 +------+---+ | Swap Mode Node | +------+---+
407 ----------->| |----------->
408 | (x->y+1) |
409 | |
410 +-----------------------+
412 LDN Packet
413 +------+---===== +-----------------------+ +------=====---+
414 | |y+1= x = | | | =y+1= x |
415 +------+---===== | | +------=====---+
416 ----------->| Stack Mode Node |----------->
417 | |
418 | |
419 +-----------------------+
421 =====
422 = = Current Cycle Identifier
423 =====
425 Figure 7: Two Modes
427 As section 3.1 illustrates, there are 3 (or 4) different queues at
428 each port. Therefore, the cycle identifier should be able to express
429 3 (or 4) different values, each value corresponds to a queue. That
430 means minimal 2 bits are needed to identify different cycles between
431 a pair of neighboring nodes. This document does not yet aim to
432 propose one, but gives an (incomplete) list of ideas:
434 o DSCP of IPv4 Header
436 o Traffic Class of IPv6 Header
438 o TC of MPLS Header (used to be EXP)
440 o IPv6 Extension Header
442 o UDP Option
444 o SID of SRv6
446 o Reserved of SRH
448 o TLV of SRv6
449 o TC of SR-MPLS Header (used to be EXP)
451 o 3 (or 4) labels/adjacency SIDs for SR-MPLS
453 4. Performance Analysis
455 4.1. Queueing Delay
457 Figure 8 describes one-hop packet forwarding delay, that mainly
458 consisted of A->B link propagation delay and queuing delay in Node B.
460 |cycle x |
461 Node A +-------\+
462 \
463 \
464 \
465 |\ cycle y|cycle y+1|
466 Node B +V--------+--------\+
467 : \
468 : Queueing Delay :\
469 :...=2*T ............ V
471 Figure 8: Single-Hop Queueing Delay
473 As Figure 8 shows, cycle x of Node A will be mapped into cycle y+1 of
474 Node B as long as the last packet sent from A->B is received within
475 the cycle y. If the last packet is re-sent out by B at the end of
476 cycle y+1, then the largest single-hop queueing delay is 2*T.
477 Therefore the end-to-end queueing delay's upper bound is 2*T*H, where
478 H is the number of hops.
480 If A did not forward the LDN packet from a prior LDN forwarder but is
481 the actual traffic source, then the packet may have been delayed by a
482 gate function before it was sent to B. The delay of this function is
483 outside of scope for the LDN delay considerations. If B is not
484 forwarding the LDN packet but the final receiver, then the packet may
485 not need to be queued and released in the same fashion to the
486 receiver as it would be queued/released to a downstream LDN node, so
487 if a path has one source followed by N LDN forwarders followed by one
488 receivers, this should be considered to be a path with N-1 LDN hops
489 for the purpose of latency and jitter calculations.
491 4.2. Jitter
493 Considering the simplest scenario one hop forwarding at first,
494 suppose Node A is the upstream node of Node B, the packet sent from
495 Node A at cycle x will be received by Node B at cycle y as Figure 9
496 shows.
498 - The best situation is Node A sends packet at the end of cycle x,
499 and Node B receives packet at the beginning of cycle y, then the
500 delay is denoted by w;
502 - The worst situation is Node A sends packet at the beginning of
503 cycle x, and Node B receives packet at the end of cycle y, then
504 the delay= w + length of cycle x + length of cycle y= w+2*T;
506 - Hence the jitter's upper bound of this simplest scenario= worst
507 case-best case=2*T.
509 |cycle x | |cycle x |
510 Node A +-------\+ Node A +\-------+
511 :\ \ :
512 : \ -------------\
513 : \ : \
514 :w |\ | :w| \ |
515 Node B : +V--------+ Node B : +--------V+
516 cycle y cycle y
518 (a) best situation (b) worst situation
520 Figure 9: Jitter Analysis for One Hop Forwarding
522 Next considering two hops forwarding as Figure 10 shows.
524 - The best situation is Node A sends packet at the end of cycle x,
525 and Node C receives packet at the beginning of cycle z, then the
526 delay is denoted by w';
528 - The worst situation is Node A sends packet at the beginning of
529 cycle x, and Node C receives packet at the end of cycle z, then
530 the delay= w' + length of cycle x + length of cycle z= w'+2*T;
532 - Hence the jitter's upper bound = worst case-best case=2*T.
534 |cycle x |
535 Node A +-------\+
536 \
537 :\| cycle y |
538 Node B : \---------+
539 : \
540 : \--------\
541 : \ |
542 Node C ......w'......+V--------+
543 cycle z
545 (a) best situation
547 |cycle x |
548 Node A +\-------+
549 \ :
550 \ : | cycle y |
551 Node B \ : +---------+
552 \ :
553 ---:--------------------\
554 : | \ |
555 Node C :......w'.....+--------V+
556 cycle z
558 (b) worst situation
560 Figure 10: Jitter Analysis for Two Hops Forwarding
562 And so on. For multi-hop forwarding, the end-to-end delay will
563 increase as the number of hops increases, while the delay variation
564 (jitter) still does not exceed 2*T.
566 5. IANA Considerations
568 This document makes no request of IANA.
570 6. Security Considerations
572 Security issues have been carefully considered in
573 [draft-ietf-detnet-security]. More discussion is TBD.
575 7. Acknowledgements
577 TBD.
579 8. Normative References
581 [draft-ietf-detnet-architecture]
582 "DetNet Architecture", .
585 [draft-ietf-detnet-dp-sol]
586 "DetNet Data Plane Encapsulation",
587 .
590 [draft-ietf-detnet-problem-statement]
591 "DetNet Problem Statement",
592 .
595 [draft-ietf-detnet-security]
596 "DetNet Security Considerations",
597 .
600 [draft-ietf-detnet-use-cases]
601 "DetNet Use Cases", .
604 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
605 Requirement Levels", BCP 14, RFC 2119,
606 DOI 10.17487/RFC2119, March 1997,
607 .
609 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
610 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
611 May 2017, .
613 [scheduled-queues]
614 "Scheduled queues, UBS, CQF, and Input Gates",
615 .
618 Authors' Addresses
620 Li Qiang (editor)
621 Huawei
622 Beijing
623 China
625 Email: qiangli3@huawei.com
626 Xuesong Geng
627 Huawei
628 Beijing
629 China
631 Email: gengxuesong@huawei.com
633 Bingyang Liu
634 Huawei
635 Beijing
636 China
638 Email: liubingyang@huawei.com
640 Toerless Eckert (editor)
641 Huawei USA - Futurewei Technologies Inc.
642 2330 Central Expy
643 Santa Clara 95050
644 USA
646 Email: tte+ietf@cs.fau.de
648 Liang Geng
649 China Mobile
650 Beijing
651 China
653 Email: gengliang@chinamobile.com