idnits 2.17.1 draft-dunbar-e2e-latency-arch-view-and-gaps-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 30, 2018) is 2064 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'LTE-latency' is defined on line 452, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network working group L. Dunbar 2 Internet Draft Huawei 3 Category: Informational 4 Expires: November 2019 5 August 30, 2018 7 Architectural View of E2E Latency and Gaps 9 draft-dunbar-e2e-latency-arch-view-and-gaps-02.txt 11 Abstract 13 Ultra-Low Latency is a highly desired property for many types of 14 services, such as 5G MTC (Machine Type Communication) requiring 15 E2E connection for V2V to be less than 2ms, AR/VR requiring delay 16 less than 5ms, V2X less than 20ms, etc. 18 This draft examines the E2E latency from architectural 19 perspective, from studying how different OSI layers contribute to 20 E2E latency, how different domains, which can be different 21 operators' domains or administrative domains, contribute to E2E 22 latency, to analyzing the gaps of recent technology advancement 23 in reducing latency. 25 By studying the contributing factors to E2E latency from various 26 angles, the draft identifies some gaps of recent technology 27 advancement for E2E services traversing multiple domains and 28 involving multiple layers. The discussion might touch upon 29 multiple IETF areas. 31 Status of this Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current 39 Internet-Drafts is at 40 https://datatracker.ietf.org/drafts/current/. 42 Internet-Draft E2E Over Internet Latency Taxonomy 44 Internet-Drafts are draft documents valid for a maximum of six 45 months and may be updated, replaced, or obsoleted by other 46 documents at any time. It is inappropriate to use Internet- 47 Drafts as reference material or to cite them other than as "work 48 in progress." 50 This Internet-Draft will expire on February 23, 2019. 52 Copyright Notice 54 Copyright (c) 2018 IETF Trust and the persons identified as the 55 document authors. All rights reserved. 57 This document is subject to BCP 78 and the IETF Trust's Legal 58 Provisions Relating to IETF Documents 59 (http://trustee.ietf.org/license-info) in effect on the date of 60 publication of this document. Please review these documents 61 carefully, as they describe your rights and restrictions with 62 respect to this document. Code Components extracted from this 63 document must include Simplified BSD License text as described in 64 Section 4.e of the Trust Legal Provisions and are provided 65 without warranty as described in the Simplified BSD License. 67 Table of Contents 69 1. Introduction................................................. 4 70 2. Terminology.................................................. 4 71 3. AR/VR Use Case............................................... 5 72 4. Contributing Factors to E2E Latency.......................... 5 73 5. Application Layer Initiative in reducing E2E latency......... 6 74 5.1. Content Placement mechanisms need visibility to Network. 6 75 6. Transport Layer Initiatives in reducing Latency and gaps..... 7 76 6.1. TCP Layer Latency Improvement Alone is not enough....... 7 77 6.2. LTE Latency Impact on TCP Performance................... 8 78 6.3. Low Latency via Multipath TCP Extension................. 8 79 7. Network and Link Layer Initiatives in reducing E2E Latency... 9 80 8. Radio Channel Quality Impact to flows with High QoS......... 10 81 9. E2E Latency Contributed by multiple domains................. 10 82 10. Conclusion................................................. 11 83 11. Security Considerations.................................... 11 84 12. IANA Considerations........................................ 11 85 13. Acknowledgements........................................... 11 86 14. References................................................. 11 87 14.1. Normative References.................................. 11 89 Internet-Draft E2E Over Internet Latency Taxonomy 91 14.2. Informative References................................ 11 92 15. Appendix:.................................................. 12 93 15.1. Example: multi-Segments Latency for services via 94 Cellular Access............................................. 12 95 15.2. Latency contributed by multiple nodes................. 13 96 15.3. Latency through the Data Center that hosts S-GW & P-GW 14 97 Authors' Addresses............................................. 15 99 Internet-Draft E2E Over Internet Latency Taxonomy 101 1. Introduction 103 Ultra-Low Latency is a highly desired property for many types of 104 services, such as 5G MTC (Machine Type Communication) requiring 105 E2E connection for V2V to be less than 2ms, AR/VR requiring delay 106 less than 5ms, V2X less than 20ms, etc. 108 This draft is to examine E2E latency from architectural 109 perspective, from studying how different OSI layers contribute to 110 E2E latency, how different domains, which can be different 111 operators' domains or administrative domains, contribute to E2E 112 latency, to analyzing the gaps of recent technology advancement 113 in reducing latency. 115 The primary purpose of studying E2E Latency from architectural 116 perspective is to help the IETF community identify potential work 117 areas for reducing E2E latency of services over the Internet. 119 In recent years, the internet industry has been exploring 120 technologies and innovations at all layers of the OSI stack to 121 reduce latency. At the upper (application) layer, more contents 122 are distributed to the edges closer to end points and more 123 progress in Mobile Edge Computing (MEC) has been made. At the 124 Transport layer, there are QUIC/L4S initiatives. At the network 125 layer, there are IP/MPLS Hardened pipe (RFC 7625), latency 126 optimized router design, and BBF's Broadband Assured Services 127 (BAS). At the link layer, there are IETF DETNET, IEEE 802.1 TSN 128 (Time Sensitive Networking), and Flex Ethernet (OIF). 130 By studying the contributing factors to E2E latency from various 131 angles, the draft identifies some gaps of recent technology 132 advancement for E2E services traversing multiple domains and 133 involving multiple layers. The discussion might touch upon 134 multiple IETF areas. 136 2. Terminology 138 DA: Destination Address 140 DC: Data Center 142 E2E: End To End 144 GTP: GPRS Tunneling Protocol (GTP) is a group of IP-based 145 communications protocols used to carry general packet 147 Internet-Draft E2E Over Internet Latency Taxonomy 149 radio service (GPRS) within GSM, UMTS and LTE networks. 150 In 3GPP architectures, GTP can be decomposed into 151 separate protocols, GTP-C, GTP-U and GTP'. GTP-C is 152 used for signaling. GTP-U is used for carrying user 153 data. 155 LTE: Long Term Evolution 157 TS: Tenant System 159 VM: Virtual Machines 161 VN: Virtual Network 163 3. AR/VR Use Case 165 The E-2-E delays of AR/VR system come from delay of multiple 166 systems: 168 - Tracking delay 169 - Application delay 170 - Rendering delay 171 - Display delay 173 For human beings not to feel dizzy viewing AR/VR images, the 174 oculus delay should be less than 19.3ms, which includes display 175 delay, computing delay, transport delay, and sensoring delay. 176 That means the "Network Delay" budget is only 5ms at the most. 178 4. Contributing Factors to E2E Latency 180 Internet data is packaged and transported in small pieces of 181 data. The flow of these small pieces of data directly affects a 182 user's internet experience. When data packets arrive in a smooth 183 and timely manner, the user sees a continuous flow of data; if 184 data packets arrive with large and variable delays between 185 packets, the user's experience is degraded. 187 Key contributing factors to E2E latency: 189 - Generation: delay between physical event and availability of 190 data 192 Internet-Draft E2E Over Internet Latency Taxonomy 194 - Transmission: signal propagation, initial signal encoding 195 - Processing: Forwarding, encap/decap, NAT, encryption, 196 authentication, compress, error coding, signal translation 197 - Multiplexing: Delays needed to support sharing; Shared channel 198 acquisition, output queuing, connection establishment 199 - Grouping: Reduces frequency of control information and 200 processing; Packetization, message aggregation 202 The 2013 ISOC Workshop [Latency-ISOC] on Internet Latency 203 concluded that: 205 o Bandwidth alone is not enough in reducing latency 206 o Bufferbloat is one of the main causes for high latency in 207 the Internet. 209 Figure 1 of the 2013 ISOC workshop report showed that the timing 210 of download of an apparently uncluttered example Web page 211 (ieeexplore.ieee.org), actually comprised of over one hundred 212 objects, transferred over 23 connections needing 10 different DNS 213 look-ups. This phenomenon just further proves that reducing E2E 214 latency will need multiple layers coordination and interaction. 216 5. Application Layer Initiative in reducing E2E latency 218 More and more End to End services over internet are from end 219 users/devices to applications hosted in data centers. 221 As most content today is distributed, E2E services usually do not 222 traverse the globe but rather more often than not, the network 223 segments that the E2E service traverses are from end users to 224 regional data centers. The practice of content distribution to 225 the edge has transformed reaching low latency goals from fighting 226 against the speed of light to optimizing communication between 227 end users and their desired content. 229 However, without awareness of latency characteristics of network 230 segments, the content distribution mechanisms & algorithms might 231 not achieve their intended optimal result. 233 5.1. Content Placement mechanisms need visibility to Network 235 To be added. 237 Internet-Draft E2E Over Internet Latency Taxonomy 239 6. Transport Layer Initiatives in reducing Latency and gaps 241 IETF QUIC, L4S are some of the initiatives in reducing E2E 242 latency at the Transport Layer. 244 IETF QUIC focus on the improvement from end points. It doesn't 245 take into consideration of the network latency that the data 246 packets traverse. 248 The IETF L4S uses AQM for network nodes to purposely drop packets 249 or send indication to end points when their queues are above 250 certain thresholds. The goal is for the end nodes to reduce 251 transmission rate when intermediate nodes buffers are almost 252 full. It has following issues: 254 As network aggregates many flows from many different end points 255 and most flows have variable data rate, an intermediate network 256 node+port's buffer being almost full at one specific time 257 doesn't mean that the same amount of traffic will traverse the 258 same port a few microseconds later. If all end (source) points 259 reduce transmission rate upon receiving the AQM indication (or 260 experiencing packets drop), traffic through the network can be 261 greatly reduced (i.e. leaving no queue in the buffer). Then all 262 end points can increase their rate, causing traffic pattern 263 oscillation and buffer congestion again. 265 6.1. TCP Layer Latency Improvement Alone is not enough 267 The following example shows why simply optimizing transport layer 268 alone is not enough. More details can be found at 269 https://www.w3.org/Protocols/HTTP/Performance/Pipeline.html. 271 Typical web pages today contain a HyperText Markup Language 272 (HTML) document and many embedded images. Twenty or more 273 embedded images are quite common. Each of these images is an 274 independent object in the Web, retrieved (or validated for 275 change) separately. The common behavior for a web client, 276 therefore, is to fetch the base HTML document, and then 277 immediately fetch the embedded objects, which are typically 278 located on the same server. 280 The large number of embedded objects represents a change from 281 the environment in which the Web transfer protocol, the 282 Hypertext Transfer Protocol (HTTP), was designed. As a result, 283 HTTP/1.0 handles multiple requests from the same server 285 Internet-Draft E2E Over Internet Latency Taxonomy 287 inefficiently, creating a separate TCP connection for each 288 object. 290 6.2. LTE Latency Impact on TCP Performance 292 HTTP/TCP is the dominating application and transport layer 293 protocol suite used on the internet today. According to HTTP 294 Archive (http://httparchive.org/trends.php), the typical size of 295 HTTP based transactions over the internet are in the range of a 296 few 10's of Kbytes up to 1 Mbyte. In this size range, the TCP 297 slow start period is a significant part of the total transport 298 period of the packet stream. 300 During TCP slow start, TCP exponentially increases its congestion 301 window, i.e. the number of segments it brings into flight, until 302 it fully utilizes the throughput that LTE (Radio + EPC) can 303 offer. The incremental increases are based on TCP ACKs which are 304 received after one round trip delay in the LTE system. Thus, as 305 it turns out, during TCP slow start the performance is latency 306 limited in Radio Network (LTE). Hence, improved latency in LTE 307 can improve the perceived data rate for TCP based data 308 transactions, which in its turn reduces the time it takes to 309 complete a data down-load or upload. 311 Despite rather small (in terms of milliseconds) improvements that 312 can be achieved over the radio round trip time, the total 313 increase in the perceived throughput and delay savings of 314 downloading an item below 1MB is significant due to the additive 315 effect of LTE latency improvements in the TCP slow start[LTE- 316 Research]. 318 6.3. Low Latency via Multipath TCP Extension 320 There are some research work on how to use multi-path TCP to 321 reduce E2E latency, such as 322 http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7510787. The 323 paper proposes an MPTCP extension that sends data redundantly 324 over multiple paths in the network, which basically exchanges 325 bandwidth for latency. The integration into the MPTCP protocol 326 provides benefits such as transparent end-to-end connection 327 establishment, multipath-enabled congestion control, and the 328 prevention of head of line blocking. The research paper claims 329 that their proposed Multipath TCP extension can halve the average 330 round-trip time and reduce its standard deviation by a factor of 331 19 for a real world mobile scenario in a stressed environment. 333 Internet-Draft E2E Over Internet Latency Taxonomy 335 Those kind of researchers should be invited to the "Reducing 336 latency over Internet Deep-Dive" workshop or cross-area BOF (to 337 be organized by IAB). 339 7. Network and Link Layer Initiatives in reducing E2E Latency 341 Several industry initiatives already exist for improving latency 342 at the Link and Network layers: 344 - Link Layer: IEEE 802.1 TSN (Time Sensitive Networking), and 345 Flex Ethernet (OIF). 346 - The network layer: IETF DETNET, IP/MPLS Hardened pipe (RFC 347 7625). 349 Gaps: 351 IEEE 802.1 TSN (Time Sensitive Networking) requires stringent 352 synchronous timing among all the nodes, which is suitable for 353 small scoped network, but not suitable for the internet because 354 most routers/switches in the network don't support synchronous 355 timing. 357 IP/MPLS hardened pipe can guarantee no congestion and no 358 buffering on all nodes along the path, therefore, ensure the 359 lowest latency along the path. The hardened pipe is ideal for 360 flows with steady bandwidth requirement. 362 But for applications that don't have steady flow size, the 363 hardened pipe requires reserving the peak rate dedicated 364 channels, which, like TDM, will incur bandwidth waste when 365 application traffic goes below peak rate. 367 Traffic Engineering is one of the most commonly used methods to 368 reduce congestion at the network layer. However, it doesn't 369 completely prevent transient congestion. Depending on the tunnel 370 sizing, there could be momentary traffic bursts that exceed the 371 tunnel size, thus causing congestion if there isn't adequate 372 headroom on the trunk carrying the tunnel to absorb the burst. Or 373 a link or node outage, that reroutes the tunnel onto a secondary 374 path that becomes overloaded, could cause congestion. 376 Internet-Draft E2E Over Internet Latency Taxonomy 378 8. Radio Channel Quality Impact to flows with High QoS. 380 QoS is one of the key methods employed by fixed IP network to 381 reduce latency for some flows. However, in Radio network, if a 382 UE's channel condition is poor, the eNB may schedule more frames 383 to other UEs whose flow are marked with much lower QoS. 385 There are many studies showing how Radio quality negatively 386 impact to the TCP performance. 388 It is beneficial to the whole industry if there is a workshop to 389 get people or SDOs working on different layers of Internet 390 service together to showcase their work or their pain points. 392 IESG can make much more informed decision on creating useful 393 initiatives when the community is aware of other work and 394 obstacles. 396 9. E2E Latency Contributed by multiple domains 398 All of the latency improvement initiatives in the link layer have 399 been within a single domain, such as IETF DETNET, IEEE 802.1 TSN 400 (Time Sensitive Networking), and Flex Ethernet (OIF). The network 401 layer latency improvement, such as IP/MPLS Hardened pipe (RFC 402 7625) is also within a single domain. 404 But E2E services usually traverse more than one domain, which can 405 be administrative domains or multiple operators' networks. 407 Yet today, there is no interface between domains to: 409 - Inquire about the latency characteristics or capabilities from 410 another domain 411 - Negotiate or reserve latency capabilities from another domain. 412 - Have a standardized method to characterize latency 414 IETF/IAB is an ideal organization to tackle those issues because 415 IETF has the expertise. 417 Internet-Draft E2E Over Internet Latency Taxonomy 419 10. Conclusion 421 As end to end services traverse multiple types of network 422 segments and domains, and involve multiple layers, more informed 423 decision in each layer technological improvement is important. 425 - Need across domain coordination 426 - Need across layer coordination 428 11. Security Considerations 430 As the trend is going more encryption, it is getting more 431 difficult for various network segments to detect applications 432 sessions. Therefore, it is more important to create ways for 433 better coordination among different layers, for improved latency, 434 trouble shooting, restoration, etc. 436 12. IANA Considerations 438 This section gives IANA allocation and registry considerations. 440 13. Acknowledgements 442 Special thanks to Jari Arkko for encouraging writing this draft. 443 And many thanks to Andy Malis, Jim Guichard, Spenser Dawkins, and 444 Donald Eastlake for suggestions and comments to this draft. 446 14. References 448 14.1. Normative References 450 14.2. Informative References 452 [LTE-latency] https://www.ericsson.com/research-blog/lte/lte- 453 latency-improvement-gains/ 455 [Latency-ISOC] 2013 ISOC organized Latency over Internet workshop 456 report 458 Internet-Draft E2E Over Internet Latency Taxonomy 460 15. Appendix: 462 15.1. Example: multi-Segments Latency for services via Cellular 463 Access 465 Via Cellular network, there are User Plane Latency and Control 466 Plane Latency. Control plane deals with signaling and control 467 functions, while user plane deals with actual user data 468 transmission. 470 The User Plane latency can be measured by the time it takes for a 471 small IP packet to travel from the terminal through the network 472 to the internet server, and back. The Control Plane latency is 473 measured as the time required for the UE (User Equipment) to 474 transit from idle state to active state. 476 User Plane latency is relevant for the performance of many 477 applications. This document mainly focuses on the User Plane 478 Latency. The following diagram depicts a logical path from an end 479 user (smart phone) application to the application controller 480 hosted in a data center via 4G Mobile network, which utilize the 481 Evolved Packet Core (EPC). 483 +------+ +---------+ 484 |DC | | EPC | +----+ 485 |Apps |<----------->|P-GW/S-GW|< -------> | eNB|<---> UE 486 | | +---------+ Mobile +----+ Radio 487 +------+ Internet Backhaul Access 489 Mobility Management Entity (MME) is responsible for 490 authentication of the mobile device. MME retains location 491 information for each user and then selects the Serving Gateway 492 (S-GW) for a UE at the initial attach and at time of intra-LTE 493 handover involving Core Network (CN) node relocation. 495 The Serving Gateway (S-GW) resides in the user plane where it 496 forwards and routes packets to and from the eNodeB (eNB) 497 and packet data network gateway (P-GW). The S-GW also serves as 498 the local mobility anchor for inter-eNodeB handover and mobility 499 between 3GPP networks. 501 P-GW (Packet Data Network Gateway) provides connectivity from the 502 UE to external packet data networks by being the point of exit 503 and entry of traffic for the UE. A UE may have simultaneous 504 connectivity with more than one P-GW for accessing multiple 505 Packet Data Networks. The P-GW performs policy enforcement, 507 Internet-Draft E2E Over Internet Latency Taxonomy 509 packet filtering for each user, charging support, lawful 510 interception and packet screening. Another key role of the P-GW 511 is to act as the anchor for mobility between 3GPP and non-3GPP 512 technologies such as WiMAX and 3GPP2 (CDMA 1X and EvDO). 514 Very often P-GW and S-GW are co-located. The data traffic between 515 eNB and S-GW is encapsulated by GTP-U. 517 The figure above shows that the end to end services from/to UE 518 consists of the following network segments: 520 - Radio Access network - RAN 521 - Mobile Backhaul network that connect eNB to S-GW. 522 - Network within the DC that hosts S-GW & P-GW 523 - Packet Data Network, which can dedicated VPN, internet, or 524 other data network. 525 - Network within the DC that hosts the App. 527 The RAN (Radio Access Network) is between UE (e.g. smart phone) 528 and eNB. 3GPP has a group TSG RAN working on improving 529 performance (including latency) of the Radio Access network. 530 There are many factors impacting the latency through RAN. 532 The Mobile Backhaul Network connects eNBs to S-GW/P-GW, with data 533 traffic being encapsulated in GTP protocol. The number of UEs 534 that one eNB can handle are in 100s. The number of UEs that one 535 S-GW/P-GW can handle are in millions. Therefore, the mobile 536 backhaul network connects 10s of thousands of eNBs to S-GW/P-GW. 537 Therefore, the number of network nodes in the Mobile Backhaul 538 network can be very large. Therefore, any new protocol 539 improvement in reducing latency can play a big part in reducing 540 the overall latency for the end to end services. 542 15.2. Latency contributed by multiple nodes 544 The variant of delay for data packets through network is caused 545 by network nodes along the path as the transmission delay on 546 physical link is fixed. When there is no congestion, the latency 547 across most routers and switches are very small, in the magnitude 548 of ~20us (worst case in ~40us). When congestion occurs within a 549 node, i.e. with buffer/queues being used to avoid dropping 550 packets, latency across a node can be in the magnitude of micro- 551 seconds. The recent improvements made within router architecture 552 have greatly improved latency through a node. However, there is 554 Internet-Draft E2E Over Internet Latency Taxonomy 556 no standard methods for routers to characterize and expose 557 various latency characteristics through a network node. 559 Data packets also traverse through network functions, such as FW, 560 DPI, OPS, whose latency vary depending on the depth of the 561 processing and the equipment performance. 563 15.3. Latency through the Data Center that hosts S-GW & P-GW 565 S-GW and P-GW are hosted in Data center. There are typically 2~3 566 tiers of switches connecting the servers that hosts S-GW & P-GW 567 to the external network, as depicted in the following: 569 +---------+ 570 | Gateway | 571 +---------+ 573 \ +-------+ +------+ / 574 \ +/------+ | +/-----+ | / 575 \ | Aggr11| + ----- |AggrN1| + / 576 \ +---+---+/ +------+/ / 577 \ / \ / \ / 578 \ / \ / \ / 579 \ +---+ +---+ +---+ +---+ / 580 \- |T11|... |T1x| |T21| ... |T2y|--- 581 +---+ +---+ +---+ +---+ 582 | | | | 583 +-|-+ +-|-+ +-|-+ +-|-+ Servers 584 | |... |SGW| | S | | S |<- 585 +---+ +---+ +---+ +---+ 586 | |... |PGW| | S | ... | S | 587 +---+ +---+ +---+ +---+ 588 | |... | S | | S | ... | S | 589 +---+ +---+ +---+ +---+ 591 As the distance within data center can be small, the transmission 592 delay within data center can be negligent. The majority of 593 latency within data center is caused by the switching within the 594 gateway routers, traffic traversing through middleware boxes such 595 as FW, DPI, IPS, value added services, the top of the rack 596 switches, and aggregation switches. 598 Internet-Draft E2E Over Internet Latency Taxonomy 600 If the S-GW and P-GW are hosted in large data center, there could 601 be latency contributed by the 602 encapsulation/decapsulation such as work specified by 603 NVO3. 605 Authors' Addresses 607 Linda Dunbar 608 Huawei Technologies 609 5430 Legacy Drive, Suite #175 610 Plano, TX 75024, USA 611 Phone: (469) 277 5840 612 Email: linda.dunbar@huawei.com