idnits 2.17.1 draft-even-fast-congestion-response-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet has text resembling RFC 2119 boilerplate text. -- The document date (March 10, 2019) is 1868 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC3550' is defined on line 185, but no explicit reference was found in the text == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-18 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TSVWG R. Even 3 Internet-Draft Huawei 4 Intended status: Informational March 10, 2019 5 Expires: September 11, 2019 7 Fast Congestion Response 8 draft-even-fast-congestion-response-00 10 Abstract 12 The high link speed (100Gb/s) in Data Centers (DC) are making network 13 transfers complete faster and in fewer RTTs. The short data bursts 14 requires low latency while longer data transfer require high 15 throughput. This document describes the current state of flow 16 control and congestion handling in the DC using RoCEv2 and suggests 17 new directions for faster congestion control. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on September 11, 2019. 36 Copyright Notice 38 Copyright (c) 2019 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (https://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 3. Problem statement . . . . . . . . . . . . . . . . . . . . . . 3 56 4. Security Considerations . . . . . . . . . . . . . . . . . . . 4 57 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 4 58 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 6.1. Normative References . . . . . . . . . . . . . . . . . . 4 60 6.2. Informative References . . . . . . . . . . . . . . . . . 4 61 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 5 63 1. Introduction 65 The high link speed (100Gb/s) in Data Centers (DC) are making network 66 transfers complete faster and in fewer RTTs. Network traffic in a 67 data center is often a mix of short and long flows, where the short 68 flows require low latencies and the long flows require high 69 throughputs. [RFC8257] titled Data Center TCP (DCTCP): TCP 70 Congestion Control for Data Centers is an Informational RFC that 71 extends the Explicit Congestion Notification (ECN) [RFC3168] 72 processing to estimate the fraction of bytes that encounter 73 congestion, DCTCP then scales the TCP congestion window based on this 74 estimate. DCTCP does not change the ECN reporting in TCP. Other ECN 75 notification mechanisms are specified for RTP in [RFC6679] and for 76 QUIC [I-D.ietf-quic-transport]. The ECN notification are reported 77 from the end receiver to the sender and the notification includes 78 only the occurrence of ECN in the TCP case and the number of ECN 79 marked packet for RTP and QUIC. What is common for TCP, RTP and QUIC 80 is that the switches in the middle just monitor and report while the 81 analysis and the rate control are done by the data sender. 83 In Data Centers the InfiniBand Architecture (IBA) offers a rich set 84 of I/O services based on an RDMA access method and message passing 85 semantics. RDMA over Converged Ethernet (RoCEv2) [RoCEv2] is using 86 UDP as the transport for RDMA. RoCEv2 Congestion Management (RCM) 87 provides the capability to avoid congestion hot spots and optimize 88 the throughput of the fabric. RCM relies on the Link-Layer Flow- 89 Control IEEE 802.1Qbb(PFC) to provide a lossless network. RoCEv2 90 Congestion Management(RCM) use ECN [RFC3168] to signal the congestion 91 to the destination. The ECN notification is sent back from the 92 receiver to the data sender using RoCEv2 Congestion Notification 93 Packet (CNP) that notifies the sender about ECN marked packets. The 94 rate reduction by the sender as well as the increase in data 95 injection is left to the implementation. 97 2. Conventions 99 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 100 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 101 document are to be interpreted as described in [RFC2119] [RFC8174] 103 3. Problem statement 105 The congestion control using ECN in the DC is done between the 106 receiver and the sender. The network measures the traffic and 107 informs the receiver about problems by the ECN bit. The Receiver 108 will send to the Sender in the RoCEv2 case, a CNP message and the 109 sender adapts by reducing the rate. The sender reduces the rate 110 based on pre-defined policy. The sender has also a policy about when 111 to start sending at a higher rate and by how much to increase the 112 traffic. In the DC network when latency and high transfer rate is 113 important there is a need to define a congestion response mechanism 114 that will be optimized for the DC network. The behavior of the 115 sender on congestion is not specified by RoCEV2. 117 This type of congestion management is re-active. The high link speed 118 in the DC (100Gb/s) are making network transfers complete faster and 119 in fewer RTTs; allocating flows their proper rates as quickly as 120 possible becomes a priority. The convergence time must become a 121 primary metric for congestion control in high speed networks. 123 A pro-active direction will provide more information to the sender 124 about the congestion that can be used to optimize the congestion 125 response allowing the network to adapt faster to the changes in the 126 traffic conditions. This information should be available to the 127 sender to allow fast response (RTT or lower). 129 The entity that measures the congestion is the switch in the network. 130 Currently it just notifies about congestion to the receiver (ECN), 131 may drop packets (the receiver may use IEEE 802.1Qbb to provide a 132 lossless network). The receiver NIC informs the sender about the 133 ECN; the sender will analyze, control and execute an action to 134 address the congestion based on some predefined policy. 136 The requirement is to allow the network to control the traffic 137 instead of the end points. The proposal is to allow the network to 138 analyze the congestion and inform the sender (QPSource in terms of 139 ROCEv2)) how to handle the congestion when in the transport layer 140 (directly to the data sender). In the case of RoCEV2 as the 141 transport protocol can be a new Congestion Notification Message. 142 This requires a new message from the network to the sender (backward 143 notification). The proposed solution for the DC should only be 144 deployed in an intra-data-center environment where both endpoints and 145 the switching fabric are under a single administrative domain. 147 4. Security Considerations 149 TBD 151 5. IANA Considerations 153 No IANA action 155 6. References 157 6.1. Normative References 159 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 160 Requirement Levels", BCP 14, RFC 2119, 161 DOI 10.17487/RFC2119, March 1997, 162 . 164 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 165 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 166 May 2017, . 168 [RoCEv2] "Infiniband Trade Association. Supplement to InfiniBand 169 architecture specification volume 1 release 1.2.2 annex 170 A17: RoCEv2 (IP routable RoCE).", 171 . 173 6.2. Informative References 175 [I-D.ietf-quic-transport] 176 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 177 and Secure Transport", draft-ietf-quic-transport-18 (work 178 in progress), January 2019. 180 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 181 of Explicit Congestion Notification (ECN) to IP", 182 RFC 3168, DOI 10.17487/RFC3168, September 2001, 183 . 185 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 186 Jacobson, "RTP: A Transport Protocol for Real-Time 187 Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, 188 July 2003, . 190 [RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P., 191 and K. Carlberg, "Explicit Congestion Notification (ECN) 192 for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August 193 2012, . 195 [RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., 196 and G. Judd, "Data Center TCP (DCTCP): TCP Congestion 197 Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257, 198 October 2017, . 200 Author's Address 202 Roni Even 203 Huawei 205 Email: roni.even@huawei.com