idnits 2.17.1 draft-irtf-nwcrg-coding-and-congestion-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 521 has weird spacing: '...packets inf...' -- The document date (April 19, 2021) is 1096 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-08) exists of draft-detchart-nwcrg-tetrys-06 == Outdated reference: A later version (-10) exists of draft-ietf-quic-datagram-01 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 NWCRG N. Kuhn 3 Internet-Draft CNES 4 Intended status: Informational E. Lochin 5 Expires: October 21, 2021 ENAC 6 F. Michel 7 UCLouvain 8 M. Welzl 9 University of Oslo 10 April 19, 2021 12 Coding and congestion control in transport 13 draft-irtf-nwcrg-coding-and-congestion-08 15 Abstract 17 Forward Erasure Correction (FEC) is a reliability mechanism that is 18 distinct and separate from the retransmission logic in reliable 19 transfer protocols such as TCP. FEC coding can help deal with losses 20 at the end of transfers or with networks having non-congestion 21 losses. However, FEC coding mechanisms should not hide congestion 22 signals. This memo offers a discussion of how FEC coding and 23 congestion control can coexist. Another objective is to encourage 24 the research community to also consider congestion control aspects 25 when proposing and comparing FEC coding solutions in communication 26 systems. 28 This document is the product of the Coding for Efficient Network 29 Communications Research Group (NWCRG). The scope of the document is 30 end-to-end communications: FEC coding for tunnels is out-of-the scope 31 of the document. 33 Status of This Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at https://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on October 21, 2021. 50 Copyright Notice 52 Copyright (c) 2021 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (https://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 2. Context . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 69 2.1. Separate channels, separate entities . . . . . . . . . . 4 70 2.2. Relation between transport layer and application 71 requirements . . . . . . . . . . . . . . . . . . . . . . 6 72 2.3. Scope of the document concerning transport multipath and 73 multi-streams applications . . . . . . . . . . . . . . . 7 74 2.4. Types of coding . . . . . . . . . . . . . . . . . . . . . 8 75 2.5. Fairness, a policy concern . . . . . . . . . . . . . . . 9 76 3. FEC above the transport . . . . . . . . . . . . . . . . . . . 9 77 3.1. Fairness and impact on non-coded flows . . . . . . . . . 11 78 3.2. Congestion control and recovered symbols . . . . . . . . 11 79 3.3. Interactions between congestion control and coding rates 11 80 3.4. On useless repair symbols . . . . . . . . . . . . . . . . 11 81 3.5. On partial ordering . . . . . . . . . . . . . . . . . . . 11 82 3.6. On partial reliability . . . . . . . . . . . . . . . . . 12 83 3.7. On multipath transport . . . . . . . . . . . . . . . . . 12 84 4. FEC within the transport . . . . . . . . . . . . . . . . . . 12 85 4.1. Fairness and impact on non-coded flows . . . . . . . . . 13 86 4.2. Congestion control and recovered symbols . . . . . . . . 13 87 4.3. Interactions between congestion control and coding rates 13 88 4.4. On useless repair symbols . . . . . . . . . . . . . . . . 13 89 4.5. On partial ordering . . . . . . . . . . . . . . . . . . . 14 90 4.6. On partial reliability . . . . . . . . . . . . . . . . . 14 91 4.7. On transport multipath . . . . . . . . . . . . . . . . . 14 92 5. FEC below the transport . . . . . . . . . . . . . . . . . . . 14 93 5.1. Fairness and impact on non-coded flows . . . . . . . . . 15 94 5.2. Congestion control and recovered symbols . . . . . . . . 16 95 5.3. Interactions between congestion control and coding rates 16 96 5.4. On useless repair symbols . . . . . . . . . . . . . . . . 16 97 5.5. On partial ordering . . . . . . . . . . . . . . . . . . . 16 98 5.6. On partial reliability . . . . . . . . . . . . . . . . . 16 99 5.7. On transport multipath . . . . . . . . . . . . . . . . . 16 100 6. Research recommendations and questions . . . . . . . . . . . 17 101 6.1. Activities related to congestion control and coding . . . 17 102 6.2. Open research questions . . . . . . . . . . . . . . . . . 17 103 6.2.1. Parameter derivation . . . . . . . . . . . . . . . . 17 104 6.2.2. New signaling methods and fairness . . . . . . . . . 18 105 6.3. Recommendations and advice for evaluating coding 106 mechanisms . . . . . . . . . . . . . . . . . . . . . . . 18 107 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 19 108 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 109 9. Security Considerations . . . . . . . . . . . . . . . . . . . 19 110 10. Informative References . . . . . . . . . . . . . . . . . . . 19 111 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 113 1. Introduction 115 There are cases where deploying FEC coding improves the performance 116 of a transmission. As an example, it may take time for a sender to 117 detect transfer tail losses (losses that occur at the end of a 118 transfer, where, e.g., TCP obtains no more ACKs that would enable it 119 to quickly repair the loss via retransmission). Allowing the 120 receiver to recover such losses instead of having to rely on a 121 retransmission could improve the experience of applications using 122 short flows. Another example is a network where non-congestion 123 losses are persistent and prevent a sender from exploiting the link 124 capacity. 126 Coding is a reliability mechanism that is distinct and separate from 127 the loss detection of congestion controls. [RFC5681] defines the 128 loss-based congestion control of TCP; since FEC coding repairs such 129 losses, blindly applying it may easily lead to an implementation that 130 also hides a congestion signal from the sender. It is important to 131 ensure that such information hiding does not occur. 133 FEC coding and congestion control can be seen as two separate 134 channels. In practice, implementations may mix the signals that are 135 exchanged on these channels. This memo offers a discussion of how 136 FEC coding and congestion control coexist. Another objective is to 137 encourage the research community also to consider congestion control 138 aspects when proposing and comparing FEC coding solutions in 139 communication systems. This document does not aim at proposing 140 guidelines for characterizing FEC coding solutions. 142 We consider an end-to-end unicast data transfer with FEC coding in 143 the application (above the transport), within the transport or 144 directly below the transport. A typical scenario for the 145 considerations in this document is a client browsing the web or 146 watching a live video. 148 This document represents the collaborative work and consensus of the 149 Coding for Efficient Network Communications Research Group (NWCRG); 150 it is not an IETF product and is not a standard. The document 151 follows the terminology proposed in the taxonomy document [RFC8406]. 153 2. Context 155 2.1. Separate channels, separate entities 157 Figure 1 presents the notations that will be used in this document 158 and introduces the Forward Erasure Correction (FEC) and Congestion 159 Control (CC) channels. The Forward Erasure Correction channel 160 carries repair symbols (from the sender to the receiver) and 161 information from the receiver to the sender (e.g. signaling which 162 packets have been repaired, loss rate prior and/or after decoding, 163 etc.). The Congestion Control channel carries network packets from a 164 sender to a receiver, and packets signaling information about the 165 network (number of packets received vs. lost, Explicit Congestion 166 Notification (ECN) marks, etc.) from the receiver to the sender. The 167 network packets that are sent by the Congestion Control channel may 168 be composed of source packets and/or repair symbols. 170 SENDER RECEIVER 172 +------+ +------+ 173 | | ----- network packets ---->| | 174 | CC | | CC | 175 | | <--- network information ---| | 176 +------+ +------+ 178 +------+ +------+ 179 | | source and/or ---->| | 180 | | ----- repair symbols ---->| | 181 | FEC | | FEC | 182 | | <--- info: repaired symbols --| | 183 +------+ +------+ 185 Figure 1: Notations and separate channels 187 Inside a host, the CC and FEC entities can be regarded as 188 conceptually separate: 190 | ^ | ^ 191 | source | coding |packets | sending 192 | packets | rate |requirements | rate (or 193 v | v | window) 194 +---------------+source +-----------------+ 195 | FEC |and/or | CC | 196 | |repair | |network 197 | |symbols | |packets 198 +---------------+==> +-----------------+==> 199 ^ ^ 200 | signaling about | network 201 | losses and/or | information 202 | repaired symbols 204 Figure 2: Separate entities (sender-side) 206 | | 207 | source and/or | network 208 | repair symbols | packets 209 v v 210 +---------------+ +-----------------+ 211 | FEC |signaling | CC | 212 | |repaired | |network 213 | |symbols | |information 214 +---------------+==> +-----------------+==> 216 Figure 3: Separate entities (receiver-side) 218 Figure 2 and Figure 3 provide more details than Figure 1. Some 219 elements are introduced: 221 o 'network information' (input control plane for the transport 222 including CC): refers not only to the network information that is 223 explicitly signaled from the receiver, but all the information a 224 congestion control obtains from a network (e.g., TCP can estimate 225 the latency and the available capacity at the bottleneck). 227 o 'requirements' (input control plane for the transport including 228 CC): refers to application requirements such as upper/lower rate 229 bounds, periods of quiescence, or a priority. 231 o 'sending rate (or window)' (output control plane for the transport 232 including CC): refers to the rate at which a congestion control 233 decides to transmit packets based on 'network information'. 235 o 'signaling repaired symbols' (input control plane for the FEC): 236 refers to the information a FEC sender can obtain from a FEC 237 receiver about the performance of the FEC solution as seen by the 238 receiver. 240 o 'coding rate' (output control plane for the FEC): refers to the 241 coding rate that is used by the FEC solution (i.e. proportion of 242 transmitted symbols that carry useful data). 244 o 'network packets' (output data plane for the CC): refers to the 245 data that is transmitted by a CC sender to a CC receiver. The 246 network packets may contain source and/or repair symbols. 248 o 'source and/or repair symbols' (data plane for the FEC): refers to 249 the data that is transmitted by a FEC sender to a FEC receiver. 250 The sender can decide to send source symbols only (meaning that 251 the coding rate is 0), repair symbols only (if the solution 252 decides not to send the original source symbols) or a mix of both. 254 The inputs to FEC (incoming data packets without repair symbols, and 255 signaling from the receiver about losses and/or repaired symbols) are 256 distinct from the inputs to CC. The latter calculates a sending rate 257 or window from network information, and it takes the packet to send 258 as input, sometimes along with application requirements such as 259 upper/lower rate bounds, periods of quiescence, or a priority. It is 260 not clear that the ACK signals feeding into a congestion control 261 algorithm are useful to FEC in their raw form, and vice versa - 262 information about repaired blocks may be quite irrelevant to a CC 263 algorithm. 265 2.2. Relation between transport layer and application requirements 267 The choice of the adequate transport layer may be related to 268 application requirements and the services offered by a transport 269 protocol [RFC8095]: 271 o The transport layer may provide an unreliable transport service 272 (e.g. UDP or DCCP [RFC4340]) or a partially reliable transport 273 service (e.g. SCTP with the partial reliability extension 274 [RFC3758] or QUIC with the unreliable datagram extension 275 [I-D.ietf-quic-datagram]). Depending on the amount of redundancy 276 and network conditions, there could be cases where it becomes 277 impossible to carry traffic. 279 o The transport layer may implement a retransmission mechanism to 280 guarantee the reliability of a data transfer (e.g. TCP). 281 Depending on how the FEC and CC functions are scheduled (FEC above 282 CC, FEC in CC, FEC below CC), the impact of reliable transport on 283 the FEC reliability mechanisms is different. 285 2.3. Scope of the document concerning transport multipath and multi- 286 streams applications 288 The application layer may be composed of several streams above FEC 289 and transport layers instances. The transport layer may exploit a 290 multipath mechanism. The different streams could exploit different 291 paths between the sender and the receiver or not. This section 292 describes what is in the scope of this document in regards with 293 multi-streams applications and multipath transport protocols. 295 The different combinations between multi-stream applications and 296 multipath transport are the following: (1) one application layer 297 stream as input packets above a combination of FEC and multipath 298 (Mpath) transport layers (Figure 4), and (2) multiple application 299 layer streams as input packets above a combination of FEC and 300 multipath (Mpath) or single path (Spath) transport layers (Figure 5). 301 In Figure 4, each of stream 1 and stream 2 are considered in the 302 scope of the document. In Figure 5, the case with single path 303 transport is considered in the scope of this document but the cases 304 with mutipath transport is not. The case of multiple application 305 level streams above multiple transport layers instances is out of the 306 scope of the document and not further described. 308 +---------------+ +---------------+ 309 | Stream 1 | | Stream 2 | 310 +---------------+ +---------------+ 312 +---------------+ +---------------+ 313 | FEC | |Mpath Transport| 314 +---------------+ +---------------+ 316 +---------------+ +-----+ +-----+ 317 |Mpath Transport| |Flow1|...|FlowM| 318 +---------------+ +-----+ +-----+ 320 +-----+ +-----+ +-----+ +-----+ 321 |Flow1|...|FlowM| | FEC |...| FEC | 322 +-----+ +-----+ +-----+ +-----+ 324 Figure 4: Transport multipath and single stream applications 326 +-------+ +-------+ +-------+ +-------+ +-------+ +-------+ 327 |Stream1|...|StreamM| |Stream1|...|StreamM| |Stream1|...|StreamM| 328 +-------+ +-------+ +-------+ +-------+ +-------+ +-------+ 330 +-------------------+ +-------------------+ +-------------------+ 331 | | | FEC | | Mpath Transport | 332 | FEC | +-------------------+ +-------------------+ 333 | above/in/below | 334 | Spath Transport | +-------------------+ +-------------------+ 335 | | | Mpath Transport | | FEC | 336 +-------------------+ +-------------------+ +-------------------+ 338 +-------------------+ +-----+ +-----+ +-----+ +-----+ 339 | Flow | |Flow1| ... |FlowM| |Flow1| ... |FlowM| 340 +-------------------+ +-----+ +-----+ +-----+ +-----+ 342 Figure 5: Transport single path, transport multipath and multi-stream 343 applications 345 2.4. Types of coding 347 [RFC8406] summarizes recommended terminology for Network Coding 348 concepts and constructs. In particular, the document identifies the 349 following coding types (among many others): 351 o Block Coding: Coding technique where the input Flow must first be 352 segmented into a sequence of blocks; FEC encoding and decoding are 353 performed independently on a per-block basis. 355 o Sliding Window Coding: general class of coding techniques that 356 rely on a sliding encoding window. 358 The decoding scheme may not be able to decode all the symbols. The 359 chance of decoding the erased packets depends on the size of the 360 encoding window, the coding rate and the distribution of erasure in 361 the transmission channel. The FEC channel may let the client 362 transmit information related to the need of supplementary symbols to 363 adapt the level of reliability. Partial and full reliability could 364 be envisioned. 366 o Full reliability: The receiver may hold symbols until the decoding 367 of source symbols is possible. In particular, if the codec does 368 not enable a subset of the system to be inverted, the receiver 369 would have to wait for a certain minimum amount of repair packets 370 before it can recover all the source symbols. 372 o Partial reliability: The receiver cannot deliver source symbols 373 that could not have been decoded to the upper layer. For a fixed 374 size of encoding window (for Sliding Window Coding) or of blocks 375 (for Block Coding) containing the source symbols, increasing the 376 amount of repair symbols would increase the chances of recovering 377 the erased symbols. However, this would impact on memory 378 requirements, on the cost of encoding and decoding processes and 379 on the network overhead. 381 2.5. Fairness, a policy concern 383 Traffic from or to different end users may share various types of 384 bottlenecks. When such a shared bottleneck does not implement some 385 form of flow protection, the share of the available capacity between 386 single flows can help assess when one flow starves the other. 388 As one example, for residential accesses, the data rate can be 389 guaranteed for the customer premises equipment, but not necessarily 390 for the end user. The quality of service that guarantees fairness 391 between the different clients can be seen as a policy concern 392 [I-D.briscoe-tsvarea-fair]. 394 While past efforts have focused on achieving fairness, quantifying 395 and limiting harm caused by new algorithms (or algorithms with 396 coding) is more practical [BEYONDJAIN]. This document considers 397 fairness as the impact of the addition of coded flows on non-coded 398 flows when they share the same bottleneck. It is assumed that the 399 non-coded flows respond to congestion signals from the network. This 400 document does not contribute to the definition of fairness at a wider 401 scale. 403 3. FEC above the transport 404 | source ^ source 405 | packets | packets 406 v | 407 +-------------+ +-------------+ 408 |FEC | signaling|FEC | 409 | | repaired| | 410 | | symbols| | 411 | | <==| | 412 +-------------+ +-------------+ 413 | source ^ ^ source 414 | and/or | sending | and/or 415 | repair | rate | repair 416 | symbols | (or window) | symbols 417 v | | 418 +-------------+ +-------------+ 419 |Transport | network|Transport | 420 |(incl. CC) | information| | 421 | |network <==| | 422 | |packets | | 423 +-------------+==> +-------------+ 425 SENDER RECEIVER 427 Figure 6: FEC above the transport 429 Figure 6 presents an architecture where FEC operates on top of the 430 transport. 432 The advantage of this approach is that the FEC overhead does not 433 contribute to congestion in the network. When congestion control is 434 implemented at the transport layer, the repair symbols are sent 435 following the congestion window or rate determined by the CC 436 mechanism. This can result in improved quality of experience for 437 latency sensitive applications such as VoIP or any not-fully reliable 438 services. 440 This approach requires that the transport protocol does not implement 441 a fully reliable in-order data transfer service (e.g., like TCP). 442 QUIC with unreliable datagram extension [I-D.ietf-quic-datagram] is 443 an example of a protocol for which this is relevant. In cases where 444 QUIC traffic is blocked and a fall-back to TCP is proposed, there is 445 a risk for bad interactions between TCP's full reliability and coding 446 schemes. For reliable transfers, coding usage does not guarantee 447 better performance; instead, it would mainly reduce goodput. 449 3.1. Fairness and impact on non-coded flows 451 The addition of coding within the flow does not influence the 452 interaction between coded and non-coded flows. This interaction 453 would mainly depend on the congestion controls associated with each 454 flow. 456 3.2. Congestion control and recovered symbols 458 The congestion control mechanism receives network packets and may not 459 be able to differentiate repair symbols from actual source ones. The 460 relevance of adding coding at the application layer is related to the 461 needs of the application. For real-time applications using an 462 unreliable or partially reliable transport, this approach may reduce 463 the number of losses perceived by the application. 465 3.3. Interactions between congestion control and coding rates 467 The coding rate applied at the application layer mainly depends on 468 the available rate or congestion window given by the congestion 469 control underneath. The coding rate could be adapted to avoid adding 470 overhead when the minimum required data rate of the application is 471 not provided by the congestion control underneath. When the 472 congestion control allows sending faster than the application needs, 473 adding coding can reduce packet losses and improve the quality of 474 experience (provided that an unreliable or partially reliable 475 transport is used). 477 3.4. On useless repair symbols 479 The discussion depends on application needs. The only case where 480 adding useless repair symbols does not obviously result in reduced 481 goodput is when the application rate is limited (e.g., VoIP traffic). 482 In this case, useless repair symbols would only impact the amount of 483 data generated in the network. Extra data in the network can, 484 however, increase the likelihood of increasing delay and/or packet 485 loss, which could provoke a congestion control reaction that would 486 degrade goodput. 488 3.5. On partial ordering 490 Irrespective of the transport protocol, a FEC mechanism does not 491 require to implement a reordering mechanism if the application does 492 not need it. However, if the application needs in-order delivery of 493 packets, a reordering mechanism at the receiver is required. 495 3.6. On partial reliability 497 The application may require partial reliability. In this case, the 498 coding rate of a FEC mechanism could be adapted based on inputs from 499 the application and the trade-off between latency and packet loss. 500 Partial reliability impacts the type of FEC and type of codec that 501 can be used, such as discussed in Section 2.4. 503 3.7. On multipath transport 505 Whether the transport protocol exploits multiple paths or not does 506 not have an impact on the FEC mechanism. 508 4. FEC within the transport 510 | source ^ source 511 | packets | packets 512 v | 513 +------------+ +------------+ 514 | Transport | | Transport | 515 | | | | 516 | +---+ +--+ | signaling| +---+ +--+ | 517 | |FEC| |CC| | repaired| |FEC| |CC| | 518 | +---+ +--+ | symbols| +---+ +--+ | 519 | | <==| | 520 | |network network| | 521 | |packets information| | 522 +------------+ ==> <==+------------+ 524 SENDER RECEIVER 526 Figure 7: FEC in the transport 528 Figure 7 presents an architecture where FEC operates within the 529 transport. The repair symbols are sent within what the congestion 530 window or calculated rate allows, such as in [CTCP]. 532 The advantage of this approach is that it allows a joint optimization 533 of CC and FEC. Moreover, the transmission of repair symbols does not 534 add congestion in potentially congested networks but helps repair 535 lost packets (such as tail losses). 537 For reliable transfers, including redundancy reduces goodput for long 538 transfers but the amount of repair symbols can be adapted, e.g. 539 depending on the congestion window size. There is a trade-off 540 between 1) the capacity that could have been exploited by application 541 data instead of transmitting source packets, and 2) the benefits 542 derived from transmitting repair symbols (e.g. unlocking the receive 543 buffer if it is limiting). The coding ratio needs to be carefully 544 designed. For small files, sending repair symbols when there is no 545 more data to transmit could help to reduce the transfer time. 546 Sending repair symbols can avoid the silence period between the 547 transmission of the last packet in the send buffer and 1) firing a 548 retransmission of lost packets, or 2) the transmission of new 549 packets. 551 4.1. Fairness and impact on non-coded flows 553 The addition of coding within the transport may impact the congestion 554 control mechanism and hide congestion losses. Specific interaction 555 between congestion controls and coding schemes can be proposed (see 556 Section 4.2, Section 4.3 and Section 4.4). If no specific 557 interaction is introduced, the coding scheme may hide congestion 558 losses from the congestion controller and the description of 559 Section 5 may apply. 561 4.2. Congestion control and recovered symbols 563 The receiver can differentiate between source packets and repair 564 symbols. The receiver may indicate both the number of source packets 565 received and repair symbols that were actually useful in the recovery 566 process of packets. 568 4.3. Interactions between congestion control and coding rates 570 There is an important flexibility in the trade-off, inherent to the 571 use of coding, between (1) reducing goodput when useless repair 572 symbols are transmitted and (2) helping to recover from losses 573 earlier than with retransmissions. The receiver may indicate to the 574 sender the number of packets that have been received or recovered. 575 The sender may use this information to tune the coding ratio. For 576 example, coupling an increased transmission rate with an increasing 577 or decreasing coding rate could be envisioned. A server may use a 578 decreasing coding rate as a probe of the channel capacity and adapt 579 the congestion control transmission rate. 581 4.4. On useless repair symbols 583 The sender may exploit the information given by the receiver to 584 reduce the number of useless repair symbols and the resulting goodput 585 reduction. 587 4.5. On partial ordering 589 The application may require in-order delivery of packets. In this 590 case, both FEC and transport layer mechanisms should guarantee that 591 packets are delivered in order. If partial ordering is requested by 592 the application, both the FEC and transport could relax the 593 constraints related to in-order delivery: reordering mechanisms at 594 the receiver may not be necessary. 596 4.6. On partial reliability 598 The application may require partial reliability. The reliability 599 offered by FEC may be sufficient, with no retransmission required. 600 This depends on application needs and the trade-off between latency 601 and loss. Partial reliability impacts the type of FEC and type of 602 codec that can be used, such as discussed in Section 2.4. 604 4.7. On transport multipath 606 The sender may adapt the coding rate of each of the single subpaths, 607 whether the congestion control is coupled or not. There is an 608 important flexibility on how the coding rate is tuned depending on 609 the characteristics of each subpath. 611 5. FEC below the transport 613 | source ^ source 614 | packets | packets 615 v | 616 +--------------+ +--------------+ 617 |Transport | network|Transport | 618 |(including CC)| information| | 619 | | <==| | 620 +--------------+ +--------------+ 621 | network packets ^ network packets 622 v | 623 +--------------+ +--------------+ 624 | FEC |source | FEC | 625 | |and/or signaling| | 626 | |repair repaired| | 627 | |symbols symbols| | 628 | |==> <==| | 629 +--------------+ +--------------+ 631 SENDER RECEIVER 633 Figure 8: FEC below the transport 635 Figure 8 presents an architecture where FEC is applied end-to-end 636 below the transport layer, but above the link layer. Note that it is 637 common to apply FEC at the link layer, where it contributes to the 638 total capacity that a link exposes to upper layers. This application 639 of FEC is out of scope of this document. This includes the use of 640 FEC on top of a link layer in scenarios where the link is known by 641 configuration. In the scenario considered here, the repair symbols 642 are sent on top of what is allowed by the congestion control. 644 Including redundancy adds traffic without reducing goodput but incurs 645 potential fairness issues. The effective bit-rate is higher than the 646 CC's computed fair share due to the transmission of repair symbols, 647 and losses are hidden from the transport. This may cause a problem 648 for loss-based congestion detection, but it is not a problem for 649 delay-based congestion detection. 651 The advantage of this approach is that it can result in performance 652 gains when there are persistent transmission losses along the path. 654 The drawback of this approach is that it can induce congestion in 655 already congested networks. The coding ratio needs to be carefully 656 designed. 658 Examples of the solution could be to add a given percentage of the 659 congestion window or rate as supplementary symbols, or to send a 660 fixed amount of repair symbols at a fixed rate. The redundancy flow 661 can be decorrelated from the congestion control that manages source 662 packets: a separate congestion control entity could be introduced to 663 manage the amount of repaired packets to transmit on the FEC channel. 664 The separate congestion control instances could be made to work 665 together while adhering to priorities, as in coupled congestion 666 control for RTP media [RFC8699] in case all traffic can be assumed to 667 take the same path, or otherwise with a multipath congestion window 668 coupling mechanism as in Multipath TCP [RFC6356]. Another 669 possibility would be to exploit a lower than best-effort congestion 670 control [RFC6297] for repair symbols. 672 5.1. Fairness and impact on non-coded flows 674 The coding scheme may hide congestion losses from the congestion 675 controller. There are cases where this can drastically reduce the 676 goodput of non-coded flows. Depending on the congestion control, it 677 may be possible to signal to the congestion control mechanism that 678 there was congestion (loss) even when a packet has been recovered, 679 e.g. using ECN, to reduce the impact on the non-coded flows (see 680 Section 5.2 and [TENTET]). 682 5.2. Congestion control and recovered symbols 684 The congestion control may not be aware of the existence of a coding 685 scheme underneath it. The congestion control may behave as if no 686 coding scheme had been introduced. The only way for a coding channel 687 to indicate that symbols have been lost but recovered is to exploit 688 existing signaling that is understood by the congestion control 689 mechanism. An example would be to indicate to a TCP sender that a 690 packet has been received, yet congestion has occurred, by using ECN 691 signaling [TENTET]. 693 5.3. Interactions between congestion control and coding rates 695 The coding rate can be tuned depending on the number of recovered 696 symbols and the rate at which the sender transmits data. If the 697 coding scheme is not aware of the congestion control implementation, 698 it is hard for the coding scheme to apply the relevant coding rate. 700 5.4. On useless repair symbols 702 Useless repair symbols only impact the load on the network without 703 actual gain for the coded flow. Using feedback signaling, FEC 704 mechanisms can measure the ratio between actually used and useless 705 symbols, and adjust the coding rate. 707 5.5. On partial ordering 709 The transport above the FEC channel may support out-of-order delivery 710 of packets: reordering mechanisms at the receiver may not be 711 necessary. In cases where the transport requires in-order delivery, 712 the FEC channel may need to implement a reordering mechanism. 713 Otherwise, spurious retransmissions may occur at the transport level. 715 5.6. On partial reliability 717 The transport or application layer above the FEC channel may require 718 partial reliability only. In this case, FEC may provide an 719 unnecessary service if it is not aware of the reliability 720 requirements. Partial reliability impacts the type of FEC and type 721 of codec that can be used, such as discussed in Section 2.4. 723 5.7. On transport multipath 725 The transport may exploit multiple paths without the FEC channel 726 being aware of it. This depends on whether FEC is applied to all 727 subflows or each of the subflows individually. When FEC is applied 728 to all the flows, there is a risk for the coding rate to be 729 inadequate for the characteristics of the individual paths. 731 6. Research recommendations and questions 733 This section provides a short state-of-the art overview of activities 734 related to congestion control and coding. The objective is to 735 identify open research questions and contribute to advice when 736 evaluating coding mechanisms. 738 6.1. Activities related to congestion control and coding 740 We map activities related to congestion control and coding with the 741 organization presented in this document: 743 o For the FEC above transport case: [RFC8680]. 745 o For the FEC within transport case: 746 [I-D.swett-nwcrg-coding-for-quic], [QUIC-FEC], [RFC5109]. 748 o For the FEC below transport case: [NCTCP], 749 [I-D.detchart-nwcrg-tetrys]. 751 6.2. Open research questions 753 There is a general trade-off, inherent to the use of coding, between 754 (1) reducing goodput when useless repair symbols are transmitted and 755 (2) helping to recover from transmission and congestion losses. 757 6.2.1. Parameter derivation 759 There is a trade-off related to the amount of redundancy to add, as a 760 function of the transport layer protocol and application 761 requirements. 763 [RFC8095] describes the mechanisms provided by existing IETF 764 protocols such as TCP, SCTP or RTP. [RFC8406] describes the variety 765 of coding techniques. The important level of combinations makes the 766 determination of an optimum parameters derivation very complex. This 767 depends on application requirements and deployment context. 769 Appendix C of [RFC8681] describes how to tune the parameters for 770 target use-case. However, this discussion does not integrate 771 congestion-controlled end points. 773 Research question 1 : "Is there a way to dynamically adjust the codec 774 characteristics depending on the transmission channel, the transport 775 protocol and application requirements ?" 776 Research question 2 : "Should we apply specific per-stream FEC 777 mechanisms when multiple streams with different reliability needs are 778 carried out ?" 780 6.2.2. New signaling methods and fairness 782 Recovering lost symbols may hide congestion losses from the 783 congestion control. Disambiguate acked packets from rebuilt packets 784 would help the sender adapt its sending rate accordingly. There are 785 opportunities for introducing interaction between congestion control 786 and coding schemes to improve the quality of experience while 787 guaranteeing fairness with other flows. 789 Some existing solutions already propose to disambiguate acked packets 790 from rebuilt packets [QUIC-FEC]. New signaling methods and FEC- 791 recovery-aware congestion controls could be proposed. This would 792 allow the design of adaptive coding rates. 794 Research question 3 : "Should we quantify the harm that a coded flow 795 would induce on a non-coded flow ? How can this be reduced while 796 still benefiting from advantages brought by FEC ?" 798 Research question 4 : "If transport and FEC senders are collocated 799 and close to the client, and FEC is applied only on the last mile, 800 e.g. to ignore losses on a noisy wireless link, would this raise 801 fairness issues ?" 803 Research question 5 : "Should we propose a generic API to allow 804 dynamic interactions between a transport protocol and a coding scheme 805 ? This should consider existing APIs between application and 806 transport layers." 808 6.3. Recommendations and advice for evaluating coding mechanisms 810 Research Recommendation 1: "From a congestion control point-of-view, 811 a repaired packet must be considered as a lost packet. This does not 812 apply to the usage of FEC on a path that is known to be lossy." 814 Research Recommendation 2: "New research contributions should be 815 mapped following the organization of this document (above, below, in 816 the congestion control) and should consider congestion control 817 aspects when proposing and comparing FEC coding solutions in 818 communication systems." 820 Research Recommendation 3: "When a research work aims at improving 821 throughput by hiding the packet loss signal from congestion control 822 (e.g., because the path between the sender and receiver is known to 823 consist of a noisy wireless link), the authors should 1) discuss the 824 advantages of using the proposed FEC solution compared to replacing 825 the congestion control by one that ignores a portion of the 826 encountered losses, 2) critically discuss the impact of hiding packet 827 loss from the congestion control mechanism." 829 7. Acknowledgements 831 Many thanks to Spencer Dawkins, Dave Oran, Carsten Bormann, Vincent 832 Roca and Marie-Jose Montpetit for their useful comments that helped 833 improve the document. 835 8. IANA Considerations 837 This memo includes no request to IANA. 839 9. Security Considerations 841 FEC and CC schemes can contribute to DoS attacks. This is not 842 specific to this document. 844 In case of FEC below the transport, the aggregate rate of source and 845 repair packets may exceed the rate at which a congestion control 846 mechanism allows an application to send. This could result in an 847 application obtaining more than its fair share of the network 848 capacity. 850 10. Informative References 852 [BEYONDJAIN] 853 Ware (et al.), R., "Beyond Jain's Fairness Index: Setting 854 the Bar For The Deployment of Congestion Control 855 Algorithms", HotNets '19 10.1145/3365609.3365855, 2019. 857 [CTCP] Kim (et al.), M., "Network Coded TCP (CTCP)", 858 arXiv 1212.2291v3, 2013. 860 [I-D.briscoe-tsvarea-fair] 861 Briscoe, B., "Flow Rate Fairness: Dismantling a Religion", 862 draft-briscoe-tsvarea-fair-02 (work in progress), July 863 2007. 865 [I-D.detchart-nwcrg-tetrys] 866 Detchart, J., Lochin, E., Lacan, J., and V. Roca, "Tetrys, 867 an On-the-Fly Network Coding protocol", draft-detchart- 868 nwcrg-tetrys-06 (work in progress), December 2020. 870 [I-D.ietf-quic-datagram] 871 Pauly, T., Kinnear, E., and D. Schinazi, "An Unreliable 872 Datagram Extension to QUIC", draft-ietf-quic-datagram-01 873 (work in progress), August 2020. 875 [I-D.swett-nwcrg-coding-for-quic] 876 Swett, I., Montpetit, M., Roca, V., and F. Michel, "Coding 877 for QUIC", draft-swett-nwcrg-coding-for-quic-04 (work in 878 progress), March 2020. 880 [NCTCP] Sundararajan (et al.), J., "Network Coding Meets TCP: 881 Theory and Implementation", IEEE 882 INFOCOM 10.1109/JPROC.2010.2093850, 2009. 884 [QUIC-FEC] 885 Michel (et al.), F., "QUIC-FEC: Bringing the benefits of 886 Forward Erasure Correction to QUIC", IFIP 887 Networking 10.23919/IFIPNetworking.2019.8816838, 2019. 889 [RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P. 890 Conrad, "Stream Control Transmission Protocol (SCTP) 891 Partial Reliability Extension", RFC 3758, 892 DOI 10.17487/RFC3758, May 2004, 893 . 895 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 896 Congestion Control Protocol (DCCP)", RFC 4340, 897 DOI 10.17487/RFC4340, March 2006, 898 . 900 [RFC5109] Li, A., Ed., "RTP Payload Format for Generic Forward Error 901 Correction", RFC 5109, DOI 10.17487/RFC5109, December 902 2007, . 904 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 905 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 906 . 908 [RFC6297] Welzl, M. and D. Ros, "A Survey of Lower-than-Best-Effort 909 Transport Protocols", RFC 6297, DOI 10.17487/RFC6297, June 910 2011, . 912 [RFC6356] Raiciu, C., Handley, M., and D. Wischik, "Coupled 913 Congestion Control for Multipath Transport Protocols", 914 RFC 6356, DOI 10.17487/RFC6356, October 2011, 915 . 917 [RFC8095] Fairhurst, G., Ed., Trammell, B., Ed., and M. Kuehlewind, 918 Ed., "Services Provided by IETF Transport Protocols and 919 Congestion Control Mechanisms", RFC 8095, 920 DOI 10.17487/RFC8095, March 2017, 921 . 923 [RFC8406] Adamson, B., Adjih, C., Bilbao, J., Firoiu, V., Fitzek, 924 F., Ghanem, S., Lochin, E., Masucci, A., Montpetit, M-J., 925 Pedersen, M., Peralta, G., Roca, V., Ed., Saxena, P., and 926 S. Sivakumar, "Taxonomy of Coding Techniques for Efficient 927 Network Communications", RFC 8406, DOI 10.17487/RFC8406, 928 June 2018, . 930 [RFC8680] Roca, V. and A. Begen, "Forward Error Correction (FEC) 931 Framework Extension to Sliding Window Codes", RFC 8680, 932 DOI 10.17487/RFC8680, January 2020, 933 . 935 [RFC8681] Roca, V. and B. Teibi, "Sliding Window Random Linear Code 936 (RLC) Forward Erasure Correction (FEC) Schemes for 937 FECFRAME", RFC 8681, DOI 10.17487/RFC8681, January 2020, 938 . 940 [RFC8699] Islam, S., Welzl, M., and S. Gjessing, "Coupled Congestion 941 Control for RTP Media", RFC 8699, DOI 10.17487/RFC8699, 942 January 2020, . 944 [TENTET] Lochin, E., "On the joint use of TCP and Network Coding", 945 NWCRG session IETF 100, 2017. 947 Authors' Addresses 949 Nicolas Kuhn 950 CNES 952 Email: nicolas.kuhn@cnes.fr 954 Emmanuel Lochin 955 ENAC 957 Email: emmanuel.lochin@enac.fr 959 Francois Michel 960 UCLouvain 962 Email: francois.michel@uclouvain.be 963 Michael Welzl 964 University of Oslo 966 Email: michawe@ifi.uio.no