idnits 2.17.1 draft-ietf-tcpsat-stand-mech-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-23) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 59 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 2 instances of too long lines in the document, the longest one being 1 character in excess of 72. ** There are 2 instances of lines with control characters in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 569: '... an ACK for each incoming segment. [Bra89] states that hosts SHOULD...' RFC 2119 keyword, line 582: '...t a TCP receiver SHOULD generate delay...' Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Kru95' is mentioned on line 540, but not defined -- Unexpected draft version: The latest known version of draft-floyd-incr-init-win is -02, but you're referring to -03. ** Downref: Normative reference to an Experimental draft: draft-floyd-incr-init-win (ref. 'AFP98') -- Possible downref: Non-RFC (?) normative reference: ref. 'AHKO97' -- Possible downref: Non-RFC (?) normative reference: ref. 'All97' -- Possible downref: Non-RFC (?) normative reference: ref. 'FF96' -- Possible downref: Non-RFC (?) normative reference: ref. 'FF98' -- Possible downref: Non-RFC (?) normative reference: ref. 'Flo94' -- Possible downref: Non-RFC (?) normative reference: ref. 'GJKFV98' -- Possible downref: Non-RFC (?) normative reference: ref. 'Jac90' ** Obsolete normative reference: RFC 1323 (ref. 'JBB92') (Obsoleted by RFC 7323) -- Possible downref: Non-RFC (?) normative reference: ref. 'Jac88' ** Downref: Normative reference to an Informational RFC: RFC 1435 (ref. 'Kno93') -- Possible downref: Non-RFC (?) normative reference: ref. 'Mar78' -- Possible downref: Non-RFC (?) normative reference: ref. 'MM96' ** Obsolete normative reference: RFC 793 (ref. 'Pos81') (Obsoleted by RFC 9293) -- Possible downref: Non-RFC (?) normative reference: ref. 'PS97' -- Possible downref: Non-RFC (?) normative reference: ref. 'PSC' -- Possible downref: Non-RFC (?) normative reference: ref. 'SMM98' -- Possible downref: Non-RFC (?) normative reference: ref. 'Sta94' ** Obsolete normative reference: RFC 2001 (ref. 'Ste97') (Obsoleted by RFC 2581) Summary: 16 errors (**), 0 flaws (~~), 3 warnings (==), 17 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Mark Allman 3 INTERNET DRAFT NASA Lewis/Sterling Software 4 File: draft-ietf-tcpsat-stand-mech-05.txt Dan Glover 5 NASA Lewis 6 August, 1998 7 Expires: February, 1999 9 Enhancing TCP Over Satellite Channels 10 using Standard Mechanisms 12 Status of this Memo 14 This document is an Internet-Draft. Internet-Drafts are working 15 documents of the Internet Engineering Task Force (IETF), its areas, 16 and its working groups. Note that other groups may also distribute 17 working documents as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six 20 months and may be updated, replaced, or obsoleted by other documents 21 at any time. It is inappropriate to use Internet-Drafts as 22 reference material or to cite them other than as ``work in 23 progress.'' 25 To view the entire list of current Internet-Drafts, please check the 26 "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow 27 Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern 28 Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific 29 Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). 31 Abstract 33 The Transmission Control Protocol (TCP) provides reliable delivery 34 of data across any network path, including network paths containing 35 satellite channels. While TCP works over satellite channels there 36 are several IETF standardized mechanisms that enable TCP to more 37 effectively utilize the available capacity of the network path. 38 This draft outlines some of these TCP mitigations. At this time, 39 all mitigations discussed in this draft are IETF standards track 40 mechanisms (or are compliant with IETF standards). 42 1. Introduction 44 Satellite channel characteristics have an effect on the way 45 transport protocols, such as the Transmission Control Protocol (TCP) 46 [Pos81], behave. When protocols, such as TCP, perform poorly, 47 channel utilization is low. While the performance of a transport 48 protocol is important, it is not the only consideration when 49 constructing a network containing satellite links. For example, 50 data link protocol, application protocol, router buffer size, 51 queueing discipline and proxy location are some of the considerations 52 that must be taken into account. However, this document focuses on 53 improving TCP in the satellite environment and non-TCP 54 considerations are left for another document. Finally, there have 55 been many satellite mitigations proposed and studied by the research 56 community. While these mitigations may prove useful and safe for 57 shared networks in the future, this document only considers TCP 58 mechanisms which are currently well understood and on the IETF 59 standards track (or are compliant with IETF standards). 61 This draft is divided up as follows: Section 2 provides a brief 62 outline of the characteristics of satellite networks. Section 3 63 outlines two non-TCP mechanisms that enable TCP to more effectively 64 utilize the available bandwidth. Section 4 outlines the TCP 65 mechanisms defined by the IETF that benefit satellite networks. 66 Finally, Section 5 provides a summary of what modern TCP 67 implementations should include to be considered "satellite 68 friendly". 70 2. Satellite Characteristics 72 There is an inherent delay in the delivery of a message over a 73 satellite link due to the finite speed of light and the altitude of 74 communications satellites. 76 Many communications satellites are located at Geostationary Orbit 77 (GSO) with an altitude of approximately 36,000 km [Sta94]. At this 78 altitude the orbit period is the same as the Earth's rotation 79 period. Therefore, each ground station is always able to "see" the 80 orbiting satellite at the same position in the sky. The propagation 81 time for a radio signal to travel twice that distance (corresponding 82 to a ground station directly below the satellite) is 239.6 83 milliseconds (ms) [Mar78]. For ground stations at the edge of the 84 view area of the satellite, the distance traveled is 2 x 41,756 km 85 for a total propagation delay of 279.0 ms [Mar78]. These delays are 86 for one ground station-to-satellite-to-ground station route (or 87 "hop"). Therefore, the propagation delay for a message and the 88 corresponding reply (one round-trip time or RTT) could be at least 89 558 ms. The RTT is not based solely on satellite propagation time. 90 The RTT will be increased by other factors in the network, such as 91 the transmission time and propagation time of other links in the 92 network path and queueing delay in gateways. Furthermore, the 93 satellite propagation delay will be proportionately longer if the 94 link includes multiple hops or if intersatellite links are used. As 95 satellites become more complex and include on-board processing of 96 signals, additional delay may be added. 98 Other orbits are possible for use by communications satellites 99 including Low Earth Orbit (LEO) and Medium Earth Orbit (MEO) 100 [Mar78]. The lower orbits require the use of constellations of 101 satellites for constant coverage. In other words, as one satellite 102 leaves the ground station's sight, another satellite appears on the 103 horizon and the channel is switched to it. The propagation delay to 104 a LEO orbit ranges from several milliseconds when communicating with 105 a satellite directly overhead, to as much as 80 ms when the 106 satellite is on the horizon. These systems are more likely to use 107 intersatellite links and have variable path delay depending on 108 routing through the network. 110 Satellite channels are dominated by two fundamental characteristics, 111 as described below: 113 NOISE - The strength of a radio signal falls in proportion to 114 the square of the distance traveled. For a satellite link the 115 distance is large and so the signal becomes weak before reaching 116 its destination. This results in a low signal-to-noise ratio. 117 Some frequencies are particularly susceptible to atmospheric 118 effects such as rain attenuation. For mobile applications, 119 satellite channels are especially susceptible to multi-path 120 distortion and shadowing (e.g., blockage by buildings). Typical 121 bit error rates (BER) for a satellite link today are on the 122 order of 1 error per 10 million bits (1 x 10^-7) or less 123 frequent. Advanced error control coding (e.g., Reed Solomon) 124 can be added to existing satellite services and is currently 125 being used by many services. Satellite error performance 126 approaching fiber will become more common as advanced error 127 control coding is used in new systems. However, many legacy 128 satellite systems will continue to exhibit higher BER than newer 129 satellite systems and terrestrial channels. 131 BANDWIDTH - The radio spectrum is a limited natural resource, 132 hence there is a restricted amount of bandwidth available to 133 satellite systems which is typically controlled by licenses. 134 This scarcity makes it difficult to trade bandwidth to solve 135 other design problems. Typical carrier frequencies for current, 136 point-to-point, commercial, satellite services are 6 GHz 137 (uplink) and 4 GHz (downlink), also known as C band, and 14/12 138 GHz (Ku band). A new service at 30/20 GHz (Ka band) will be 139 emerging over the next few years. Satellite-based radio 140 repeaters are known as transponders. Traditional C band 141 transponder bandwidth is typically 36 MHz to accommodate one 142 color television channel (or 1200 voice channels). Ku band 143 transponders are typically around 50 MHz. Furthermore, one 144 satellite may carry a few dozen transponders. 146 Not only is bandwidth limited by nature, but the allocations for 147 commercial communications are limited by international agreements so 148 that this scarce resource can be used fairly by many different 149 applications. 151 Although satellites have certain disadvantages when compared to 152 fiber channels, they also have certain advantages over terrestrial 153 links. First, satellites have a natural broadcast capability. This 154 gives satellites a natural advantage for multicast applications. 155 Next, satellites can reach geographically remote areas or countries 156 that have little terrestrial infrastructure. A related advantage is 157 the ability of satellite links to reach mobile users. 159 Satellite channels have several characteristics that differ from 160 most terrestrial channels. These characteristics can degrade the 161 performance of TCP. These characteristics include: 163 Long feedback loop 165 Due to the propagation delay of some satellite channels (e.g., 166 approximately 250 ms over a geosynchronous satellite) it may 167 take a long time for a TCP sender to determine whether or not a 168 packet has been successfully received at the final destination. 169 This delay hurts interactive applications such as telnet, as 170 well as some of the TCP congestion control algorithms (see 171 section 4). 173 Large delay*bandwidth product 175 The delay*bandwidth product (DBP) defines the amount of data a 176 protocol should have "in flight" (data that has been 177 transmitted, but not yet acknowledged) at any one time to fully 178 utilize the available channel capacity. The delay used in this 179 equation is the RTT and the bandwidth is the capacity of the 180 bottleneck link in the network path. Because the delay in some 181 satellite environments is large, TCP will need to keep a large 182 amount of data "in flight". 184 Transmission errors 186 Satellite channels exhibit a higher bit-error rate (BER) than 187 typical terrestrial networks. TCP uses all packet drops as 188 signals of network congestion and reduces its window size in an 189 attempt to alleviate the congestion. In the absence of 190 knowledge about why a packet was dropped (congestion or 191 corruption), TCP must assume the drop was due to network 192 congestion to avoid congestion collapse [Jac88] [FF98]. 193 Therefore, packets dropped due to corruption cause TCP to reduce 194 the size of its sliding window, even though these packet drops 195 do not signal congestion in the network. 197 Asymmetric use 199 Due to the expense of the equipment used to send data to 200 satellites, asymmetric satellite networks are often constructed. 201 For example, a host connected to a satellite network will send 202 all outgoing traffic over a slow terrestrial link (such as a 203 dialup modem channel) and receive incoming traffic via the 204 satellite channel. Another common situation arises when both 205 the incoming and outgoing traffic are sent using a satellite 206 link, but the uplink has less available capacity than the 207 downlink due to the expense of the transmitter required to 208 provide a high bandwidth back channel. This asymmetry can have 209 an impact on TCP performance. 211 Variable Round Trip Times 213 In some satellite environments, such as low-Earth orbit (LEO) 214 constellations, the propagation delay to and from the satellite 215 varies over time. This can have a negative impact on TCP's 216 ability to accurately set retransmission timeouts and determine 217 the appropriate window size. 219 Intermittent connectivity 221 In non-GSO satellite orbit configurations, TCP connections must 222 be transferred from one satellite to another or from one ground 223 station to another from time to time. This handoff can cause 224 packet loss. 226 Most satellite channels only exhibit a subset of the above 227 characteristics. Furthermore, satellite networks are not the only 228 environments where the above characteristics are found. However, 229 satellite networks do tend to exhibit more of the above problems or 230 the above problems are aggravated in the satellite environment. The 231 mechanisms outlined in this document should benefit most networks, 232 especially those with one or more of the above characteristics. 234 3. Lower Level Mitigations 236 It is recommended that those utilizing satellite channels in their 237 networks should use the following two non-TCP mechanisms which can 238 increase TCP performance. These mechanisms are Path MTU Discovery 239 and forward error correction (FEC) and are outlined in the following 240 two sections. 242 The data link layer protocol employed over a satellite channel can 243 have a large impact on performance of higher layer protocols. While 244 beyond the scope of this document, those constructing satellite 245 networks should tune these protocols in an appropriate manner to 246 ensure that the data link protocol does not limit TCP performance. 247 In particular, data link layer protocols often implement a flow 248 control window and retransmission mechanisms. When the link level 249 window size is too small, performance will suffer just as when the 250 TCP window size is too small (see section 4.3 for a discussion of 251 appropriate window sizes). The impact that link level 252 retransmissions have on TCP transfers is not currently well 253 understood. The interaction between TCP retransmissions and link 254 level retransmissions is a subject for further research. 256 3.1 Path MTU Discovery 258 Path MTU discovery [MD90] is used to determine the maximum packet 259 size a connection can use on a given network path without being 260 subjected to IP packet fragmentation. The sender transmits a packet 261 that is the appropriate size for the local network to which it is 262 connected (e.g., 1500 bytes on an Ethernet) and sets the IP "don't 263 fragment" (DF) bit. If the packet is too large to be forwarded 264 without being fragmented to a given channel along the network path, 265 the gateway that would normally fragment the packet and forward the 266 fragments will return an ICMP message to the originator of the 267 packet. The ICMP message will indicate that the original segment 268 could not be transmitted without being fragmented and will also 269 contain the size of the largest packet that can be forwarded by the 270 gateway. Additional information from the IESG on Path MTU discovery 271 is available in [Kno93]. 273 Path MTU Discovery allows TCP to use the largest possible packet 274 size, without incurring the cost of fragmentation and reassembly. 275 Large packets reduce the packet overhead by sending more data bytes 276 per overhead byte. As outlined in section 4, increasing TCP's 277 congestion window is segment based, rather than byte based and 278 therefore, larger segments enable TCP senders to increase the 279 congestion window more rapidly than smaller segments. 281 The disadvantage of Path MTU Discovery is that it may cause a long 282 pause before TCP is able to start sending data. For example, assume 283 a packet is sent with the DF bit set and one of the intervening 284 gateways (G1) returns an ICMP message indicating that it cannot 285 forward the segment. At this point, the sending host reduces the 286 packet size per the ICMP message returned by G1 and sends another 287 packet with the DF bit set. The packet will be forwarded by G1, 288 however this does not ensure all subsequent gateways in the network 289 path will be able to forward the segment. If a second gateway (G2) 290 cannot forward the segment it will return an ICMP message to the 291 transmitting host and the process will be repeated. Therefore, path 292 MTU discovery can waste a large amount of time determining the 293 maximum allowable packet size on the network path between the sender 294 and receiver. Satellite delays can aggravate this problem (consider 295 the case when the channel between G1 and G2 is a satellite link). 296 However, in practice, Path MTU Discovery does not consume a large 297 amount of time due to wide support of common MTU values. 298 Additionally, caching MTU values may be able to eliminate discovery 299 time in many instances. 301 The relationship between BER and segment size is likely to vary 302 depending on the error characteristics of the given channel. This 303 relationship deserves further study, however with the use of good 304 forward error correction (see section 3.2) larger segments should 305 provide better performance in most cases due to the reduction in 306 header overhead. While the exact method for choosing the best MTU 307 for a satellite link is outside the scope of this document, the use 308 of Path MTU Discovery is recommended to allow TCP to use the largest 309 possible MTU over the satellite channel. 311 3.2 Forward Error Correction 313 A loss event in TCP is always interpreted as an indication of 314 congestion and always causes TCP to reduce its window size. Since 315 window growth is based on returning acknowledgments (see section 4), 316 TCP spends a long time recovering from loss when operating in 317 satellite networks. When packet loss is due to corruption, rather 318 than congestion, TCP does not need to reduce its window size. 319 However, at the present time detecting corruption loss is a research 320 issue. 322 Therefore, for TCP to operate efficiently, the channel 323 characteristics should be such that nearly all loss is due to 324 network congestion. The use of forward error correction coding 325 (FEC) on a satellite link should be used to improve the bit-error 326 rate (BER) of the satellite channel. Reducing the BER is not always 327 possible in satellite environments. However, since TCP takes a long 328 time to recover from lost packets because the long propagation delay 329 imposed by a satellite link delays feedback from the receiver 330 [PS97], the link should be made as clean as possible to prevent TCP 331 connections from receiving false congestion signals. 333 FEC should not be expected to fix all problems associated with noisy 334 satellite links. There are some situations where FEC cannot be 335 expected to solve the noise problem (such as military jamming, deep 336 space missions, noise caused by rain fade, etc.). In addition, link 337 outages can also cause problems in satellite systems that do not 338 occur as frequently in terrestrial networks. Finally, FEC is not 339 without cost. FEC requires additional hardware and uses some of the 340 available bandwidth. It can add delay and timing jitter due to the 341 processing time of the coder/decoder. 343 Further research is needed into mechanisms that allow TCP to 344 differentiate between congestion induced drops and those caused by 345 corruption. Such a mechanism would allow TCP to respond to 346 congestion in an appropriate manner, as well as repairing corruption 347 induced loss without reducing the transmission rate. However, in 348 the absence of such a mechanism packet loss must be assumed to 349 indicate congestion to preserve network stability. Incorrectly 350 interpreting loss as caused by corruption and not reducing the 351 transmission rate accordingly can lead to congestive collapse 352 [Jac88] [FF98]. 354 4. Standard TCP Mechanisms 356 This section includes an outline of the mechanisms that may be 357 necessary in satellite or hybrid satellite/terrestrial networks to 358 better utilize the available capacity of the link. These mechanisms 359 may also be needed to fully utilize fast terrestrial channels. 360 Furthermore, these mechanisms do not fundamentally hurt performance 361 in a shared terrestrial network. Each of the following sections 362 outlines one mechanism and why that mechanism may be needed. 364 4.1 Congestion Control 366 To avoid generating an inappropriate amount of network traffic for 367 the current network conditions, during a connection TCP employs four 368 congestion control mechanisms [Jac88] [Jac90] [Ste97]. These 369 algorithms are slow start, congestion avoidance, fast retransmit and 370 fast recovery. These algorithms are used to adjust the amount of 371 unacknowledged data that can be injected into the network and to 372 retransmit segments dropped by the network. 374 TCP uses two variables to accomplish congestion control. The first 375 variable is the congestion window (cwnd). This is an upper bound on 376 the amount of data the sender can inject into the network before 377 receiving an acknowledgment (ACK). The value of cwnd is limited to 378 the receiver's advertised window. The congestion window is 379 increased or decreased during the transfer based on the inferred 380 amount of congestion present in the network. The second variable is 381 the slow start threshold (ssthresh). This variable determines which 382 algorithm is being used to increase the value of cwnd. If cwnd is 383 less than ssthresh the slow start algorithm is used to increase the 384 value of cwnd. However, if cwnd is greater than or equal to (or 385 just greater than in some TCP implementations) ssthresh the 386 congestion avoidance algorithm is used. The initial value of 387 ssthresh is the receiver's advertised window size. Furthermore, the 388 value of ssthresh is set when congestion is detected. 390 The four congestion control algorithms are outlined below, followed 391 by a brief discussion of the impact of satellite environments on 392 these algorithms. 394 4.1.1 Slow Start and Congestion Avoidance 396 When a host begins sending data on a TCP connection the host has no 397 knowledge of the current state of the network between itself and the 398 data receiver. In order to avoid transmitting an inappropriately 399 large burst of traffic, the data sender is required to use the slow 400 start algorithm at the beginning of a transfer [Jac88] [Bra89] 401 [Ste97]. Slow start begins by initializing cwnd to 1 segment 402 (although a proposal in the IETF pipeline would increase the size of 403 the initial window to roughly 4 Kbytes [AFP98]). This forces TCP to 404 transmit one segment and wait for the corresponding ACK. For each 405 ACK that is received, the value of cwnd is increased by 1 segment. 406 For example, after the first ACK is received cwnd will be 2 segments 407 and the sender will be allowed to transmit 2 data packets. This 408 continues until cwnd meets or exceeds ssthresh (or in some 409 implementations when cwnd equals ssthresh), or loss is detected. 411 When the value of cwnd is greater than or equal to (or equal to in 412 certain implementations) ssthresh the congestion avoidance algorithm 413 is used to increase cwnd [Jac88] [Bra89] [Ste97]. This algorithm 414 increases the size of cwnd more slowly than does slow start. 415 Congestion avoidance is used to probe the network for any additional 416 capacity. During congestion avoidance, cwnd is increased by 1/cwnd 417 for each incoming ACK. Therefore, if one ACK is received for every 418 data segment, cwnd will increase by roughly 1 segment per round-trip 419 time (RTT). 421 The slow start and congestion control algorithms can force poor 422 utilization of the available channel bandwidth when using long-delay 423 satellite networks [All97]. For example, transmission begins with 424 the transmission of one segment. After the first segment is 425 transmitted the data sender is forced to wait for the corresponding 426 ACK. When using a GSO satellite this leads to an idle time of 427 roughly 500 ms when no useful work is being accomplished. 428 Therefore, slow start takes more real time over GSO satellites than 429 on typical terrestrial channels. This holds for congestion 430 avoidance, as well [All97]. This is precisely why Path MTU 431 Discovery is an important algorithm. While the number of segments 432 we transmit is determined by the congestion control algorithms, the 433 size of these segments is not. Therefore, using larger packets will 434 enable TCP to send more data per segment which yields better channel 435 utilization. 437 4.1.2 Fast Retransmit and Fast Recovery 439 TCP's default mechanism to detect dropped segments is a timeout 440 [Pos81]. In other words, if the sender does not receive an ACK for 441 a given packet within the expected amount of time the segment will 442 be retransmitted. The retransmission timeout (RTO) is based on 443 observations of the RTT. In addition to retransmitting a segment 444 when the RTO expires, TCP also uses the lost segment as an 445 indication of congestion in the network. In response to the 446 congestion, the value of ssthresh is set to half of the cwnd and the 447 value of cwnd is then reduced to 1 segment. This triggers the use 448 of the slow start algorithm to increase cwnd until the value of cwnd 449 reaches half of its value when congestion was detected. After the 450 slow start phase, the congestion avoidance algorithm is used to 451 probe the network for additional capacity. 453 TCP ACKs always acknowledge the highest in-order segment that has 454 arrived. Therefore an ACK for segment X also effectively ACKs all 455 segments < X. Furthermore, if a segment arrives out-of-order the 456 ACK triggered will be for the highest in-order segment, rather than 457 the segment that just arrived. For example, assume segment 11 has 458 been dropped somewhere in the network and segment 12 arrives at the 459 receiver. The receiver is going to send a duplicate ACK covering 460 segment 10 (and all previous segments). 462 The fast retransmit algorithm uses these duplicate ACKs to detect 463 lost segments. If 3 duplicate ACKs arrive at the data originator, 464 TCP assumes that a segment has been lost and retransmits the missing 465 segment without waiting for the RTO to expire. After a segment is 466 resent using fast retransmit, the fast recovery algorithm is used to 467 adjust the congestion window. First, the value of ssthresh is set 468 to half of the value of cwnd. Next, the value of cwnd is halved. 469 Finally, the value of cwnd is artificially increased by 1 segment 470 for each duplicate ACK that has arrived. The artificial inflation 471 can be done because each duplicate ACK represents 1 segment that has 472 left the network. When the cwnd permits, TCP is able to transmit 473 new data. This allows TCP to keep data flowing through the network 474 at half the rate it was when loss was detected. When an ACK for the 475 retransmitted packet arrives, the value of cwnd is reduced back to 476 ssthresh (half the value of cwnd when the congestion was detected). 478 Fast retransmit can resend only one segment per window of data sent. 479 When multiple segments are lost in a given window of data, one of 480 the segments will be resent using fast retransmit and the rest of 481 the dropped segments must usually wait for the RTO to expire, which 482 causes TCP to revert to slow start. 484 TCP's response to congestion differs based on the way the congestion 485 was detected. If the retransmission timer causes a packet to be 486 resent, TCP drops ssthresh to half the current cwnd and reduces the 487 value of cwnd to 1 segment (thus triggering slow start). However, 488 if a segment is resent via fast retransmit both ssthresh and cwnd 489 are set to half the current value of cwnd and congestion avoidance 490 is used to send new data. The difference is that when 491 retransmitting due to duplicate ACKs, TCP knows that packets are 492 still flowing through the network and can therefore infer that the 493 congestion is not that bad. However, when resending a packet due to 494 the expiration of the retransmission timer, TCP cannot infer 495 anything about the state of the network and therefore must proceed 496 conservatively by sending new data using the slow start algorithm. 498 Note that the fast retransmit/fast recovery algorithms, as discussed 499 above can lead to a phenomenon that allows multiple fast retransmits 500 per window of data [Flo94]. This can reduce the size of the 501 congestion window multiple times for a single ``loss event''. The 502 problem is particularly noticeable in connection that utilize large 503 congestion windows, since these connections are able to inject 504 enough new segments into the network during recovery to trigger the 505 multiple fast retransmits. Reducing cwnd multiple times for a 506 single loss event has been shown to hurt performance [GJKFV98]. 508 The best way to improve the fast retransmit/fast recovery algorithms 509 is to use a selective acknowledgment (SACK) based algorithm for loss 510 recovery. As discussed below, these algorithms are generally able 511 to quickly recover from multiple lost segments without needlessly 512 reducing the value of cwnd. However, if SACK is not available for a 513 particular connection, a simple way to enhance the fast 514 retransmit/fast recovery algorithms is to allow successive fast 515 retransmits to occur, but do not reduce the congestion window more 516 than once per window of data transmitted. 518 4.1.3 Congestion Control in Satellite Environment 520 The above algorithms have a negative impact on the performance of 521 individual TCP connection's performance because the algorithms 522 slowly probe the network for addition capacity, which in turn wastes 523 bandwidth. This is especially true over long-delay satellite 524 channels because of the large amount of time required for the sender 525 to obtain feedback from the receiver [All97] [AHKO97]. However, the 526 algorithms are necessary to prevent congestive collapse in a shared 527 network [Jac88]. Therefore, the negative impact on a given 528 connection is more than offset by the benefit to the entire network. 530 4.2 Large TCP Windows 532 The standard TCP window size (65,535 bytes) is not adequate to allow 533 a single TCP connection to utilize the entire bandwidth available on 534 some satellite channels. TCP throughput is limited by the following 535 formula [Pos81]: 537 throughput = window size / RTT 539 Therefore, using the maximum window size of 65,535 bytes and a 540 geosynchronous satellite channel RTT of 560 ms [Kru95] the maximum 541 throughput is limited to: 543 throughput = 65,535 bytes / 560 ms = 117,027 bytes/second 545 Therefore, a single standard TCP connection cannot fully utilize, 546 for example, T1 rate (approximately 192,000 bytes/second) GSO 547 satellite channels. However, TCP has been extended to support 548 larger windows [JBB92]. The window scaling options outlined in 549 [JBB92] should be used in satellite environments, as well as the 550 companion algorithms PAWS (Protection Against Wrapped Sequence 551 space) and RTTM (Round-Trip Time Measurements). 553 It should be noted that for a satellite link shared among many 554 flows, large windows may not be necessary. For instance, two 555 long-lived TCP connections each using a window of 65,535 bytes, as 556 in the above example, can fully utilize a T1 GSO satellite channel. 558 Using large windows often requires applications or TCP stacks to be 559 hand tuned (usually by an expert) to utilize large windows. 560 Research into operating system mechanisms that are able to adjust 561 the buffer capacity as needed is currently underway [SMM98]. This 562 will allow stock TCP implementations and applications to better 563 utilize the capacity provided by the underlying network. 565 4.3 Acknowledgment Strategies 567 There are two standard methods that can be used by TCP receivers to 568 generated acknowledgments. The method outlined in [Pos81] generates 569 an ACK for each incoming segment. [Bra89] states that hosts SHOULD 570 use ``delayed acknowledgments''. Using this algorithm, an ACK is 571 generated for every second full-sized segment, or if a second 572 full-size segment does not arrive within a given timeout (which must 573 not exceed 500 ms). The congestion window is increased based on the 574 number of incoming ACKs and delayed ACKs reduce the number of ACKs 575 being sent by the receiver. Therefore, cwnd growth occurs much more 576 slowly when using delayed ACKs compared to the case when the receiver 577 ACKs each incoming segment. 579 A tempting ``fix'' to the problem caused by delayed ACKs is to 580 simply turn the mechanism off and let the receiver ACK each incoming 581 segment. However, this is not recommended. First, [Bra89] says 582 that a TCP receiver SHOULD generate delayed ACKs. And, second, 583 increasing the number of ACKs by a factor of two in a shared network 584 may have consequences that are not yet understood. Therefore, 585 disabling delayed ACKs is still a research issue and thus, at this 586 time TCP receivers should continue to generate delayed ACKs, per 587 [Bra89]. 589 4.4 Selective Acknowledgments 591 Selective acknowledgments (SACKs) [MMFR96] allow TCP receivers to 592 inform TCP senders exactly which packets have arrived. SACKs allow 593 TCP to recover more quickly from lost segments, as well as avoiding 594 needless retransmissions. 596 The fast retransmit algorithm can generally only repair one loss per 597 window of data. When multiple losses occur, the sender generally 598 must rely on a timeout to determine which segment needs to be 599 retransmitted next. While waiting for a timeout, the data segments 600 and their acknowledgments drain from the network. In the absence of 601 incoming ACKs to clock new segments into the network, the sender 602 must use the slow start algorithm to restart transmission. As 603 discussed above, the slow start algorithm can be time consuming over 604 satellite channels. When SACKs are employed, the sender is 605 generally able to determine which segments need to be retransmitted 606 in the first RTT following loss detection. This allows the sender 607 to continue to transmit segments (retransmissions and new segments, 608 if appropriate) at an appropriate rate and therefore sustain the ACK 609 clock. This avoids a costly slow start period following multiple 610 lost segments. Generally SACK is able to retransmit all dropped 611 segments within the first RTT following the loss detection. [MM96] 612 and [FF96] discuss specific congestion control algorithms that rely 613 on SACK information to determine which segments need to be 614 retransmitted and when it is appropriate to transmit those segments. 615 Both these algorithms follow the basic principles of congestion 616 control outlined in [Jac88] and reduce the window by half when 617 congestion is detected. 619 5. Mitigation Summary 621 Table 1 summarizes the mechanisms that have been discussed in this 622 document. Those mechanisms denoted "Recommended" are IETF standards 623 track mechanisms that are recommended by the authors for use in 624 networks containing satellite channels. Those mechanisms marked 625 "Required" have been defined by the IETF as required for hosts using 626 the shared Internet [Bra89]. Along with the section of this 627 document containing the discussion of each mechanism, we note where 628 the mechanism needs to be implemented. The codes listed in the last 629 column are defined as follows: ``S'' for the data sender, ``R'' for 630 the data receiver and ``L'' for the satellite link. 632 Mechanism Use Section Where 633 +------------------------+-------------+------------+--------+ 634 | Path-MTU Discovery | Recommended | 3.1 | S | 635 | FEC | Recommended | 3.2 | L | 636 | TCP Congestion Control | | | | 637 | Slow Start | Required | 4.1.1 | S | 638 | Congestion Avoidance | Required | 4.1.1 | S | 639 | Fast Retransmit | Recommended | 4.1.2 | S | 640 | Fast Recovery | Recommended | 4.1.2 | S | 641 | TCP Large Windows | | | | 642 | Window Scaling | Recommended | 4.2 | S,R | 643 | PAWS | Recommended | 4.2 | S,R | 644 | RTTM | Recommended | 4.2 | S,R | 645 | TCP SACKs | Recommended | 4.4 | S,R | 646 +------------------------+-------------+------------+--------+ 647 Table 1 649 Satellite users should check with their TCP vendors (implementors) 650 to ensure the recommended mechanisms are supported in their stack in 651 current and/or future versions. Alternatively, the Pittsburgh 652 Supercomputer Center tracks TCP implementations and which extensions 653 they support, as well as providing guidance on tuning various TCP 654 implementations [PSC]. 656 Research into improving the efficiency of TCP over satellite 657 channels is ongoing and will be summarized in a planned memo along 658 with other considerations, such as satellite network architectures. 660 6. Security Considerations 662 The authors believe that the recommendations contained in this memo 663 do not alter the security implications of TCP. However, when using 664 a broadcast medium such as satellites links to transfer user data 665 and/or network control traffic, one should be aware of the intrinsic 666 security implications of such technology. 668 Eavesdropping on network links is a form of passive attack that, if 669 performed successfully, could reveal critical traffic control 670 information that would jeopardize the proper functioning of the 671 network. These attacks could reduce the ability of the network to 672 provide data transmission services efficiently. Eavesdroppers could 673 also compromise the privacy of user data, especially if end to end 674 security mechanisms are not in use. While passive monitoring can 675 occur on any network, the wireless broadcast nature of satellite 676 links allows reception of signals without physical connection to the 677 network which enables monitoring to be conducted without detection. 678 However, it should be noted that the resources needed to monitor a 679 satellite link are non-trivial. 681 Data encryption at the physical and/or link layers can provide 682 secure communication over satellite channels. However, this still 683 leaves traffic vulnerable to eavesdropping on networks before and 684 after traversing the satellite link. Therefore, end-to-end security 685 mechanisms should be considered. This document does not make any 686 recommendations as to which security mechanisms should be employed. 687 However, those operating and using satellite networks should survey 688 the currently available network security mechanisms and choose those 689 that meet their security requirements. 691 Acknowledgments 693 This document has benefited from comments from the members of the 694 TCP Over Satellite Working Group. In particular, we would like to 695 thank Aaron Falk, Matthew Halsey, Hans Kruse, Matt Mathis, Greg 696 Nakanishi, Vern Paxson, Jeff Semke, Bill Sepmeier and Eric Travis 697 for their useful comments about this document. Finally, we are 698 indebted to Luis Sanchez for providing much needed guidance on 699 security section. 701 References 703 [AFP98] Mark Allman, Sally Floyd, Craig partridge. Increasing TCP's 704 Initial Window. May 1998. Internet-Draft 705 draft-floyd-incr-init-win-03.txt. 707 [AHKO97] Mark Allman, Chris Hayes, Hans Kruse, and Shawn Ostermann. 708 TCP Performance Over Satellite Links. In Proceedings of the 5th 709 International Conference on Telecommunication Systems, March 710 1997. 712 [All97] Mark Allman. Improving TCP Performance Over Satellite 713 Channels. Master's thesis, Ohio University, June 1997. 715 [Bra89] Robert Braden. Requirements for Internet Hosts -- 716 Communication Layers, October 1989. RFC 1122. 718 [FF96] Kevin Fall and Sally Floyd. Simulation-based Comparisons of 719 Tahoe, Reno and SACK TCP. Computer Communication Review, July 720 1996. 722 [FF98] Sally Floyd, Kevin Fall. Promoting the Use of End-to-End 723 Congestion Control in the Internet. Submitted to IEEE 724 Transactions on Networking. 726 [Flo94] S. Floyd, TCP and Successive Fast Retransmits. Technical 727 report, October 1994. 728 ftp://ftp.ee.lbl.gov/papers/fastretrans.ps. 730 [GJKFV98] Rohit Goyal, Raj Jain, Shiv Kalyanaraman, Sonia Fahmy, 731 Bobby Vandalore, "Improving the Performance of TCP over the 732 ATM-UBR service", 1998. Sumbitted to Computer Communications. 734 [Jac90] Van Jacobson. Modified TCP Congestion Avoidance Algorithm. 735 Technical Report, LBL, April 1990. 737 [JBB92] Van Jacobson, Robert Braden, and David Borman. TCP 738 Extensions for High Performance, May 1992. RFC 1323. 740 [Jac88] Van Jacobson. Congestion Avoidance and Control. In ACM 741 SIGCOMM, 1988. 743 [Kno93] Steve Knowles. IESG Advice from Experience with Path MTU 744 Discovery, March 1993. RFC 1435. 746 [Mar78] James Martin. Communications Satellite Systems. Prentice 747 Hall, 1978. 749 [MD90] Jeff Mogul and Steve Deering. Path MTU Discovery, November 750 1990. RFC 1191. 752 [MM96] Matt Mathis and Jamshid Mahdavi. Forward Acknowledgment: 753 Refining TCP Congestion Control. In ACM SIGCOMM, 1996. 755 [MMFR96] Matt Mathis, Jamshid Mahdavi, Sally Floyd, and Allyn 756 Romanow. TCP Selective Acknowledgment Options, October 1996. 757 RFC 2018. 759 [Pos81] Jon Postel. Transmission Control Protocol, September 1981. 760 RFC 793. 762 [PS97] Craig Partridge and Tim Shepard. TCP Performance Over 763 Satellite Links. IEEE Network, 11(5), September/October 1997. 765 [PSC] Jamshid Mahdavi. Enabling High Performance Data Transfers on 766 Hosts. http://www.psc.edu/networking/perf_tune.html. 768 [SMM98] Jeff Semke, Jamshid Mahdavi and Matt Mathis. Automatic TCP 769 Buffer Tuning. In ACM SIGCOMM, August 1998. To appear. 771 [Sta94] William Stallings. Data and Computer Communications. 772 MacMillian, 4th edition, 1994. 774 [Ste97] W. Richard Stevens. TCP Slow Start, Congestion Avoidance, 775 Fast Retransmit, and Fast Recovery Algorithms, January 1997. 776 RFC 2001. 778 Author's Addresses: 780 Mark Allman 781 NASA Lewis Research Center/Sterling Software 782 21000 Brookpark Rd. MS 54-2 783 Cleveland, OH 44135 784 mallman@lerc.nasa.gov 785 http://gigahertz.lerc.nasa.gov/~mallman 787 Dan Glover 788 NASA Lewis Research Center 789 21000 Brookpark Rd. MS 54-2 790 Cleveland, OH 44135 791 Daniel.R.Glover@lerc.nasa.gov