idnits 2.17.1 draft-touch-tcpm-sno-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 21, 2016) is 2744 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2385 (Obsoleted by RFC 5925) -- Obsolete informational reference (is this intentional?): RFC 1078 (Obsoleted by RFC 7805) -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) == Outdated reference: A later version (-13) exists of draft-ietf-tcpm-tcp-edo-06 == Outdated reference: A later version (-12) exists of draft-touch-tcpm-tcp-syn-ext-opt-05 Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 TCPM WG J. Touch 2 Internet Draft USC/ISI 3 Intended status: Experimental October 21, 2016 4 Expires: April 2017 6 The TCP Service Number Option (SNO) 7 draft-touch-tcpm-sno-06.txt 9 Status of this Memo 11 This Internet-Draft is submitted in full conformance with the 12 provisions of BCP 78 and BCP 79. This document may not be modified, 13 and derivative works of it may not be created, except to publish it 14 as an RFC and to translate it into languages other than English. 16 This document may contain material from IETF Documents or IETF 17 Contributions published or made publicly available before November 18 10, 2008. The person(s) controlling the copyright in some of this 19 material may not have granted the IETF Trust the right to allow 20 modifications of such material outside the IETF Standards Process. 21 Without obtaining an adequate license from the person(s) controlling 22 the copyright in such materials, this document may not be modified 23 outside the IETF Standards Process, and derivative works of it may 24 not be created outside the IETF Standards Process, except to format 25 it for publication as an RFC or to translate it into languages other 26 than English. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF), its areas, and its working groups. Note that 30 other groups may also distribute working documents as Internet- 31 Drafts. 33 Internet-Drafts are draft documents valid for a maximum of six 34 months and may be updated, replaced, or obsoleted by other documents 35 at any time. It is inappropriate to use Internet-Drafts as 36 reference material or to cite them other than as "work in progress." 38 The list of current Internet-Drafts can be accessed at 39 http://www.ietf.org/ietf/1id-abstracts.txt 41 The list of Internet-Draft Shadow Directories can be accessed at 42 http://www.ietf.org/shadow.html 44 This Internet-Draft will expire on April 21, 2014. 46 Copyright Notice 48 Copyright (c) 2016 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with 56 respect to this document. Code Components extracted from this 57 document must include Simplified BSD License text as described in 58 Section 4.e of the Trust Legal Provisions and are provided without 59 warranty as described in the Simplified BSD License. 61 Abstract 63 This document specifies a TCP option for service numbers. The 64 current SYN destination port is used both to indicate the desired 65 service and as a connection demultiplexing field. This option 66 separates those two functions, retaining the current destination 67 port solely for demultiplexing and indicating the service separately 68 in a service number option (SNO). By decoupling these two functions, 69 SNO allows a larger number of concurrent connections for a single 70 service, as might be useful between fixed addresses of proxies. 72 Table of Contents 74 1. Introduction...................................................3 75 2. Conventions used in this document..............................4 76 3. Background.....................................................4 77 3.1. IANA port numbers.........................................5 78 3.2. DNS SRV records...........................................6 79 3.3. RPC portmapper and RPCBIND................................6 80 3.4. TCPMUX....................................................7 81 3.5. Summary of alternatives and comparison to SNO.............8 82 4. TCP Service Number Option......................................9 83 4.1. Interaction between SNO and the TCP API..................10 84 4.1.1. Active OPEN (Unix connect)..........................11 85 4.1.2. Passive OPEN (Unix listen)..........................11 86 4.1.3. Impact on the TCP OPEN API..........................11 87 4.2. Error conditions.........................................12 88 4.3. Backward compatibility...................................12 89 5. Issues........................................................13 90 5.1. Interaction with other protocols and features............13 91 5.2. Potential use in other transport protocols...............14 92 5.3. Discussion of alternative approaches.....................15 93 5.4. Implementation Issues....................................16 94 6. SNO impact on TCP option space................................17 95 7. Security Considerations.......................................17 96 8. IANA considerations...........................................18 97 9. References....................................................18 98 9.1. Normative References.....................................18 99 9.2. Informative References...................................18 100 10. Acknowledgments..............................................20 102 1. Introduction 104 TCP connections are defined by a socket pair, where each TCP socket 105 consists of an IP address and a port number. The IP addresses 106 indicate the network endpoints (hosts) of the connection, and the 107 port numbers allow a pair of IP endpoints to have more than one 108 concurrent connection. TCP connections begin when an application on 109 one host sends a SYN segment to a waiting application on the other 110 host, determined by the destination port in that segment. 112 Port numbers thus serve two distinct purposes. For the entirety of a 113 connection, they help differentiate concurrent connections as part 114 of the socket pair, and are thus used for demultiplexing within a 115 host. For the SYN, the destination port also indicates the waiting 116 application, i.e., the service for that connection, acting as a 117 service identifier. 119 Service identifiers need to be coordinated between the endpoints of 120 a connection, but need not be coordinated with any other component 121 of the network. To avoid the need for explicit pairwise 122 coordination, most Internet transport protocols currently use 123 globally-assigned destination port numbers as service identifiers; 124 this includes TCP, UDP, SCTP, and DCCP [RFC768] [RFC793] [RFC4960] 125 [RFC4340]. An assigned port number can be requested from the 126 Internet Assigned Numbers Authority (IANA) [IANA]. 128 The use of SYN destination ports as both service identifier and 129 demultiplexing identifier can impact TCP performance. For a given 130 service, a given pair of endpoints can have at most 2^16 concurrent 131 connections, or even connections in the TIME-WAIT state (which is 132 typically expected to last two minutes) [To1999]. This limits 133 services to at most an average of 550 connections per second, which 134 can be a constraint on proxy-to-proxy services. 136 To reduce this impact, this document specifies the TCP service 137 number option (SNO), which allows services to be specified in an 138 option separate from the current header destination port field. SNO 139 decouples the use of ports for connection demultiplexing and state 140 management from their use to indicate a desired endpoint service. 141 This decoupling can substantially increase the number of concurrent 142 connections to 2^32; even considering current expected TIME-WAIT 143 delay, that can support up to 35.8M connections per second. 145 Although it changes TCP SYNs, it does not otherwise affect the 146 processing of other TCP segments or the TCP state machine. SNO must 147 be implemented at both ends of a TCP connection to be effective. 149 2. Conventions used in this document 151 In examples, "C:" and "S:" indicate lines sent by the client and 152 server respectively. 154 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 155 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 156 document are to be interpreted as described in RFC-2119 [RFC2119]. 158 In this document, these words will appear with that interpretation 159 only when in ALL CAPS. Lower case uses of these words are not to be 160 interpreted as carrying RFC-2119 significance. 162 In this document, the characters ">>" preceding an indented line(s) 163 indicates a compliance requirement statement using the key words 164 listed above. This convention aids reviewers in quickly identifying 165 or finding the explicit compliance requirements of this RFC. 167 3. Background 169 TCP supports multiplexing as one of its six core facilities, 170 allowing a single pair of hosts to have multiple concurrent TCP 171 sessions (see Sec. 1.5 of [RFC793]). An endpoint address is 172 associated with a port number, forming a socket; and "A pair of 173 sockets uniquely identifies each connection." Although ports can be 174 bound to services uniquely at each endpoint, RFC 793 notes that it 175 is useful to attach frequently-used services to fixed ports which 176 are publicly known, but that other services may be discovered by 177 dynamic means. This document addresses one impact of that 178 suggestion, and specifies an alternative which alleviates that 179 impact. 181 The Internet currently relies on the use of fixed, publicly-agreed 182 port numbers for most services, whether for public access (e.g., 183 HTTP, FTP, DNS) or between pre-arranged pairs (e.g., X11, SSL). Some 184 of these services use one public port to negotiate other ports for 185 further exchanges (e.g., FTP, H.323, RPC). 187 There are several current methods for determining the port for a 188 public service: 190 o Index the service in IANA's port registry 192 o Index the service in the host's DNS SRV records 194 o Ask the host directly using an RPC portmapper/bind-like service 196 o Ask the host for a hand-off using the TCPMUX port (port #1) 198 Many of these alternatives, including the use of strings as service 199 identifiers, were described in principle in RFC 814, and have 200 evolved into deployed capabilities [RFC814]. Each of these 201 alternatives is summarized below, and each either significantly 202 limits the number of concurrent connections for a service or incurs 203 additional latency or management overhead compared to SNO. 205 3.1. IANA port numbers 207 The Internet Assigned Numbers Authority currently manages globally 208 reserved port numbers [RFC6335]. The desired port number for a 209 service is determined either by an operating system index to a copy 210 of IANA's table (e.g., getportbyname() in Unix, which indexes the 211 /etc/services file), or is fixed in inside the application. 213 The port number space 0..65536 is split into three ranges [RFC2780]: 215 o 0..1023 "well-known", also called "system" ports 217 o 1024..49151 "registered", also called "user" ports 219 o 49152..65535 "dynamic", also called "private" ports 221 The terms "well-known" and "registered" are misnomers; both of those 222 port ranges are managed by IANA, and are equally registered and 223 well-known; they are currently known together as "assigned" 224 [RFC6335]. System ports are intended for services that run in 225 privileged mode, sometimes known as "root", although that 226 distinction is blurred in current operating systems. 228 IANA-managed ports are allocated globally, for all hosts everywhere 229 on the public Internet, even though the meaning of a port need be 230 known only for a particular host. A given service is typically 231 assigned a single port, which then limits the number of concurrent 232 connections between two hosts to 2^16, i.e., the number expressible 233 in the source port field. This assumes that the source port can be 234 arbitrary; in may implementations, when a service is bound to a SYN 235 destination port it is prohibited for use in other connections, 236 e.g., as a source port for outgoing SYNs. 238 3.2. DNS SRV records 240 DNS SRV resource records provide a way to find the port number for a 241 service based on its string name [RFC2782]. A host asks the DNS to 242 index "_servicename._tcp.hostname" (underscores required) and the 243 response is a record that includes both the port number and host's 244 IP address. 246 SRV records allow port numbers to be allocated on a per-host basis, 247 and allow multiple ports to be indicated for a given service. A 248 system that wants to support a large number of concurrent HTTP 249 connections could advertise the HTTP service as available on the 250 entire unassigned (dynamic) port range, in addition to port 80. This 251 can increase the number of concurrent connections to 2^30 (2^14 252 dynamic ports and 2^16 source ports), which would be nearly as good 253 as SNO (2^32). 255 However, SRV lookups require an additional protocol exchange for 256 each first-time access, which can traverse much of the same path the 257 TCP SYN will, i.e., incurring an additional round-trip time of delay 258 (because DNS servers are often located near the hosts they serve). 259 Further, using SRV records requires that the dynamic ports be 260 allocated in advance, and they cannot be reclaimed once advertised. 261 SRV advertisement may be useful for a single known service, but does 262 not support a larger number of connections for any (or every) 263 service on-demand. 265 Additional challenges for DNS SRV records are autonomy, robustness, 266 and size of the name space. Many hosts do not have control over 267 their DNS entries; moving port lookup into the DNS could limit the 268 services that a host can deploy for public access. This solution 269 also makes the DNS a required part of the Internet architecture, 270 even for accessing services on hosts with well-known IP addresses 271 (e.g., the DNS itself). This decreases network robustness, because 272 access of services on a host depends on access to the DNS. 274 3.3. RPC portmapper and RPCBIND 276 An alternative to indexing the service name at a separate host via 277 the DNS would be to contact the intended host directly and request 278 the lookup there. This is how the RPC portmapper (v2) and RPCBIND 279 (v3 and v4) services work, where the source host contacts the 280 destination on port #111 [RFC1833][RFC5531]. This service was 281 designed for the same basic reason as the TCP port option of this 282 document: to allow a small subset of a potentially large set of 283 services to be dynamically bound to a small number of ports. The 284 differences between portmapper and RPCBIND are not important here, 285 so they are discussed as a single example. 287 In both portmapper and RPCBIND the source host contacts the 288 destination host on port 111, and issues a request including the 289 desired destination RPC service name. A response indicates the 290 appropriate port for that RPC service. 292 Like the DNS SRV solution, portmapper/RPCBIND requires a separate 293 round-trip (one for UDP; more for TCP) to perform the lookup 294 operation. This adds to both the communication overhead and 295 connection establishment latency. 297 The portmapper service also allows services to be selected on 298 version, i.e., to have different versions of a service on different 299 ports, accessed using the same version name but a different version 300 number. This is handled in some IANA entries, DNS SRV records, and 301 TCPMUX by using a port keyword that embeds the version number in the 302 name, e.g., 'imap' vs. 'imap3'. In most other cases, versioning is 303 indicated and negotiated in-band, inside the protocol (e.g., HTTP). 305 Unfortunately, portmapper has the same limitation as DNS SRV 306 records; once a port is advertised for a given service, it cannot be 307 reclaimed for use by another service. Further, once a given service 308 is advertised, it is likely that the requesting host will cache the 309 response. As a result, dynamic ports can be used to extend the port 310 space for a given service in advance, but they need to be pinned to 311 that service when it is first requested from that host. Again, this 312 limits the ability to flexibly support large numbers of connections 313 for any (or every) service. 315 3.4. TCPMUX 317 TCPMUX is a service on TCP port #1 which allows a host to provide a 318 port name handoff service for itself [RFC1078]. A source host opens 319 a connection to port 1 on a destination host and transmits 320 'portname' in the data stream; the destination replies with 321 either '+' (yes, the service is available) or '-' 322 (no, the service is not available). If the service is available, the 323 connection is transferred to the desired service while still in the 324 OPEN state. 326 TCPMUX modifies the semantics of TCP connection establishment; its 327 connections always succeed, and upon receipt of the named service 328 the application must determine whether to proceed or not. This 329 document seeks a more conventional TCP semantics, where unavailable 330 services result in a rejected connection (e.g., RST in reply and/or 331 ICMP error message). 333 TCPMUX further requires all new connections to be received on a 334 single port; this again limits the number of connections between two 335 machines to 2^16, which provides no benefit compared to existing 336 assigned ports as currently used in SYN segments. 338 3.5. Summary of alternatives and comparison to SNO 340 Each of the alternatives presented has a significant limitation. 341 These alternatives are summarized as follows: 343 o IANA ports: limits a given service to 2^16 concurrent connections 344 between two IP addresses; fewer if system/user/dynamic boundaries 345 are preserved 347 o DNS SRV records: requires an extra round-trip exchange for 348 lookup, not typically under host control, allows up to 2^30 349 concurrent connections but requires that the additional space of 350 2^14 be allocated to services on a given host in advance. 352 o Portmapper: requires an extra round-trip exchange for lookup, 353 allows up to 2^30 concurrent connections but requires that the 354 additional space of 2^14 be allocated to services on a given host 355 in advance 357 o TCPMUX: destroys semantics of TCP connection establishment, 358 limits connections per endpoint pair to 2^16 over all services 360 SNO allows the destination host to associate services with processes 361 on a per-connection basis, while avoiding unnecessary additional 362 round-trips or connections and also while reducing message overhead. 363 This enables every service to support up to 2^32 concurrent 364 connections, by decoupling the demultiplexing and service identifier 365 role of SYN destination ports. 367 The basic operation of SNO is as follows: 369 o The source host issues a SYN, picking both source and destination 370 port numbers arbitrarily that are not currently in use (active or 371 pending connection). 373 o The SYN includes SNO, which indicates the IANA assigned port 374 number of the desired service. 376 o The destination host, upon receiving the SYN with SNO, determines 377 whether the service indicated in the option is running. If so, a 378 SYN-ACK is issued with a zero-length SNO, indicating success of 379 the lookup and handoff. The service is bound to that connection 380 at the destination. 382 o If the service is not available, the appropriate RST and/or ICMP 383 error messages are returned. 385 The benefits to TCP SNO are that: 387 o For a given service, the number of connections between two given 388 IP addresses is no longer limited to 2^16; it is expanded to 389 2^32. 391 o SNO support is provided at the same host as the intended service, 392 so the fate of both is shared (i.e., it is more robust than 393 decoupled service such as DNS SRV). 395 o SNO is embedded in the TCP SYN segment, avoiding extra round 396 trips and messages. 398 o NAT traversal is preserved. 400 o TCP connection semantics are maintained, i.e., services not 401 available never connect. 403 4. TCP Service Number Option 405 The TCP service number option (SNO) extends the TCP header to 406 include a 16-bit port field indicating desired service, as shown in 407 Figure 1. 409 >> New implementations of TCP MAY implement SNO. 411 >> SNO SHOULD NOT appear in any TCP segment except SYN and SYN-ACK. 412 SNO MUST be silently ignored if in any segments except SYN and SYN- 413 ACK. 415 SNO includes the mandatory KIND and LENGTH fields [RFC793], as well 416 as the desired service port number. The current specification uses 417 the TCP Experimental Option format, with an ExID of 0x5323 in 418 network-standard byte order (ASCII for "S#") [RFC6994]. 420 The KIND is a single octet (byte) which indicates this is an 421 experimental option; SNO is supported on both experimental options 422 (253 and 254); there is no difference as to which experimental 423 option is used. The LENGTH is a single octet (byte) interpreted as 424 an unsigned number that indicates the length of this option in 425 octets (bytes), including the KIND and LENGTH fields, as well as the 426 octets of the Service-Number. 428 +--------+--------+--------+--------+ 429 | 253 | 6 | 0x53 | 0x23 | 430 +--------+--------+--------+--------+ 431 | Service-Number | 432 +--------+--------+ 434 Figure 1 TCP SNO SYN option format 436 Upon receipt of a TCP SYN segment including SNO ('TCP SYN/SNO'), the 437 Service-Number is matched against a list of available services. 438 Available services are those that listen on the indicated port 439 number. E.g., a web server that listens for incoming connections on 440 port 80 will respond to connections with SYN segments with SNO=80. 442 The way in which SNO and TCP destination port numbers interacts is 443 described in Section 4.1. When an incoming TCP SYN/SNO is considered 444 valid, the connection is completed by returning a SYN-ACK with a 445 null SNO. 447 +--------+--------+--------+--------+ 448 | 253 | 4 | 0x53 | 0x23 | 449 +--------+--------+--------+--------+ 451 Figure 2 TCP Null SNO format, as used in SYN-ACK 453 >> A TCP SYN/SNO answered with a TCP SYN with a non-null SNO (LENGTH 454 > 2) or lacking the SNO option MUST cause the initiator to abort the 455 connection via issuing a RST and by reporting an error to the 456 application as if the port were not available. 458 The TCB for that connection is then associated with the process for 459 the matching service, which then handles all further interactions 460 with the connection. 462 4.1. Interaction between SNO and the TCP API 464 TCP currently uses TCP port numbers to demultiplex connections as 465 well as to indicate the desired service at the destination. SNO 466 retains the demultiplexing capability, but overrides service 467 identification. 469 TCP specifies port numbers for connections in the OPEN command. The 470 current OPEN command is described in RFC 793 Sections 2.7 and 3.8 471 as: 473 OPEN (local port, foreign socket, active/passive 474 [, timeout] [, precedence] [, security/compartment] 475 [, options]) 476 -> local connection name 478 The OPEN call is used to initiate connections, corresponding to Unix 479 connect, and to wait for incoming connection requests, corresponding 480 to Unix listen. The impact of the SNO option on each of these 481 variants is described below. 483 4.1.1. Active OPEN (Unix connect) 485 During a TCP active OPEN command, SNO interprets the port number of 486 foreign TCP socket as the SNO Service-Number and selects a random 487 number as the foreign port. The OPEN command can be extended to 488 override that random selection by extending the foreign socket to 489 include both the service identifier and port number as separate 490 fields. 492 4.1.2. Passive OPEN (Unix listen) 494 During a TCP passive OPEN command, SNO interprets the local port 495 number as the SNO service identifier. The OPEN command can be 496 extended to allow the listening application to also indicate a 497 specific destination port by extending the local port to include 498 both a service identifier and port number as separate fields. 500 4.1.3. Impact on the TCP OPEN API 502 Both active OPEN and passive OPEN may need to extend the current 503 port numbers to include separate service identifiers. It may be 504 useful to consider that only one service identifier is ever used, 505 e.g., an active OPEN may need a separate foreign service identifier, 506 and a passive OPEN may need a separate local service identifier, but 507 separate service identifiers for both foreign and local would never 508 occur. As a result, it may be more convenient to consider the TCP 509 OPEN API as being extended with a single service field as follows: 511 SNOPEN (local port, foreign socket, service, active/passive 512 [, timeout] [, precedence] [, security/compartment] 513 [, options]) 514 -> local connection name 516 Legacy uses of the OPEN call can be trivially converted to the new 517 SNOPEN description. A legacy active OPEN uses the port of the 518 foreign socket as the service; a legacy passive OPEN uses the local 519 port as the service. 521 However, because the most common use is to allow the active foreign 522 port or passive local port "float" (be unspecified, and thus filled 523 by the OS with an arbitrary value), most implementations will not 524 need to modify the TCP OPEN API implementation, or can extend the 525 API using a separate interface (e.g., Unix setsockopt). 527 4.2. Error conditions 529 There are two error conditions for a SYN segment with the SNO option 530 to be considered: 532 o SNO not supported 534 o Invalid port (i.e., no application listening on that port) 536 The case where SNO is not supported is already addressed in TCP as 537 an unknown option [RFC793. Implementations are expected to ignore 538 it, which means the SYN-ACK would not include the SNO confirmation 539 response. 541 >> For an invalid port, the receiving TCP should act as it would if 542 the destination port were a service that is not available, i.e., it 543 SHOULD return an ICMP port unreachable error message [RFC1122]. This 544 message MUST include the received TCP header including the SNO 545 option in its entirety. The destination TCP MUST also send a RST in 546 response. Other interactions are the result of backward 547 compatibility, and are discussed in Section 4.3. 549 4.3. Backward compatibility 551 The TCP SNO option is designed to interact correctly only on SNO- 552 supporting implementations. 554 SNO connection attempts to non-SNO endpoints will be rejected; the 555 SNO SYN will receive a non-SNO SYN-ACK, at which point the SNO 556 endpoint will terminate the connection attempt. 558 Services on SNO endpoints will support both SNO and non-SNO incoming 559 connections, without the need for recompilation or relinking. 561 >> Outgoing connections intended to be compatible with both 562 implementations MUST either attempt both SNO and non-SNO connections 563 in parallel or retry a failed SNO attempt with a non-SNO attempt. 565 5. Issues 567 The TCP SNO option interacts with some other IP and TCP services, 568 notably security services. Variants of the option may be useful in 569 other transport protocols. Also, there were a number of alternate 570 designs considered which this document captures in summary. 572 5.1. Interaction with other protocols and features 574 TCP SNO potentially interacts with any other protocol that 575 interprets or modifies TCP port numbers. This includes IPsec and 576 other firewall systems, TCP/MD5 and other TCP security mechanisms, 577 FTP and other in-band exchange of ports, and network address 578 translators (NATs). 580 IPsec uses port numbers to perform access control in transport mode 581 [RFC4301]. Security policies can define port-specific access 582 control (PROTECT, BYPASS, DISCARD), as well as port-specific 583 algorithms and keys. Similarly, firewall policies allow or block 584 traffic based on port numbers. 586 Use of port numbers in IPsec selectors and firewalls may assume that 587 the numbers correspond to well-known services. It is useful to note 588 that there is no such requirement; any service may run on any port, 589 subject to mutual agreement between the endpoint hosts. Use of SNO 590 may interfere with this assumption both within IPsec and in other 591 firewalling systems, but it does not add a new vulnerability. New 592 implementations of IPsec and firewall systems may want to support 593 interpreting SNO in these policy rules, but again should not rely on 594 either port numbers to indicate a specific service. 596 TCP SNO occupies space in the TCP SYN segment. Such space is 597 severely limited in cases where TCP-level security is present, as 598 noted in detail in Section 5. 600 >> TCP SNO MUST be protected in the same way that the existing SYN 601 destination port is protected. 603 For IPsec, this is not an issue because the entire TCP header and 604 payload are protected by all IPsec modes. None of the TCP header is 605 protected by application-layer security, e.g., TLS, so again this is 606 not an issue [RFC5246]. 608 The resulting primary concern is TCP-level security, e.g., legacy 609 TCP/MD5 and its successors TCP-AO [RFC2385][RFC5925]. TCP/MD5 always 610 excludes TCP options in its hash calculation; this it fails to 611 protect current critical TCP options such as alternate checksums, 612 window scale, and timestamp options [RFC793] [RFC7323]. TCP-AO 613 allows options to be included or excluded, depending on per- 614 connection parameter. This document recommends, as per above, that 615 SNO, as all options, be included in TCP-level protection. Note that 616 it may be difficult to use SNO together with any of these TCP-layer 617 protection mechanisms unless the TCP option space is extended, as 618 with TCP EDO and/or EDO-SYN [To2016a][To2016b]. 620 A number of protocols exchange port numbers in-band, notably to 621 coordinate separate concurrent connections, e.g., FTP (file 622 transfer) and SIP (teleconferencing) [RFC959][RFC3261]. Because 623 these protocols coordinate the specific port numbers in advance, 624 there is no need for SNO to indicate the desired service. As a 625 result, it is unlikely that it would be useful to augment these 626 protocols to support SNO in their creation of subordinate 627 connections. SNO could still be useful in establishing the primary 628 (first) connection for these services. 630 Network address and port translators, known collectively as NATs, 631 not only read TCP ports, but may also translate them [RFC2993]. This 632 interferes with the use of ports for service identification 633 [RFC3234]. SNO may allow services to be identified behind NATs if 634 NATs are not further extended to translate SNO. It is thus unknown 635 whether SNO will help restore service identification in the presence 636 of NATs. 638 TCP connections using SNO continue to use IP addresses and ports, 639 although both port numbers are typically set arbitrarily. 640 Translation of these ports should not interfere with the operation 641 of NATs, though this has not been verified and is not a design 642 requirement. 644 5.2. Potential use in other transport protocols 646 As noted earlier, SNO may be a useful addition to a variety of other 647 transport protocols, such as UDP, SCTP and DCCP [RFC768] [RFC4960] 648 [RFC4340]. Adding SNO support to SCTP and DCCP should be 649 straightforward because both already have an option space. These are 650 not addressed further in this document, because this focuses on TCP 651 only. 653 DCCP already includes a Service Code that provides a similar way to 654 separately identify services, but these codes are 32 bits and use a 655 separate IANA registered space. DCCP does not use Service Codes as a 656 way to expand the number of concurrent connections to a given IANA 657 transport service. 659 UDP lacks options, so adding support for SNO is not feasible. 661 5.3. Discussion of alternative approaches 663 The current proposal assumes that the source TCP selects both source 664 and destination port numbers randomly, that SNO occurs only in SYN 665 and SYN-ACKs. A number of alternative approaches were considered 666 during the development of the approach presented herein. These 667 include: 669 o A portmapper-like service that returns a specific port number 671 o Continued demuxing based on SNO 673 o Dynamic overwriting of the destination port 675 The first approach, of returning a specific port number for a 676 service, requires a separate round trip and messages to initiate a 677 connection. We avoid both the additional time and messages in the 678 proposed solution which integrates the lookup in the SYN. 680 Continued demultiplexing based on SNO would violate TCP connection 681 semantics, which indicate that a connection be uniquely identified 682 by the 4-tuple: . Although 683 SNO demuxing would increase the connection tuple space, this seems 684 unnecessary as it is already over 2^32 concurrent connections 685 between a single pair of host addresses. Finally, this variant 686 incurs the SNO option overhead on every message, which seems 687 unnecessarily inefficient. The proposed solution is more efficient 688 and sufficiently increases the utility of the entire current 689 connection name space. 691 Dynamic overwriting of the destination port complicates the 692 connection establishment on the source side, because the SYN-ACK 693 would have a different port pair than the SYN. It would further 694 interfere with NAT traversal. The primary utility for overwriting 695 the port number would be to facilitate demultiplexing at the 696 receiver, but this is should already include the entire 4-tuple 697 anyway. Overall, this variant seems unnecessarily complex for no 698 real benefit. 700 5.4. Implementation Issues 702 Prototypes underway in both FreeBSD and Linux indicate substantial 703 challenges with implementing SNO due to errors in option processing 704 as well as optimizations that interfere with SNO's decoupling of 705 service and connection identifiers. 707 Option processing has never been sufficiently described to ensure 708 interoperable implementation. Both FreeBSD and Linux assume that TCP 709 options can be processed at a single location in both incoming and 710 outgoing TCP header processing, but this has never been true. In 711 particular, options that determine whether a segment is valid (TCP 712 MD5, TCP-AO, header checksum, etc.) must be processed before any 713 other header fields are interpreted, whereas options that are 714 interpreted in the context of header fields (e.g., SACK, etc.) must 715 be interpreted afterwards. 717 Keeping track of TCP state can require multiple data structures on 718 both endpoints, but these structures are currently optimized 719 assuming that port numbers are overloaded as both service and 720 connection identifiers. Connections can be in any of the following 721 11 states: CLOSED, LISTEN, SYN-SENT, SYN-RECEIVED, ESTABLISHED, FIN- 722 WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT. CLOSED 723 is fictional because no connection context exists. The remaining 724 states are often grouped as follows: 726 o LISTEN - no connection state yet; tracking which ports are bound 728 o Active - SYN-SENT, ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE- 729 WAIT, CLOSING, and LAST-ACK (sometimes also TIME-WAIT), in which 730 full connection state is kept 732 There are two states that are typically not kept in detail - SYN- 733 RECEIVED and TIME-WAIT. SYN-RECEIVED keeps track of a connection 734 that has received a SYN but not yet the final ACK of the three-way 735 handshake; it state is typically not kept in detail to avoid DOS 736 attacks that overload a server with half-open connections [RFC4987]. 737 Similarly, the TIME-WAIT state is often ignored or kept in aggregate 738 to avoid state accumulation on busy servers [Fa99]. 740 The challenge implementing SNO involves using the LISTEN queue for 741 SYN-RECEIVED states. Connections in the LISTEN state are indexed by 742 the service number: for legacy TCP connections, this is the SYN 743 destination port, and for SNO connections this is the SNO service 744 number. In the SYN-RECEIVED state, connections always need to be 745 indexed by the receive port number of the incoming ACK segment. As a 746 result, SNO implementations need a distinct SYN-RECEIVED queue; they 747 cannot reuse the LISTEN queue to keep track of pending half-open 748 connections. 750 The additional state needed for the SYN-RECEIVED queue is the same 751 regardless of whether it shares space with the LISTEN queue - each 752 receive port for half-open connections needs to be listed. The key 753 difference is the index to the queue. 755 6. SNO impact on TCP option space 757 SNO needs to fit inside the available TCP option space, which 758 provides 40 bytes for options. It is useful to consider that TCP SYN 759 segments may include other options, notably: 761 o 4 bytes of MSS [RFC793] 763 o 10 bytes of timestamp [RFC7323] 765 o 3 bytes of window scale [RFC7323] 767 o 2 + 8N bytes of SACK, for N SACK blocks [RFC2018][RFC6675] 769 This leaves only 13 bytes for the SNO option (assuming 1 SACK 770 block), which is more than sufficient. The experimental variant 771 described herein uses 6 bytes; a standards-track variant would use 772 only 4 bytes. 774 7. Security Considerations 776 There are four areas of security which the SNO option raises: 778 1. Interaction with IPsec and firewalls 780 2. Interaction with TCP/MD5 and TCP-AO security 782 3. Increased DOS impact 784 The impact on IPsec and firewalls is discussed in detail in Section 785 5.1. As noted there, SNO defeats the assumption that port numbers 786 correspond to specific services, an assumption that was already 787 defeated between consenting hosts. The SNO option thus raises no new 788 vulnerability. 790 The impact of SNO on TCP/MD5 and TCP-AO is also discussed in 791 Sections 5.1. Use of these services without inclusion of TCP options 792 makes all options vulnerable, including SNO. 794 The additional resources incurred by parsing the SNO option are 795 minimal. 797 8. IANA considerations 799 This document specifies a new TCP option that uses the shared 800 experimental options format, with ExID = 0x5323 in network-standard 801 byte order (representing ASCII "S#") [RFC6994]. This ExID has 802 already been registered with IANA. 804 9. References 806 9.1. Normative References 808 [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC 809 793, Sep. 1981 (STANDARD). 811 [RFC1122] Braden, R. (ed.), "Requirements for Internet Hosts - 812 Communication Layers", STD 3, RFC 1122, Oct. 1989 813 (STANDARD). 815 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 816 Requirement Levels", BCP 14, RFC 2119, Mar. 1997 (BEST 817 CURRENT PRACTICE). 819 [RFC2385] Heffernan, A., "Protection of BGP Sessions via the TCP MD5 820 Signature Option", RFC 2385, Aug. 1998 (PROPOSED 821 STANDARD). 823 [RFC6994] Touch, J., "Shared Use of Experimental TCP Options", 824 RFC6994, Aug. 2013 (PROPOSED STANDARD). 826 9.2. Informative References 828 [Fa99] T. Faber, J. Touch, and W. Yue, "The TIME-WAIT state in 829 TCP and Its Effect on Busy Servers", in Proc. IEEE 830 Infocom, 1999, pp. 1573-1583. 832 [Fr2008] Freire, S., A. Zuquete, "A TCP-layer name service for TCP 833 ports", Proc. Usenix, 2008. 835 [IANA] Internet Assigned Numbers Authority, www.iana.org 837 [RFC768] Postel, J., "User Datagram Protocol", RFC768, Aug. 1980 838 (STANDARD). 840 [RFC814] Clark, D., "NAME, ADDRESSES, PORTS, AND ROUTES", RFC 814, 841 Jul. 1982 (UNKNOWN). 843 [RFC959] Postel, J., J. Reynolds, "FILE TRANSFER PROTOCOL (FTP)", 844 STD 9, RFC 959, Oct. 1985 (STANDARD). 846 [RFC1078] Lottor, M., "TCP Port Service Multiplexer (TCPMUX)", 847 RFC1078, Nov. 1988 (UNKNOWN). 849 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 850 Selective Acknowledgment Options", RFC 2018, October 1996. 852 [RFC5531] Thurlow, R., "RPC: Remote Procedure Call Protocol 853 Specification Version 2", RFC 5531, May 2006 (DRAFT 854 STANDARD). 856 [RFC1833] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", 857 RFC 1833, Aug. 1995 (PROPOSED STANDARD). 859 [RFC2780] Bradner, S., V. Paxson, "IANA Allocation Guidelines For 860 Values In the Internet Protocol and Related Headers", BCP 861 37, RFC 2780, Mar. 2000 (BEST CURRENT PRACTICE). 863 [RFC2782] Gulbrandsen, A., P. Vixie, L. Esibov, "A DNS RR for 864 specifying the location of services (DNS SRV)", RFC 2782, 865 Feb. 2000 (PROPOSED STANDARD). 867 [RFC2993] Hain, T., "Architectural Implications of NAT", RFC 2993, 868 November 2000 (INFORMATIONAL). 870 [RFC3234] Carpenter, B., S. Brim, "Middleboxes: Taxonomy and 871 Issues", RFC 3234 Feb. 2002 (INFORMATIONAL). 873 [RFC3261] Rosenberg, J., H. Schulzrinne, G. Camarillo, A. Johnston, 874 J. Peterson, R. Sparks, M. Handley, E. Schooler, "SIP: 875 Session Initiation Protocol", RFC 3261, Jun. 2002 876 (PROPOSED STANDARD). 878 [RFC4301] Kent, S., K. Seo, "Security Architecture for the Internet 879 Protocol", RFC4301, Dec. 2005 (PROPOSED STANDARD). 881 [RFC4340] Kohler, E., M. Handley, S. Floyd, "Datagram Congestion 882 Control Protocol (DCCP)", RFC 4340, Mar. 2006 (PROPOSED 883 STANDARD). 885 [RFC4960] Stewart, R. (Ed.), "Stream Control Transmission Protocol", 886 RFC 4960, Sep. 2007 (PROPOSED STANDARD). 888 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 889 Mitigations", RFC 4987, Aug. 2007. 891 [RFC5246] Dierks, T., E. Rescorla, "The Transport Layer Security 892 (TLS) Protocol Version 1.2", RFC 5246, Aug. 2008 (PROPOSED 893 STANDARD). 895 [RFC5925] Touch, J., A. Mankin, R. Bonica, "The TCP Authentication 896 Option", RFC5925, Jun. 2010 (PROPOSED STANDARD). 898 [RFC6335] Cotton, M., L. Eggert, J. Touch, M. Westerlund, S. 899 Cheshire "Internet Assigned Numbers Authority (IANA) 900 Procedures for theManagement of the Transport Protocol 901 Port Number and Service Name Registry", RFC 6335 / BCP 902 165, Aug. 2011. 904 [RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M., 905 and Y. Nishida, "A Conservative Loss Recovery Algorithm 906 Based on Selective Acknowledgment (SACK) for TCP", RFC 907 6675, August 2012. 909 [RFC7323] D. Borman, R. Braden, Jacobson, V., R. Scheffenegger, Ed., 910 "TCP Extensions for High Performance", RFC 7323, May 1992 911 (PROPOSED STANDARD). 913 [To1999] Touch, J., T. Faber, "The TIME-WAIT state in TCP and its 914 Effect on Busy Servers", Proc. Infocom, 1999. 916 [To2006] Touch, J., "A TCP Option for Port Names", draft-touch-tcp- 917 portnames-00.txt (work in progress), Apr. 2006. 919 [To2016a] Touch, J., W. Eddy, "TCP Extended Data Offset Option", 920 draft-ietf-tcpm-tcp-edo-06 (work in progress), Jun. 2016. 922 [To2016b] Touch, J., T. Faber, "TCP SYN Extended Option Space Using 923 an Out-of-Band Segment", draft-touch-tcpm-tcp-syn-ext-opt- 924 05 (work in progress), Oct. 2016. 926 10. Acknowledgments 928 This work was inspired by discussions on the IETF mailing list, 929 notably by suggestions by Keith Moore and Noel Chiappa. Bob Braden 930 noted some of the origins of the named service concept. 932 This document is based on an earlier version based on using strings 933 rather than IANA port numbers, where the receiving host used the 934 strings to directly identify services [To2006]. A similar approach 935 was proposed that also used strings was implemented in Linux, except 936 that the strings were resolved by a separate server and transmitted 937 in the TCP segment as data (e.g., as with TCPMUX) [Fr2008]. 939 This effort is partly supported by USC/ISI's Postel Center. 941 This document was initially prepared using 2-Word-v2.0.template.dot. 943 Authors' Addresses 945 Joe Touch 946 USC/ISI 947 4676 Admiralty Way 948 Marina del Rey, CA 90292-6695 949 U.S.A. 951 Phone: +1 (310) 448-9151 952 Email: touch@isi.edu 953 URL: http://www.isi.edu/touch