idnits 2.17.1 draft-ietf-ipngwg-bsd-api-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 8 instances of too long lines in the document, the longest one being 4 characters in excess of 72. ** The abstract seems to contain references ([1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 225 has weird spacing: '... u_char s6_ad...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 21, 1995) is 10537 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '16' is mentioned on line 225, but not defined == Missing Reference: '0' is mentioned on line 650, but not defined -- Looks like a reference, but probably isn't: 'N-1' on line 662 -- Looks like a reference, but probably isn't: 'N' on line 666 -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' Summary: 11 errors (**), 0 flaws (~~), 6 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force R. E. Gilligan (Sun) 3 INTERNET-DRAFT S. Thomson (Bellcore) 4 J. Bound (Digital) 6 June 21, 1995 8 IPv6 Program Interfaces for BSD Systems 9 11 Abstract 13 In order to implement the version 6 Internet Protocol (IPv6) [1] in an 14 operating system based on Berkeley Unix (4.x BSD), changes must be made 15 to the application program interface (API). TCP/IP applications written 16 for BSD-based operating systems have in the past enjoyed a high degree 17 of portability because most of the systems derived from BSD provide the 18 same API, known informally as "the socket interface". We would like the 19 same portability with IPv6. This memo presents a set of extensions to 20 the BSD socket API to support IPv6. The changes include a new data 21 structure to carry IPv6 addresses, new name to address translation 22 library functions, new address conversion functions, and some new 23 setsockopt() options. The extensions are designed to provide access to 24 IPv6 features, while introducing a minimum of change into the system and 25 providing complete compatibility for existing IPv4 applications. 27 Status of this Memo 29 This document is an Internet Draft. Internet Drafts are working 30 documents of the Internet Engineering Task Force (IETF), its Areas, and 31 its Working Groups. Note that other groups may also distribute working 32 documents as Internet Drafts. 34 Internet Drafts are draft documents valid for a maximum of six months. 35 This Internet Draft expires on December 20, 1995. Internet Drafts may 36 be updated, replaced, or obsoleted by other documents at any time. It 37 is not appropriate to use Internet Drafts as reference material or to 38 cite them other than as a "working draft" or "work in progress." 40 To learn the current status of any Internet-Draft, please check the 41 1id-abstracts.txt listing contained in the Internet-Drafts Shadow 42 Directories on ds.internic.net, nic.nordu.net, ftp.isi.edu, or 43 munnari.oz.au. 45 Distribution of this memo is unlimited. 47 1. Introduction. 49 While IPv4 addresses are 32-bits long, IPv6 nodes are identified by 50 128-bit addresses. The socket interface API make the size of an IP 51 address quite visible to an application; virtually all TCP/IP 52 applications for BSD-based systems have knowledge of the size of an IP 53 address. Those parts of the API that expose the addresses need to be 54 extended to accommodate the larger IPv6 address size. This paper 55 defines a set of extensions to the socket interface API to support IPv6. 56 This specification is preliminary. The API extensions are expected to 57 evolve as we gain more implementation experience. 59 2. Design Considerations 61 There are a number of important considerations in designing changes to 62 this well-worn API: 64 - The extended API should provide both source and binary 65 compatibility for programs written to the original API. That 66 is, existing program binaries should continue to operate when 67 run on a system supporting the new API. In addition, existing 68 applications that are re-compiled and run on a system supporting 69 the new API should continue to operate. Simply put, the API 70 changes for IPv6 should not break existing programs. 72 - The changes to the API should be as small as possible in order 73 to simplify the task of converting existing IPv4 applications to 74 IPv6. 76 - Where possible, applications should be able to use the extended 77 API to interoperate with both IPv6 and IPv4 hosts. Applications 78 should not need to know which type of host they are 79 communicating with. 81 - IPv6 addresses carried in data structures should be 64-bit 82 aligned. This is necessary in order to obtain optimum 83 performance on 64-bit machine architectures. 85 Because of the importance of providing IPv4 compatibility in the API, 86 our extensions are explicitly designed to operate on machines that 87 provide complete support for both IPv4 and IPv6. A subset of this API 88 could probably be designed for operation on systems that support only 89 IPv6. However, this is not addressed in this document. 91 2.1. Overview of Changes 92 The socket interface API consists of a few distinct components: 94 - Core socket functions. 96 - Address data structures. 98 - Name-to-address translation functions. 100 - Address conversion functions. 102 The core socket functions -- those functions that deal with such things 103 as setting up and tearing down TCP connections, and sending and 104 receiving UDP packets -- were designed to be transport independent. 105 Where protocol addresses are passed as function arguments, they are 106 carried via opaque pointers. A protocol specific address data structure 107 is defined for each protocol that the socket functions support. 108 Applications must cast these protocol specific address structures into 109 the generic "sockaddr" data type when using the socket functions. These 110 functions need not change for IPv6, but a new IPv6 specific address data 111 structure is needed. 113 The "sockaddr_in" structure is the protocol specific data structure for 114 IPv4. This data structure actually includes 8-octets of unused space, 115 and it is tempting to try to use this space to adapt the sockaddr_in 116 structure to IPv6. Unfortunately, the sockaddr_in structure is not 117 large enough to hold the 16-octet IPv6 address as well as the other 118 information (2-octet address family and 2-octet port number) that is 119 needed. So a new address data structure must be defined for IPv6. 121 The name-to-address translation functions in the socket interface are 122 gethostbyname() and gethostbyaddr(). Gethostbyname() does not provide 123 enough flexibility to accommodate more than one protocol family. To 124 solve this problem, we introduced a new name-to-address translation 125 function which is analogous to gethostbyname(), but supports addresses 126 in both the IPv4 and IPv6 address families. Gethostbyaddr() does not, 127 strictly speaking, need to be replaced since it carries an address 128 family argument and can be extended to support both address families 129 without introducing compatibility problems. However, we have chosen to 130 introduce a new function to maintain symmetry with the replacement to 131 gethostbyname(). The new functions both carry an address family 132 parameter, so they can be extended to operate with other protocol 133 families in addition to IPv4 and IPv6. 135 The address conversion functions -- inet_ntoa() and inet_addr() -- 136 convert IPv4 addresses between binary and printable form. These 137 functions are quite specific to 32-bit IPv4 addresses. We have designed 138 two analogous functions which convert both IPv4 and IPv6 addresses, and 139 carry an address type parameter so that they can be extended to other 140 protocol families as well. 142 Finally, a few miscellaneous features are needed to support IPv6. A new 143 interface is needed in order to support the IPv6 flow label and priority 144 header fields. New interfaces are needed in order to receive IPv6 145 multicast packets and control the sending of multicast packets. And an 146 interface is necessary in order to pass IPv6 source route information 147 between the application and the system. 149 3. Implementation Experience 151 A few issues exposed in experimenting with prototype implementations of 152 IPv6 helped to guide the design of this API: 154 First, we discovered that, by providing a way to represent the addresses 155 of IPv4 nodes as IPv6 addresses, we could greatly simplify the 156 applications' task of providing IPv4 compatibility. New applications 157 could interoperate with IPv4 nodes by using the new API and expressing 158 the addresses of IPv4 nodes they interoperate with as IPv6 addresses. 159 For example, a client application could open a TCP connection to an IPv4 160 server by giving the IPv6 representation of the server's IPv4 address in 161 the connect() call. Most applications do not even need to know whether 162 the peer is an IPv4 or IPv6 node. Such applications can simply treat 163 IPv6 addresses as opaque values; They need not understand the 164 "structure" by which IPv4 addresses are encoded within IPv6 addresses. 165 Yet the structure can be decoded by those applications that do need to 166 know whether the peer is IPv6 or IPv4. This should prove to be a 167 significant simplification since most applications will need to 168 interoperate with both IPv4 and IPv6 nodes for some time to come. 170 Second, we learned that existing applications written to the IPv4 API 171 could be made to interoperate with IPv6 nodes to a limited degree. This 172 technique does not work for all applications, but does for certain 173 applications, such as those that do not "look at" the peer address that 174 is provided by the API. (e.g. the source address provided by the 175 recvfrom() function when a UDP packet is received, or the client address 176 returned by the accept() function.) 178 Third, we learned that the common application practice of passing open 179 socket descriptors between processes across an exec() call can cause 180 problems. It is possible, for example, for an application using the 181 extended API to pass an open socket to an older application using the 182 original API. The old application could be confused if the socket 183 functions return IPv6 address structures to it. The solution designed 184 was to provide a mechanism by which applications could have explicit 185 control over what form of addresses are returned. 187 4. Interface Specification 189 This section specifies the interface changes for IPv6. 191 The data types of the structure elements given in the following section 192 are intended to be examples, not absolute requirements. System 193 implementations may use other types if they are appropriate. In some 194 cases, such as when a field of a data structure holds a protocol value, 195 the structure field must be of some minimum size. These size 196 requirements are noted in the text. For example, since the UDP and TCP 197 port values are 16-bit quantities, the sin6_port field must be at least 198 a 16-bit data types. We specify the sin6_port field as a u_short type, 199 but an implementation may use any data type that is at least 16-bits 200 long. 202 4.1. New Address Family 204 A new address family macro, named AF_INET6, is defined in 205 . The AF_INET6 definition is used to distinguish between 206 the original sockaddr_in address data structure, and the new 207 sockaddr_in6 data structure. 209 A new protocol family macro, named PF_INET6, is defined in 210 . Like most of the other protocol family macros, this 211 will usually be defined to have the same value as the corresponding 212 address family macro: 214 #define PF_INET6 AF_INET6 216 The PF_INET6 is used in the first argument to the socket() function to 217 indicate that an IPv6 socket is being created. 219 4.2. IPv6 Address Data Structure 221 A new data structure to hold a single IPv6 address is defined in 222 : 224 struct in_addr6 { 225 u_char s6_addr[16]; /* IPv6 address */ 226 } 228 This data structure contains an array of sixteen 8-bit elements, which 229 make up one 128-bit IPv6 address. 231 The IPv6 address is stored in network byte order. 233 4.3. Socket Address Structure for 4.3 BSD-Based Systems 234 In the socket interface, a different protocol-specific data structure 235 is defined to carry the addresses for each of the protocol suite. 236 Each protocol-specific data structure is designed so it can be cast 237 into a protocol-independent data structure -- the "sockaddr" 238 structure. Each has a "family" field which overlays the "sa_family" 239 of the sockaddr data structure. This field can be used to identify 240 the type of the data structure. 242 The sockaddr_in structure is the protocol-specific address data 243 structure for IPv4. It is used to pass addresses between applications 244 and the system in the socket functions. We have defined the following 245 structure in to carry IPv6 addresses: 247 struct sockaddr_in6 { 248 u_short sin6_family; /* AF_INET6 */ 249 u_short sin6_port; /* Transport layer port # */ 250 u_long sin6_flowinfo; /* IPv6 flow information */ 251 struct in_addr6 sin6_addr; /* IPv6 address */ 252 }; 254 This structure is designed to be compatible with the sockaddr data 255 structure used in the 4.3 BSD release. 257 The sin6_family field is used to identify this as a sockaddr_in6 258 structure. This field is designed to overlay the sa_family field when 259 the buffer is cast to a sockaddr data structure. The value of this 260 field must be AF_INET6. 262 The sin6_port field is used to store the 16-bit UDP or TCP port 263 number. This field is used in the same way as the sin_port field of 264 the sockaddr_in structure. The port number is stored in network byte 265 order. 267 The sin6_flowinfo field is a 32-bit field that is used to store the 268 24-bit IPv6 flow label, and the 4-bit priority field. The IPv6 flow 269 label is represented as the low-order 24-bits of the 32-bit field, and 270 the priority is represented in the next 4-bits above this. The 271 high-order 4 bits of this field are reserved. The sin6_flowinfo field 272 is stored in network byte order. The use of this field is explained in 273 sec 4.8. 275 The sin6_addr field is a single in_addr6 structure (defined in the 276 previous section). This field holds one 128-bit IPv6 address. The 277 address is stored in network byte order. 279 The ordering of elements in this structure is specifically designed so 280 that the sin6_addr field will be aligned on a 64-bit boundary. This 281 is done for optimum performance on 64-bit architectures. 283 4.4. Socket Address Structure for 4.4 BSD-Based Systems 285 The 4.4 BSD release includes a small, but incompatible change to the 286 socket interface. The "sa_family" field of the sockaddr data 287 structure was changed from a 16-bit value to an 8-bit value, and the 288 space saved used to hold a length field, named "sa_len". The 289 sockaddr_in6 data structure given in the previous section can not be 290 correctly cast into the newer sockaddr data structure. For this 291 reason, we have defined the following alternative IPv6 address data 292 structure to be used on systems based on 4.4 BSD: 294 #define SIN6_LEN 296 struct sockaddr_in6 { 297 u_char sin6_len; /* length of this struct */ 298 u_char sin6_family; /* AF_INET6 */ 299 u_short sin6_port; /* Transport layer port # */ 300 u_long sin6_flowinfo; /* IPv6 flow information */ 301 struct in_addr6 sin6_addr; /* IPv6 address */ 302 }; 304 This structure is defined in the header file. The only 305 differences between this data structure and the 4.3 BSD variant are 306 the inclusion of the length field, and the change of the family field 307 to a 8-bit data type. The definitions of all the other fields are 308 identical to the 4.3 BSD variant defined in the previous section. 310 Systems that provide this version of the sockaddr_in6 data structure 311 must include the SIN6_LEN macro definition in . This 312 macro allows applications to determine whether they are being built on 313 a system that supports the 4.3 BSD or 4.4 BSD variants of the data 314 structure. Applications can be written to run on both systems by 315 simply making their assignments and use of the sin6_len field 316 conditional on the SIN6_LEN field. For example, to fill in an IPv6 317 address structure in an application, one might write: 319 struct sockaddr_in6 sin6; 321 bzero((char *) &sin6, sizeof(struct sockaddr_in6)); 322 #ifdef SIN6_LEN 323 sin6.sin6_len = sizeof(struct sockaddr_in6); 324 #endif 325 sin6.sin6_family = AF_INET6; 326 sin6.sin6_port = 23; 328 Note that the size of the sockaddr_in6 structure is larger than the 329 size of the sockaddr structure. Applications that use the 330 sockaddr_in6 structure need to be aware that they can not use 331 sizeof(sockaddr) to allocate a buffer to hold a sockaddr_in6 332 structure. They should use sizeof(sockaddr_in6) instead. 334 4.5. The Socket Functions 336 Applications use the socket() function to create a socket descriptor 337 that represents a communication endpoint. The arguments to the socket() 338 function tell the system which protocol to use, and what format address 339 structure will be used in subsequent functions. For example, to create 340 an IPv4/TCP socket, applications make the call: 342 s = socket (PF_INET, SOCK_STREAM, 0); 344 To create an IPv4/UDP socket, applications make the call: 346 s = socket (PF_INET, SOCK_DGRAM, 0); 348 Applications may create IPv6/TCP and IPv6/UDP sockets by simply using 349 the constant PF_INET6 instead of PF_INET in the first argument. For 350 example, to create an IPv6/TCP socket, applications make the call: 352 s = socket (PF_INET6, SOCK_STREAM, 0); 354 To create an IPv6/UDP socket, applications make the call: 356 s = socket (PF_INET6, SOCK_DGRAM, 0); 358 Once the application has created a PF_INET6 socket, it must use the 359 sockaddr_in6 address structure when passing addresses in to the system. 360 The functions which the application uses to pass addresses into the 361 system are: 363 bind() 364 connect() 365 sendmsg() 366 sendto() 368 The system will use the sockaddr_in6 address structure to return 369 addresses to applications that are using PF_INET6 sockets. The 370 functions that return an address from the system to an application 371 are: 373 accept() 374 recvfrom() 375 recvmsg() 376 getpeername() 377 getsockname() 379 No changes to the syntax of the socket functions are needed to support 380 IPv6, since the all of the "address carrying" functions use an opaque 381 address pointer, and carry an address length as a function argument. 383 4.6. Compatibility with IPv4 Applications 385 In order to support the large base of applications using the original 386 API, system implementations must provide complete source and binary 387 compatibility with the original API. This means that systems must 388 continue to support PF_INET sockets and the sockaddr_in addresses 389 structure. Applications must be able to create IPv4/TCP and IPv4/UDP 390 sockets using the PF_INET constant in the socket() function, as 391 described in the previous section. Applications should be able to hold 392 a combination of IPv4/TCP, IPv4/UDP, IPv6/TCP and IPv6/UDP sockets 393 simultaneously within the same process. 395 Applications using the original API should continue to operate as they 396 did on systems supporting only IPv4. That is, they should continue to 397 interoperate with IPv4 nodes. It is not clear, though, how, or even if, 398 those IPv4 applications should interoperate with IPv6 nodes. The open 399 issues section (section 7) discusses some of the alternatives. 401 4.7. Compatibility with IPv4 Nodes 403 The API also provides a different type of compatibility: the ability for 404 applications using the extended API to interoperate with IPv4 nodes. 405 This feature uses the IPv4-mapped IPv6 address format defined in the 406 IPv6 addressing architecture specification [3]. This address format 407 allows the IPv4 address of an IPv4 node to be represented as an IPv6 408 address. The IPv4 address is encoded into the low-order 32-bits of the 409 IPv6 address, and the high-order 96-bits hold the fixed prefix 410 0:0:0:0:0:FFFF. IPv4-mapped addresses are written as follows: 412 ::FFFF: 414 Applications may use PF_INET6 sockets to open TCP connections to IPv4 415 nodes, or send UDP packets to IPv4 nodes, by simply encoding the 416 destination's IPv4 address as an IPv4-mapped IPv6 address, and passing 417 that address, within a sockaddr_in6 structure, in the connect() or 418 sendto() call. When applications use PF_INET6 sockets to accept TCP 419 connections from IPv4 nodes, or receive UDP packets from IPv4 nodes, the 420 system returns the peer's address to the application in the accept(), 421 recvfrom(), or getpeername() call using a sockaddr_in6 structure encoded 422 this way. 424 We expect that few applications will need to know which type of node 425 they are interoperating with. However, for those applications that do 426 need to know, the following function is provided: 428 int is_ipv4_addr (const struct in_addr6 *ap); 430 The "ap" argument to this function points to a buffer holding an IPv6 431 address in network byte order. The function returns true (non-zero) 432 if that address is an IPv4-mapped address, and returns 0 otherwise. 433 When an application using the extended API accepts a TCP connection, 434 or receives a UDP packet, it may determine whether the peer is an IPv4 435 node by applying the is_ipv4_addr() function to the address returned 436 by accept() or recvfrom(). 438 4.8. Sockets Passed Across exec() 440 Unix allows open sockets to be passed across an exec() call. It is a 441 relatively common application practice to pass open sockets across 442 exec() calls. Because of this, it is possible for an application 443 using the original API to pass an open PF_INET socket to an 444 application that is expecting to receive a PF_INET6 socket. 445 Similarly, it is possible for an application using the extended API to 446 pass an open PF_INET6 socket to an application using the original API, 447 which would be equipped only to deal with PF_INET sockets. Either of 448 these cases could cause problems, because the application which is 449 passed the open socket might not know how to decode the address 450 structures returned in subsequent socket functions. 452 To remedy this problem, we have defined a new setsockopt() option that 453 allows an application to "transform" a PF_INET6 socket into a PF_INET 454 socket and vice-versa. 456 An IPv6 application that is passed an open socket from an unknown 457 process may use the IP_ADDRFORM setsockopt() option to "convert" the 458 socket to PF_INET6. Once that has been done, the system will return 459 sockaddr_in6 address structures in subsequent socket functions. 460 Similarly, an IPv6 application that is about to pass an open PF_INET6 461 socket to a program that may not be IPv6 capable may "downgrade" the 462 socket to PF_INET before calling exec(). After that, the system will 463 return sockaddr_in address structures to the application that was 464 exec()'ed. 466 The macro definition for IP_ADDRFORM is in . 468 The IP_ADDRFORM option is at the IPPROTO_IP level. The only valid 469 option values are PF_INET6 and PF_INET. For example, to convert a 470 PF_INET6 socket to PF_INET, a program would call: 472 int addrform = PF_INET; 473 if (setsockopt(s, IPPROTO_IP, IP_ADDRFORM, (char *) &addrform, 474 sizeof(addrform)) == -1) 475 perror("setsockopt IP_ADDRFORM"); 477 An application may use IP_ADDRFORM in the getsckopt() function to learn 478 whether an open socket is a PF_INET of PF_INET6 socket. For example: 480 int addrform; 481 int len = sizeof(int); 483 if (getsockopt(s, IPPROTO_IP, IP_ADDRFORM, (char *) &addrform, 484 &len) == -1) 485 perror("getsockopt IP_ADDRFORM"); 486 if (addrform == PF_INET) 487 printf("This is an IPv4 socket.\n"); 488 else if (addrform == PF_INET6) 489 printf("This is an IPv6 socket.\n"); 490 else 491 printf("This system is broken.\n"); 493 4.9. Flow Information 495 The IPv6 header has a 24-bit field to hold a "flow label", and a 4-bit 496 field to hold a "priority". Applications have control over what values 497 for these fields are used in packets that they originate, and have 498 access to the field values of packets that they receive. 500 The sin6_flowinfo field of the sockaddr_in6 structure is used to carry 501 the flow information between the application and the system. An 502 application may specify a flow label and priority to use in the 503 transmitted packets of an actively opened TCP connection by setting the 504 sin6_flowinfo field of the destination address sockaddr_in6 structure 505 passed in the connect() function. An application may specify the flow 506 label and priority to use in transmitted UDP packets by setting the 507 sin6_flowinfo field of the destination address sockaddr_in6 structure 508 passed in the sendto() function. If an application does not care what 509 values are used, it should set the flowinfo value to zero. 511 An application may specify the flow label and priority to use in 512 transmitted packets of a passively accepted TCP connection, by setting 513 the sin6_flowinfo field of the address passed in the bind() function. 515 The flow label and priority that appeared in received UDP packets are 516 passed up to the application in the sin6_flowinfo field of the source 517 address sockaddr_in6 structure that is returned in the recvfrom() call. 518 The flow information that appeared in the received SYN segment of a 519 passively accepted TCP connection is returned to the application in the 520 source address sin6_flowinfo field of the sockaddr_in6 structure that is 521 passed in the accept() call. 523 4.10. Handling IPv6 Source Routes 525 IPv6 makes more use of the source routing mechanism than IPv4. In order 526 for source routing to operate properly, the node receiving a request 527 packet that bears a source route must reverse that source route when 528 sending the reply. In the case of TCP, the reversal can be done in the 529 transport protocol implementation transparently to the application. But 530 in the case of UDP, the application must perform the reversal itself. 531 The transport protocol code can not perform the reversal for UDP packets 532 because a UDP application may receive a number of requests and generate 533 replies asynchronously. A "reply" sent by an application may not match 534 the "request" most recently passed up to the application. 536 The API for source routing has two components: providing a source route 537 to be used with originated traffic -- actively opened TCP connections 538 and UDP packets being sent -- and retrieving the source route of 539 received traffic -- passively accepted TCP connections and received UDP 540 packets. An application may always provide a source route with TCP 541 connections being originated and UDP packets being sent. But to receive 542 source routes, the application must enable an option. 544 To provide a source route, an application simply provides an array of 545 sockaddr_in6 data structures in the address argument of the sendto() 546 function (when sending a UDP packet), or the connect() function (when 547 actively opening a TCP connection). The length argument of the function 548 is the total length, in octets, of the array. The elements of the array 549 represent the full source route, including both source and destination 550 identifying address. The elements of the array are ordered from 551 destination to source. That is, the first element of the array 552 represents the destination identifying address, and the last element of 553 the array represents the source identifying address. If the application 554 provides a source route, the source identifying address can not be 555 omitted. The sin6_addr field of the source identifying address may be 556 set to zero, however, in which case the system will select an 557 appropriate source address. The sin6_port field of the destination 558 identifying address must be assigned. The sin_port field of the source 559 identifying address may be set to zero, in which case the system will 560 select an appropriate source port number. The sin6_port and 561 sin6_flowinfo fields of the intermediate addresses must be set to zero. 563 The arrangement of the address structures in the address buffer passed 564 to connect() or sendto() is shown in the figure below: 566 +--------------------+ 567 | | 568 | sockaddr_in6[0] | Destination Identifying Address 569 | | 570 +--------------------+ 571 | | 572 | sockaddr_in6[1] | Last Source-Route Hop Address 573 | | 574 +--------------------+ 575 . . 576 . . 577 . . 578 +--------------------+ 579 | | 580 | sockaddr_in6[N-1] | First Source-Route Hop Address 581 | | 582 +--------------------+ 583 | | 584 | sockaddr_in6[N] | Source Identifying Address 585 | | 586 +--------------------+ 588 Address buffer when sending a source route 590 The IP_RCVSRCRT setsockopt() option controls the reception of source 591 routes. The option is disabled by default. Applications must 592 explicitly enable the option using the setsockopt() function in order to 593 receive source routes. 595 The macro definition for IP_RCVSRCRT is in . 597 The IP_RCVSRCRT option is at the IPPROTO_IP level. An example of how an 598 application might use this option is: 600 int on = 1; /* value == 1 means enable the option */ 602 if (setsockopt(s, IPPROTO_IP, IP_RCVSRCRT, (char *) &on, 603 sizeof(on)) == -1) 604 perror("setsockopt IP_RCVSRCRT"); 606 When the IP_RCVSRCRT option is disabled, only a single sockaddr_in6 607 address structure is returned to applications in the address argument 608 of the recvfrom() and accept() functions. This address represents the 609 source identifying address of the UDP packet received or the TCP 610 connection accepted. 612 When the IP_RCVSRCRT option is enabled, the address argument of the 613 recvfrom() function (when receiving UDP packets) and the accept() 614 functions (when passively accepting TCP connections) points to an array 615 of sockaddr_in6 structures. When the function returns, the array will 616 hold two elements -- source and destination address -- when the received 617 UDP packet or TCP SYN packet does not carry a source route. The array 618 will hold more than two elements when the received packet carries a 619 source route. 621 The addresses in the array are ordered from source to destination. That 622 is, the first element of the array holds source identifying address of 623 the received packet. Following this in the array are the intermediary 624 hops. And the last element of the array holds the destination 625 identifying address. Note that this is the opposite of the order 626 specified for sending. This ordering was chosen so that the address 627 array received in a recvfrom() call can be used in a subsequent sendto() 628 call without requiring the application to re-order the addresses in the 629 array. Similarly, the address array received in an accept() call can be 630 used unchanged in a subsequent connect() call. 632 The address length argument of the recvfrom() and accept() functions 633 indicate the length, in octets, of the full address array. This 634 argument is a value-result parameter. The application sets the maximum 635 size of the address buffer when it makes the call, and the system 636 modifies the value to return the actual size of the buffer to the 637 application. 639 The sin6_port field of the first and last array elements (source and 640 destination identifying address) will hold the source and destination 641 UDP or TCP port number of the received packet. The sin6_port field of 642 the intermediate elements of the array will be zero. 644 The address buffer returned to the application in the recvfrom() or 645 accept() functions when the IP_RCVSRCRT option is enabled is shown 646 below: 648 +--------------------+ 649 | | 650 | sockaddr_in6[0] | Source Identifying Address 651 | | 652 +--------------------+ 653 | | 654 | sockaddr_in6[1] | First Source-Route Hop Address 655 | | 656 +--------------------+ 657 . . 658 . . 659 . . 660 +--------------------+ 661 | | 662 | sockaddr_in6[N-1] | Last Source-Route Hop Address 663 | | 664 +--------------------+ 665 | | 666 | sockaddr_in6[N] | Destination Identifying Address 667 | | 668 +--------------------+ 670 Address buffer when receiving a source route 672 Since IPv6 allows the number of elements in a source route to be very 673 large, it is impractical for all applications that have enabled the 674 reception of source routes to provide buffer space to hold the maximum 675 number of elements. Some applications may choose a buffer size that is 676 appropriate for their own use. This means that it is possible that a 677 received source route may be too large to fit into the buffer provided 678 by the application. In this circumstance, the system should return only 679 a single address element -- the source identifying address -- to the 680 application. This case is clearly distinguishable to the application 681 because in all other cases, the system returns at least two address 682 elements -- the source and destination identifying addresses. 684 4.11. Unicast Hop Limit 686 A new setsockopt() option is used to control the hop limit used in 687 outgoing unicast IPv6 packets. The name of this option is 688 IP_UNICAST_HOPS, and it is used at the IPPROTO_IP layer. The macro 689 definition for IP_UNICAST_HOPS resides in the header 690 file. The following example illustrates how it is used: 692 int hoplimit = 10; 694 if (setsockopt(s, IPPROTO_IP, IP_UNICAST_HOPS, (char *) &hoplimit, 695 sizeof(hoplimit)) == -1) 696 perror("setsockopt IP_UNICAST_HOPS); 698 When the IP_UNICAST_HOPS option is set with setsockopt(), the option 699 value given is used as the hop limit for all subsequent unicast packets 700 sent via that socket. If the option is not set, the system selects a 701 default value. 703 The IP_UNICAST_HOPS option may be used in the getsockopt() function to 704 determine the hop limit value that the system will use for subsequent 705 unicast packets sent via that socket. For example: 707 int hoplimit; 708 int len = sizeof(hoplimit); 710 if (getsockopt(s, IPPROTO_IP, IP_UNICAST_HOPS, (char *) &hoplimit, 711 &len) == -1) 712 perror("getsockopt IP_UNICAST_HOPS); 713 else 714 printf("Using %d for hop limit.\n", hoplimit); 716 4.12. Sending and Receiving Multicast Packets 718 IPv6 applications may send UDP multicast packets by simply specifying an 719 IPv6 multicast address in the address argument of the sendto() function. 721 A few setsockopt options at the IPPROTO_IP layer are used to control 722 some of the parameters of sending multicast packets. These options are 723 optional: applications may send multicast packets without using these 724 options. The setsockopt() options for controlling the sending of 725 multicast packets are summarized below: 727 IP_MULTICAST_IF Set the interface to use for outgoing 728 multicast packets. 730 IP_MULTICAST_HOPS Set the hop limit to use for outgoing 731 multicast packets. (Note a separate 732 option - IP_UNICAST_HOPS - is provided 733 to set the hop limit to use for outgoing 734 unicast packets.) 736 IP_MULTICAST_LOOP Controls whether outgoing multicast 737 packets sent should be delivered back to 738 the local application. A toggle. 740 The reception of multicast packets is controlled by the two setsockopt() 741 options summarized below: 743 IP_ADD_MEMBERSHIP Join a multicast group. Requests 744 that multicast packets sent to a 745 particular multicast address 746 be delivered to this socket. 748 IP_DROP_MEMBERSHIP Leave a multicast group. Requests that 749 multicast packets sent to a particular 750 multicast address no longer be delivered 751 to this socket. 753 4.13. Name-to-Address Translation Functions 755 We have defined two new functions analogous to gethostbyname() and 756 gethostbyaddr() which support addresses in both the IPv4 and IPv6 757 address families. The names of the new functions are hostname2addr() 758 and addr2hostname(). These functions were designed to have semantics 759 similar to gethostbyname() and gethostbyaddr(), so that existing IPv4 760 applications can be easily ported to IPv6. 762 Hostname2addr() is defined similarly to gethostbyname(), but enables 763 applications to specify the type of address to be looked up: 765 struct hostent *hostname2addr(const char *name, int af); 767 This new function looks up the given name in the name service and 768 returns the completed hostent structure if the lookup succeeds, and NULL 769 otherwise. The name argument is the domain name of the host to look up. 770 The af argument specifies the type of the address -- IPv4 (AF_INET) or 771 IPv6 (AF_INET6) -- to return to the caller in the h_addr_list field of 772 the hostent structure. 774 If the af argument is AF_INET, hostname2addr() queries the name service 775 for IPv4 addresses and, if any are found, returns a hostent structure 776 that includes an array of IPv4 addresses. Each IPv4 address is encoded 777 in network byte order. 779 If the af argument is AF_INET6, the processing is as follows: the 780 hostname2addr() function first queries the name service for IPv6 781 addresses. If IPv6 addresses are found, they are returned in an array in 782 the hostent structure. If no IPv6 addresses are found, the function 783 queries the name service for IPv4 addresses. If IPv4 addresses are 784 found, they are returned as IPv4-mapped IPv6 addresses. As in IPv4, 785 each IPv6 address returned in the hostent structure is encoded in 786 network byte order. 788 The second new function, called addr2hostname(), is defined in exactly 789 the same way as the gethostbyaddr() function, except that it now 790 supports both the IPv4 and IPv6 address families: 792 struct hostent *addr2hostname(const void *addr, int len, int af); 794 addr2hostname() performs an address-to-name lookup on the address 795 specified, returning a completed hostent structure if the lookup 796 succeeds, or NULL, if the lookup fails. This function supports both the 797 AF_INET and AF_INET6 address families. If the af argument is AF_INET, 798 then len must be specified to be 4-octets and addr must refer to an IPv4 799 address. If af is AF_INET6, then len must be specified as 16-octets and 800 addr must refer to an IPv6 address. If the addr argument is an 801 IPv4-mapped IPv6 address, an IPv4 address-to-name lookup is performed on 802 the embedded IPv4 address. 804 A new name-to-address translation library function is now under 805 development at Berkeley [2]. This new function, named getconninfo(), 806 will subsume the functionality of gethostbyname(), hostname2addr(), as 807 well as the getservbyname() and getservbyport() functions. The new 808 function is specifically designed to be "transport independent", so it 809 should be directly usable by IPv6 applications. 811 System implementations should provide the addr2hostname() and 812 hostname2addr() functions in order to simplify the porting of existing 813 IPv4 applications to IPv6. System implementations may also provide the 814 getconninfo() function, once it is defined, so that newly written 815 applications can be transport independent. 817 The getconninfo() function is expected to be published as a separate 818 specification document, not included in this spec. 820 Implementations must retain the BSD gethostbyname() and gethostbyaddr() 821 functions in order to provide source and binary compatibility for 822 existing applications. 824 4.14. Address Conversion Functions 826 BSD Unix provides two functions, inet_addr() and inet_ntoa(), to convert 827 an IPv4 address between binary and printable form. IPv6 applications 828 need similar functions. We have defined the following two functions to 829 convert both IPv6 and IPv4 addresses: 831 int ascii2addr(int af, const char *cp, void *ap); 833 and 835 char *addr2ascii(int af, const void *ap, int len, char *cp); 837 The first function converts an ascii string to an address in the address 838 family specified by the af argument. Currently AF_INET and AF_INET6 839 address families are supported. The cp argument points to the ascii 840 string being passed in. The ap argument points to a buffer into which 841 the function stores the address. Ascii2addr() returns the length of the 842 address in octets if the conversion succeeds, and -1 otherwise. The 843 function does not modify the storage pointed to by ap if the conversion 844 fails. The application must ensure that the buffer referred to by ap is 845 large enough to hold the converted address. 847 If the af argument is AF_INET, the function accepts a string in the 848 standard IPv4 dotted decimal form: 850 ddd.ddd.ddd.ddd 852 where ddd is a one to three digit decimal number between 0 and 255. 854 If the af argument is AF_INET6, then the function accepts a string in 855 one of the standard IPv6 printing forms defined in the addressing 856 architecture specification [3]. 858 The second function converts an address into a printable string. The af 859 argument specifies the form of the address. This can be AF_INET or 860 AF_INET6. The ap argument points to a buffer holding an IPv4 address if 861 the af argument is AF_INET, and an IPv6 address if the af argument is 862 AF_INET6. The len field specifies the length in octets of the address 863 pointed to by ap, and must be 4 if af is AF_INET, or 16 if af is 864 AF_INET6. The cp argument points to a buffer that the function can use 865 to store the ascii string. If the cp argument is NULL, the function 866 uses its own private static buffer. If the application specifies a cp 867 argument, it must be large enough to hold the ascii conversion of the 868 address specified as an argument, including the terminating null octet. 869 For IPv6 addresses, the buffer must be at least 46-octets. For IPv4 870 addresses, the buffer must be at least 16-octets. 872 The addr2ascii() function returns a pointer to the buffer containing the 873 ascii string if the conversion succeeds, and NULL otherwise. The 874 function does not modify the storage pointed to by cp if the conversion 875 fails. 877 5. Security Considerations 879 IPv6 provides a number of new security mechanisms, many of which need to 880 be accessible to applications. A companion document detailing the 881 extensions to the socket interfaces to support IPv6 security is being 882 written [4]. At some point in the future, that document and this one 883 may be merged into a single API specification. 885 6. Change History 886 Changes from the March 1995 Edition 888 - Changed the definition of the ipv6_addr structure to be an array 889 of sixteen chars instead of four longs. This change is 890 necessary to support machines which implement the socket 891 interface, but do not have a 32-bit addressable word. Virtually 892 all machines which provide the socket interface do support an 893 8-bit addressable data type. 895 - Added a more detailed explanation that the data types defined in 896 this documented are not intended to be hard and fast 897 requirements. Systems may use other data types if they wish. 899 - Added a note flagging the fact that the sockaddr_in6 structure 900 is not the same size as the sockaddr structure. 902 - Changed the sin6_flowlabel field to sin6_flowinfo to accommodate 903 the addition of the priority field to the IPv6 header. 905 Changes from the October 1994 Edition 907 - Added variant of sockaddr_in6 for 4.4 BSD-based systems (sa_len 908 compatibility). 910 - Removed references to SIT transition specification, and added 911 reference to addressing architecture document, for definition of 912 IPv4-mapped addresses. 914 - Added a solution to the problem of the application not providing 915 enough buffer space to hold a received source route. 917 - Moved discussion of IPv4 applications interoperating with IPv6 918 nodes to open issues section. 920 - Added length parameter to addr2ascii() function to be consistent 921 with addr2hostname(). 923 - Changed IP_MULTICAST_TTL to IP_MULTICAST_HOPS to match IPv6 924 terminology, and added IP_UNICAST_HOPS option to match 925 IP_MULTICAST_HOPS. 927 - Removed specification of numeric values for AF_INET6, 928 IP_ADDRFORM, and IP_RCVSRCRT, since they need not be the same on 929 different implementations. 931 - Added a definition for the in_addr6 IPv6 address data 932 structure. Added this so that applications could use 933 sizeof(struct in_addr6) to get the size of an IPv6 address, 934 and so that a structured type could be used in the 935 is_ipv4_addr(). 937 7. Open Issues 939 A few open issues for IPv6 socket interface API specification remain, 940 including: 942 - The multicast API needs to be documented in more detail. 944 - Should we add a timeout parameter to hostname2addr() and 945 addr2hostname()? DNS lookups need to be given some finite 946 timeout interval, so it might be nice to let the application 947 specify that interval. 949 - Can existing IPv4 applications interoperate with IPv6 nodes? 951 7.1. IPv4 Applications Interoperating with IPv6 Nodes 953 This problem primarily has to do with the how IPv4 applications 954 represent addresses of IPv6 nodes. What address should be returned to 955 the application when an IPv6/UDP packet is received, or an IPv6/TCP 956 connection is accepted? The peer's address could be any arbitrary 957 128-bit IPv6 address. But the application is only equipped to deal with 958 32-bit IPv4 addresses encoded in sockaddr_in data structures. 960 We have not discovered any solution that provides complete transparent 961 interoperability with IPv6 nodes for applications using the original 962 IPv4 API. However, two techniques that partially solve the problem are: 964 1) Prohibit communication between IPv4 applications and IPv6 nodes. 965 Only UDP packets received from IPv4 nodes would be passed up to 966 the application, and only TCP connections received from IPv4 967 nodes would be accepted. UDP packets from IPv6 nodes would be 968 dropped, and TCP connections from IPv6 nodes would be refused. 970 2) The system could generate a local 32-bit cookie to represent the 971 full 128-bit IPv6 address, and pass this value to the 972 application. The system would maintain a mapping from cookie 973 value into the 128-bit IPv6 address that it represents. When 974 the application passed a cookie back into the system (for 975 example, in a sendto() or connect() call) the system would use 976 the 128-bit IPv6 address that the cookie represents. 978 The cookie would have to be chosen so as to be an invalid IPv4 979 address (e.g. an address on net 127.0.0.0), and the system would 980 have to make sure that these cookie values did not escape into 981 the Internet as the source or destination addresses of IPv4 982 packets. 984 Both of these techniques have drawbacks. This is an area for further 985 study. System implementors may use one of these techniques or implement 986 another solution. 988 Acknowledgments 990 Thanks to the many people who made suggestions and provided feedback to 991 to the numerous revisions of this document, including: Dave Borman, Mark 992 Hasson, Alan Cox, Wan-Yen Hsu, Alex Conta, Richard Stevens, Dan 993 McDonald, Alan Lloyd, Christian Huitema, Steve Deering, Andrew 994 Cherenson, Charles Lynn, Ran Atkinson, Erik Nordmark, Josh Osborne, 995 Glenn Trewitt, Fred Baker, Robert Elz, Dean D. Throop, and Francis 996 Dupont. Craig Partridge suggested the addr2ascii() and ascii2addr() 997 functions. 999 Ramesh Govindan made a number of contributions and co-authored an 1000 earlier version of this paper. 1002 References 1004 [1] R. Hinden. "Internet Protocol, Version 6 (IPv6) Specification". 1005 Internet Draft. June 1995. 1007 [2] K. Sklower. Private communication. 1009 [3] R. Hinden., S. Deering. "IP Version 6 Addressing Architecture". 1010 Internet Draft. June 1995. 1012 [4] D. McDonald. "IPv6 Security API for BSD Sockets". Internet 1013 Draft. January 1995. 1015 Authors' Address 1017 Jim Bound 1018 Digital Equipment Corporation 1019 110 Spitbrook Road ZK3-3/U14 1020 Nashua, NH 03062-2698 1021 Phone: +1 603 881 0400 1022 Email: bound@zk3.dec.com 1024 Susan Thomson 1025 Bell Communications Research 1026 MRE 2P-343, 445 South Street 1027 Morristown, NJ 07960 1028 Telephone: +1 201 829 4514 1029 Email: set@thumper.bellcore.com 1031 Robert E. Gilligan 1032 Sun Microsystems, Inc. 1033 2550 Garcia Avenue 1034 Mailstop UMTV05-44 1035 Mountain View, CA 94043-1100 1036 Phone: +1 415 336 1012 1037 Email: bob.gilligan@eng.sun.com