idnits 2.17.1 draft-stevens-advanced-api-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 130 instances of too long lines in the document, the longest one being 11 characters in excess of 72. ** The abstract seems to contain references ([2]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 5 instances of lines with non-RFC2606-compliant FQDNs in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 476: '... via raw sockets MUST be in network by...' Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 249 has weird spacing: '...ip6_vfc ip6_...' == Line 250 has weird spacing: '...p6_flow ip6_c...' == Line 251 has weird spacing: '...p6_plen ip6_c...' == Line 252 has weird spacing: '...ip6_nxt ip6_...' == Line 253 has weird spacing: '...p6_hlim ip6_c...' == (32 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 26, 1997) is 9892 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '0' is mentioned on line 395, but not defined == Missing Reference: '8' is mentioned on line 618, but not defined ** Obsolete normative reference: RFC 1883 (ref. '1') (Obsoleted by RFC 2460) -- Unexpected draft version: The latest known version of draft-ietf-ipngwg-bsd-api is -06, but you're referring to -07. ** Downref: Normative reference to an Informational draft: draft-ietf-ipngwg-bsd-api (ref. '2') ** Obsolete normative reference: RFC 1981 (ref. '3') (Obsoleted by RFC 8201) ** Obsolete normative reference: RFC 1970 (ref. '4') (Obsoleted by RFC 2461) == Outdated reference: A later version (-15) exists of draft-ietf-rsvp-spec-14 Summary: 15 errors (**), 0 flaws (~~), 11 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT W. Richard Stevens (Consultant) 3 Expires: September 26, 1997 Matt Thomas (AltaVista) 4 March 26, 1997 6 Advanced Sockets API for IPv6 7 9 Abstract 11 Specifications are in progress for changes to the sockets API to 12 support IP version 6 [2]. These changes are for TCP and UDP-based 13 applications and will support most end-user applications in use 14 today: Telnet and FTP clients and servers, HTTP clients and servers, 15 and the like. 17 But another class of applications exists that will also be run under 18 IPv6. We call these "advanced" applications and today this includes 19 programs such as Ping, Traceroute, routing daemons, multicast routing 20 daemons, router discovery daemons, and the like. The API feature 21 typically used by these programs that make them "advanced" is a raw 22 socket to access ICMPv4, IGMPv4, or IPv4, along with some knowledge 23 of the packet header formats used by these protocols. To provide 24 portability for applications that use raw sockets under IPv6, some 25 standardization is needed for the advanced API features. 27 There are other features of IPv6 that some applications will need to 28 access: interface identification (specifying the outgoing interface 29 and determining the incoming interface) and IPv6 extension headers 30 that are not addressed in [2]: Hop-by-Hop options, Destination 31 options, and the Routing header (source routing). This document 32 provides API access to these features too. 34 Status of this Memo 36 This document is an Internet Draft. Internet Drafts are working 37 documents of the Internet Engineering Task Force (IETF), its Areas, 38 and its Working Groups. Note that other groups may also distribute 39 working documents as Internet Drafts. 41 Internet Drafts are draft documents valid for a maximum of six 42 months. Internet Drafts may be updated, replaced, or obsoleted by 43 other documents at any time. It is not appropriate to use Internet 44 Drafts as reference material or to cite them other than as a "working 45 draft" or "work in progress". 47 To learn the current status of any Internet-Draft, please check the 48 "1id-abstracts.txt" listing contained in the internet-drafts Shadow 49 Directories on: ftp.is.co.za (Africa), nic.nordu.net (Europe), 50 ds.internic.net (US East Coast), ftp.isi.edu (US West Coast), and 51 munnari.oz.au (Pacific Rim). 53 Table of Contents 55 1. Introduction .................................................... 5 57 2. Common Structures and Definitions ............................... 6 58 2.1. The ip6_hdr Structure ...................................... 6 59 2.1.1. IPv6 Next Header Values ............................. 7 60 2.2. The icmp6_hdr Structure .................................... 7 61 2.2.1. ICMPv6 Type and Code Values ......................... 8 62 2.2.2. ICMPv6 Neighbor Discovery Type and Code Values ...... 9 63 2.3. Address Testing Functions .................................. 11 64 2.4. Protocols File ............................................. 12 66 3. IPv6 Raw Sockets ................................................ 12 67 3.1. Checksums .................................................. 13 68 3.2. ICMPv6 Type Filtering ...................................... 13 70 4. Ancillary Data .................................................. 16 71 4.1. The msghdr Structure ....................................... 17 72 4.2. The cmsghdr Structure ...................................... 18 73 4.3. Ancillary Data Object Functions ............................ 19 74 4.3.1. CMSG_FIRSTHDR ....................................... 20 75 4.3.2. CMSG_NXTHDR ......................................... 20 76 4.3.3. CMSG_DATA ........................................... 21 77 4.3.4. CMSG_SPACE .......................................... 22 78 4.3.5. CMSG_LEN ............................................ 22 79 4.4. Summary of Options Described Using Ancillary Data .......... 22 80 4.5. TCP Access to Ancillary Data ............................... 24 82 5. Packet Information .............................................. 25 83 5.1. Specifying/Receiving the Interface ......................... 26 84 5.2. Specifying/Receiving Source/Destination Address ............ 27 85 5.3. Specifying/Receiving the Hop Limit ......................... 27 86 5.4. Specifying the Next Hop Address ............................ 28 87 5.5. Additional Errors with sendmsg() ........................... 28 89 6. Flow Labels ..................................................... 29 90 6.1. inet6_flow_assign .......................................... 31 91 6.2. inet6_flow_free ............................................ 32 92 6.3. inet6_flow_reuse ........................................... 32 94 7. Hop-By-Hop Options .............................................. 33 95 7.1. Receiving Hop-by-Hop Options ............................... 34 96 7.2. Sending Hop-by-Hop Options ................................. 35 97 7.3. Hop-by-Hop and Destination Options Processing .............. 35 98 7.3.1. inet6_option_space .................................. 35 99 7.3.2. inet6_option_init ................................... 36 100 7.3.3. inet6_option_append ................................. 36 101 7.3.4. inet6_option_alloc .................................. 37 102 7.3.5. inet6_option_next ................................... 38 103 7.3.6. inet6_option_find ................................... 38 104 7.3.7. Options Examples .................................... 39 106 8. Destination Options ............................................. 46 107 8.1. Receiving Destination Options .............................. 46 108 8.2. Sending Destination Options ................................ 47 110 9. Source Route Option ............................................. 47 111 9.1. inet6_srcrt_space .......................................... 48 112 9.2. inet6_srcrt_init ........................................... 49 113 9.3. inet6_srcrt_add ............................................ 49 114 9.4. inet6_srcrt_lasthop ........................................ 50 115 9.5. inet6_srcrt_reverse ........................................ 50 116 9.6. inet6_srcrt_segments ....................................... 50 117 9.7. inet6_srcrt_getaddr ........................................ 51 118 9.8. inet6_srcrt_getflags ....................................... 51 119 9.9. Source Route Example ....................................... 51 121 10. Ordering of Ancillary Data and IPv6 Extension Headers ........... 56 123 11. IPv6-Specific Options with IPv4-Mapped IPv6 Addresses ........... 58 125 12. rresvport_af .................................................... 58 127 13. Future Items .................................................... 59 128 13.1. Path MTU Discovery and UDP ................................ 59 129 13.2. Neighbor Reachability and UDP ............................. 59 131 14. Summary of New Definitions ...................................... 60 133 15. Security Considerations ......................................... 63 135 16. Change History .................................................. 63 137 17. References ...................................................... 66 139 18. Acknowledgments ................................................. 66 141 19. Authors' Addresses .............................................. 66 143 1. Introduction 145 Specifications are in progress for changes to the sockets API to 146 support IP version 6 [2]. These changes are for TCP and UDP-based 147 applications. The current document defines some the "advanced" 148 features of the sockets API that are required for applications to 149 take advantage of additional features of IPv6. 151 Today, the portability of applications using IPv4 raw sockets is 152 quite high, but this is mainly because most IPv4 implementations 153 started from a common base (the Berkeley source code) or at least 154 started with the Berkeley headers. This allows programs such as Ping 155 and Traceroute, for example, to compile with minimal effort on many 156 hosts that support the sockets API. With IPv6, however, there is no 157 common source code base that implementors are starting from, and the 158 possibility for divergence at this level between different 159 implementations is high. To avoid a complete lack of portability 160 amongst applications that use raw IPv6 sockets, some standardization 161 is necessary. 163 There are also features from the basic IPv6 specification that are 164 not addressed in [2]: sending and receiving Hop-by-Hop options, 165 Destination options, and Routing headers, specifying the outgoing 166 interface, and being told of the receiving interface. 168 This document can be divided into the following main sections. 170 1. Definitions of the basic constants and structures required for 171 applications to use raw IPv6 sockets. This includes structure 172 definitions for the IPv6 and ICMPv6 headers and all associated 173 constants (e.g., values for the Next Header field). 175 2. Some basic semantic definitions for IPv6 raw sockets. For 176 example, a raw ICMPv4 socket requires the application to 177 calculate and store the ICMPv4 header checksum. But with IPv6 178 this would require the application to choose the source IPv6 179 address because the source address is part of the pseudo header 180 that ICMPv6 now uses for its checksum computation. It should be 181 defined that with a raw ICMPv6 socket the kernel always 182 calculates and stores the ICMPv6 header checksum. 184 3. Packet information: how applications can obtain the received 185 interface, destination address, and received hop limit, along 186 with specifying these values on a per-packet basis. There are a 187 class of applications that need this capability and the technique 188 should be portable. 190 4. Access to the optional Hop-by-Hop, Destination, and Routing 191 headers. 193 5. Additional features required for IPv6 application portability. 195 The packet information along with access to the extension headers 196 (Hop-by-Hop options, Destination options, and Routing header) are 197 specified using the "ancillary data" fields that were added to the 198 4.3BSD Reno sockets API in 1990. The reason is that these ancillary 199 data fields are part of the Posix.1g standard (which should be 200 approved in 1997) and should therefore be adopted by most vendors. 202 This document does not address application access to either the 203 authentication header or the encapsulating security payload header. 205 All examples in this document omit error checking in favor of brevity 206 and clarity. 208 We note that many of the functions and socket options defined in this 209 document may have error returns that are not defined in this 210 document. Many of these possible error returns will be recognized 211 only as implementations proceed. 213 Datatypes in this document follow the Posix.1g format: u_intN_t means 214 an unsigned integer of exactly N bits (e.g., u_int16_t) and u_intNm_t 215 means an unsigned integer of at least N bits (e.g., u_int32m_t). 217 Note that we use the (unofficial) terminology ICMPv4, IGMPv4, and 218 ARPv4 to avoid any confusion with the newer ICMPv6 protocol. 220 2. Common Structures and Definitions 222 Many advanced applications examine fields in the IPv6 header and set 223 and examine fields in the various ICMPv6 headers. Common structure 224 definitions for these headers are required, along with common 225 constant definitions for the structure members. 227 When an include file is specified, that include file is allowed to 228 include other files that do the actual declaration or definition. 230 2.1. The ip6_hdr Structure 232 The following structure is defined as a result of including 233 . Note that this is a new header. 235 struct ip6_hdr { 236 union { 237 struct ip6_hdrctl { 238 u_int32_t ctl6_flow; /* 24 bits of flow-ID */ 239 u_int16_t ctl6_plen; /* payload length */ 240 u_int8_t ctl6_nxt; /* next header */ 241 u_int8_t ctl6_hlim; /* hop limit */ 242 } un_ctl6; 243 u_int8_t un_vfc; /* 4 bits version, 4 bits priority */ 244 } ip6_ctlun; 245 struct in6_addr ip6_src; /* source address */ 246 struct in6_addr ip6_dst; /* destination address */ 247 }; 249 #define ip6_vfc ip6_ctlun.un_vfc 250 #define ip6_flow ip6_ctlun.un_ctl6.ctl6_flow 251 #define ip6_plen ip6_ctlun.un_ctl6.ctl6_plen 252 #define ip6_nxt ip6_ctlun.un_ctl6.ctl6_nxt 253 #define ip6_hlim ip6_ctlun.un_ctl6.ctl6_hlim 254 #define ip6_hops ip6_ctlun.un_ctl6.ctl6_hlim 256 2.1.1. IPv6 Next Header Values 258 IPv6 defines many new values for the Next Header field. The 259 following constants are defined as a result of including 260 . 262 #define IPPROTO_HOPOPTS 0 /* IPv6 Hop-by-Hop options */ 263 #define IPPROTO_IPV6 41 /* IPv6 header */ 264 #define IPPROTO_ROUTING 43 /* IPv6 Routing header */ 265 #define IPPROTO_FRAGMENT 44 /* IPv6 fragmentation header */ 266 #define IPPROTO_ESP 50 /* encapsulating security payload */ 267 #define IPPROTO_AH 51 /* authentication header */ 268 #define IPPROTO_ICMPV6 58 /* ICMPv6 */ 269 #define IPPROTO_NONE 59 /* IPv6 no next header */ 270 #define IPPROTO_DSTOPTS 60 /* IPv6 Destination options */ 272 Berkeley-derived IPv4 implementations also define IPPROTO_IP to be 0. 273 This should not be a problem since IPPROTO_IP is used only with IPv4 274 sockets and IPPROTO_HOPOPTS only with IPv6 sockets. 276 2.2. The icmp6_hdr Structure 278 The ICMPv6 header is needed by numerous IPv6 applications including 279 Ping, Traceroute, router discovery daemons, and neighbor discovery 280 daemons. The following structure is defined as a result of including 281 . Note that this is a new header. 283 struct icmp6_hdr { 284 u_int8_t icmp6_type; /* type field */ 285 u_int8_t icmp6_code; /* code field */ 286 u_int16_t icmp6_cksum; /* checksum field */ 287 union { 288 u_int32_t icmp6_un_data32[1]; /* type-specific field */ 289 u_int16_t icmp6_un_data16[2]; /* type-specific field */ 290 u_int8_t icmp6_un_data8[4]; /* type-specific field */ 291 } icmp6_dataun; 292 }; 294 #define icmp6_data32 icmp6_dataun.icmp6_un_data32 295 #define icmp6_data16 icmp6_dataun.icmp6_un_data16 296 #define icmp6_data8 icmp6_dataun.icmp6_un_data8 297 #define icmp6_pptr icmp6_data32[0] /* parameter prob */ 298 #define icmp6_mtu icmp6_data32[0] /* packet too big */ 299 #define icmp6_id icmp6_data16[0] /* echo request/reply */ 300 #define icmp6_seq icmp6_data16[1] /* echo request/reply */ 301 #define icmp6_maxdelay icmp6_data16[0] /* mcast group membership */ 303 2.2.1. ICMPv6 Type and Code Values 305 In addition to a common structure for the ICMPv6 header, common 306 definitions are required for the ICMPv6 type and code fields. The 307 following constants are also defined as a result of including 308 . 310 #define ICMPV6_DEST_UNREACH 1 311 #define ICMPV6_PACKET_TOOBIG 2 312 #define ICMPV6_TIME_EXCEEDED 3 313 #define ICMPV6_PARAMPROB 4 315 #define ICMPV6_INFOMSG_MASK 0x80 /* all informational messages */ 317 #define ICMPV6_ECHOREQUEST 128 318 #define ICMPV6_ECHOREPLY 129 319 #define ICMPV6_MGM_QUERY 130 320 #define ICMPV6_MGM_REPORT 131 321 #define ICMPV6_MGM_REDUCTION 132 323 #define ICMPV6_DEST_UNREACH_NOROUTE 0 /* no route to destination */ 324 #define ICMPV6_DEST_UNREACH_ADMIN 1 /* communication with destination */ 325 /* administratively prohibited */ 326 #define ICMPV6_DEST_UNREACH_NOTNEIGHBOR 2 /* not a neighbor */ 327 #define ICMPV6_DEST_UNREACH_ADDR 3 /* address unreachable */ 328 #define ICMPV6_DEST_UNREACH_NOPORT 4 /* bad port */ 330 #define ICMPV6_TIME_EXCEED_HOPS 0 /* Hop Limit == 0 in transit */ 331 #define ICMPV6_TIME_EXCEED_REASSEMBLY 1 /* Reassembly time out */ 333 #define ICMPV6_PARAMPROB_HEADER 0 /* erroneous header field */ 334 #define ICMPV6_PARAMPROB_NEXTHEADER 1 /* unrecognized Next Header */ 335 #define ICMPV6_PARAMPROB_OPTION 2 /* unrecognized IPv6 option */ 337 The five ICMP message types defined by IPv6 neighbor discovery 338 (133-137) are defined in the next section. 340 2.2.2. ICMPv6 Neighbor Discovery Type and Code Values 342 The following constants are defined as a result of including 343 . 345 #define ND6_ROUTER_SOLICITATION 133 346 #define ND6_ROUTER_ADVERTISEMENT 134 347 #define ND6_NEIGHBOR_SOLICITATION 135 348 #define ND6_NEIGHBOR_ADVERTISEMENT 136 349 #define ND6_REDIRECT 137 351 enum nd6_option { 352 ND6_OPT_SOURCE_LINKADDR=1, 353 ND6_OPT_TARGET_LINKADDR=2, 354 ND6_OPT_PREFIX_INFORMATION=3, 355 ND6_OPT_REDIRECTED_HEADER=4, 356 ND6_OPT_MTU=5, 357 ND6_OPT_ENDOFLIST=256 358 }; 360 struct nd6_router_solicit { /* router solicitation */ 361 struct icmp6_hdr rsol_hdr; 362 }; 364 #define rsol_type rsol_hdr.icmp6_type 365 #define rsol_code rsol_hdr.icmp6_code 366 #define rsol_cksum rsol_hdr.icmp6_cksum 367 #define rsol_reserved rsol_hdr.icmp6_data32[0] 369 struct nd6_router_advert { /* router advertisement */ 370 struct icmp6_hdr radv_hdr; 371 u_int32_t radv_reachable; /* reachable time */ 372 u_int32_t radv_retransmit; /* reachable retransmit time */ 373 }; 375 #define radv_type radv_hdr.icmp6_type 376 #define radv_code radv_hdr.icmp6_code 377 #define radv_cksum radv_hdr.icmp6_cksum 378 #define radv_maxhoplimit radv_hdr.icmp6_data8[0] 379 #define radv_m_o_res radv_hdr.icmp6_data8[1] 380 #define ND6_RADV_M_BIT 0x80 381 #define ND6_RADV_O_BIT 0x40 382 #define radv_router_lifetime radv_hdr.icmp6_data16[1] 384 struct nd6_nsolicitation { /* neighbor solicitation */ 385 struct icmp6_hdr nsol6_hdr; 386 struct in6_addr nsol6_target; 387 }; 389 struct nd6_nadvertisement { /* neighbor advertisement */ 390 struct icmp6_hdr nadv6_hdr; 391 struct in6_addr nadv6_target; 393 }; 395 #define nadv6_flags nadv6_hdr.icmp6_data32[0] 396 #define ND6_NADVERFLAG_ISROUTER 0x80 397 #define ND6_NADVERFLAG_SOLICITED 0x40 398 #define ND6_NADVERFLAG_OVERRIDE 0x20 400 struct nd6_redirect { /* redirect */ 401 struct icmp6_hdr redirect_hdr; 402 struct in6_addr redirect_target; 403 struct in6_addr redirect_destination; 404 }; 406 struct nd6_opt_prefix_info { /* prefix information */ 407 u_int8_t opt_type; 408 u_int8_t opt_length; 409 u_int8_t opt_prefix_length; 410 u_int8_t opt_l_a_res; 411 u_int32_t opt_valid_life; 412 u_int32_t opt_preferred_life; 413 u_int32_t opt_reserved2; 414 struct in6_addr opt_prefix; 415 }; 417 #define ND6_OPT_PI_L_BIT 0x80 418 #define ND6_OPT_PI_A_BIT 0x40 420 struct nd6_opt_mtu { /* MTU option */ 421 u_int8_t opt_type; 422 u_int8_t opt_length; 423 u_int16_t opt_reserved; 424 u_int32_t opt_mtu; 425 }; 427 2.3. Address Testing Functions 429 The basic API ([2]) defines some functions for testing an IPv6 430 address for certain properties. This API extends those definitions 431 with additional address testing functions, defined as a result of 432 including . 434 int IN6_ARE_ADDR_EQUAL(const struct in6_addr *, 435 const struct in6_addr *); 437 2.4. Protocols File 439 Many hosts provide the file /etc/protocols that contains the names of 440 the various IP protocols and their protocol number (e.g., the value 441 of the protocol field in the IPv4 header for that protocol, such as 1 442 for ICMP). Some programs then call the function getprotobyname() to 443 obtain the protocol value that is then specified as the third 444 argument to the socket() function. For example, the Ping program 445 contains code of the form 447 struct protoent *proto; 449 proto = getprotobyname("icmp"); 451 s = socket(AF_INET, SOCK_RAW, proto->p_proto); 453 Common names are required for the new IPv6 protocols in this file, to 454 provide portability of applications that call the getprotoXXX() 455 functions. 457 We define the two protocol names 459 ipv6 460 icmpv6 462 with values 41 and 58 (decimal), respectively. 464 3. IPv6 Raw Sockets 466 Raw sockets bypass the transport layer (TCP or UDP). With IPv4, raw 467 sockets are used to access ICMPv4, IGMPv4, and to read and write IPv4 468 datagrams containing a protocol field that the kernel does not 469 process. An example of the latter is a routing daemon for OSPF, 470 since it uses IPv4 protocol field 89. With IPv6 raw sockets will be 471 used for ICMPv6 and to read and write IPv6 datagrams containing a 472 Next Header field that the kernel does not process. Examples of the 473 latter are a routing daemon for OSPF for IPv6 and RSVP (protocol 474 field 46). 476 All data sent via raw sockets MUST be in network byte order and all 477 data received via raw sockets will be in network byte order. This 478 differs from the IPv4 raw sockets, which did not specify a byte 479 ordering and typically used the host's byte order. 481 Another difference from IPv4 raw sockets is that complete packets 482 (that is, IPv6 packets with extension headers) cannot be transferred 483 via the IPv6 raw sockets API. Instead, ancillary data objects are 484 used to transfer the extension headers, as described later in this 485 document. Should an application need access to the complete IPv6 486 packet, some other technique, such as the datalink interfaces BPF or 487 DLPI, must be used. 489 All fields in the IPv6 header that an application might want to 490 change (i.e., everything other than the version number) can be 491 modified by the application. All fields in a received IPv6 header 492 (other than the version number and Next Header fields) and all 493 extension headers are also made available to the application. Hence 494 there is no need for a socket option similar to the IPv4 IP_HDRINCL 495 socket option. 497 When we say "an ICMPv6 raw socket" we mean a socket created by 498 calling the socket function with the three arguments PF_INET6, 499 SOCK_RAW, and IPPROTO_ICMPV6. 501 3.1. Checksums 503 The kernel will calculate and insert the ICMPv6 checksum for ICMPv6 504 raw sockets, since this checksum is mandatory. 506 For other raw IPv6 sockets (that is, for raw IPv6 sockets created 507 with a third argument other than IPPROTO_ICMPV6), the application 508 must set the new IPV6_CHECKSUM socket option to have the kernel 509 compute and store a checksum. This option prevents applications from 510 having to perform source address selection on the packets they send. 511 The checksum will incorporate the IPv6 pseudo-header, defined in 512 Section 8.1 of [1]. This new socket option also specifies an integer 513 offset into the user data of where the checksum is to be placed. 515 int offset = 2; 516 setsockopt(fd, IPPROTO_IPV6, IPV6_CHECKSUM, &offset, sizeof(offset)); 518 By default, this socket option is disabled, which means the kernel 519 will not calculate and store a checksum. If the offset is set to -1 520 this tells the kernel not to calculate and store a checksum. 522 (Note: Since the checksum is always calculated by the kernel for an 523 ICMPv6 socket, applications are not able to generate ICMPv6 packets 524 with incorrect checksums (presumably for testing purposes) using this 525 API.) 527 3.2. ICMPv6 Type Filtering 529 ICMPv4 raw sockets receive most ICMPv4 messages received by the 530 kernel. (We say "most" and not "all" because Berkeley-derived 531 kernels never pass echo requests, timestamp requests, or address mask 532 requests to a raw socket. Instead these three messages are processed 533 entirely by the kernel.) But ICMPv6 is a superset of ICMPv4, also 534 including the functionality of IGMPv4 and ARPv4. This means that an 535 ICMPv6 raw socket can potentially receive many more messages than 536 would be received with an ICMPv4 raw socket: ICMP messages similar to 537 ICMPv4, along with neighbor solicitations, neighbor advertisements, 538 and the three group membership messages. 540 Most applications using an ICMPv6 raw socket care about only a small 541 subset of the ICMPv6 message types. To transfer extraneous ICMPv6 542 messages from the kernel to user can incur a significant overhead. 543 Therefore this API includes a method of filtering ICMPv6 messages by 544 the ICMPv6 type field. 546 Each ICMPv6 raw socket has an associated filter whose datatype is 547 defined as 549 struct icmp6_filter; 551 This structure, along with the functions and constants defined later 552 in this section, are defined as a result of including the 553 header. 555 The current filter is fetched and stored using getsockopt() and 556 setsockopt() with a level of IPPROTO_ICMPV6 and an option name of 557 ICMPV6_FILTER. 559 Six functions operate on an icmp6_filter structure: 561 void ICMPV6_FILTER_SETPASSALL (struct icmp6_filter *); 562 void ICMPV6_FILTER_SETBLOCKALL(struct icmp6_filter *); 564 void ICMPV6_FILTER_SETPASS ( int, struct icmp6_filter *); 565 void ICMPV6_FILTER_SETBLOCK( int, struct icmp6_filter *); 567 int ICMPV6_FILTER_WILLPASS (int, const struct icmp6_filter *); 568 int ICMPV6_FILTER_WILLBLOCK(int, const struct icmp6_filter *); 570 The first argument to the last four functions (an integer) is an 571 ICMPv6 message type, between 0 and 255. The pointer argument to all 572 six functions is a pointer to a filter that is modified by the first 573 four functions examined by the last two functions. 575 The first two functions, SETPASSALL and SETBLOCKALL, let us specify 576 that all ICMPv6 messages are passed to the application or that all 577 ICMPv6 messages are blocked from being passed to the application. 579 The next two functions, SETPASS and SETBLOCK, let us specify that 580 messages of a given ICMPv6 type should be passed to the application 581 or not passed to the application (blocked). 583 The final two functions, WILLPASS and WILLBLOCK, return true or false 584 depending whether the specified message type is passed to the 585 application or blocked from being passed to the application by the 586 filter pointed to by the second argument. 588 When an ICMPv6 raw socket is created, it will by default pass all 589 ICMPv6 message types to the application. 591 As an example, a Ping program could execute the following: 593 struct icmp6_filter myfilt; 595 fd = socket(PF_INET6, SOCK_RAW, IPPROTO_ICMPV6); 597 ICMPV6_FILTER_SETBLOCKALL(&myfilt); 598 ICMPV6_FILTER_SETPASS(ICMPV6_ECHOREPLY, &myfilt); 599 setsockopt(fd, IPPROTO_ICMPV6, ICMPV6_FILTER, &myfilt, sizeof(myfilt)); 601 The filter structure is declared and then initialized to block all 602 messages types. The filter structure is then changed to allow ICMPv6 603 echo reply messages to be passed to the application and the filter is 604 installed using setsockopt(). 606 The icmp6_filter structure is similar to the fd_set datatype used 607 with the select() function in the sockets API. The icmp6_filter 608 structure is an opaque datatype and the application should not care 609 how it is implemented. All the application does with this datatype 610 is allocate a variable of this type, pass a pointer to a variable of 611 this type to getsockopt() and setsockopt(), and operate on a variable 612 of this type using the six functions that we just defined. 614 Nevertheless, it is worth showing a simple implementation of this 615 datatype and the six functions, which can be implemented as C macros. 617 struct icmp6_filter { 618 u_int32m_t data[8]; /* 8*32 = 256 bits */ 619 }; 621 #define ICMPV6_FILTER_WILLPASS(type, filterp) \ 622 ((((filterp)->data[(type) >> 5]) & (1 << ((type) & 31))) != 0) 623 #define ICMPV6_FILTER_WILLBLOCK(type, filterp) \ 624 ((((filterp)->data[(type) >> 5]) & (1 << ((type) & 31))) == 0) 625 #define ICMPV6_FILTER_SETPASS(type, filterp) \ 626 ((((filterp)->data[(type) >> 5]) |= (1 << ((type) & 31)))) 627 #define ICMPV6_FILTER_SETBLOCK(type, filterp) \ 628 ((((filterp)->data[(type) >> 5]) &= ~(1 << ((type) & 31)))) 629 #define ICMPV6_FILTER_SETPASSALL(filterp) \ 630 memset((filterp), 0xFF, sizeof(struct icmp6_filter)) 631 #define ICMPV6_FILTER_SETBLOCKALL(filterp) \ 632 memset((filterp), 0, sizeof(struct icmp6_filter)) 634 (Note: These sample definitions have two limitations that an 635 implementation may want to change. The first four macros evaluate 636 their first argument two times. The second two macros require the 637 inclusion of the header for the memset() function.) 639 4. Ancillary Data 641 4.2BSD allowed file descriptors to be transferred between separate 642 processes across a UNIX domain socket using the sendmsg() and 643 recvmsg() functions. Two members of the msghdr structure, 644 msg_accrights and msg_accrightslen, were used to send and receive the 645 descriptors. When the OSI protocols were added to 4.3BSD Reno in 646 1990 the names of these two fields in the msghdr structure were 647 changed to msg_control and msg_controllen, because they were used by 648 the OSI protocols for "control information", although the comments in 649 the source code call this "ancillary data". 651 Other than the OSI protocols, the use of ancillary data has been 652 rare. In 4.4BSD, for example, the only use of ancillary data with 653 IPv4 is to return the destination address of a received UDP datagram 654 if the IP_RECVDSTADDR socket option is set. With Unix domain sockets 655 ancillary data is still used to send and receive descriptors. 657 Nevertheless the ancillary data fields of the msghdr structure 658 provide a clean way to pass information in addition to the data that 659 is being read or written. The inclusion of the msg_control and 660 msg_controllen members of the msghdr structure along with the cmsghdr 661 structure that is pointed to by the msg_control member is required by 662 the Posix.1g sockets API standard (which should be completed during 663 1997). 665 In this document ancillary data is used to exchange the following 666 optional information between the application and the kernel: 668 1. the send/receive interface and source/destination address, 669 2. the hop limit, 670 3. next hop address, 671 4. Hop-by-Hop options, 672 5. Destination options, and 673 6. Routing header. 675 Before describing these uses in detail, we review the definition of 676 the msghdr structure itself, the cmsghdr structure that defines an 677 ancillary data object, and some functions that operate on the 678 ancillary data objects. 680 4.1. The msghdr Structure 682 The msghdr structure is used by the recvmsg() and sendmsg() 683 functions. Its Posix.1g definition is: 685 struct msghdr { 686 void *msg_name; /* ptr to socket address structure */ 687 size_t msg_namelen; /* size of socket address structure */ 688 struct iovec *msg_iov; /* scatter/gather array */ 689 size_t msg_iovlen; /* # elements in msg_iov */ 690 void *msg_control; /* ancillary data */ 691 size_t msg_controllen; /* ancillary data buffer length */ 692 int msg_flags; /* flags on received message */ 693 }; 695 The structure is declared as a result of including . 697 (Note: Before Posix.1g the two "void *" pointers were typically "char 698 *", and the three size_t members were typically integers. The change 699 in msg_control to a "void *" pointer affects any code that increments 700 this pointer.) 702 Most Berkeley-derived implementations limit the amount of ancillary 703 data in a call to sendmsg() to no more than 108 bytes (an mbuf). 704 This API requires a minimum of 10240 bytes of ancillary data, but it 705 is recommended that the amount be limited only by the buffer space 706 reserved by the socket (which can be modified by the SO_SNDBUF socket 707 option). (Note: This magic number 10240 was picked as a value that 708 should always be large enough. 108 bytes is clearly too small as the 709 maximum size of a Type 0 Routing header is 376 bytes.) 711 4.2. The cmsghdr Structure 713 The cmsghdr structure describes ancillary data objects transferred by 714 recvmsg() and sendmsg(). Its Posix.1g definition is: 716 struct cmsghdr { 717 size_t cmsg_len; /* #bytes, including this header */ 718 int cmsg_level; /* originating protocol */ 719 int cmsg_type; /* protocol-specific type */ 720 /* followed by unsigned char cmsg_data[]; */ 721 }; 723 This structure is declared as a result of including . 725 As shown in this definition, normally there is no member with the 726 name cmsg_data[]. Instead, the data portion is accessed using the 727 CMSG_xxx() functions, as described shortly. Nevertheless, it is 728 common to refer to the cmsg_data[] member. 730 (Note: Before Posix.1g the cmsg_len member was an integer, and not a 731 size_t. On a 32-bit architecture this probably has no effect, but on 732 a 64-bit architecture this could change the size of this member from 733 4 bytes to 8 bytes and force 8 byte alignment for the structure.) 735 When ancillary data is sent or received, any number of ancillary data 736 objects can be specified by the msg_control and msg_controllen 737 members of the msghdr structure, because each object is preceded by a 738 cmsghdr structure defining the object's length (the cmsg_len member). 739 Historically Berkeley-derived implementations have passed only one 740 object at a time, but this API allows multiple objects to be passed 741 in a single call to sendmsg() or recvmsg(). The following example 742 shows two ancillary data objects in a control buffer. 744 |<--------------------------- msg_controllen -------------------------->| 745 | | 746 |<----- ancillary data object ----->|<----- ancillary data object ----->| 747 |<---------- CMSG_SPACE() --------->|<---------- CMSG_SPACE() --------->| 748 | | | 749 |<---------- cmsg_len ---------->| |<--------- cmsg_len ----------->| | 750 |<--------- CMSG_LEN() --------->| |<-------- CMSG_LEN() ---------->| | 751 | | | | | 752 +-----+-----+-----+--+-----------+--+-----+-----+-----+--+-----------+--+ 753 |cmsg_|cmsg_|cmsg_|XX| |XX|cmsg_|cmsg_|cmsg_|XX| |XX| 754 |len |level|type |XX|cmsg_data[]|XX|len |level|type |XX|cmsg_data[]|XX| 755 +-----+-----+-----+--+-----------+--+-----+-----+-----+--+-----------+--+ 756 ^ 757 | 758 msg_control 759 points here 761 The fields shown as "XX" are possible padding, between the cmsghdr 762 structure and the data, and between the data and the next cmsghdr 763 structure, if required by the implementation. 765 4.3. Ancillary Data Object Functions 767 To aid in the manipulation of ancillary data objects, three functions 768 from 4.4BSD are defined by Posix.1g: CMSG_DATA(), CMSG_NXTHDR(), and 769 CMSG_FIRSTHDR(). Before describing these functions, we show the 770 following example of how they might be used with a call to recvmsg(). 772 struct msghdr msg; 773 struct cmsghdr *cmsgptr; 775 /* fill in msg */ 777 /* call recvmsg() */ 779 for (cmsgptr = CMSG_FIRSTHDR(&msg); cmsgptr != NULL; 780 cmsgptr = CMSG_NXTHDR(&msg, cmsgptr)) { 781 if (cmsgptr->cmsg_level == ... && cmsgptr->cmsg_type == ... ) { 782 u_char *ptr; 784 ptr = CMSG_DATA(cmsgptr); 785 /* process data pointed to by ptr */ 786 } 787 } 789 We now describe the three Posix.1g functions, followed by two more 790 that are new with this API: CMSG_SPACE() and CMSG_LEN(). All these 791 functions are defined as a result of including . 793 4.3.1. CMSG_FIRSTHDR 795 struct cmsghdr *CMSG_FIRSTHDR(const struct msghdr *mhdr); 797 CMSG_FIRSTHDR() returns a pointer to the first cmsghdr structure in 798 the msghdr structure pointed to by mhdr. The function returns NULL 799 if there is no ancillary data pointed to the by msghdr structure 800 (that is, if either msg_control is NULL or if msg_controllen is less 801 than the size of a cmsghdr structure). 803 One possible implementation could be 805 #define CMSG_FIRSTHDR(mhdr) \ 806 ( (mhdr)->msg_controllen >= sizeof(struct cmsghdr) ? \ 807 (struct cmsghdr *)(mhdr)->msg_control : \ 808 (struct cmsghdr *)NULL ) 810 (Note: Most existing implementations do not test the value of 811 msg_controllen, and just return the value of msg_control. The value 812 of msg_controllen must be tested, because if the application asks 813 recvmsg() to return ancillary data, by setting msg_control to point 814 to the application's buffer and setting msg_controllen to the length 815 of this buffer, the kernel indicates that no ancillary data is 816 available by setting msg_controllen to 0 on return. It is also 817 easier to put this test into this macro, than making the application 818 perform the test.) 820 4.3.2. CMSG_NXTHDR 822 struct cmsghdr *CMSG_NXTHDR(const struct msghdr *mhdr, 823 const struct cmsghdr *cmsg); 825 CMSG_NXTHDR() returns a pointer to the cmsghdr structure describing 826 the next ancillary data object. mhdr is a pointer to a msghdr 827 structure and cmsg is a pointer to a cmsghdr structure. If there is 828 not another ancillary data object, the return value is NULL. 830 The following behavior of this function is new to this API: if the 831 value of the cmsg pointer is NULL, a pointer to the cmsghdr structure 832 describing the first ancillary data object is returned. That is, 833 CMSG_NXTHDR(mhdr, NULL) is equivalent to CMSG_FIRSTHDR(mhdr). If 834 there are no ancillary data objects, the return value is NULL. This 835 provides an alternative way of coding the processing loop shown 836 earlier: 838 struct msghdr msg; 839 struct cmsghdr *cmsgptr = NULL; 841 /* fill in msg */ 843 /* call recvmsg() */ 845 while ((cmsgptr = CMSG_NXTHDR(&msg, cmsgptr)) != NULL) { 846 if (cmsgptr->cmsg_level == ... && cmsgptr->cmsg_type == ... ) { 847 u_char *ptr; 849 ptr = CMSG_DATA(cmsgptr); 850 /* process data pointed to by ptr */ 851 } 852 } 854 One possible implementation could be: 856 #define CMSG_NXTHDR(mhdr, cmsg) \ 857 ( ((cmsg) == NULL) ? CMSG_FIRSTHDR(mhdr) : \ 858 (((u_char *)(cmsg) + ALIGN((cmsg)->cmsg_len) \ 859 + ALIGN(sizeof(struct cmsghdr)) > \ 860 (u_char *)((mhdr)->msg_control) + (mhdr)->msg_controllen) ? \ 861 (struct cmsghdr *)NULL : \ 862 (struct cmsghdr *)((u_char *)(cmsg) + ALIGN((cmsg)->cmsg_len))) ) 864 The macro ALIGN(), which is implementation dependent, rounds its 865 argument up to the next even multiple of whatever alignment is 866 required (probably a multiple of 4 or 8 bytes). 868 4.3.3. CMSG_DATA 870 unsigned char *CMSG_DATA(const struct cmsghdr *cmsg); 872 CMSG_DATA() returns a pointer to the data (what is called the 873 cmsg_data[] member, even though such a member is not defined in the 874 structure) following a cmsghdr structure. 876 One possible implementation could be: 878 #define CMSG_DATA(cmsg) ( (u_char *)(cmsg) + \ 879 ALIGN(sizeof(struct cmsghdr)) ) 881 4.3.4. CMSG_SPACE 883 unsigned int CMSG_SPACE(unsigned int length); 885 This function is new with this API. Given the length of an ancillary 886 data object, CMSG_SPACE() returns the space required by the object 887 and its cmsghdr structure, including any padding needed to satisfy 888 alignment requirements. This function can be used, for example, to 889 allocate space dynamically for the ancillary data. This function 890 should not be used to initialize the cmsg_len member of a cmsghdr 891 structure; instead use the CMSG_LEN() function. 893 One possible implementation could be: 895 #define CMSG_SPACE(length) ( ALIGN(sizeof(struct cmsghdr)) + \ 896 ALIGN(length) ) 898 4.3.5. CMSG_LEN 900 unsigned int CMSG_LEN(unsigned int length); 902 This function is new with this API. Given the length of an ancillary 903 data object, CMSG_LEN() returns the value to store in the cmsg_len 904 member of the cmsghdr structure, taking into account any padding 905 needed to satisfy alignment requirements. 907 One possible implementation could be: 909 #define CMSG_LEN(length) ( ALIGN(sizeof(struct cmsghdr)) + length ) 911 Note the difference between CMSG_SPACE() and CMSG_LEN(), shown also 912 in the figure in Section 4.2: the former accounts for any required 913 padding at the end of the ancillary data object and the latter is the 914 actual length to store in the cmsg_len member of the ancillary data 915 object. 917 4.4. Summary of Options Described Using Ancillary Data 918 There are six types of optional information described in this 919 document that are passed between the application and the kernel using 920 ancillary data: 922 1. the send/receive interface and source/destination address, 923 2. the hop limit, 924 3. next hop address, 925 4. Hop-by-Hop options, 926 5. Destination options, and 927 6. Routing header. 929 First, to receive any of this optional information (other than the 930 next hop address, which can only be set), the application must call 931 setsockopt() to turn on the corresponding flag: 933 int on = 1; 935 setsockopt(fd, IPPROTO_IPV6, IPV6_PKTINFO, &on, sizeof(on)); 936 setsockopt(fd, IPPROTO_IPV6, IPV6_HOPLIMIT, &on, sizeof(on)); 937 setsockopt(fd, IPPROTO_IPV6, IPV6_HOPOPTS, &on, sizeof(on)); 938 setsockopt(fd, IPPROTO_IPV6, IPV6_DSTOPTS, &on, sizeof(on)); 939 setsockopt(fd, IPPROTO_IPV6, IPV6_SRCRT, &on, sizeof(on)); 941 When any of these options are enabled, the corresponding data is 942 returned as control information by recvmsg(), as one or more 943 ancillary data objects. 945 Nothing special need be done to send any of this optional 946 information; the application just calls sendmsg() and specifies one 947 or more ancillary data objects as control information. 949 We also summarize the three cmsghdr fields that describe the 950 ancillary data objects: 952 cmsg_level cmsg_type cmsg_data[] #times 953 ------------ ------------ ------------------------ ------ 954 IPPROTO_IPV6 IPV6_PKTINFO in6_pktinfo structure once 955 IPPROTO_IPV6 IPV6_HOPLIMIT int once 956 IPPROTO_IPV6 IPV6_NEXTHOP socket address structure once 957 IPPROTO_IPV6 IPV6_HOPOPTS implementation dependent mult. 958 IPPROTO_IPV6 IPV6_DSTOPTS implementation dependent mult. 959 IPPROTO_IPV6 IPV6_SRCRT implementation dependent once 961 The final column indicates how many times an ancillary data object of 962 that type can appear as control information. The Hop-by-Hop and 963 Destination options can appear multiple times, while all the others 964 can appear only one time. 966 All these options are described in detail in following sections. All 967 the constants beginning with IPV6_ are defined as a result of 968 including the header. 970 (Note: It is up to the implementation what it passes as ancillary 971 data for the Hop-by-Hop option, Destination option, and source route 972 option, since the API to these features is through a set of 973 inet6_option_XXX() and inet6_srcrt_XXX() functions that we define 974 later. These functions serve two purposes: to simplify the interface 975 to these features (instead of requiring the application to know the 976 intimate details of the extension header formats), and to hide the 977 actual implementation from the application. Nevertheless, we show 978 some examples of these features that store the actual extension 979 header as the ancillary data. Implementations need not use this 980 technique.) 982 4.5. TCP Access to Ancillary Data 984 The summary in the previous section assumes a UDP socket. Sending 985 and receiving ancillary data is easy with UDP: the application calls 986 sendmsg() and recvmsg() instead of sendto() and recvfrom(). 988 But there might be cases where a TCP application wants to send or 989 receive this optional information. For example, a TCP client might 990 want to specify a source route and this needs to be done before 991 calling connect(). Similarly a TCP server might want to know the 992 received interface after accept() returns along with any Destination 993 options. 995 One new socket option is defined to allow TCP access to these 996 optional fields, although it is valid to use this with UDP or raw 997 sockets as well. Setting the socket option specifies any of the 998 optional output fields: 1000 setsockopt(fd, IPPROTO_IPV6, IPV6_PKTOPTIONS, &buf, len); 1002 The fourth argument points to a buffer containing one or more 1003 ancillary data objects, and the fifth argument is the total length of 1004 all these objects. The application fills in this buffer exactly as 1005 if the buffer were being passed to sendmsg() as control information. 1007 The corresponding receive option 1009 getsockopt(fd, IPPROTO_IPV6, IPV6_PKTOPTIONS, &buf, &len); 1011 returns a buffer with one or more ancillary data objects for all the 1012 optional receive information that the application has previously 1013 specified that it wants to receive. The fourth argument points to 1014 the buffer that is filled in by the call. The fifth argument is a 1015 pointer to a value-result integer: when the function is called the 1016 integer specifies the size of the buffer pointed to by the fourth 1017 argument, and on return this integer contains the actual number of 1018 bytes that were returned. The application processes this buffer 1019 exactly as if the buffer were returned by recvmsg() as control 1020 information. 1022 When using getsockopt() with the IPV6_PKTOPTIONS option, only the 1023 options from the most recently received segment are retained and 1024 returned to the caller. Also, none of the ancillary data that we 1025 describe in this document is ever returned as control information by 1026 recvmsg() on a TCP socket. 1028 The options set by calling setsockopt() for IPV6_PKTOPTIONS are 1029 called "sticky" options because once set they apply to all packets 1030 sent on that socket. They may, however, be overridden with ancillary 1031 data specified in a call to sendmsg(). 1033 But the following three options are considered a set: Hop-by-Hop, 1034 Destination, and Routing header options. If any of these three 1035 options are specified in a call to sendmsg(), then none of these 1036 three from the socket's sticky options are sent for this packet. For 1037 example, if the application calls setsockopt() for IPV6_PKTOPTIONS 1038 and sets sticky values for the Hop-by-Hop and Destination options, 1039 but then calls sendmsg() specifying just a Routing header as an 1040 ancillary data object, then only the Routing header is sent with this 1041 packet. The two sticky options, Hop-by-Hop and Destination, are not 1042 sent for this packet. 1044 5. Packet Information 1046 There are four pieces of information that an application can specify 1047 for an outgoing packet using ancillary data: 1049 1. the source IPv6 address, 1050 2. the outgoing interface index, 1051 3. the outgoing hop limit, and 1052 4. the next hop address. 1054 Three similar pieces of information can be returned for a received 1055 packet as ancillary data: 1057 1. the destination IPv6 address, 1058 2. the arriving interface index, and 1059 3. the arriving hop limit. 1061 The flow label can also be considered as packet information, but its 1062 semantics differ from these three, so we describe it in Section 6. 1064 The first two pieces of information are contained in an in6_pktinfo 1065 structure that is sent as ancillary data with sendmsg() and received 1066 as ancillary data with recvmsg(). This structure is defined as a 1067 result of including the header. 1069 struct in6_pktinfo { 1070 struct in6_addr ipi6_addr; /* src/dst IPv6 address */ 1071 int ipi6_ifindex; /* send/recv interface index */ 1072 }; 1074 In the cmsghdr structure containing this ancillary data, the 1075 cmsg_level member will be IPPROTO_IPV6, the cmsg_type member will be 1076 IPV6_PKTINFO, and the first byte of cmsg_data[] will be the first 1077 byte of the in6_pktinfo structure. 1079 This information is returned as ancillary data by recvmsg() only if 1080 the application has enabled the IPV6_PKTINFO socket option: 1082 int on = 1; 1083 setsockopt(fd, IPPROTO_IPV6, IPV6_PKTINFO, &on, sizeof(on)); 1085 Nothing special need be done to send this information: just specify 1086 the control information as ancillary data for sendmsg(). 1088 (Note: The hop limit is not contained in the in6_pktinfo structure 1089 for the following reason. Some UDP servers want to respond to client 1090 requests by sending their reply out the same interface on which the 1091 request was received and with the source IPv6 address of the reply 1092 equal to the destination IPv6 address of the request. To do this the 1093 application can enable just the IPV6_PKTINFO socket option and then 1094 use the received control information from recvmsg() as the outgoing 1095 control information for sendmsg(). The application need not examine 1096 or modify the in6_pktinfo structure at all. But if the hop limit 1097 were contained in this structure, the application would have to parse 1098 the received control information and change the hop limit member, 1099 since the received hop limit is not the desired value for an outgoing 1100 packet.) 1102 5.1. Specifying/Receiving the Interface 1104 Interfaces on an IPv6 node are identified by a small positive 1105 integer, as described in Section 4 of [2]. That document also 1106 describes a function to map an interface name to its interface index, 1107 a function to map an interface index to its interface name, and a 1108 function to return all the interface names and indexes. Notice from 1109 this document that no interface is ever assigned an index of 0. 1111 When specifying the outgoing interface, if the ipi6_ifindex value is 1112 0, the kernel will choose the outgoing interface. If the application 1113 specifies an outgoing interface for a multicast packet, the interface 1114 specified by the ancillary data overrides any interface specified by 1115 the IPV6_MULTICAST_IF socket option (described in [2]), for that call 1116 to sendmsg() only. 1118 When the IPV6_PKTINFO socket option is enabled, the received 1119 interface index is always returned as the ipi6_index member of the 1120 in6_pktinfo structure. 1122 5.2. Specifying/Receiving Source/Destination Address 1124 The source IPv6 address can be specified by calling bind() before 1125 each output operation, but supplying the source address together with 1126 the data requires less overhead (i.e., fewer system calls) and 1127 requires less state to be stored and protected in a multithreaded 1128 application. 1130 When specifying the source IPv6 address as ancillary data, if the 1131 ipi6_addr member of the in6_pktinfo structure is the unspecified 1132 address (IN6ADDR_ANY_INIT), then (a) if an address is currently bound 1133 to the socket, it is used as the source address, or (b) if no address 1134 is currently bound to the socket, the kernel will choose the source 1135 address. If the ipi6_addr member is not the unspecified address, but 1136 the socket has already bound a source address, then the ipi6_addr 1137 value overrides the already-bound source address for this output 1138 operation only. 1140 When the in6_pktinfo structure is returned as ancillary data by 1141 recvmsg(), the ipi6_addr member contains the destination IPv6 address 1142 from the received packet. 1144 5.3. Specifying/Receiving the Hop Limit 1146 The outgoing hop limit is normally specified with either the 1147 IPV6_UNICAST_HOPS socket option or the IPV6_MULTICAST_HOPS socket 1148 option, both of which are described in [2]. Specifying the hop limit 1149 as ancillary data lets the application override either the kernel's 1150 default or a previously specified value, for either a unicast 1151 destination or a multicast destination, for a single output 1152 operation. Returning the received hop limit is useful for programs 1153 such as Traceroute and for IPv6 applications that need to verify that 1154 the received hop limit is 255 (e.g., that the packet has not been 1155 forwarded). 1157 The received hop limit is returned as ancillary data by recvmsg() 1158 only if the application has enabled the IPV6_HOPLIMIT socket option: 1160 int on = 1; 1161 setsockopt(fd, IPPROTO_IPV6, IPV6_HOPLIMIT, &on, sizeof(on)); 1163 In the cmsghdr structure containing this ancillary data, the 1164 cmsg_level member will be IPPROTO_IPV6, the cmsg_type member will be 1165 IPV6_HOPLIMIT, and the first byte of cmsg_data[] will be the first 1166 byte of the integer hop limit. 1168 Nothing special need be done to specify the outgoing hop limit: just 1169 specify the control information as ancillary data for sendmsg(). As 1170 specified in [2], the interpretation of the integer hop limit value 1171 is 1173 x < -1: return an error of EINVAL 1174 x == -1: use kernel default 1175 0 <= x <= 255: use x 1176 x >= 256: return an error of EINVAL 1178 5.4. Specifying the Next Hop Address 1180 The IPV6_NEXTHOP ancillary data object specifies the next hop for the 1181 datagram as a socket address structure. In the cmsghdr structure 1182 containing this ancillary data, the cmsg_level member will be 1183 IPPROTO_IPV6, the cmsg_type member will be IPV6_NEXTHOP, and the 1184 first byte of cmsg_data[] will be the first byte of the socket 1185 address structure. 1187 This is a privileged option. 1189 If the socket address structure contains an IPv6 address (e.g., the 1190 sin6_family member is AF_INET6), then the node identified by that 1191 address must be a neighbor of the sending host. If that address 1192 equals the destination IPv6 address of the datagram, then this is 1193 equivalent to the existing SO_DONTROUTE socket option. 1195 5.5. Additional Errors with sendmsg() 1197 With the IPV6_PKTINFO socket option there are no additional errors 1198 possible with the call to recvmsg(). But when specifying the 1199 outgoing interface or the source address, additional errors are 1200 possible from sendmsg(): 1202 ENXIO The interface specified by ipi6_ifindex does not exist. 1204 ENETDOWN The interface specified by ipi6_ifindex is not enabled 1205 for IPv6 use. 1207 EADDRNOTAVAIL ipi6_ifindex specifies an interface but the address 1208 ipi6_addr is not available for use on that interface. 1210 EHOSTUNREACH No route to the destination exists over the interface 1211 specified by ifi6_ifindex. 1213 6. Flow Labels 1215 IPv6 allows packets to be explicitly labeled as belonging to a flow 1216 of related packets (Section 6 of [1]). All packets with a given IPv6 1217 source address that share the same flow label must have the following 1218 fields in common as well: destination address (unicast or multicast), 1219 priority, Hop-by-Hop options header, and if a Routing header is 1220 present, all extension headers up to and including the Routing 1221 header. Flow label values must be uniformly distributed in the range 1222 [1, 2^24-1] so that routers may use any portion of the flow label as 1223 a hash key to access stored state for the flow. 1225 The following points must be considered in designing an API to 1226 specify flow labels. 1228 - Space is already allocated in the sockaddr_in6 structure for the 1229 flow label. This implies that the process specifies the value 1230 (setting it to 0 to indicate no flow), in a call to connect() for 1231 a connected socket, or in a call to sendto() or sendmsg() for an 1232 unconnected socket. (Note: The sin6_flowinfo field performs 1233 double duty, carrying both the outgoing flow and the incoming 1234 flow. UDP applications that read requests using recvfrom() and 1235 then send a reply using sendto() must not use the incoming flow 1236 label for the outgoing reply.) 1238 - Generation of flow labels should be in the kernel, since they must 1239 be unique for a given source address, destination address and 1240 priority. The kernel also must keep track of the assigned flow 1241 labels to prevent them from being reused by a new flow within the 1242 flow-state lifetime (6 seconds default). 1244 - These first two points imply that the kernel assigns the flow 1245 label, but the process needs a way to obtain its value from the 1246 kernel. 1248 - To assign a flow label the process must specify the destination 1249 address and priority. (Note: The use of the priority field in the 1250 IPv6 header is still subject to change. The basic API spec [2] 1251 removed all references to this field for this reason. Therefore 1252 it is unspecified how a process specifies a nonzero priority 1253 field.) 1255 - All packets belonging to the same flow must also have the same 1256 Hop-by-Hop header and, if a Routing header is present, all 1257 extension headers up to and including the Routing header. 1258 Therefore, when a process asks to have a flow label assigned, it 1259 should also specify these extension headers that must remain 1260 constant for the flow. 1262 - For a connected socket (TCP or UDP) the process must be able 1263 specify a flow label either when the connection is established (as 1264 part of the sockaddr_in6 structure that is passed to connect()), 1265 or after the connection is established (the kernel should notice 1266 that the socket is already connected when it is asked to assign 1267 the flow label, and then start using it for that socket). On 1268 these connected sockets the process calls write() or send(), and 1269 does not specify a sockaddr_in6 structure with the flow 1270 label--hence the requirement that the kernel store the value and 1271 automatically use it. 1273 - For an unconnected UDP socket the process must ask the kernel to 1274 assign the flow label, obtain the value, and then use that value 1275 in subsequent calls to sendto() or sendmsg(). 1277 - It should be possible for a UDP application that will communicate 1278 with N peer processes to assign up to N different flow labels to a 1279 given socket. The process obtains the N values from the kernel 1280 and then uses the correct one for each of the N peers. 1282 - getpeername() can return the assigned flow label for a connected 1283 socket, but this function cannot be used to return the flow label 1284 for an unconnected socket. 1286 - Flows are defined between a source and destination. It should be 1287 possible for multiple sockets between a given source and 1288 destination to share the same flow label. This implies that it 1289 must be possible for a flow label assigned to one socket to be 1290 "reused" to another socket. 1292 One way a TCP client could do this, for example, is to obtain a 1293 flow to a given destination and then simply use that flow label in 1294 the socket address structures for multiple connect()s to the same 1295 server (e.g., Web clients). But it should also be possible to use 1296 some already assigned flow on an already connected socket, 1297 implying some way to tell the kernel to use an already assigned 1298 flow on a given socket. 1300 - There is some error checking that the kernel could perform with 1301 regard to flow labels, and the API should not address these, but 1302 leave them up to the implementation. For example, what if the 1303 process asks the kernel to allocate a flow label to DST1 for 1304 SOCKFD1 but then calls connect(SOCKFD1) connecting to DST2 using 1305 the flow label that was assigned to DST1? Or when a UDP 1306 application allocates multiple flow labels, but uses them 1307 incorrectly? Or when a UDP application allocates a flow to a 1308 destination, but then sends datagrams with the flow label set to 1309 0? 1311 - Flow labels are often mentioned along with RSVP, but the 1312 interaction between RSVP reservations and IPv6 flow labels is 1313 unclear (Section 1.2 of [5]). We note that RSVP is receiver- 1314 driven, while IPv6 flows labels must be chosen by the sender. 1316 - Lastly, the use of flow labels is still experimental. All this 1317 API can provide is some way to allocate flow labels within the 1318 rules provided in [1], allowing the kernel to enforce the 1319 requirements on common packet fields and freeing the application 1320 from the burden of selecting unique pseudo-random flow labels. 1322 The interface to the flow label feature is through three 1323 inet6_flow_XXX() functions. The function prototypes for these 1324 functions are all in the header. 1326 6.1. inet6_flow_assign 1328 int inet6_flow_assign(int fd, struct sockaddr_in6 *sin6, 1329 const void *buf, size_t len); 1331 To cause a flow label to be assigned the application must specify the 1332 socket, destination address, priority, and the optional headers that 1333 are not allowed to change for the flow. 1335 The socket address structure pointed to by sin6 specifies the 1336 destination address and priority. The flow label and port number 1337 fields are ignored. 1339 The buffer specified by the buf and len arguments contains the Hop- 1340 by-Hop options, the Destination options that precede the option 1341 Routing header, and the optional Routing header. The format of the 1342 buffer is a sequence of ancillary data objects, as described with the 1343 IPV6_PKTOPTIONS socket option. 1345 The flow label is assigned and returned in the sin6_flowinfo member 1346 of the socket address structure. 1348 This function returns 0 on success, -1 on error. 1350 If an earlier connect() or accept() has already connected the socket 1351 to the destination address supplied in this call, then subsequent 1352 output operations will have the assigned flow label in the IPv6 1353 header. 1355 If the socket is not connected then the application must use the 1356 returned flow label in a subsequent call to connect(), sendto(), or 1357 sendmsg(). 1359 (Note: It makes no sense to assign a flow to a listening TCP socket, 1360 since a destination address is required to assign the flow.) (Note: 1361 Since the socket address structure pointed to by the second argument 1362 is both a value and a result, implementations might consider using 1363 ioctl() for flow label access. Note that if this function were 1364 implemented using setsockopt() followed by getsockopt(), it would not 1365 be thread safe.) 1367 6.2. inet6_flow_free 1369 int inet6_flow_free(int fd, const struct sockaddr_in6 *sin6); 1371 A previously assigned flow label can be explicitly freed. If this 1372 function is not called, the flow label is automatically freed on the 1373 last close of the socket. 1375 The flow label field in the socket address structure specifies the 1376 flow label that is being freed. 1378 This function returns 0 on success, -1 on error. 1380 6.3. inet6_flow_reuse 1382 int inet6_flow_reuse(int currfd, int newfd, 1383 const struct sockaddr_in6 *sin6); 1385 A flow label assigned to one socket can be used on another socket 1386 (subject to the basic limitations of flow labels, of course, such as 1387 packets belonging to the flow from both sockets having the same 1388 destination address, etc.). This function needs to be called only if 1389 the new socket is already connected. If the new socket is not 1390 already connected, the application can just specify the known flow 1391 label in a call to connect(), sendto(), or sendmsg(). 1393 This function specifies that the flow label previously assigned to 1394 the socket currfd is also to be used on the socket newfd. 1396 The caller must fill in the destination address, priority, and flow 1397 label fields of the socket address structure. 1399 If the socket newfd is already connected to the destination address, 1400 subsequent output operations will have the assigned flow label in the 1401 IPv6 header. 1403 This function returns 0 on success, -1 on error. 1405 7. Hop-By-Hop Options 1407 A variable number of Hop-by-Hop options can appear in a single Hop- 1408 by-Hop options header. Each option in the header is TLV-encoded with 1409 a type, length, and value. 1411 Today only three Hop-by-Hop options are defined for IPv6 [1]: Jumbo 1412 Payload, Pad1, and PadN, although a proposal exists for a router- 1413 alert Hop-by-Hop option. The Jumbo Payload option should not be 1414 passed back to an application and an application should receive an 1415 error if it attempts to set it. This option is processed entirely by 1416 the kernel. It is indirectly specified by datagram-based 1417 applications as the size of the datagram to send and indirectly 1418 passed back to these applications as the length of the received 1419 datagram. The two pad options are for alignment purposes and are 1420 automatically inserted by a sending kernel when needed and ignored by 1421 the receiving kernel. This section of the API is therefore defined 1422 for future Hop-by-Hop options that an application may need to specify 1423 and receive. 1425 Individual Hop-by-Hop options (and Destination options, which are 1426 described shortly, and which are similar to the Hop-by-Hop options) 1427 may have specific alignment requirements. For example, the 4-byte 1428 Jumbo Payload length should appear on a 4-byte boundary, and IPv6 1429 addresses are normally aligned on an 8-byte boundary. These 1430 requirements and the terminology used with these options are 1431 discussed in Section 4.2 and Appendix A of [1]. The alignment of 1432 each option is specified by two values, called x and y, written as 1433 "xn + y". This states that the option must appear at an integer 1434 multiple of x bytes from the beginning of the options header (x can 1435 have the values 1, 2, 4, or 8), plus y bytes (y can have a value 1436 between 0 and 7, inclusive). The Pad1 and PadN options are inserted 1437 as needed to maintain the required alignment. Whatever code builds 1438 either a Hop-by-Hop options header or a Destination options header 1439 must know the values of x and y for each option. 1441 Multiple Hop-by-Hop options can be specified by the application. 1442 Normally one ancillary data object describes all the Hop-by-Hop 1443 options (since each option is itself TLV-encoded) but the application 1444 can specify multiple ancillary data objects for the Hop-by-Hop 1445 options, each object specifying one or more options. Care must be 1446 taken designing the API for these options since 1448 1. it may be possible for some future Hop-by-Hop options to be 1449 generated by the application and processed entirely by the 1450 application (e.g., the kernel may not know the alignment 1451 restrictions for the option), 1453 2. it must be possible for the kernel to insert its own Hop-by-Hop 1454 options in an outgoing packet (e.g., the Jumbo Payload option), 1456 3. the application can place one or more Hop-by-Hop options into a 1457 single ancillary data object, 1459 3. if the application specifies multiple ancillary data objects, 1460 each containing one or more Hop-by-Hop options, the kernel must 1461 combine these a single Hop-by-Hop options header, and 1463 4. it must be possible for the kernel to remove some Hop-by-Hop 1464 options from a received packet before returning the remaining 1465 Hop-by-Hop options to the application. (This removal might 1466 consist of the kernel converting the option into a pad option of 1467 the same length.) 1469 Finally, we note that access to some Hop-by-Hop options or to some 1470 Destination options, might require special privilege. That is, 1471 normal applications (without special privilege) might be forbidden 1472 from setting certain options in outgoing packets, and might never see 1473 certain options in received packets. 1475 7.1. Receiving Hop-by-Hop Options 1477 To receive Hop-by-Hop options the application must enable the 1478 IPV6_HOPOPTS socket option: 1480 int on = 1; 1481 setsockopt(fd, IPPROTO_IPV6, IPV6_HOPOPTS, &on, sizeof(on)); 1483 All the Hop-by-Hop options are returned as one ancillary data object 1484 described by a cmsghdr structure. The cmsg_level member will be 1485 IPPROTO_IPV6 and the cmsg_type member will be IPV6_HOPOPTS. These 1486 options are then processed by calling the inet6_option_next() and 1487 inet6_option_find() functions, described shortly. 1489 7.2. Sending Hop-by-Hop Options 1491 To send one or more Hop-by-Hop options, the application just 1492 specifies them as ancillary data in a call to sendmsg(). No socket 1493 option need be set. 1495 Normally all the Hop-by-Hop options are specified by a single 1496 ancillary data object. Multiple ancillary data objects, each 1497 containing one or more Hop-by-Hop options, can also be specified, in 1498 which case the kernel will combine all the Hop-by-Hop options into a 1499 single Hop-by-Hop extension header. But it should be more efficient 1500 to use a single ancillary data object to describe all the Hop-by-Hop 1501 options. The cmsg_level member is set to IPPROTO_IPV6 and the 1502 cmsg_type member is set to IPV6_HOPOPTS. The option is normally 1503 constructed using the inet6_option_init(), inet6_option_append(), and 1504 inet6_option_alloc() functions, described shortly. 1506 Additional errors may be possible from sendmsg() if the specified 1507 option is in error. 1509 7.3. Hop-by-Hop and Destination Options Processing 1511 Building and parsing the Hop-by-Hop and Destination options is 1512 complicated for the reasons given earlier. We therefore define a set 1513 of functions to help the application. The function prototypes for 1514 these functions are all in the header. 1516 7.3.1. inet6_option_space 1518 int inet6_option_space(int nbytes); 1520 This function returns the number of bytes required to hold an option 1521 when it is stored as ancillary data, including the cmsghdr structure 1522 at the beginning, and any padding at the end (to make its size a 1523 multiple of 8 bytes). The argument is the size of the structure 1524 defining the option, which must include any pad bytes at the 1525 beginning (the value y in the alignment term "xn + y"), the type 1526 byte, the length byte, and the option data. 1528 (Note: If multiple options are stored in a single ancillary data 1529 object, which is the recommended technique, this function 1530 overestimates the amount of space required by the size of N-1 cmsghdr 1531 structures, where N is the number of options to be stored in the 1532 object. This is of little consequence, since it is assumed that most 1533 Hop-by-Hop option headers and Destination option headers carry only 1534 one option (p. 33 of [1]).) 1536 7.3.2. inet6_option_init 1538 int inet6_option_init(void *bp, struct cmsghdr **cmsgp, int type); 1540 This function is called once per ancillary data object that will 1541 contain either Hop-by-Hop or Destination options. It returns 0 on 1542 success or -1 on an error. 1544 bp is a pointer to previously allocated space that will contain the 1545 ancillary data object. It must be large enough to contain all the 1546 individual options to be added by later calls to 1547 inet6_option_append() and inet6_option_alloc(). 1549 cmsgp is a pointer to a pointer to a cmsghdr structure. *cmsgp is 1550 initialized by this function to point to the cmsghdr structure 1551 constructed by this function in the buffer pointed to by bp. 1553 type is either IPV6_HOPOPTS or IPV6_DSTOPTS. This type is stored in 1554 the cmsg_type member of the cmsghdr structure pointed to by *cmsgp. 1556 7.3.3. inet6_option_append 1558 int inet6_option_append(struct cmsghdr *cmsg, const u_int8_t *typep, 1559 int multx, int plusy); 1561 This function appends a Hop-by-Hop option or a Destination option 1562 into an ancillary data object that has been initialized by 1563 inet6_option_init(). This function returns 0 if it succeeds or -1 on 1564 an error. 1566 cmsg is a pointer to the cmsghdr structure that must have been 1567 initialized by inet6_option_init(). 1569 typep is a pointer to the 8-bit option type. It is assumed that this 1570 field is immediately followed by the 8-bit option data length field, 1571 which is then followed immediately by the option data. The caller 1572 initializes these three fields (the type-length-value, or TLV) before 1573 calling this function. 1575 The option type must have a value from 2 to 255, inclusive. (0 and 1 1576 are reserved for the Pad1 and PadN options, respectively.) 1578 The option data length must have a value between 0 and 255, 1579 inclusive, and is the length of the option data that follows. 1581 multx is the value x in the alignment term "xn + y" described 1582 earlier. It must have a value of 1, 2, 4, or 8. 1584 plusy is the value y in the alignment term "xn + y" described 1585 earlier. It must have a value between 0 and 7, inclusive. 1587 7.3.4. inet6_option_alloc 1589 u_int8_t *inet6_option_alloc(struct cmsghdr *cmsg, int datalen, 1590 int multx, int plusy); 1592 This function appends a Hop-by-Hop option or a Destination option 1593 into an ancillary data object that has been initialized by 1594 inet6_option_init(). This function returns a pointer to the 8-bit 1595 option type field that starts the option on success, or NULL on an 1596 error. 1598 The difference between this function and inet6_option_append() is 1599 that the latter copies the contents of a previously built option into 1600 the ancillary data object while the current function returns a 1601 pointer to the space in the data object where the option's TLV must 1602 then be built by the caller. 1604 cmsg is a pointer to the cmsghdr structure that must have been 1605 initialized by inet6_option_init(). 1607 datalen is the value of the option data length byte for this option. 1608 This value is required as an argument to allow the function to 1609 determine if padding must be appended at the end of the option. (The 1610 inet6_option_append() function does not need a data length argument 1611 since the option data length must already be stored by the caller.) 1612 multx is the value x in the alignment term "xn + y" described 1613 earlier. It must have a value of 1, 2, 4, or 8. 1615 plusy is the value y in the alignment term "xn + y" described 1616 earlier. It must have a value between 0 and 7, inclusive. 1618 7.3.5. inet6_option_next 1620 int inet6_option_next(const struct cmsghdr *cmsg, u_int8_t **tptrp); 1622 This function processes the next Hop-by-Hop option or Destination 1623 option in an ancillary data object. If another option remains to be 1624 processed, the return value of the function is 0 and *tptrp points to 1625 the 8-bit option type field (which is followed by the 8-bit option 1626 data length, followed by the option data). If no more options remain 1627 to be processed, the return value is -1 and *tptrp is NULL. If an 1628 error occurs, the return value is -1 and *tptrp is not NULL. 1630 cmsg is a pointer to cmsghdr structure of which cmsg_level equals 1631 IPPROTO_IPV6 and cmsg_type equals either IPV6_HOPOPTS or 1632 IPV6_DSTOPTS. 1634 tptrp is a pointer to a pointer to an 8-bit byte and *tptrp is used 1635 by the function to remember its place in the ancillary data object 1636 each time the function is called. The first time this function is 1637 called for a given ancillary data object, *tptrp must be set to NULL. 1638 Each time this function returns success, *tptrp points to the 8-bit 1639 option type field for the next option to be processed. 1641 7.3.6. inet6_option_find 1643 int inet6_option_find(const struct cmsghdr *cmsg, u_int8_t *tptrp, 1644 int type); 1646 This function is similar to the previously described 1647 inet6_option_next() function, except this function lets the caller 1648 specify the option type to be searched for, instead of always 1649 returning the next option in the ancillary data object. 1651 cmsg is a pointer to cmsghdr structure of which cmsg_level equals 1652 IPPROTO_IPV6 and cmsg_type equals either IPV6_HOPOPTS or 1653 IPV6_DSTOPTS. 1655 tptrp is a pointer to a pointer to an 8-bit byte and *tptrp is used 1656 by the function to remember its place in the ancillary data object 1657 each time the function is called. The first time this function is 1658 called for a given ancillary data object, *tptrp must be set to NULL. 1660 This function starts searching for an option of the specified type 1661 beginning after the value of *tptrp. If an option of the specified 1662 type is located, this function returns 0 and *tptrp points to the 1663 8-bit option type field for the option of the specified type. If an 1664 option of the specified type is not located, the return value is -1 1665 and *tptrp is NULL. If an error occurs, the return value is -1 and 1666 *tptrp is not NULL. 1668 7.3.7. Options Examples 1670 We now provide an example that builds two Hop-by-Hop options. First 1671 we define two options, called X and Y, taken from the example in 1672 Appendix A of [1]. We assume that all options will have structure 1673 definitions similar to what is shown below. 1675 /* option X and option Y are defined in [1], pp. 33-34 */ 1676 #define IPV6_OPT_X_TYPE X /* replace X with assigned value */ 1677 #define IPV6_OPT_X_LEN 12 1678 #define IPV6_OPT_X_MULTX 8 /* 8n + 2 alignment */ 1679 #define IPV6_OPT_X_OFFSETY 2 1681 struct ipv6_opt_X { 1682 u_int8_t opt_X_pad[IPV6_OPT_X_OFFSETY]; 1683 u_int8_t opt_X_type; 1684 u_int8_t opt_X_len; 1685 u_int32_t opt_X_val1; 1686 u_int64_t opt_X_val2; 1687 }; 1689 #define IPV6_OPT_Y_TYPE Y /* replace Y with assigned value */ 1690 #define IPV6_OPT_Y_LEN 7 1691 #define IPV6_OPT_Y_MULTX 4 /* 4n + 3 alignment */ 1692 #define IPV6_OPT_Y_OFFSETY 3 1694 struct ipv6_opt_Y { 1695 u_int8_t opt_Y_pad[IPV6_OPT_Y_OFFSETY]; 1696 u_int8_t opt_Y_type; 1697 u_int8_t opt_Y_len; 1698 u_int8_t opt_Y_val1; 1699 u_int16_t opt_Y_val2; 1700 u_int32_t opt_Y_val3; 1701 }; 1703 We now show the code fragment to build one ancillary data object 1704 containing both options. 1706 struct msghdr msg; 1707 struct cmsghdr *cmsgptr; 1708 struct ipv6_opt_X optX; 1709 struct ipv6_opt_Y optY; 1711 msg.msg_control = malloc(sizeof(optX) + sizeof(optY)); 1713 inet6_option_init(msg.msg_control, &cmsgptr, IPV6_HOPOPTS); 1715 optX.opt_X_type = IPV6_OPT_X_TYPE; 1716 optX.opt_X_len = IPV6_OPT_X_LEN; 1717 optX.opt_X_val1 = <32-bit value>; 1718 optX.opt_X_val2 = <64-bit value>; 1719 inet6_option_append(cmsgptr, &optX.opt_X_type, 1720 IPV6_OPT_X_MULTX, IPV6_OPT_X_OFFSETY); 1722 optY.opt_Y_type = IPV6_OPT_Y_TYPE; 1723 optY.opt_Y_len = IPV6_OPT_Y_LEN; 1724 optY.opt_Y_val1 = <8-bit value>; 1725 optY.opt_Y_val2 = <16-bit value>; 1726 optY.opt_Y_val3 = <32-bit value>; 1727 inet6_option_append(cmsgptr, &optY.opt_Y_type, 1728 IPV6_OPT_Y_MULTX, IPV6_OPT_Y_OFFSETY); 1730 msg.msg_controllen = CMSG_SPACE(cmsgptr->cmsg_len); 1732 The call to inet6_option_init() builds the cmsghdr structure in the 1733 control buffer. 1735 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1736 | cmsg_len = CMSG_LEN(0) = 12 | 1737 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1738 | cmsg_level = IPPROTO_IPV6 | 1739 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1740 | cmsg_type = IPV6_HOPOPTS | 1741 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1743 Here we assume a 32-bit architecture where sizeof(struct cmsghdr) 1744 equals 12, with a desired alignment of 4-byte boundaries (that is, 1745 the ALIGN() macro shown in the sample implementations of the 1746 CMSG_xxx() functions rounds up to a multiple of 4). 1748 The first call to inet6_option_append() appends the X option. Since 1749 this is the first option in the ancillary data object, 2 bytes are 1750 allocated for the Next Header byte and for the Hdr Ext Len byte. The 1751 former will be set by the kernel, depending on the type of header 1752 that follows this header, and the latter byte is set to 1. These 2 1753 bytes form the 2 bytes of padding (IPV6_OPT_X_OFFSETY) required at 1754 the beginning of this option. 1756 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1757 | cmsg_len = 28 | 1758 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1759 | cmsg_level = IPPROTO_IPV6 | 1760 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1761 | cmsg_type = IPV6_HOPOPTS | 1762 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1763 | Next Header | Hdr Ext Len=1 | Option Type=X |Opt Data Len=12| 1764 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1765 | 4-octet field | 1766 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1767 | | 1768 + 8-octet field + 1769 | | 1770 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1772 The cmsg_len member of the cmsghdr structure is incremented by 16, 1773 the size of the option. 1775 The next call to inet6_option_append() appends the Y option to the 1776 ancillary data object. 1778 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1779 | cmsg_len = 44 | 1780 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1781 | cmsg_level = IPPROTO_IPV6 | 1782 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1783 | cmsg_type = IPV6_HOPOPTS | 1784 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1785 | Next Header | Hdr Ext Len=3 | Option Type=X |Opt Data Len=12| 1786 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1787 | 4-octet field | 1788 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1789 | | 1790 + 8-octet field + 1791 | | 1792 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1793 | PadN Option=1 |Opt Data Len=1 | 0 | Option Type=Y | 1794 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1795 |Opt Data Len=7 | 1-octet field | 2-octet field | 1796 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1797 | 4-octet field | 1798 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1799 | PadN Option=1 |Opt Data Len=2 | 0 | 0 | 1800 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1802 16 bytes are appended by this function, so cmsg_len becomes 44. The 1803 inet6_option_append() function notices that the appended data 1804 requires 4 bytes of padding at the end, to make the size of the 1805 ancillary data object a multiple of 8, and appends the PadN option 1806 before returning. The Hdr Ext Len byte is incremented by 2 to become 1807 3. 1809 Alternately, the application could build two ancillary data objects, 1810 one per option, although this will probably be less efficient than 1811 combining the two options into a single ancillary data object (as 1812 just shown). The kernel must combine these into a single Hop-by-Hop 1813 extension header in the final IPv6 packet. 1815 struct msghdr msg; 1816 struct cmsghdr *cmsgptr; 1817 struct ipv6_opt_X optX; 1818 struct ipv6_opt_Y optY; 1820 msg.msg_control = malloc(sizeof(optX) + sizeof(optY)); 1822 inet6_option_init(msg.msg_control, &cmsgptr, IPPROTO_HOPOPTS); 1824 optX.opt_X_type = IPV6_OPT_X_TYPE; 1825 optX.opt_X_len = IPV6_OPT_X_LEN; 1826 optX.opt_X_val1 = <32-bit value>; 1827 optX.opt_X_val2 = <64-bit value>; 1828 inet6_option_append(cmsgptr, &optX.opt_X_type, 1829 IPV6_OPT_X_MULTX, IPV6_OPT_X_OFFSETY); 1830 msg.msg_controllen = CMSG_SPACE(cmsgptr->cmsg_len); 1832 inet6_option_init((u_char *)msg.msg_control + msg.msg_controllen, 1833 &cmsgptr, IPPROTO_HOPOPTS); 1835 optY.opt_Y_type = IPV6_OPT_Y_TYPE; 1836 optY.opt_Y_len = IPV6_OPT_Y_LEN; 1837 optY.opt_Y_val1 = <8-bit value>; 1838 optY.opt_Y_val2 = <16-bit value>; 1839 optY.opt_Y_val3 = <32-bit value>; 1840 inet6_option_append(cmsgptr, &optY.opt_Y_type, 1841 IPV6_OPT_Y_MULTX, IPV6_OPT_Y_OFFSETY); 1842 msg.msg_controllen += CMSG_SPACE(cmsgptr->cmsg_len); 1844 Each call to inet6_option_init() builds a new cmsghdr structure, and 1845 the final result looks like the following: 1847 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1848 | cmsg_len = 28 | 1849 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1850 | cmsg_level = IPPROTO_IPV6 | 1851 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1852 | cmsg_type = IPV6_HOPOPTS | 1853 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1854 | Next Header | Hdr Ext Len=1 | Option Type=X |Opt Data Len=12| 1855 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1856 | 4-octet field | 1857 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1858 | | 1859 + 8-octet field + 1860 | | 1861 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1862 | cmsg_len = 28 | 1863 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1864 | cmsg_level = IPPROTO_IPV6 | 1865 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1866 | cmsg_type = IPV6_HOPOPTS | 1867 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1868 | Next Header | Hdr Ext Len=1 | Pad1 Option=0 | Option Type=Y | 1869 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1870 |Opt Data Len=7 | 1-octet field | 2-octet field | 1871 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1872 | 4-octet field | 1873 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1874 | PadN Option=1 |Opt Data Len=2 | 0 | 0 | 1875 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1877 When the kernel combines these two options into a single Hop-by-Hop 1878 extension header, the first 3 bytes of the second ancillary data 1879 object (the Next Header byte, the Hdr Ext Len byte, and the Pad1 1880 option) will be combined into a PadN option occupying 3 bytes. 1882 The following code fragment is a redo of the first example shown 1883 (building two options in a single ancillary data object) but this 1884 time we use inet6_option_alloc(). 1886 u_int8_t *typep; 1887 struct msghdr msg; 1888 struct cmsghdr *cmsgptr; 1889 struct ipv6_opt_X *optXp; /* now a pointer, not a struct */ 1890 struct ipv6_opt_Y *optYp; /* now a pointer, not a struct */ 1892 msg.msg_control = malloc(sizeof(*optXp) + sizeof(*optYp)); 1893 inet6_option_init(msg.msg_control, &cmsgptr, IPV6_HOPOPTS); 1895 typep = inet6_option_append(cmsgptr, IPV6_OPT_X_LEN, 1896 IPV6_OPT_X_MULTX, IPV6_OPT_X_OFFSETY); 1897 optXp = (struct ipv6_opt_X *) (typep - IPV6_OPT_X_OFFSETY); 1898 optXp->opt_X_type = IPV6_OPT_X_TYPE; 1899 optXp->opt_X_len = IPV6_OPT_X_LEN; 1900 optXp->opt_X_val1 = <32-bit value>; 1901 optXp->opt_X_val2 = <64-bit value>; 1903 typep = inet6_option_append(cmsgptr, IPV6_OPT_Y_LEN, 1904 IPV6_OPT_Y_MULTX, IPV6_OPT_Y_OFFSETY); 1905 optYp = (struct ipv6_opt_Y *) (typep - IPV6_OPT_Y_OFFSETY); 1906 optYp->opt_Y_type = IPV6_OPT_Y_TYPE; 1907 optYp->opt_Y_len = IPV6_OPT_Y_LEN; 1908 optYp->opt_Y_val1 = <8-bit value>; 1909 optYp->opt_Y_val2 = <16-bit value>; 1910 optYp->opt_Y_val3 = <32-bit value>; 1912 msg.msg_controllen = CMSG_SPACE(cmsgptr->cmsg_len); 1914 Notice that inet6_option_alloc() returns a pointer to the 8-bit 1915 option type field. If the program wants a pointer to an option 1916 structure that includes the padding at the front (as shown in our 1917 definitions of the ipv6_opt_X and ipv6_opt_Y structures), the y- 1918 offset at the beginning of the structure must be subtracted from the 1919 returned pointer. 1921 The following code fragment shows the processing of Hop-by-Hop 1922 options using the inet6_option_next() function. 1924 struct msghdr msg; 1925 struct cmsghdr *cmsgptr; 1927 /* fill in msg */ 1929 /* call recvmsg() */ 1931 for (cmsgptr = CMSG_FIRSTHDR(&msg); cmsgptr != NULL; 1932 cmsgptr = CMSG_NXTHDR(&msg, cmsgptr)) { 1933 if (cmsgptr->cmsg_level == IPPROTO_IPV6 && 1934 cmsgptr->cmsg_type == IPV6_HOPOPTS) { 1936 u_int8_t *tptr = NULL; 1938 while (inet6_option_next(cmsgptr, &tptr) == 0) { 1939 if (*tptr == IPV6_OPT_X_TYPE) { 1940 struct ipv6_opt_X *optXp; 1941 optXp = (struct ipv6_opt_X *) (tptr - IPV6_OPT_X_OFFSETY); 1942 optXp->opt_X_val1; 1943 optXp->opt_X_val2; 1945 } else if (*tptr == IPV6_OPT_Y_TYPE) { 1946 struct ipv6_opt_Y *optYp; 1948 optYp = (struct ipv6_opt_Y *) (tptr - IPV6_OPT_Y_OFFSETY); 1949 optYp->opt_Y_val1; 1950 optYp->opt_Y_val2; 1951 optYp->opt_Y_val3; 1952 } 1953 } 1954 if (tptr != NULL) 1955 ; 1956 } 1957 } 1959 8. Destination Options 1961 A variable number of Destination options can appear in one or more 1962 Destination option headers. As defined in [1], a Destination options 1963 header appearing before a Routing header is processed by the first 1964 destination plus any subsequent destinations specified in the Routing 1965 header, while a Destination options header appearing after a Routing 1966 header is processed only by the final destination. As with the Hop- 1967 by-Hop options, each option in a Destination options header is TLV- 1968 encoded with a type, length, and value. 1970 Today no Destination options are defined for IPv6 [1], although 1971 proposals exist to use Destination options with mobility and 1972 anycasting. 1974 8.1. Receiving Destination Options 1976 To receive Destination options the application must enable the 1977 IPV6_DSTOPTS socket option: 1979 int on = 1; 1980 setsockopt(fd, IPPROTO_IPV6, IPV6_DSTOPTS, &on, sizeof(on)); 1982 All the Destination options appearing before a Routing header are 1983 returned as one ancillary data object described by a cmsghdr 1984 structure and all the Destination options appearing after a Routing 1985 header are returned as another ancillary data object described by a 1986 cmsghdr structure. For these ancillary data objects, the cmsg_level 1987 member will be IPPROTO_IPV6 and the cmsg_type member will be 1988 IPV6_HOPOPTS. These options are then processed by calling the 1989 inet6_option_next() and inet6_option_find() functions. 1991 8.2. Sending Destination Options 1993 To send one or more Destination options, the application just 1994 specifies them as ancillary data in a call to sendmsg(). No socket 1995 option need be set. 1997 As described earlier, one set of Destination options can appear 1998 before a Routing header, and one set can appear after a Routing 1999 header. Each set can consist of one or more options. 2001 Normally all the Destination options in a set are specified by a 2002 single ancillary data object, since each option is itself TLV- 2003 encoded. Multiple ancillary data objects, each containing one or 2004 more Destination options, can also be specified, in which case the 2005 kernel will combine all the Destination options in the set into a 2006 single Destination extension header. But it should be more efficient 2007 to use a single ancillary data object to describe all the Destination 2008 options in a set. The cmsg_level member is set to IPPROTO_IPV6 and 2009 the cmsg_type member is set to IPV6_DSTOPTS. The option is normally 2010 constructed using the inet6_option_init(), inet6_option_append(), and 2011 inet6_option_alloc() functions. 2013 Additional errors may be possible from sendmsg() if the specified 2014 option is in error. 2016 9. Source Route Option 2018 Source routing in IPv6 is accomplished by specifying a Routing header 2019 as an extension header. There can be different types of Routing 2020 headers, but IPv6 currently defines only the Type 0 Routing header 2021 [1]. This type supports up to 23 intermediate nodes. With this 2022 maximum number of intermediate nodes, a source, and a destination, 2023 there are 24 hops, each of which is defined as a strict or loose hop. 2025 Source routing with IPv4 sockets API (the IP_OPTIONS socket option) 2026 requires the application to build the source route in the format that 2027 appears as the IPv4 header option, requiring intimate knowledge of 2028 the IPv4 options format. This IPv6 API, however, defines eight 2029 functions that the application calls to build and examine a Routing 2030 header. Four functions build a Routing header: 2032 inet6_srcrt_space() - return #bytes required for ancillary data 2033 inet6_srcrt_init() - initialize ancillary data for Routing header 2034 inet6_srcrt_add() - add IPv6 address & flags to Routing header 2035 inet6_srcrt_lasthop() - specify the flags for the final hop 2037 Four functions deal with a returned Routing header: 2039 inet6_srcrt_reverse() - reverse a Routing header 2040 inet6_srcrt_segments() - return #segments in a Routing header 2041 inet6_srcrt_getaddr() - fetch one address from a Routing header 2042 inet6_srcrt_getflags() - fetch one flag from a Routing header 2044 The function prototypes for these functions are all in the 2045 header. 2047 A Routing header is passed between the application and the kernel as 2048 an ancillary data object. The cmsg_level member has a value of 2049 IPPROTO_IPV6 and the cmsg_type member has a value of IPV6_SRCRT. The 2050 contents of the cmsg_data[] member is implementation dependent and 2051 should not be accessed directly by the application, but should be 2052 accessed using the eight functions that we are about to describe. 2054 The following constants are defined in the header: 2056 #define IPV6_SRCRT_LOOSE 0 /* this hop need not be a neighbor */ 2057 #define IPV6_SRCRT_STRICT 1 /* this hop must be a neighbor */ 2059 #define IPV6_SRCRT_TYPE_0 0 /* IPv6 Routing header type 0 */ 2061 When a Routing header is specified, the destination address specified 2062 for connect(), sendto(), or sendmsg() is the final destination 2063 address of the datagram. The Routing header then contains the 2064 addresses of all the intermediate nodes. 2066 9.1. inet6_srcrt_space 2068 size_t inet6_srcrt_space(int type, int segments); 2070 This function returns the number of bytes required to hold a Routing 2071 header of the specified type containing the specified number of 2072 segments (addresses). The number of segments must be between 1 and 2073 23, inclusive. The return value includes the size of the cmsghdr 2074 structure that precedes the Routing header, and any required padding. 2076 If the return value is 0, then either the type of the Routing header 2077 is not supported by this implementation or the number of segments is 2078 invalid for this type of Routing header. 2080 (Note: This function returns the size but does not allocate the space 2081 required for the ancillary data. This allows an application to 2082 allocate a larger buffer, if other ancillary data objects are 2083 desired, since all the ancillary data objects must be specified to 2084 sendmsg() as a single msg_control buffer.) 2086 9.2. inet6_srcrt_init 2088 struct cmsghdr *inet6_srcrt_init(void *bp, int type); 2090 This function initializes the buffer pointed to by bp to contain a 2091 cmsghdr structure followed by a Routing header of the specified type. 2092 The cmsg_len member of the cmsghdr structure is initialized to the 2093 size of the structure plus the amount of space required by the 2094 Routing header. The cmsg_level and cmsg_type members are also 2095 initialized as required. 2097 The caller must allocate the buffer and its size can be determined by 2098 calling inet6_srcrt_space(). 2100 The return value is the pointer to the cmsghdr structure, and this is 2101 then used as the first argument to the next two functions. If the 2102 type of Routing header is not supported by the implementation, the 2103 return value is NULL. 2105 9.3. inet6_srcrt_add 2107 int inet6_srcrt_add(struct cmsghdr *cmsg, 2108 const struct in6_addr *addr, unsigned int flags); 2110 This function adds the address pointed to by addr to the end of the 2111 Routing header being constructed and sets the type of this hop to the 2112 value of flags. For an IPv6 Type 0 Routing header, flags must be 2113 either IPV6_SRCRT_LOOSE or IPV6_SRCRT_STRICT. 2115 If successful, the cmsg_len member of the cmsghdr structure is 2116 updated to account for the new address in the Routing header and the 2117 return value of the function is 0. 2119 If the address would exceed the limits of the Routing header, the 2120 return value of the function is ENOSPC. If flags specifies an 2121 invalid value for the Routing header, the return value of the 2122 function is EINVAL. 2124 9.4. inet6_srcrt_lasthop 2126 int inet6_srcrt_lasthop(struct cmsghdr *cmsg, 2127 unsigned int flags); 2129 This function specifies the Strict/Loose flag for the final hop of a 2130 source route. For an IPv6 Type 0 Routing header, flags must be 2131 either IPV6_SRCRT_LOOSE or IPV6_SRCRT_STRICT. 2133 Notice that a source route that specifies N intermediate nodes 2134 requires N+1 Strict/Loose flags. This requires N calls to 2135 inet6_srcrt_add() followed by one call to inet6_srcrt_lasthop(). 2137 9.5. inet6_srcrt_reverse 2139 int inet6_srcrt_reverse(const struct cmsghdr *in, struct cmsghdr *out); 2141 This function takes a Routing header that was received as ancillary 2142 data (pointed to by the first argument) and writes a new Routing 2143 header that sends datagrams along the reverse of that route. Both 2144 arguments are allowed to point to the same buffer (that is, the 2145 reversal can occur in place). The return value of the function is 0 2146 on success. 2148 If the type of Routing header in not supported by the implementation, 2149 the return value of the function is EOPNOTSUPP. If the Routing 2150 header information is invalid, the return value of the function is 2151 EINVAL. 2153 9.6. inet6_srcrt_segments 2155 int inet6_srcrt_segments(const struct cmsghdr *cmsg) 2157 This function returns the number of segments (addresses) contained in 2158 the Routing header described by cmsg. On success the return value is 2159 between 1 and 23, inclusive. The return value is -1 if the cmsghdr 2160 structure does not describe a valid Routing header or is a Routing 2161 header of an unsupported type. 2163 9.7. inet6_srcrt_getaddr 2165 struct in6_addr *inet6_srcrt_getaddr(struct cmsghdr *cmsg, int index); 2167 This function returns a pointer to the IPv6 address specified by 2168 index (which must have a value between 1 and the value returned by 2169 inet6_srcrt_segments()) in the Routing header described by cmsg. An 2170 application should first call inet6_srcrt_segments() to obtain the 2171 number of segments in the Routing header. 2173 If offset refers to an address beyond the end of the Routing header, 2174 the return value is NULL. 2176 9.8. inet6_srcrt_getflags 2178 int inet6_srcrt_getflags(const struct cmsghdr *cmsg, int offset); 2180 This function returns the flags value indexed by offset (which must 2181 have a value between 0 and the value returned by 2182 inet6_srcrt_segments()) in the Routing header described by cmsg. For 2183 an IPv6 Type 0 Routing header the return value will be either 2184 IPV6_SRCRT_LOOSE or IPV6_SRCRT_STRICT. 2186 If offset refers to a segment beyond the end of the Routing header, 2187 the return value is -1. 2189 (Note: Addresses are indexed starting at 1, and flags starting at 0, 2190 to maintain consistency with the terminology and figures in [1].) 2192 9.9. Source Route Example 2194 As an example of these source routing functions, we go through the 2195 function calls for the example on p. 18 of [1]. The source is S, the 2196 destination is D, and the three intermediate nodes are I1, I2, and 2197 I3. f0, f1, f2, and f3 are the Strict/Loose flags for each hop. 2199 f0 f1 f2 f3 2200 S -----> I1 -----> I2 -----> I3 -----> D 2202 src: * S S S S S 2203 dst: D I1 I2 I3 D D 2204 A[1]: I1 I2 I1 I1 I1 I1 2205 A[2]: I2 I3 I3 I2 I2 I2 2206 A[3]: I3 D D D I3 I3 2207 #seg: 3 3 2 1 0 3 2209 check: f0 f1 f2 f3 2211 src and dst are the source and destination IPv6 addresses in the IPv6 2212 header. A[1], A[2], and A[3] are the three addresses in the Routing 2213 header. #seg is the Segments Left field in the Routing header. 2214 check indicates which bit of the Strict/Loose Bit Map (0 through 3, 2215 specified as f0 through f3) that node checks. 2217 The six values in the column beneath node S are the values in the 2218 Routing header specified by the application using sendmsg(). The 2219 function calls by the sender would look like: 2221 void *ptr; 2222 struct msghdr msg; 2223 struct cmsghdr *cmsgptr; 2224 struct sockaddr_in6 I1, I2, I3, D; 2225 unsigned int f0, f1, f2, f3; 2227 ptr = malloc(inet6_srcrt_space(IPV6_SRCRT_TYPE_0, 3)); 2228 cmsgptr = inet6_srcrt_init(ptr, IPV6_SRCRT_TYPE_0); 2230 inet6_srcrt_add(cmsgptr, &I1.sin6_addr, f0); 2231 inet6_srcrt_add(cmsgptr, &I2.sin6_addr, f1); 2232 inet6_srcrt_add(cmsgptr, &I3.sin6_addr, f2); 2233 inet6_srcrt_lasthop(cmsgptr, f3); 2235 msg.msg_control = ptr; 2236 msg.msg_controllen = CMSG_LEN(cmsgptr->cmsg_len); 2238 /* finish filling in msg{}, msg_name = D */ 2239 /* call sendmsg() */ 2241 We also assume that the source address for the socket is not 2242 specified (i.e., the asterisk in the figure). 2244 The four columns of six values that are then shown between the five 2245 nodes are the values of the fields in the packet while the packet is 2246 in transit between the two nodes. Notice that before the packet is 2247 sent by the source node S, the source address is chosen (replacing 2248 the asterisk), I1 becomes the destination address of the datagram, 2249 the two addresses A[2] and A[3] are "shifted up", and D is moved to 2250 A[3]. If f0 is IPV6_SRCRT_STRICT, then I1 must be a neighbor of S. 2252 The columns of values that are shown beneath the destination node are 2253 the values returned by recvmsg(), assuming the application has 2254 enabled both the IPV6_PKTINFO and IPV6_SRCRT socket options. The 2255 source address is S (contained in the sockaddr_in6 structure pointed 2256 to by the msg_name member), the destination address is D (returned as 2257 an ancillary data object in an in6_pktinfo structure), and the 2258 ancillary data object specifying the source route will contain three 2259 addresses (I1, I2, and I3) and four flags (f0, f1, f2, and f3). The 2260 number of segments in the Routing header is known from the Hdr Ext 2261 Len field in the Routing header (a value of 6, indicating 3 2262 addresses). 2264 The return value from inet6_srcrt_segments() will be 3 and 2265 inet6_srcrt_getaddr(1) will return I1, inet6_srcrt_getaddr(2) will 2266 return I2, and inet6_srcrt_getaddr(3) will return I3, The return 2267 value from inet6_srcrt_flags(0) will be f0, inet6_srcrt_flags(1) will 2268 return f1, inet6_srcrt_flags(2) will return f2, and 2269 inet6_srcrt_flags(3) will return f3. 2271 If the receiving application then calls inet6_srcrt_reverse(), the 2272 order of the three addresses will become I3, I2, and I1, and the 2273 order of the four Strict/Loose flags will become f3, f2, f1, and f0. 2275 We can also show what an implementation might store in the ancillary 2276 data object as the Routing header is being built by the sending 2277 process. If we assume a 32-bit architecture where sizeof(struct 2278 cmsghdr) equals 12, with a desired alignment of 4-byte boundaries, 2279 then the call to inet6_srcrt_space(3) returns 68: 12 bytes for the 2280 cmsghdr structure and 56 bytes for the Routing header (8 + 3*16). 2282 The call to inet6_srcrt_init() initializes the ancillary data object 2283 to contain a Type 0 Routing header: 2285 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2286 | cmsg_len = 20 | 2287 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2288 | cmsg_level = IPPROTO_IPV6 | 2289 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2290 | cmsg_type = IPV6_SRCRT | 2291 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2292 | Next Header | Hdr Ext Len=0 | Routing Type=0| Seg Left=0 | 2293 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2294 | Reserved | Strict/Loose Bit Map | 2295 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2297 The first call to inet6_srcrt_add() adds I1 to the list. 2299 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2300 | cmsg_len = 36 | 2301 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2302 | cmsg_level = IPPROTO_IPV6 | 2303 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2304 | cmsg_type = IPV6_SRCRT | 2305 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2306 | Next Header | Hdr Ext Len=2 | Routing Type=0| Seg Left=1 | 2307 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2308 | Reserved |X| Strict/Loose Bit Map | 2309 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2310 | | 2311 + + 2312 | | 2313 + Address[1] = I1 + 2314 | | 2315 + + 2316 | | 2317 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2319 Bit 0 of the Strict/Loose Bit Map contains the value f0, which we 2320 just mark as X. cmsg_len is incremented by 16, the Hdr Ext Len field 2321 is incremented by 2, and the Segments Left field is incremented by 1. 2323 The next call to inet6_srcrt_add() adds I2 to the list. 2325 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2326 | cmsg_len = 52 | 2327 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2328 | cmsg_level = IPPROTO_IPV6 | 2329 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2330 | cmsg_type = IPV6_SRCRT | 2331 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2332 | Next Header | Hdr Ext Len=4 | Routing Type=0| Seg Left=2 | 2333 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2334 | Reserved |X|X| Strict/Loose Bit Map | 2335 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2336 | | 2337 + + 2338 | | 2339 + Address[1] = I1 + 2340 | | 2341 + + 2342 | | 2343 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2344 | | 2345 + + 2346 | | 2347 + Address[2] = I2 + 2348 | | 2349 + + 2350 | | 2351 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2353 The next bit of the Strict/Loose Bit Map contains the value f1. 2354 cmsg_len is incremented by 16, the Hdr Ext Len field is incremented 2355 by 2, and the Segments Left field is incremented by 1. 2357 The last call to inet6_srcrt_add() adds I3 to the list. 2359 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2360 | cmsg_len = 68 | 2361 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2362 | cmsg_level = IPPROTO_IPV6 | 2363 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2364 | cmsg_type = IPV6_SRCRT | 2365 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2366 | Next Header | Hdr Ext Len=6 | Routing Type=0| Seg Left=3 | 2367 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2368 | Reserved |X|X|X| Strict/Loose Bit Map | 2369 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2370 | | 2371 + + 2372 | | 2373 + Address[1] = I1 + 2374 | | 2375 + + 2376 | | 2377 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2378 | | 2379 + + 2380 | | 2381 + Address[2] = I2 + 2382 | | 2383 + + 2384 | | 2385 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2386 | | 2387 + + 2388 | | 2389 + Address[3] = I3 + 2390 | | 2391 + + 2392 | | 2393 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2395 The next bit of the Strict/Loose Bit Map contains the value f2. 2396 cmsg_len is incremented by 16, the Hdr Ext Len field is incremented 2397 by 2, and the Segments Left field is incremented by 1. 2399 Finally, the call to inet6_srcrt_lasthop() sets the next bit of the 2400 Strict/Loose Bit Map to the value specified by f3. All the lengths 2401 remain unchanged. 2403 10. Ordering of Ancillary Data and IPv6 Extension Headers 2404 Three IPv6 extension headers can be specified by the application and 2405 returned to the application using ancillary data with sendmsg() and 2406 recvmsg(): Hop-by-Hop options, Destination options, and the Routing 2407 header. When multiple ancillary data objects are transferred via 2408 sendmsg() or recvmsg() and these objects represent any of these three 2409 extension headers, their placement in the control buffer is directly 2410 tied to their location in the corresponding IPv6 datagram. This API 2411 imposes some ordering constraints when using multiple ancillary data 2412 objects with sendmsg(). 2414 When multiple IPv6 Hop-by-Hop options having the same option type are 2415 specified, these options will be inserted into the Hop-by-Hop options 2416 header in the same order as they appear in the control buffer. But 2417 when multiple Hop-by-Hop options having different option types are 2418 specified, these options may be reordered by the kernel to reduce 2419 padding in the Hop-by-Hop options header. Hop-by-hop options may 2420 appear anywhere in the control buffer and will always be collected by 2421 the kernel and placed into a single Hop-by-Hop options header that 2422 immediately follows the IPv6 header. 2424 Similar rules apply to the Destination options: (1) those of the same 2425 type will appear in the same order as they are specified, and (2) 2426 those of differing types may be reordered. But the kernel will build 2427 up to two Destination options headers: one to precede the Routing 2428 header and one to follow the Routing header. If the application 2429 specifies a Routing header then all Destination options that appear 2430 in the control buffer before the Routing header will appear in a 2431 Destination options header before the Routing header and these 2432 options might be reordered, subject to the two rules that we just 2433 stated. Similarly all Destination options that appear in the control 2434 buffer after the Routing header will appear in a Destination options 2435 header after the Routing header, and these options might be 2436 reordered, subject to the two rules that we just stated. 2438 As an example, assume that an application specifies control 2439 information to sendmsg() containing six ancillary data objects: the 2440 first containing two Hop-by-Hop options, the second containing one 2441 Destination option, the third containing two Destination options, the 2442 fourth containing a source route, the fifth containing a Hop-by-Hop 2443 option, and the sixth containing two Destination options. We also 2444 assume that all the Hop-by-Hop options are of different types, as are 2445 all the Destination options. We number these options 1-9, 2446 corresponding to their order in the control buffer, and show them on 2447 the left below. 2449 In the middle we show the final arrangement of the options in the 2450 extension headers built by the kernel. On the right we show the four 2451 ancillary data objects returned to the receiving application. 2453 Sender's Receiver's 2454 Ancillary Data --> IPv6 Extension --> Ancillary Data 2455 Objects Headers Objects 2456 ------------------ --------------- -------------- 2457 HOPOPT-1,2 (first) HOPHDR(J,7,1,2) HOPOPT-7,1,2 2458 DSTOPT-3 DSTHDR(4,5,3) DSTOPT-4,5,3 2459 DSTOPT-4,5 RTGHDR(6) SRCRT-6 2460 SRCRT-6 DSTHDR(8,9) DSTOPT-8,9 2461 HOPOPT-7 2462 DSTOPT-8,9 (last) 2464 The sender's two Hop-by-Hop ancillary data objects are reordered, as 2465 are the first two Destination ancillary data objects. We also show a 2466 Jumbo Payload option (denoted as J) inserted by the kernel before the 2467 sender's three Hop-by-Hop options. The first three Destination 2468 options must appear in a Destination header before the Routing 2469 header, and the final two Destination options must appear in a 2470 Destination header after the Routing header. 2472 If Destination options are specified in the control buffer after a 2473 Routing header, or if Destination options are specified without a 2474 Routing header, the kernel will place those Destination options after 2475 an authentication header and/or an encapsulating security payload 2476 header, if present. 2478 11. IPv6-Specific Options with IPv4-Mapped IPv6 Addresses 2480 The various socket options and ancillary data specifications defined 2481 in this document apply only to true IPv6 sockets. It is possible to 2482 create an IPv6 socket that actually sends and receives IPv4 packets, 2483 using IPv4-mapped IPv6 addresses, but the mapping of the options 2484 defined in this document to an IPv4 datagram is beyond the scope of 2485 this document. 2487 In general, attempting to specify an IPv6-only option, such as the 2488 Hop-by-Hop options, Destination options, or Routing header on an IPv6 2489 socket that is using IPv4-mapped IPv6 addresses, will probably result 2490 in an error. Some implementations, however, may provide access to 2491 the packet information (source/destination address, send/receive 2492 interface, and hop limit) on an IPv6 socket that is using IPv4-mapped 2493 IPv6 addresses. 2495 12. rresvport_af 2497 The rresvport() function is used by the rcmd() function, and this 2498 function is in turn called by many of the "r" commands such as 2499 rlogin. While new applications are not being written to use the 2500 rcmd() function, legacy applications such as rlogin will continue to 2501 use it and these will be ported to IPv6. 2503 rresvport() creates an IPv4/TCP socket and binds a "reserved port" to 2504 the socket. Instead of defining an IPv6 version of this function we 2505 define a new function that takes an address family as its argument. 2507 #include 2509 int rresvport_af(int *port, int family); 2511 This function behaves the same as the existing rresvport() function, 2512 but instead of creating an IPv4/TCP socket, it can also create an 2513 IPv6/TCP socket. The family argument is either AF_INET or AF_INET6, 2514 and a new error return is EAFNOSUPPORT if the address family is not 2515 supported. 2517 (Note: There is little consensus on which header defines the 2518 rresvport() and rcmd() function prototypes. 4.4BSD defines it in 2519 , others in , and others don't define the function 2520 prototypes at all.) 2522 (Note: We define this function only, and do not define something like 2523 rcmd_af() or rcmd6(). The reason is that rcmd() calls 2524 gethostbyname(), which returns the type of address: AF_INET or 2525 AF_INET6. It should therefore be possible to modify rcmd() to 2526 support either IPv4 or IPv6, based on the address family returned by 2527 gethostbyname().) 2529 13. Future Items 2531 Some additional items may require standardization, but no concrete 2532 proposals have been made for the API to perform these tasks. These 2533 may be addressed in a later document. 2535 13.1. Path MTU Discovery and UDP 2537 A standard method may be desirable for a UDP application to determine 2538 the "maximum send transport-message size" (Section 5.1 of [3]) to a 2539 given destination. This would let the UDP application send smaller 2540 datagrams to the destination, avoiding fragmentation. 2542 13.2. Neighbor Reachability and UDP 2543 A standard method may be desirable for a UDP application to tell the 2544 kernel that it is making forward progress with a given peer (Section 2545 7.3.1 of [4]). This could save unneeded neighbor solicitations and 2546 neighbor advertisements. 2548 14. Summary of New Definitions 2550 The following list summarizes the constants and structure, 2551 definitions discussed in this memo, sorted by header. 2553 ICMPV6_DEST_UNREACH 2554 ICMPV6_DEST_UNREACH_ADDR 2555 ICMPV6_DEST_UNREACH_ADMIN 2556 ICMPV6_DEST_UNREACH_NOPORT 2557 ICMPV6_DEST_UNREACH_NOROUTE 2558 ICMPV6_DEST_UNREACH_NOTNEIGHBOR 2559 ICMPV6_ECHOREPLY 2560 ICMPV6_ECHOREQUEST 2561 ICMPV6_INFOMSG_MASK 2562 ICMPV6_MGM_QUERY 2563 ICMPV6_MGM_REDUCTION 2564 ICMPV6_MGM_REPORT 2565 ICMPV6_PACKET_TOOBIG 2566 ICMPV6_PARAMPROB 2567 ICMPV6_PARAMPROB_HEADER 2568 ICMPV6_PARAMPROB_NEXTHEADER 2569 ICMPV6_PARAMPROB_OPTION 2570 ICMPV6_TIME_EXCEEDED 2571 ICMPV6_TIME_EXCEED_HOPS 2572 ICMPV6_TIME_EXCEED_REASSEMBLY 2573 ND6_NADVERFLAG_ISROUTER 2574 ND6_NADVERFLAG_OVERRIDE 2575 ND6_NADVERFLAG_SOLICITED 2576 ND6_NEIGHBOR_ADVERTISEMENT 2577 ND6_NEIGHBOR_SOLICITATION 2578 ND6_OPT_ENDOFLIST 2579 ND6_OPT_MTU 2580 ND6_OPT_PI_A_BIT 2581 ND6_OPT_PI_L_BIT 2582 ND6_OPT_PREFIX_INFORMATION 2583 ND6_OPT_REDIRECTED_HEADER 2584 ND6_OPT_SOURCE_LINKADDR 2585 ND6_OPT_TARGET_LINKADDR 2586 ND6_RADV_M_BIT 2587 ND6_RADV_O_BIT 2588 ND6_REDIRECT 2589 ND6_ROUTER_ADVERTISEMENT 2590 ND6_ROUTER_SOLICITATION 2591 enum nd6_option{}; 2592 struct icmp6_filter{}; 2593 struct icmp6_hdr{}; 2594 struct nd6_nadvertisement{}; 2595 struct nd6_nsolicitation{}; 2596 struct nd6_opt_mtu{}; 2597 struct nd6_opt_prefix_info{}; 2598 struct nd6_redirect{}; 2599 struct nd6_router_advert{}; 2600 struct nd6_router_solicit{}; 2602 IPPROTO_AH 2603 IPPROTO_DSTOPTS 2604 IPPROTO_ESP 2605 IPPROTO_FRAGMENT 2606 IPPROTO_HOPOPTS 2607 IPPROTO_ICMPV6 2608 IPPROTO_IPV6 2609 IPPROTO_NONE 2610 IPPROTO_ROUTING 2611 IPV6_DSTOPTS 2612 IPV6_HOPLIMIT 2613 IPV6_HOPOPTS 2614 IPV6_NEXTHOP 2615 IPV6_PKTINFO 2616 IPV6_PKTOPTIONS 2617 IPV6_SRCRT 2618 IPV6_SRCRT_LOOSE 2619 IPV6_SRCRT_STRICT 2620 IPV6_SRCRT_TYPE_0 2621 struct in6_pktinfo{}; 2623 struct ip6_hdr{}; 2625 struct cmsghdr{}; 2626 struct msghdr{}; 2628 The following list summarizes the function and macro prototypes 2629 discussed in this memo, sorted by header. 2631 void ICMPV6_FILTER_SETBLOCK(int, struct icmp6_filter *); 2632 void ICMPV6_FILTER_SETBLOCKALL(struct icmp6_filter *); 2633 void ICMPV6_FILTER_SETPASS(int, struct icmp6_filter *); 2634 void ICMPV6_FILTER_SETPASSALL(struct icmp6_filter *); 2635 int ICMPV6_FILTER_WILLBLOCK(int, 2636 const struct icmp6_filter *); 2638 int ICMPV6_FILTER_WILLPASS(int, 2639 const struct icmp6_filter *); 2641 int IN6_ARE_ADDR_EQUAL(const struct in6_addr *, 2642 const struct in6_addr *); 2644 int inet6_flow_assign(int, struct sockaddr_in6 *, 2645 const void *, size_t); 2646 int inet6_flow_free(int, const struct sockaddr_in6 *); 2647 int inet6_flow_reuse(int, int, 2648 const struct sockaddr_in6 *); 2650 u_int8_t *inet6_option_alloc(struct cmsghdr *, 2651 int, int, int); 2652 int inet6_option_append(struct cmsghdr *, 2653 const u_int8_t *, int, int); 2654 int inet6_option_find(const struct cmsghdr *, 2655 u_int8_t *, int); 2656 int inet6_option_init(void *, struct cmsghdr **, int); 2657 int inet6_option_next(const struct cmsghdr *, 2658 u_int8_t **); 2659 int inet6_option_space(int); 2661 int inet6_srcrt_add(struct cmsghdr *, 2662 const struct in6_addr *, 2663 unsigned int); 2664 struct in6_addr inet6_srcrt_getaddr(struct cmsghdr *, 2665 int); 2666 int inet6_srcrt_getflags(const struct cmsghdr *, int); 2667 struct cmsghdr *inet6_srcrt_init(void *, int); 2668 int inet6_srcrt_lasthop(struct cmsghdr *, unsigned int); 2669 int inet6_srcrt_reverse(const struct cmsghdr *, 2670 struct cmsghdr *); 2671 int inet6_srcrt_segments(const struct cmsghdr *); 2672 size_t inet6_srcrt_space(int, int); 2674 unsigned char *CMSG_DATA(const struct cmsghdr *); 2675 struct cmsghdr *CMSG_FIRSTHDR(const struct msghdr *); 2676 unsigned int CMSG_LEN(unsigned int); 2677 struct cmsghdr *CMSG_NXTHDR(const struct msghdr *mhdr, 2678 const struct cmsghdr *); 2679 unsigned int CMSG_SPACE(unsigned int); 2681 int rresvport_af(int *, int); 2683 15. Security Considerations 2685 Allowing an application to pick flow labels at will could permit 2686 interference with the routing of packets sent by another application 2687 from the same host, or theft of a bandwidth reservation or other 2688 network state created on behalf of another user. 2690 The setting of certain Hop-by-Hop options and Destination options may 2691 be restricted to privileged processes. Similarly some Hop-by-Hop 2692 options and Destination options may not be returned to nonprivileged 2693 applications. 2695 16. Change History 2697 Changes from the February 1997 Edition (-01 draft) 2699 - Changed the name of the ip6hdr structure to ip6_hdr (Section 2.1) 2700 for consistency with the icmp6hdr structure. Also changed the 2701 name of the ip6hdrctl structure contained within the ip6_hdr 2702 structure to ip6_hdrctl (Section 2.1). Finally, changed the name 2703 of the icmp6hdr structure to icmp6_hdr (Section 2.2). All other 2704 occurrences of this structure name, within the Neighbor Discovery 2705 structures in Section 2.2.1, already contained the underscore. 2707 - The "struct nd_router_solicit" and "struct nd_router_advert" 2708 should both begin with "nd6_". (Section 2.2.2). 2710 - Changed the name of in6_are_addr_equal to IN6_ARE_ADDR_EQUAL 2711 (Section 2.3) for consistency with basic API address testing 2712 functions. The header defining this macro is . 2714 - getprotobyname("ipv6") now returns 41, not 0 (Section 2.4). 2716 - The first occurrence of "struct icmpv6_filter" in Section 3.2 2717 should be "struct icmp6_filter". 2719 - Changed the name of the CMSG_LENGTH() macro to CMSG_LEN() 2720 (Section 4.3.5), since LEN is used throughout the 2721 headers. 2723 - Corrected the argument name for the sample implementations of the 2724 CMSG_SPACE() and CMSG_LEN() macros to be "length" (Sections 4.3.4 2725 and 4.3.5). 2727 - Corrected the socket option mentioned in Section 5.1 to specify 2728 the interface for multicasting from IPV6_ADD_MEMBERSHIP to 2729 IPV6_MULTICAST_IF. 2731 - There were numerous errors in the previous draft that specified 2732 that should have been . These have 2733 all been corrected and the locations of all definitions is now 2734 summarized in the new Section 14 ("Summary of New Definitions"). 2736 Changes from the October 1996 Edition (-00 draft) 2738 - Numerous rationale added using the format (Note: ...). 2740 - Added note that not all errors may be defined. 2742 - Added note about ICMPv4, IGMPv4, and ARPv4 terminology. 2744 - Changed the name of to . 2746 - Change some names in Section 2.2.1: ICMPV6_PKT_TOOBIG to 2747 ICMPV6_PACKET_TOOBIG, ICMPV6_TIME_EXCEED to ICMPV6_TIME_EXCEEDED, 2748 ICMPV6_ECHORQST to ICMPV6_ECHOREQUEST, ICMPV6_ECHORPLY to 2749 ICMPV6_ECHOREPLY, ICMPV6_PARAMPROB_HDR to 2750 ICMPV6_PARAMPROB_HEADER, ICMPV6_PARAMPROB_NXT_HDR to 2751 ICMPV6_PARAMPROB_NEXTHEADER, and ICMPV6_PARAMPROB_OPTS to 2752 ICMPV6_PARAMPROB_OPTION. 2754 - Prepend the prefix "icmp6_" to the three members of the 2755 icmp6_dataun union of the icmp6hdr structure (Section 2.2). 2757 - Moved the neighbor discovery definitions into the 2758 header, instead of being in their own header 2759 (Section 2.2.1). 2761 - Changed Section 2.3 ("Address Testing"). The basic macros are 2762 now in the basic API. 2764 - Added the new Section 2.4 on "Protocols File". 2766 - Added note to raw sockets description that something like BPF or 2767 DLPI must be used to read or write entire IPv6 packets. 2769 - Corrected example of IPV6_CHECKSUM socket option (Section 3.1). 2770 Also defined value of -1 to disable. 2772 - Noted that defines all the ICMPv6 filtering 2773 constants, macros, and structures (Section 3.2). 2775 - Added note on magic number 10240 for amount of ancillary data 2776 (Section 4.1). 2778 - Added possible padding to picture of ancillary data (Section 2779 4.2). 2781 - Defined header for CMSG_xxx() functions (Section 2782 4.2). 2784 - Note that the data returned by getsockopt(IPV6_PKTOPTIONS) for a 2785 TCP socket is just from the optional headers, if present, of the 2786 most recently received segment. Also note that control 2787 information is never returned by recvmsg() for a TCP socket. 2789 - Changed header for struct in6_pktinfo from to 2790 (Section 5). 2792 - Removed the old Sections 5.1 and 5.2, because the interface 2793 identification functions went into the basic API. 2795 - Redid Section 5 to support the hop limit field. 2797 - New Section 5.4 ("Next Hop Address"). 2799 - New Section 6 ("Flow Labels"). 2801 - Changed all of Sections 7 and 8 dealing with Hop-by-Hop and 2802 Destination options. We now define a set of inet6_option_XXX() 2803 functions. 2805 - Changed header for IPV6_SRCRT_xxx constants from 2806 to (Section 9). 2808 - Add inet6_srcrt_lasthop() function, and fix errors in description 2809 of source routing (Section 9). 2811 - Reworded some of the source routing descriptions to conform to 2812 the terminology in [1]. 2814 - Added the example from [1] for the Routing header (Section 9.9). 2816 - Expanded the example in Section 10 to show multiple options per 2817 ancillary data object, and to show the receiver's ancillary data 2818 objects. 2820 - New Section 11 ("IPv6-Specific Options with IPv4-Mapped IPv6 2821 Addresses"). 2823 - New Section 12 ("rresvport_af"). 2825 - Redid old Section 10 ("Additional Items") into new Section 13 2826 ("Future Items"). 2828 17. References 2830 [1] Deering, S., Hinden, R., "Internet Protocol, Version 6 (IPv6), 2831 Specification", RFC 1883, Dec. 1995. 2833 [2] Gilligan, R. E., Thomson, S., Bound, J., Stevens, W., "Basic 2834 Socket Interface Extensions for IPv6", Internet-Draft, draft- 2835 ietf-ipngwg-bsd-api-07.txt, January 1997. 2837 [3] McCann, J., Deering, S., Mogul, J, "Path MTU Discovery for IP 2838 version 6", RFC 1981, Aug. 1996. 2840 [4] Narten, T., Nordmark, E., Simpson, W., "Neighbor Discovery for 2841 IP Version 6 (IPv6)", RFC 1970, Aug. 1996. 2843 [5] Braden, R., Zhang, L., Berson, S., Herzog, S., Jamin, S., 2844 "Resource ReSerVation Protocol (RSVP) -- Version 1 Functional 2845 Specification", Internet-Draft, draft-ietf-rsvp-spec-14.txt, 2846 November 1996. 2848 18. Acknowledgments 2850 Matt Thomas and Jim Bound have been working on the technical details 2851 in this draft for over a year. Keith Sklower is the original 2852 implementor of ancillary data in the BSD networking code. Craig Metz 2853 provided lots of feedback, suggestions, and comments based on his 2854 implementing many of these features as the document was being 2855 written. 2857 Matt Crawford designed the flow label interface. 2859 The following provided comments on earlier drafts: Hamid Asayesh, Ran 2860 Atkinson, Karl Auerbach, Matt Crawford, Sam T. Denton, Richard 2861 Draves, Francis Dupont, Bob Gilligan, Tim Hartrick, Masaki Hirabaru, 2862 Yoshinobu Inoue, Mukesh Kacker, A. N. Kuznetsov, der Mouse, John Moy, 2863 Thomas Narten, Erik Nordmark, Tom Pusateri, Pedro Roque, Sameer Shah, 2864 Peter Sjodin, Stephen P. Spackman, Quaizar Vohra, Carl Williams, 2865 Steve Wise, and Kazu Yamamoto. 2867 19. Authors' Addresses 2868 W. Richard Stevens 2869 1202 E. Paseo del Zorro 2870 Tucson, AZ 85718 2871 Email: rstevens@kohala.com 2873 Matt Thomas 2874 AltaVista Internet Software 2875 LJO2-1/J8 2876 30 Porter Rd 2877 Littleton, MA 01460 2878 Email: mattthomas@earthlink.net