idnits 2.17.1 draft-ietf-ipngwg-rfc2292bis-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 175 instances of too long lines in the document, the longest one being 12 characters in excess of 72. == There are 5 instances of lines with non-RFC2606-compliant FQDNs in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 2872: '... - Replaced MUST with must....' Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 287 has weird spacing: '...ip6_vfc ip6_...' == Line 288 has weird spacing: '...p6_flow ip6_c...' == Line 289 has weird spacing: '...p6_plen ip6_c...' == Line 290 has weird spacing: '...ip6_nxt ip6_...' == Line 291 has weird spacing: '...p6_hlim ip6_c...' == (39 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 16, 2002) is 7862 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '4' on line 461 -- Looks like a reference, but probably isn't: '2' on line 3425 -- Looks like a reference, but probably isn't: '1' on line 3417 -- Looks like a reference, but probably isn't: '0' on line 668 -- Looks like a reference, but probably isn't: '8' on line 973 -- Looks like a reference, but probably isn't: '3' on line 3433 -- Looks like a reference, but probably isn't: '8192' on line 3458 ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) -- Possible downref: Non-RFC (?) normative reference: ref. 'BASICAPI' -- Possible downref: Non-RFC (?) normative reference: ref. 'POSIX' ** Obsolete normative reference: RFC 1981 (Obsoleted by RFC 8201) -- Possible downref: Non-RFC (?) normative reference: ref. 'TCPIPILLUST' Summary: 7 errors (**), 0 flaws (~~), 8 warnings (==), 13 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT W. Richard Stevens 3 Expires: April 16, 2003 Matt Thomas (Consultant) 4 Obsoletes RFC 2292 Erik Nordmark (Sun) 5 Tatuya Jinmei (Toshiba) 6 October 16, 2002 8 Advanced Sockets API for IPv6 9 11 Abstract 13 This document provides sockets APIs to support "advanced" IPv6 14 applications, as a supplement to a separate specification, RFC 2553. 15 The expected applications include Ping, Traceroute, routing daemons 16 and the like, which typically use raw sockets to access IPv6 or 17 ICMPv6 header fields. This document proposes some portable 18 interfaces for applications that use raw sockets under IPv6. There 19 are other features of IPv6 that some applications will need to 20 access: interface identification (specifying the outgoing interface 21 and determining the incoming interface), IPv6 extension headers, and 22 path MTU information. This document provides API access to these 23 features too. Additionally, some extended interfaces to libraries 24 for the "r" commands are defined. The extension will provide better 25 backward compatibility to existing implementations that are not 26 IPv6-capable. 28 Status of this Memo 30 This document is an Internet-Draft and is in full conformance with 31 all provisions of Section 10 of RFC2026. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF), its areas, and its working groups. Note that 35 other groups may also distribute working documents as Internet- 36 Drafts. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 The list of current Internet-Drafts can be accessed at 44 http://www.ietf.org/ietf/1id-abstracts.txt 46 The list of Internet-Draft Shadow Directories can be accessed at 47 http://www.ietf.org/shadow.html. 49 This Internet Draft expires April 16, 2003. 51 Table of Contents 53 1. Introduction .................................................... 6 55 2. Common Structures and Definitions ............................... 7 56 2.1. The ip6_hdr Structure ...................................... 8 57 2.1.1. IPv6 Next Header Values ............................. 8 58 2.1.2. IPv6 Extension Headers .............................. 9 59 2.1.3. IPv6 Options ........................................ 10 60 2.2. The icmp6_hdr Structure .................................... 12 61 2.2.1. ICMPv6 Type and Code Values ......................... 12 62 2.2.2. ICMPv6 Neighbor Discovery Definitions ............... 13 63 2.2.3. Multicast Listener Discovery Definitions ............ 16 64 2.2.4. ICMPv6 Router Renumbering Definitions ............... 16 65 2.3. Address Testing Macros ..................................... 18 66 2.4. Protocols File ............................................. 18 68 3. IPv6 Raw Sockets ................................................ 19 69 3.1. Checksums .................................................. 20 70 3.2. ICMPv6 Type Filtering ...................................... 21 71 3.3. ICMPv6 Verification of Received Packets .................... 23 73 4. Access to IPv6 and Extension Headers ............................ 24 74 4.1. TCP Implications ........................................... 26 75 4.2. UDP and Raw Socket Implications ............................ 26 77 5. Extensions to Socket Ancillary Data ............................. 27 78 5.1. CMSG_NXTHDR ................................................ 28 79 5.2. CMSG_SPACE ................................................. 28 80 5.3. CMSG_LEN ................................................... 28 82 6. Packet Information .............................................. 29 83 6.1. Specifying/Receiving the Interface ......................... 30 84 6.2. Specifying/Receiving Source/Destination Address ............ 30 85 6.3. Specifying/Receiving the Hop Limit ......................... 31 86 6.4. Specifying the Next Hop Address ............................ 32 87 6.5. Specifying/Receiving the Traffic Class value ............... 33 88 6.6. Additional Errors with sendmsg() and setsockopt() .......... 34 89 6.7. Summary of outgoing interface selection .................... 34 91 7. Routing Header Option ........................................... 35 92 7.1. inet6_rth_space ............................................ 37 93 7.2. inet6_rth_init ............................................. 37 94 7.3. inet6_rth_add .............................................. 38 95 7.4. inet6_rth_reverse .......................................... 38 96 7.5. inet6_rth_segments ......................................... 38 97 7.6. inet6_rth_getaddr .......................................... 38 99 8. Hop-By-Hop Options .............................................. 39 100 8.1. Receiving Hop-by-Hop Options ............................... 40 101 8.2. Sending Hop-by-Hop Options ................................. 40 103 9. Destination Options ............................................. 41 104 9.1. Receiving Destination Options .............................. 41 105 9.2. Sending Destination Options ................................ 41 107 10. Hop-by-Hop and Destination Options Processing ................... 42 108 10.1. inet6_opt_init ............................................ 43 109 10.2. inet6_opt_append .......................................... 43 110 10.3. inet6_opt_finish .......................................... 44 111 10.4. inet6_opt_set_val ......................................... 44 112 10.5. inet6_opt_next ............................................ 45 113 10.6. inet6_opt_find ............................................ 45 114 10.7. inet6_opt_get_val ......................................... 46 116 11. Additional Advanced API Functions ............................... 46 117 11.1. Sending with the Minimum MTU .............................. 46 118 11.2. Sending without fragmentation ............................. 48 119 11.3. Path MTU Discovery and UDP ................................ 48 120 11.4. Determining the current path MTU .......................... 50 122 12. Ordering of Ancillary Data and IPv6 Extension Headers ........... 50 124 13. IPv6-Specific Options with IPv4-Mapped IPv6 Addresses ........... 53 126 14. Extended interfaces for rresvport, rcmd and rexec ............... 53 127 14.1. rresvport_af .............................................. 53 128 14.2. rcmd_af ................................................... 54 129 14.3. rexec_af .................................................. 54 131 15. Summary of New Definitions ...................................... 55 133 16. Security Considerations ......................................... 59 135 17. Change History .................................................. 59 137 18. References ...................................................... 65 139 19. Acknowledgments ................................................. 65 141 20. Authors' Addresses .............................................. 66 143 21. Appendix A: Ancillary Data Overview ............................. 67 144 21.1. The msghdr Structure ...................................... 67 145 21.2. The cmsghdr Structure ..................................... 68 146 21.3. Ancillary Data Object Macros .............................. 69 147 21.3.1. CMSG_FIRSTHDR ...................................... 70 148 21.3.2. CMSG_NXTHDR ........................................ 71 149 21.3.3. CMSG_DATA .......................................... 72 150 21.3.4. CMSG_SPACE ......................................... 72 151 21.3.5. CMSG_LEN ........................................... 72 153 22. Appendix B: Examples using the inet6_rth_XXX() functions ........ 73 154 22.1. Sending a Routing Header .................................. 73 155 22.2. Receiving Routing Headers ................................. 78 157 23. Appendix C: Examples using the inet6_opt_XXX() functions ........ 80 158 23.1. Building options .......................................... 80 159 23.2. Parsing received options .................................. 82 161 1. Introduction 163 A separate specification [BASICAPI] contain changes to the sockets 164 API to support IP version 6. Those changes are for TCP and UDP-based 165 applications. This document defines some of the "advanced" features 166 of the sockets API that are required for applications to take 167 advantage of additional features of IPv6. 169 Today, the portability of applications using IPv4 raw sockets is 170 quite high, but this is mainly because most IPv4 implementations 171 started from a common base (the Berkeley source code) or at least 172 started with the Berkeley header files. This allows programs such as 173 Ping and Traceroute, for example, to compile with minimal effort on 174 many hosts that support the sockets API. With IPv6, however, there 175 is no common source code base that implementors are starting from, 176 and the possibility for divergence at this level between different 177 implementations is high. To avoid a complete lack of portability 178 amongst applications that use raw IPv6 sockets, some standardization 179 is necessary. 181 There are also features from the basic IPv6 specification that are 182 not addressed in [BASICAPI]: sending and receiving Routing headers, 183 Hop-by-Hop options, and Destination options, specifying the outgoing 184 interface, being told of the receiving interface, and control of path 185 MTU information. 187 This document updates and replaces RFC 2292. This revision is based 188 on implementation experience of the RFC 2292, as well as some 189 additional extensions that have been found to be useful through the 190 IPv6 deployment. Note, however, that further work on this document 191 may still be needed. Once the API specification becomes mature and 192 is deployed among implementations, it may be formally standardized by 193 a more appropriate body, such as has been done with the Basic API 194 [BASICAPI] 196 This document can be divided into the following main sections. 198 1. Definitions of the basic constants and structures required for 199 applications to use raw IPv6 sockets. This includes structure 200 definitions for the IPv6 and ICMPv6 headers and all associated 201 constants (e.g., values for the Next Header field). 203 2. Some basic semantic definitions for IPv6 raw sockets. For 204 example, a raw ICMPv4 socket requires the application to calculate 205 and store the ICMPv4 header checksum. But with IPv6 this would 206 require the application to choose the source IPv6 address because 207 the source address is part of the pseudo header that ICMPv6 now 208 uses for its checksum computation. It should be defined that with 209 a raw ICMPv6 socket the kernel always calculates and stores the 210 ICMPv6 header checksum. 212 3. Packet information: how applications can obtain the received 213 interface, destination address, and received hop limit, along with 214 specifying these values on a per-packet basis. There are a class 215 of applications that need this capability and the technique should 216 be portable. 218 4. Access to the optional Routing header, Hop-by-Hop options, and 219 Destination options extension headers. 221 5. Additional features required for improved IPv6 application 222 portability. 224 The packet information along with access to the extension headers 225 (Routing header, Hop-by-Hop options, and Destination options) are 226 specified using the "ancillary data" fields that were added to the 227 4.3BSD Reno sockets API in 1990. The reason is that these ancillary 228 data fields are part of the Posix standard [POSIX] and should 229 therefore be adopted by most vendors. 231 This document does not address application access to either the 232 authentication header or the encapsulating security payload header. 234 Many examples in this document omit error checking in favor of 235 brevity and clarity. 237 We note that some of the functions and socket options defined in this 238 document may have error returns that are not defined in this 239 document. Some of these possible error returns will be recognized 240 only as implementations proceed. 242 Datatypes in this document follow the Posix format: intN_t means a 243 signed integer of exactly N bits (e.g., int16_t) and uintN_t means an 244 unsigned integer of exactly N bits (e.g., uint32_t). 246 Note that we use the (unofficial) terminology ICMPv4, IGMPv4, and 247 ARPv4 to avoid any confusion with the newer ICMPv6 protocol. 249 2. Common Structures and Definitions 251 Many advanced applications examine fields in the IPv6 header and set 252 and examine fields in the various ICMPv6 headers. Common structure 253 definitions for these protocol headers are required, along with 254 common constant definitions for the structure members. 256 This API assumes that the fields in the protocol headers are left in 257 the network byte order, which is big-endian for the Internet 258 protocols. If not, then either these constants or the fields being 259 tested must be converted at run-time, using something like htons() or 260 htonl(). 262 Two new header files are defined: and 263 . 265 When an include file is specified, that include file is allowed to 266 include other files that do the actual declaration or definition. 268 2.1. The ip6_hdr Structure 270 The following structure is defined as a result of including 271 . Note that this is a new header. 273 struct ip6_hdr { 274 union { 275 struct ip6_hdrctl { 276 uint32_t ip6_un1_flow; /* 4 bits version, 8 bits TC, 20 bits flow-ID */ 277 uint16_t ip6_un1_plen; /* payload length */ 278 uint8_t ip6_un1_nxt; /* next header */ 279 uint8_t ip6_un1_hlim; /* hop limit */ 280 } ip6_un1; 281 uint8_t ip6_un2_vfc; /* 4 bits version, top 4 bits tclass */ 282 } ip6_ctlun; 283 struct in6_addr ip6_src; /* source address */ 284 struct in6_addr ip6_dst; /* destination address */ 285 }; 287 #define ip6_vfc ip6_ctlun.ip6_un2_vfc 288 #define ip6_flow ip6_ctlun.ip6_un1.ip6_un1_flow 289 #define ip6_plen ip6_ctlun.ip6_un1.ip6_un1_plen 290 #define ip6_nxt ip6_ctlun.ip6_un1.ip6_un1_nxt 291 #define ip6_hlim ip6_ctlun.ip6_un1.ip6_un1_hlim 292 #define ip6_hops ip6_ctlun.ip6_un1.ip6_un1_hlim 294 2.1.1. IPv6 Next Header Values 296 IPv6 defines many new values for the Next Header field. The 297 following constants are defined as a result of including 298 . 300 #define IPPROTO_HOPOPTS 0 /* IPv6 Hop-by-Hop options */ 301 #define IPPROTO_IPV6 41 /* IPv6 header */ 302 #define IPPROTO_ROUTING 43 /* IPv6 Routing header */ 303 #define IPPROTO_FRAGMENT 44 /* IPv6 fragment header */ 304 #define IPPROTO_ESP 50 /* encapsulating security payload */ 305 #define IPPROTO_AH 51 /* authentication header */ 306 #define IPPROTO_ICMPV6 58 /* ICMPv6 */ 307 #define IPPROTO_NONE 59 /* IPv6 no next header */ 308 #define IPPROTO_DSTOPTS 60 /* IPv6 Destination options */ 310 Berkeley-derived IPv4 implementations also define IPPROTO_IP to be 0. 311 This should not be a problem since IPPROTO_IP is used only with IPv4 312 sockets and IPPROTO_HOPOPTS only with IPv6 sockets. 314 2.1.2. IPv6 Extension Headers 316 Six extension headers are defined for IPv6. We define structures for 317 all except the Authentication header and Encapsulating Security 318 Payload header, both of which are beyond the scope of this document. 319 The following structures are defined as a result of including 320 . 322 /* Hop-by-Hop options header */ 323 struct ip6_hbh { 324 uint8_t ip6h_nxt; /* next header */ 325 uint8_t ip6h_len; /* length in units of 8 octets */ 326 /* followed by options */ 327 }; 329 /* Destination options header */ 330 struct ip6_dest { 331 uint8_t ip6d_nxt; /* next header */ 332 uint8_t ip6d_len; /* length in units of 8 octets */ 333 /* followed by options */ 334 }; 336 /* Routing header */ 337 struct ip6_rthdr { 338 uint8_t ip6r_nxt; /* next header */ 339 uint8_t ip6r_len; /* length in units of 8 octets */ 340 uint8_t ip6r_type; /* routing type */ 341 uint8_t ip6r_segleft; /* segments left */ 342 /* followed by routing type specific data */ 343 }; 345 /* Type 0 Routing header */ 346 struct ip6_rthdr0 { 347 uint8_t ip6r0_nxt; /* next header */ 348 uint8_t ip6r0_len; /* length in units of 8 octets */ 349 uint8_t ip6r0_type; /* always zero */ 350 uint8_t ip6r0_segleft; /* segments left */ 351 uint32_t ip6r0_reserved; /* reserved field */ 352 /* followed by up to 127 struct in6_addr */ 353 }; 355 /* Fragment header */ 356 struct ip6_frag { 357 uint8_t ip6f_nxt; /* next header */ 358 uint8_t ip6f_reserved; /* reserved field */ 359 uint16_t ip6f_offlg; /* offset, reserved, and flag */ 360 uint32_t ip6f_ident; /* identification */ 361 }; 363 #if BYTE_ORDER == BIG_ENDIAN 364 #define IP6F_OFF_MASK 0xfff8 /* mask out offset from _offlg */ 365 #define IP6F_RESERVED_MASK 0x0006 /* reserved bits in ip6f_offlg */ 366 #define IP6F_MORE_FRAG 0x0001 /* more-fragments flag */ 367 #else /* BYTE_ORDER == LITTLE_ENDIAN */ 368 #define IP6F_OFF_MASK 0xf8ff /* mask out offset from _offlg */ 369 #define IP6F_RESERVED_MASK 0x0600 /* reserved bits in ip6f_offlg */ 370 #define IP6F_MORE_FRAG 0x0100 /* more-fragments flag */ 371 #endif 373 2.1.3. IPv6 Options 375 Several options are defined for IPv6, and we define structures and 376 macro definitions for some of them below. The following structures 377 are defined as a result of including . 379 /* IPv6 options */ 380 struct ip6_opt { 381 uint8_t ip6o_type; 382 uint8_t ip6o_len; 383 }; 385 /* 386 * The high-order 3 bits of the option type define the behavior 387 * when processing an unknown option and whether or not the option 388 * content changes in flight. 389 */ 390 #define IP6OPT_TYPE(o) ((o) & 0xc0) 391 #define IP6OPT_TYPE_SKIP 0x00 392 #define IP6OPT_TYPE_DISCARD 0x40 393 #define IP6OPT_TYPE_FORCEICMP 0x80 394 #define IP6OPT_TYPE_ICMP 0xc0 395 #define IP6OPT_MUTABLE 0x20 397 #define IP6OPT_PAD1 0x00 /* 00 0 00000 */ 398 #define IP6OPT_PADN 0x01 /* 00 0 00001 */ 399 #define IP6OPT_JUMBO 0xc2 /* 11 0 00010 */ 400 #define IP6OPT_NSAP_ADDR 0xc3 /* 11 0 00011 */ 401 #define IP6OPT_TUNNEL_LIMIT 0x04 /* 00 0 00100 */ 402 #define IP6OPT_ROUTER_ALERT 0x05 /* 00 0 00101 */ 404 /* Jumbo Payload Option */ 405 struct ip6_opt_jumbo { 406 uint8_t ip6oj_type; 407 uint8_t ip6oj_len; 408 uint8_t ip6oj_jumbo_len[4]; 409 }; 410 #define IP6OPT_JUMBO_LEN 6 412 /* NSAP Address Option */ 413 struct ip6_opt_nsap { 414 uint8_t ip6on_type; 415 uint8_t ip6on_len; 416 uint8_t ip6on_src_nsap_len; 417 uint8_t ip6on_dst_nsap_len; 418 /* followed by source NSAP */ 419 /* followed by destination NSAP */ 420 }; 422 /* Tunnel Limit Option */ 423 struct ip6_opt_tunnel { 424 uint8_t ip6ot_type; 425 uint8_t ip6ot_len; 426 uint8_t ip6ot_encap_limit; 427 }; 429 /* Router Alert Option */ 430 struct ip6_opt_router { 431 uint8_t ip6or_type; 432 uint8_t ip6or_len; 433 uint8_t ip6or_value[2]; 434 }; 436 /* Router alert values (in network byte order) */ 437 #ifdef _BIG_ENDIAN 438 #define IP6_ALERT_MLD 0x0000 439 #define IP6_ALERT_RSVP 0x0001 440 #define IP6_ALERT_AN 0x0002 441 #else 442 #define IP6_ALERT_MLD 0x0000 443 #define IP6_ALERT_RSVP 0x0100 444 #define IP6_ALERT_AN 0x0200 445 #endif 447 2.2. The icmp6_hdr Structure 449 The ICMPv6 header is needed by numerous IPv6 applications including 450 Ping, Traceroute, router discovery daemons, and neighbor discovery 451 daemons. The following structure is defined as a result of including 452 . Note that this is a new header. 454 struct icmp6_hdr { 455 uint8_t icmp6_type; /* type field */ 456 uint8_t icmp6_code; /* code field */ 457 uint16_t icmp6_cksum; /* checksum field */ 458 union { 459 uint32_t icmp6_un_data32[1]; /* type-specific field */ 460 uint16_t icmp6_un_data16[2]; /* type-specific field */ 461 uint8_t icmp6_un_data8[4]; /* type-specific field */ 462 } icmp6_dataun; 463 }; 465 #define icmp6_data32 icmp6_dataun.icmp6_un_data32 466 #define icmp6_data16 icmp6_dataun.icmp6_un_data16 467 #define icmp6_data8 icmp6_dataun.icmp6_un_data8 468 #define icmp6_pptr icmp6_data32[0] /* parameter prob */ 469 #define icmp6_mtu icmp6_data32[0] /* packet too big */ 470 #define icmp6_id icmp6_data16[0] /* echo request/reply */ 471 #define icmp6_seq icmp6_data16[1] /* echo request/reply */ 472 #define icmp6_maxdelay icmp6_data16[0] /* mcast group membership */ 474 2.2.1. ICMPv6 Type and Code Values 476 In addition to a common structure for the ICMPv6 header, common 477 definitions are required for the ICMPv6 type and code fields. The 478 following constants are also defined as a result of including 479 . 481 #define ICMP6_DST_UNREACH 1 482 #define ICMP6_PACKET_TOO_BIG 2 483 #define ICMP6_TIME_EXCEEDED 3 484 #define ICMP6_PARAM_PROB 4 485 #define ICMP6_INFOMSG_MASK 0x80 /* all informational messages */ 487 #define ICMP6_ECHO_REQUEST 128 488 #define ICMP6_ECHO_REPLY 129 490 #define ICMP6_DST_UNREACH_NOROUTE 0 /* no route to destination */ 491 #define ICMP6_DST_UNREACH_ADMIN 1 /* communication with */ 492 /* destination */ 493 /* admin. prohibited */ 494 #define ICMP6_DST_UNREACH_BEYONDSCOPE 2 /* beyond scope of source address */ 495 #define ICMP6_DST_UNREACH_ADDR 3 /* address unreachable */ 496 #define ICMP6_DST_UNREACH_NOPORT 4 /* bad port */ 498 #define ICMP6_TIME_EXCEED_TRANSIT 0 /* Hop Limit == 0 in transit */ 499 #define ICMP6_TIME_EXCEED_REASSEMBLY 1 /* Reassembly time out */ 501 #define ICMP6_PARAMPROB_HEADER 0 /* erroneous header field */ 502 #define ICMP6_PARAMPROB_NEXTHEADER 1 /* unrecognized Next Header */ 503 #define ICMP6_PARAMPROB_OPTION 2 /* unrecognized IPv6 option */ 505 The five ICMP message types defined by IPv6 neighbor discovery 506 (133-137) are defined in the next section. 508 2.2.2. ICMPv6 Neighbor Discovery Definitions 510 The following structures and definitions are defined as a result of 511 including . 513 #define ND_ROUTER_SOLICIT 133 514 #define ND_ROUTER_ADVERT 134 515 #define ND_NEIGHBOR_SOLICIT 135 516 #define ND_NEIGHBOR_ADVERT 136 517 #define ND_REDIRECT 137 519 struct nd_router_solicit { /* router solicitation */ 520 struct icmp6_hdr nd_rs_hdr; 521 /* could be followed by options */ 522 }; 524 #define nd_rs_type nd_rs_hdr.icmp6_type 525 #define nd_rs_code nd_rs_hdr.icmp6_code 526 #define nd_rs_cksum nd_rs_hdr.icmp6_cksum 527 #define nd_rs_reserved nd_rs_hdr.icmp6_data32[0] 529 struct nd_router_advert { /* router advertisement */ 530 struct icmp6_hdr nd_ra_hdr; 531 uint32_t nd_ra_reachable; /* reachable time */ 532 uint32_t nd_ra_retransmit; /* retransmit timer */ 533 /* could be followed by options */ 534 }; 536 #define nd_ra_type nd_ra_hdr.icmp6_type 537 #define nd_ra_code nd_ra_hdr.icmp6_code 538 #define nd_ra_cksum nd_ra_hdr.icmp6_cksum 539 #define nd_ra_curhoplimit nd_ra_hdr.icmp6_data8[0] 540 #define nd_ra_flags_reserved nd_ra_hdr.icmp6_data8[1] 541 #define ND_RA_FLAG_MANAGED 0x80 542 #define ND_RA_FLAG_OTHER 0x40 543 #define nd_ra_router_lifetime nd_ra_hdr.icmp6_data16[1] 545 struct nd_neighbor_solicit { /* neighbor solicitation */ 546 struct icmp6_hdr nd_ns_hdr; 547 struct in6_addr nd_ns_target; /* target address */ 548 /* could be followed by options */ 549 }; 551 #define nd_ns_type nd_ns_hdr.icmp6_type 552 #define nd_ns_code nd_ns_hdr.icmp6_code 553 #define nd_ns_cksum nd_ns_hdr.icmp6_cksum 554 #define nd_ns_reserved nd_ns_hdr.icmp6_data32[0] 556 struct nd_neighbor_advert { /* neighbor advertisement */ 557 struct icmp6_hdr nd_na_hdr; 558 struct in6_addr nd_na_target; /* target address */ 559 /* could be followed by options */ 560 }; 562 #define nd_na_type nd_na_hdr.icmp6_type 563 #define nd_na_code nd_na_hdr.icmp6_code 564 #define nd_na_cksum nd_na_hdr.icmp6_cksum 565 #define nd_na_flags_reserved nd_na_hdr.icmp6_data32[0] 566 #if BYTE_ORDER == BIG_ENDIAN 567 #define ND_NA_FLAG_ROUTER 0x80000000 568 #define ND_NA_FLAG_SOLICITED 0x40000000 569 #define ND_NA_FLAG_OVERRIDE 0x20000000 570 #else /* BYTE_ORDER == LITTLE_ENDIAN */ 571 #define ND_NA_FLAG_ROUTER 0x00000080 572 #define ND_NA_FLAG_SOLICITED 0x00000040 573 #define ND_NA_FLAG_OVERRIDE 0x00000020 574 #endif 576 struct nd_redirect { /* redirect */ 577 struct icmp6_hdr nd_rd_hdr; 578 struct in6_addr nd_rd_target; /* target address */ 579 struct in6_addr nd_rd_dst; /* destination address */ 580 /* could be followed by options */ 581 }; 583 #define nd_rd_type nd_rd_hdr.icmp6_type 584 #define nd_rd_code nd_rd_hdr.icmp6_code 585 #define nd_rd_cksum nd_rd_hdr.icmp6_cksum 586 #define nd_rd_reserved nd_rd_hdr.icmp6_data32[0] 588 struct nd_opt_hdr { /* Neighbor discovery option header */ 589 uint8_t nd_opt_type; 590 uint8_t nd_opt_len; /* in units of 8 octets */ 591 /* followed by option specific data */ 592 }; 594 #define ND_OPT_SOURCE_LINKADDR 1 595 #define ND_OPT_TARGET_LINKADDR 2 596 #define ND_OPT_PREFIX_INFORMATION 3 597 #define ND_OPT_REDIRECTED_HEADER 4 598 #define ND_OPT_MTU 5 600 struct nd_opt_prefix_info { /* prefix information */ 601 uint8_t nd_opt_pi_type; 602 uint8_t nd_opt_pi_len; 603 uint8_t nd_opt_pi_prefix_len; 604 uint8_t nd_opt_pi_flags_reserved; 605 uint32_t nd_opt_pi_valid_time; 606 uint32_t nd_opt_pi_preferred_time; 607 uint32_t nd_opt_pi_reserved2; 608 struct in6_addr nd_opt_pi_prefix; 609 }; 611 #define ND_OPT_PI_FLAG_ONLINK 0x80 612 #define ND_OPT_PI_FLAG_AUTO 0x40 614 struct nd_opt_rd_hdr { /* redirected header */ 615 uint8_t nd_opt_rh_type; 616 uint8_t nd_opt_rh_len; 617 uint16_t nd_opt_rh_reserved1; 618 uint32_t nd_opt_rh_reserved2; 619 /* followed by IP header and data */ 620 }; 622 struct nd_opt_mtu { /* MTU option */ 623 uint8_t nd_opt_mtu_type; 624 uint8_t nd_opt_mtu_len; 625 uint16_t nd_opt_mtu_reserved; 626 uint32_t nd_opt_mtu_mtu; 627 }; 629 We note that the nd_na_flags_reserved flags have the same byte 630 ordering problems as we discussed with ip6f_offlg. 632 2.2.3. Multicast Listener Discovery Definitions 634 The following structures and definitions are defined as a result of 635 including . 637 #define MLD_LISTENER_QUERY 130 638 #define MLD_LISTENER_REPORT 131 639 #define MLD_LISTENER_REDUCTION 132 641 struct mld_hdr { 642 struct icmp6_hdr mld_icmp6_hdr; 643 struct in6_addr mld_addr; /* multicast address */ 644 }; 645 #define mld_type mld_icmp6_hdr.icmp6_type 646 #define mld_code mld_icmp6_hdr.icmp6_code 647 #define mld_cksum mld_icmp6_hdr.icmp6_cksum 648 #define mld_maxdelay mld_icmp6_hdr.icmp6_data16[0] 649 #define mld_reserved mld_icmp6_hdr.icmp6_data16[1] 651 2.2.4. ICMPv6 Router Renumbering Definitions 653 The following structures and definitions are defined as a result of 654 including . 656 #define ICMP6_ROUTER_RENUMBERING 138 /* router renumbering */ 658 struct icmp6_router_renum { /* router renumbering header */ 659 struct icmp6_hdr rr_hdr; 660 uint8_t rr_segnum; 661 uint8_t rr_flags; 662 uint16_t rr_maxdelay; 663 uint32_t rr_reserved; 664 }; 665 #define rr_type rr_hdr.icmp6_type 666 #define rr_code rr_hdr.icmp6_code 667 #define rr_cksum rr_hdr.icmp6_cksum 668 #define rr_seqnum rr_hdr.icmp6_data32[0] 670 /* Router renumbering flags */ 671 #define ICMP6_RR_FLAGS_TEST 0x80 672 #define ICMP6_RR_FLAGS_REQRESULT 0x40 673 #define ICMP6_RR_FLAGS_FORCEAPPLY 0x20 674 #define ICMP6_RR_FLAGS_SPECSITE 0x10 675 #define ICMP6_RR_FLAGS_PREVDONE 0x08 677 struct rr_pco_match { /* match prefix part */ 678 uint8_t rpm_code; 679 uint8_t rpm_len; 680 uint8_t rpm_ordinal; 681 uint8_t rpm_matchlen; 682 uint8_t rpm_minlen; 683 uint8_t rpm_maxlen; 684 uint16_t rpm_reserved; 685 struct in6_addr rpm_prefix; 686 }; 688 /* PCO code values */ 689 #define RPM_PCO_ADD 1 690 #define RPM_PCO_CHANGE 2 691 #define RPM_PCO_SETGLOBAL 3 693 struct rr_pco_use { /* use prefix part */ 694 uint8_t rpu_uselen; 695 uint8_t rpu_keeplen; 696 uint8_t rpu_ramask; 697 uint8_t rpu_raflags; 698 uint32_t rpu_vltime; 699 uint32_t rpu_pltime; 700 uint32_t rpu_flags; 701 struct in6_addr rpu_prefix; 702 }; 703 #define ICMP6_RR_PCOUSE_RAFLAGS_ONLINK 0x20 704 #define ICMP6_RR_PCOUSE_RAFLAGS_AUTO 0x10 706 #if BYTE_ORDER == BIG_ENDIAN 707 #define ICMP6_RR_PCOUSE_FLAGS_DECRVLTIME 0x80000000 708 #define ICMP6_RR_PCOUSE_FLAGS_DECRPLTIME 0x40000000 709 #elif BYTE_ORDER == LITTLE_ENDIAN 710 #define ICMP6_RR_PCOUSE_FLAGS_DECRVLTIME 0x80 711 #define ICMP6_RR_PCOUSE_FLAGS_DECRPLTIME 0x40 712 #endif 714 struct rr_result { /* router renumbering result message */ 715 uint16_t rrr_flags; 716 uint8_t rrr_ordinal; 717 uint8_t rrr_matchedlen; 718 uint32_t rrr_ifid; 719 struct in6_addr rrr_prefix; 720 }; 721 #if BYTE_ORDER == BIG_ENDIAN 722 #define ICMP6_RR_RESULT_FLAGS_OOB 0x0002 723 #define ICMP6_RR_RESULT_FLAGS_FORBIDDEN 0x0001 724 #elif BYTE_ORDER == LITTLE_ENDIAN 725 #define ICMP6_RR_RESULT_FLAGS_OOB 0x0200 726 #define ICMP6_RR_RESULT_FLAGS_FORBIDDEN 0x0100 727 #endif 729 2.3. Address Testing Macros 731 The basic API ([BASICAPI]) defines some macros for testing an IPv6 732 address for certain properties. This API extends those definitions 733 with additional address testing macros, defined as a result of 734 including . 736 int IN6_ARE_ADDR_EQUAL(const struct in6_addr *, 737 const struct in6_addr *); 739 This macro returns non-zero if the addresses are equal; otherwise it 740 returns zero. 742 2.4. Protocols File 744 Many hosts provide the file /etc/protocols that contains the names of 745 the various IP protocols and their protocol number (e.g., the value 746 of the protocol field in the IPv4 header for that protocol, such as 1 747 for ICMP). Some programs then call the function getprotobyname() to 748 obtain the protocol value that is then specified as the third 749 argument to the socket() function. For example, the Ping program 750 contains code of the form 752 struct protoent *proto; 754 proto = getprotobyname("icmp"); 756 s = socket(AF_INET, SOCK_RAW, proto->p_proto); 758 Common names are required for the new IPv6 protocols in this file, to 759 provide portability of applications that call the getprotoXXX() 760 functions. 762 We define the following protocol names with the values shown. These 763 are taken under http://www.iana.org/numbers.html. 765 hopopt 0 # hop-by-hop options for ipv6 766 ipv6 41 # ipv6 767 ipv6-route 43 # routing header for ipv6 768 ipv6-frag 44 # fragment header for ipv6 769 esp 50 # encapsulating security payload for ipv6 770 ah 51 # authentication header for ipv6 771 ipv6-icmp 58 # icmp for ipv6 772 ipv6-nonxt 59 # no next header for ipv6 773 ipv6-opts 60 # destination options for ipv6 775 3. IPv6 Raw Sockets 777 Raw sockets bypass the transport layer (TCP or UDP). With IPv4, raw 778 sockets are used to access ICMPv4, IGMPv4, and to read and write IPv4 779 datagrams containing a protocol field that the kernel does not 780 process. An example of the latter is a routing daemon for OSPF, 781 since it uses IPv4 protocol field 89. With IPv6 raw sockets will be 782 used for ICMPv6 and to read and write IPv6 datagrams containing a 783 Next Header field that the kernel does not process. Examples of the 784 latter are a routing daemon for OSPF for IPv6 and RSVP (protocol 785 field 46). 787 All data sent via raw sockets must be in network byte order and all 788 data received via raw sockets will be in network byte order. This 789 differs from the IPv4 raw sockets, which did not specify a byte 790 ordering and used the host's byte order for certain IP header fields. 792 Another difference from IPv4 raw sockets is that complete packets 793 (that is, IPv6 packets with extension headers) cannot be sent or 794 received using the IPv6 raw sockets API. Instead, ancillary data 795 objects are used to transfer the extension headers and hoplimit 796 information, as described in Section 6. Should an application need 797 access to the complete IPv6 packet, some other technique, such as the 798 datalink interfaces BPF or DLPI, must be used. 800 All fields except the flow label in the IPv6 header that an 801 application might want to change (i.e., everything other than the 802 version number) can be modified using ancillary data and/or socket 803 options by the application for output. All fields except the flow 804 label in a received IPv6 header (other than the version number and 805 Next Header fields) and all extension headers that an application 806 might want to know are also made available to the application as 807 ancillary data on input. Hence there is no need for a socket option 808 similar to the IPv4 IP_HDRINCL socket option and on receipt the 809 application will only receive the payload i.e. the data after the 810 IPv6 header and all the extension headers. 812 This API does not define access to the flow label field, because 813 today there is no standard usage of the field. 815 When writing to a raw socket the kernel will automatically fragment 816 the packet if its size exceeds the path MTU, inserting the required 817 fragment headers. On input the kernel reassembles received 818 fragments, so the reader of a raw socket never sees any fragment 819 headers. 821 When we say "an ICMPv6 raw socket" we mean a socket created by 822 calling the socket function with the three arguments AF_INET6, 823 SOCK_RAW, and IPPROTO_ICMPV6. 825 Most IPv4 implementations give special treatment to a raw socket 826 created with a third argument to socket() of IPPROTO_RAW, whose value 827 is normally 255, to have it mean that the application will send down 828 complete packets including the IPv4 header. (Note: This feature was 829 added to IPv4 in 1988 by Van Jacobson to support traceroute, allowing 830 a complete IP header to be passed by the application, before the 831 IP_HDRINCL socket option was added.) We note that IPPROTO_RAW has no 832 special meaning to an IPv6 raw socket (and the IANA currently 833 reserves the value of 255 when used as a next-header field). 835 3.1. Checksums 837 The kernel will calculate and insert the ICMPv6 checksum for ICMPv6 838 raw sockets, since this checksum is mandatory. 840 For other raw IPv6 sockets (that is, for raw IPv6 sockets created 841 with a third argument other than IPPROTO_ICMPV6), the application 842 must set the new IPV6_CHECKSUM socket option to have the kernel (1) 843 compute and store a checksum for output, and (2) verify the received 844 checksum on input, discarding the packet if the checksum is in error. 845 This option prevents applications from having to perform source 846 address selection on the packets they send. The checksum will 847 incorporate the IPv6 pseudo-header, defined in Section 8.1 of 848 [RFC-2460]. This new socket option also specifies an integer offset 849 into the user data of where the checksum is located. 851 int offset = 2; 852 setsockopt(fd, IPPROTO_IPV6, IPV6_CHECKSUM, &offset, sizeof(offset)); 854 By default, this socket option is disabled. Setting the offset to -1 855 also disables the option. By disabled we mean (1) the kernel will 856 not calculate and store a checksum for outgoing packets, and (2) the 857 kernel will not verify a checksum for received packets. 859 This option assumes the use of the 16-bit one's complement of the 860 one's complement sum as the checksum algorithm and that the checksum 861 field is aligned on a 16-bit boundary. Thus, specifying a positive 862 odd value as offset is invalid, and setsockopt() will fail for such 863 offset values. 865 An attempt to set IPV6_CHECKSUM for an ICMPv6 socket will fail. 866 Also, an attempt to set or get IPV6_CHECKSUM for a non-raw IPv6 867 socket will fail. 869 (Note: Since the checksum is always calculated by the kernel for an 870 ICMPv6 socket, applications are not able to generate ICMPv6 packets 871 with incorrect checksums (presumably for testing purposes) using this 872 API.) 874 3.2. ICMPv6 Type Filtering 876 ICMPv4 raw sockets receive most ICMPv4 messages received by the 877 kernel. (We say "most" and not "all" because Berkeley-derived 878 kernels never pass echo requests, timestamp requests, or address mask 879 requests to a raw socket. Instead these three messages are processed 880 entirely by the kernel.) But ICMPv6 is a superset of ICMPv4, also 881 including the functionality of IGMPv4 and ARPv4. This means that an 882 ICMPv6 raw socket can potentially receive many more messages than 883 would be received with an ICMPv4 raw socket: ICMP messages similar to 884 ICMPv4, along with neighbor solicitations, neighbor advertisements, 885 and the three multicast listener discovery messages. 887 Most applications using an ICMPv6 raw socket care about only a small 888 subset of the ICMPv6 message types. To transfer extraneous ICMPv6 889 messages from the kernel to user can incur a significant overhead. 890 Therefore this API includes a method of filtering ICMPv6 messages by 891 the ICMPv6 type field. 893 Each ICMPv6 raw socket has an associated filter whose datatype is 894 defined as 896 struct icmp6_filter; 898 This structure, along with the macros and constants defined later in 899 this section, are defined as a result of including the 900 . 902 The current filter is fetched and stored using getsockopt() and 903 setsockopt() with a level of IPPROTO_ICMPV6 and an option name of 904 ICMP6_FILTER. 906 Six macros operate on an icmp6_filter structure: 908 void ICMP6_FILTER_SETPASSALL (struct icmp6_filter *); 909 void ICMP6_FILTER_SETBLOCKALL(struct icmp6_filter *); 911 void ICMP6_FILTER_SETPASS ( int, struct icmp6_filter *); 912 void ICMP6_FILTER_SETBLOCK( int, struct icmp6_filter *); 914 int ICMP6_FILTER_WILLPASS (int, 915 const struct icmp6_filter *); 916 int ICMP6_FILTER_WILLBLOCK(int, 917 const struct icmp6_filter *); 919 The first argument to the last four macros (an integer) is an ICMPv6 920 message type, between 0 and 255. The pointer argument to all six 921 macros is a pointer to a filter that is modified by the first four 922 macros and is examined by the last two macros. 924 The first two macros, SETPASSALL and SETBLOCKALL, let us specify that 925 all ICMPv6 messages are passed to the application or that all ICMPv6 926 messages are blocked from being passed to the application. 928 The next two macros, SETPASS and SETBLOCK, let us specify that 929 messages of a given ICMPv6 type should be passed to the application 930 or not passed to the application (blocked). 932 The final two macros, WILLPASS and WILLBLOCK, return true or false 933 depending whether the specified message type is passed to the 934 application or blocked from being passed to the application by the 935 filter pointed to by the second argument. 937 When an ICMPv6 raw socket is created, it will by default pass all 938 ICMPv6 message types to the application. 940 As an example, a program that wants to receive only router 941 advertisements could execute the following: 943 struct icmp6_filter myfilt; 945 fd = socket(AF_INET6, SOCK_RAW, IPPROTO_ICMPV6); 947 ICMP6_FILTER_SETBLOCKALL(&myfilt); 948 ICMP6_FILTER_SETPASS(ND_ROUTER_ADVERT, &myfilt); 949 setsockopt(fd, IPPROTO_ICMPV6, ICMP6_FILTER, &myfilt, sizeof(myfilt)); 951 The filter structure is declared and then initialized to block all 952 messages types. The filter structure is then changed to allow router 953 advertisement messages to be passed to the application and the filter 954 is installed using setsockopt(). 956 In order to clear an installed filter the application can issue a 957 setsockopt for ICMP6_FILTER with a zero length. When no such filter 958 has been installed, getsockopt() will return the kernel default 959 filter. 961 The icmp6_filter structure is similar to the fd_set datatype used 962 with the select() function in the sockets API. The icmp6_filter 963 structure is an opaque datatype and the application should not care 964 how it is implemented. All the application does with this datatype 965 is allocate a variable of this type, pass a pointer to a variable of 966 this type to getsockopt() and setsockopt(), and operate on a variable 967 of this type using the six macros that we just defined. 969 Nevertheless, it is worth showing a simple implementation of this 970 datatype and the six macros. 972 struct icmp6_filter { 973 uint32_t icmp6_filt[8]; /* 8*32 = 256 bits */ 974 }; 976 #define ICMP6_FILTER_WILLPASS(type, filterp) \ 977 ((((filterp)->icmp6_filt[(type) >> 5]) & (1 << ((type) & 31))) != 0) 978 #define ICMP6_FILTER_WILLBLOCK(type, filterp) \ 979 ((((filterp)->icmp6_filt[(type) >> 5]) & (1 << ((type) & 31))) == 0) 980 #define ICMP6_FILTER_SETPASS(type, filterp) \ 981 ((((filterp)->icmp6_filt[(type) >> 5]) |= (1 << ((type) & 31)))) 982 #define ICMP6_FILTER_SETBLOCK(type, filterp) \ 983 ((((filterp)->icmp6_filt[(type) >> 5]) &= ~(1 << ((type) & 31)))) 984 #define ICMP6_FILTER_SETPASSALL(filterp) \ 985 memset((filterp), 0xFF, sizeof(struct icmp6_filter)) 986 #define ICMP6_FILTER_SETBLOCKALL(filterp) \ 987 memset((filterp), 0, sizeof(struct icmp6_filter)) 989 (Note: These sample definitions have two limitations that an 990 implementation may want to change. The first four macros evaluate 991 their first argument two times. The second two macros require the 992 inclusion of the header for the memset() function.) 994 3.3. ICMPv6 Verification of Received Packets 996 The protocol stack will verify the ICMPv6 checksum and discard any 997 packets with invalid checksums. 999 An implementation might perform additional validity checks on the 1000 ICMPv6 message content and discard malformed packets. However, a 1001 portable application must not assume that such validity checks have 1002 been performed. 1004 The protocol stack should not automatically discard packets if the 1005 ICMP type is unknown to the stack. For extensibility reasons 1006 received ICMP packets with any type (informational or error) must be 1007 passed to the applications (subject to ICMP6_FILTER filtering on the 1008 type value and the checksum verification). 1010 4. Access to IPv6 and Extension Headers 1012 Applications need to be able to control IPv6 header and extension 1013 header content when sending as well as being able to receive the 1014 content of these headers. This is done by defining socket option 1015 types which can be used both with setsockopt and with ancillary data. 1016 Ancillary data is discussed in Appendix A. The following optional 1017 information can be exchanged between the application and the kernel: 1019 1. The send/receive interface and source/destination address, 1020 2. The hop limit, 1021 3. Next hop address, 1022 4. The traffic class, 1023 5. Routing header, 1024 6. Hop-by-Hop options header, and 1025 7. Destination options header. 1027 First, to receive any of this optional information (other than the 1028 next hop address, which can only be set) on a UDP or raw socket, the 1029 application must call setsockopt() to turn on the corresponding flag: 1031 int on = 1; 1033 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVPKTINFO, &on, sizeof(on)); 1034 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVHOPLIMIT, &on, sizeof(on)); 1035 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVRTHDR, &on, sizeof(on)); 1036 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVHOPOPTS, &on, sizeof(on)); 1037 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVDSTOPTS, &on, sizeof(on)); 1038 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVTCLASS, &on, sizeof(on)); 1040 When any of these options are enabled, the corresponding data is 1041 returned as control information by recvmsg(), as one or more 1042 ancillary data objects. 1044 This document does not define how to receive the optional information 1045 on a TCP socket. See Section 4.1 for more details. 1047 Two different mechanisms exist for sending this optional information: 1049 1. Using setsockopt to specify the option content for a socket. 1050 These are known "sticky" options since they affect all 1051 transmitted packets on the socket until either a new setsockopt 1052 is done or the options are overridden using ancillary data. 1054 2. Using ancillary data to specify the option content for a single 1055 datagram. This only applies to datagram and raw sockets; not to 1056 TCP sockets. 1058 The three socket option parameters and the three cmsghdr fields that 1059 describe the options/ancillary data objects are summarized as: 1061 opt level/ optname/ optval/ 1062 cmsg_level cmsg_type cmsg_data[] 1063 ------------ ------------ ------------------------ 1064 IPPROTO_IPV6 IPV6_PKTINFO in6_pktinfo structure 1065 IPPROTO_IPV6 IPV6_HOPLIMIT int 1066 IPPROTO_IPV6 IPV6_NEXTHOP socket address structure 1067 IPPROTO_IPV6 IPV6_RTHDR ip6_rthdr structure 1068 IPPROTO_IPV6 IPV6_HOPOPTS ip6_hbh structure 1069 IPPROTO_IPV6 IPV6_DSTOPTS ip6_dest structure 1070 IPPROTO_IPV6 IPV6_RTHDRDSTOPTS ip6_dest structure 1071 IPPROTO_IPV6 IPV6_TCLASS int 1073 (Note: IPV6_HOPLIMIT can be used as ancillary data items only) 1075 All these options are described in detail in Section 6, 7, 8 and 9. 1076 All the constants beginning with IPV6_ are defined as a result of 1077 including the . 1079 Note: We intentionally use the same constant for the cmsg_level 1080 member as is used as the second argument to getsockopt() and 1081 setsockopt() (what is called the "level"), and the same constant for 1082 the cmsg_type member as is used as the third argument to getsockopt() 1083 and setsockopt() (what is called the "option name"). 1085 Issuing getsockopt() for the above options will return the sticky 1086 option value i.e. the value set with setsockopt(). If no sticky 1087 option value has been set getsockopt() will return the following 1088 values: 1090 - 1091 For the IPV6_PKTINFO option, it will return an in6_pktinfo structure 1092 with ipi6_addr being in6addr_any and ipi6_ifindex being zero. 1094 - 1095 For the IPV6_TCLASS option, it will return the kernel default value. 1097 - 1098 For other options, it will indicate the lack of the option value 1099 with optlen being zero. 1101 The application does not explicitly need to access the data 1102 structures for the Routing header, Hop-by-Hop options header, and 1103 Destination options header, since the API to these features is 1104 through a set of inet6_rth_XXX() and inet6_opt_XXX() functions that 1105 we define in Section 7 and Section 10. Those functions simplify the 1106 interface to these features instead of requiring the application to 1107 know the intimate details of the extension header formats. 1109 When specifying extension headers, this API assumes the header 1110 ordering and the number of occurrences of each header as described in 1111 [RFC-2460]. More details about the ordering issue will be discussed 1112 in Section 12. 1114 4.1. TCP Implications 1116 It is not possible to use ancillary data to transmit the above 1117 options for TCP since there is not a one-to-one mapping between send 1118 operations and the TCP segments being transmitted. Instead an 1119 application can use setsockopt to specify them as sticky options. 1120 When the application uses setsockopt to specify the above options it 1121 is expected that TCP will start using the new information when 1122 sending segments. However, TCP may or may not use the new 1123 information when retransmitting segments that were originally sent 1124 when the old sticky options were in effect. 1126 It is unclear how a TCP application can use received information 1127 (such as extension headers) due to the lack of mapping between 1128 received TCP segments and receive operations. In particular, the 1129 received information could not be used for access control purposes 1130 like on UDP and raw sockets. 1132 This specification therefore does not define how to get the received 1133 information on TCP sockets. The result of the IPV6_RECVxxx options 1134 on a TCP socket is undefined as well. 1136 4.2. UDP and Raw Socket Implications 1138 The receive behavior for UDP and raw sockets is quite 1139 straightforward. After the application has enabled an IPV6_RECVxxx 1140 socket option it will receive ancillary data items for every 1141 recvmsg() call containing the requested information. However, if the 1142 information is not present in the packet the ancillary data item will 1143 not be included. For example, if the application enables 1144 IPV6_RECVRTHDR and a received datagram does not contain a Routing 1145 header there will not be an IPV6_RTHDR ancillary data item. Note 1146 that due to buffering in the socket implementation there might be 1147 some packets queued when an IPV6_RECVxxx option is enabled and they 1148 might not have the ancillary data information. 1150 For sending the application has the choice between using sticky 1151 options and ancillary data. The application can also use both having 1152 the sticky options specify the "default" and using ancillary data to 1153 override the default options. 1155 When an ancillary data item is specified in a call to sendmsg(), the 1156 item will override an existing sticky option of the same name (if 1157 previously specified). For example, if the application has set 1158 IPV6_RTHDR using a sticky option and later passes IPV6_RTHDR as 1159 ancillary data this will override the IPV6_RTHDR sticky option and 1160 the routing header of the outgoing packet will be from the ancillary 1161 data item, not from the sticky option. Note, however, that other 1162 sticky options than IPV6_RTHDR will not be affected by the IPV6_RTHDR 1163 ancillary data item; the overriding mechanism only works for the same 1164 type of sticky options and ancillary data items. 1166 (Note: the overriding rule is different from the one in RFC 2292. In 1167 RFC 2292, an ancillary data item overrode all sticky options 1168 previously defined. This was reasonable, because sticky options 1169 could only be specified as a set by a single socket option. However, 1170 in this API, each option is separated so that it can be specified as 1171 a single sticky option. Additionally, there are much more ancillary 1172 data items and sticky options than in RFC 2292, including ancillary- 1173 only one. Thus, it should be natural for application programmers to 1174 separate the overriding rule as well.) 1176 An application can also temporarily disable a particular sticky 1177 option by specifying a corresponding ancillary data item that could 1178 disable the sticky option when being used as an argument for a socket 1179 option. For example, if the application has set IPV6_HOPOPTS as a 1180 sticky option and later passes IPV6_HOPOPTS with a zero length as an 1181 ancillary data item, the packet will not have a Hop-by-Hop options 1182 header. 1184 5. Extensions to Socket Ancillary Data 1186 This specification uses ancillary data as defined in Posix with some 1187 compatible extensions, which are described in the following 1188 subsections. Section 21 will provide a detailed overview of 1189 ancillary data and related structures and macros, including the 1190 extensions. 1192 5.1. CMSG_NXTHDR 1194 struct cmsghdr *CMSG_NXTHDR(const struct msghdr *mhdr, 1195 const struct cmsghdr *cmsg); 1197 CMSG_NXTHDR() returns a pointer to the cmsghdr structure describing 1198 the next ancillary data object. Mhdr is a pointer to a msghdr 1199 structure and cmsg is a pointer to a cmsghdr structure. If there is 1200 not another ancillary data object, the return value is NULL. 1202 The following behavior of this macro is new to this API: if the value 1203 of the cmsg pointer is NULL, a pointer to the cmsghdr structure 1204 describing the first ancillary data object is returned. That is, 1205 CMSG_NXTHDR(mhdr, NULL) is equivalent to CMSG_FIRSTHDR(mhdr). If 1206 there are no ancillary data objects, the return value is NULL. 1208 5.2. CMSG_SPACE 1210 socklen_t CMSG_SPACE(socklen_t length); 1212 This macro is new with this API. Given the length of an ancillary 1213 data object, CMSG_SPACE() returns an upper bound on the space 1214 required by the object and its cmsghdr structure, including any 1215 padding needed to satisfy alignment requirements. This macro can be 1216 used, for example, when allocating space dynamically for the 1217 ancillary data. This macro should not be used to initialize the 1218 cmsg_len member of a cmsghdr structure; instead use the CMSG_LEN() 1219 macro. 1221 5.3. CMSG_LEN 1223 socklen_t CMSG_LEN(socklen_t length); 1225 This macro is new with this API. Given the length of an ancillary 1226 data object, CMSG_LEN() returns the value to store in the cmsg_len 1227 member of the cmsghdr structure, taking into account any padding 1228 needed to satisfy alignment requirements. 1230 Note the difference between CMSG_SPACE() and CMSG_LEN(), shown also 1231 in the figure in Section 21.2: the former accounts for any required 1232 padding at the end of the ancillary data object and the latter is the 1233 actual length to store in the cmsg_len member of the ancillary data 1234 object. 1236 6. Packet Information 1238 There are five pieces of information that an application can specify 1239 for an outgoing packet using ancillary data: 1241 1. the source IPv6 address, 1242 2. the outgoing interface index, 1243 3. the outgoing hop limit, 1244 4. the next hop address, and 1245 5. the outgoing traffic class value. 1247 Four similar pieces of information can be returned for a received 1248 packet as ancillary data: 1250 1. the destination IPv6 address, 1251 2. the arriving interface index, 1252 3. the arriving hop limit, and 1253 4. the arriving traffic class value. 1255 The first two pieces of information are contained in an in6_pktinfo 1256 structure that is set with setsockopt() or sent as ancillary data 1257 with sendmsg() and received as ancillary data with recvmsg(). This 1258 structure is defined as a result of including the . 1260 struct in6_pktinfo { 1261 struct in6_addr ipi6_addr; /* src/dst IPv6 address */ 1262 unsigned int ipi6_ifindex; /* send/recv interface index */ 1263 }; 1265 In the socket option and cmsghdr level will be IPPROTO_IPV6, the type 1266 will be IPV6_PKTINFO, and the first byte of the option value and 1267 cmsg_data[] will be the first byte of the in6_pktinfo structure. An 1268 application can clear any sticky IPV6_PKTINFO option by doing a 1269 "regular" setsockopt with ipi6_addr being in6addr_any and 1270 ipi6_ifindex being zero. 1272 This information is returned as ancillary data by recvmsg() only if 1273 the application has enabled the IPV6_RECVPKTINFO socket option: 1275 int on = 1; 1276 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVPKTINFO, &on, sizeof(on)); 1278 (Note: The hop limit is not contained in the in6_pktinfo structure 1279 for the following reason. Some UDP servers want to respond to client 1280 requests by sending their reply out the same interface on which the 1281 request was received and with the source IPv6 address of the reply 1282 equal to the destination IPv6 address of the request. To do this the 1283 application can enable just the IPV6_RECVPKTINFO socket option and 1284 then use the received control information from recvmsg() as the 1285 outgoing control information for sendmsg(). The application need not 1286 examine or modify the in6_pktinfo structure at all. But if the hop 1287 limit were contained in this structure, the application would have to 1288 parse the received control information and change the hop limit 1289 member, since the received hop limit is not the desired value for an 1290 outgoing packet.) 1292 6.1. Specifying/Receiving the Interface 1294 Interfaces on an IPv6 node are identified by a small positive 1295 integer, as described in Section 4 of [BASICAPI]. That document also 1296 describes a function to map an interface name to its interface index, 1297 a function to map an interface index to its interface name, and a 1298 function to return all the interface names and indexes. Notice from 1299 this document that no interface is ever assigned an index of 0. 1301 When specifying the outgoing interface, if the ipi6_ifindex value is 1302 0, the kernel will choose the outgoing interface. 1304 The ordering among various options that can specify the outgoing 1305 interface, including IPV6_PKTINFO, is defined in Section 6.7. 1307 When the IPV6_RECVPKTINFO socket option is enabled, the received 1308 interface index is always returned as the ipi6_ifindex member of the 1309 in6_pktinfo structure. 1311 6.2. Specifying/Receiving Source/Destination Address 1313 The source IPv6 address can be specified by calling bind() before 1314 each output operation, but supplying the source address together with 1315 the data requires less overhead (i.e., fewer system calls) and 1316 requires less state to be stored and protected in a multithreaded 1317 application. 1319 When specifying the source IPv6 address as ancillary data, if the 1320 ipi6_addr member of the in6_pktinfo structure is the unspecified 1321 address (IN6ADDR_ANY_INIT or in6addr_any), then (a) if an address is 1322 currently bound to the socket, it is used as the source address, or 1323 (b) if no address is currently bound to the socket, the kernel will 1324 choose the source address. If the ipi6_addr member is not the 1325 unspecified address, but the socket has already bound a source 1326 address, then the ipi6_addr value overrides the already-bound source 1327 address for this output operation only. 1329 The kernel must verify that the requested source address is indeed a 1330 unicast address assigned to the node. When the address is a scoped 1331 one, there may be ambiguity about its scope zone. This is 1332 particularly the case for link-local addresses. In such a case, the 1333 kernel must first determine the appropriate scope zone based on the 1334 zone of the destination address or the outgoing interface (if known), 1335 then qualify the address. This also means that it is not feasible to 1336 specify the source address for a non-binding socket by the 1337 IPV6_PKTINFO sticky option, unless the outgoing interface is also 1338 specified. The application should simply use bind() for such 1339 purposes. 1341 IPV6_PKTINFO can also be used as a sticky option for specifying the 1342 socket's default source address. However, the ipi6_addr member must 1343 be the unspecified address for TCP sockets, because it is not 1344 possible to dynamically change the source address of a TCP 1345 connection. When the IPV6_PKTINFO option is specified for a TCP 1346 socket with a non-unspecified address, the call will fail. This 1347 restriction should be applied even before the socket binds a specific 1348 address. 1350 When the in6_pktinfo structure is returned as ancillary data by 1351 recvmsg(), the ipi6_addr member contains the destination IPv6 address 1352 from the received packet. 1354 6.3. Specifying/Receiving the Hop Limit 1356 The outgoing hop limit is normally specified with either the 1357 IPV6_UNICAST_HOPS socket option or the IPV6_MULTICAST_HOPS socket 1358 option, both of which are described in [BASICAPI]. Specifying the 1359 hop limit as ancillary data lets the application override either the 1360 kernel's default or a previously specified value, for either a 1361 unicast destination or a multicast destination, for a single output 1362 operation. Returning the received hop limit is useful for IPv6 1363 applications that need to verify that the received hop limit is 255 1364 (e.g., that the packet has not been forwarded). 1366 The received hop limit is returned as ancillary data by recvmsg() 1367 only if the application has enabled the IPV6_RECVHOPLIMIT socket 1368 option: 1370 int on = 1; 1371 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVHOPLIMIT, &on, sizeof(on)); 1373 In the cmsghdr structure containing this ancillary data, the 1374 cmsg_level member will be IPPROTO_IPV6, the cmsg_type member will be 1375 IPV6_HOPLIMIT, and the first byte of cmsg_data[] will be the first 1376 byte of the integer hop limit. 1378 Nothing special need be done to specify the outgoing hop limit: just 1379 specify the control information as ancillary data for sendmsg(). As 1380 specified in [BASICAPI], the interpretation of the integer hop limit 1381 value is 1383 x < -1: return an error of EINVAL 1384 x == -1: use kernel default 1385 0 <= x <= 255: use x 1386 x >= 256: return an error of EINVAL 1388 This API defines IPV6_HOPLIMIT as an ancillary-only option, that is, 1389 the option name cannot be used as a socket option. This is because 1390 [BASICAPI] has more fine-grained socket options; IPV6_UNICAST_HOPS 1391 and IPV6_MULTICAST_HOPS. 1393 6.4. Specifying the Next Hop Address 1395 The IPV6_NEXTHOP ancillary data object specifies the next hop for the 1396 datagram as a socket address structure. In the cmsghdr structure 1397 containing this ancillary data, the cmsg_level member will be 1398 IPPROTO_IPV6, the cmsg_type member will be IPV6_NEXTHOP, and the 1399 first byte of cmsg_data[] will be the first byte of the socket 1400 address structure. 1402 This is a privileged option. (Note: It is implementation defined and 1403 beyond the scope of this document to define what "privileged" means. 1404 Unix systems use this term to mean the process must have an effective 1405 user ID of 0.) 1407 This API only defines the case where the socket address contains an 1408 IPv6 address (i.e., the sa_family member is AF_INET6). And, in this 1409 case, the node identified by that address must be a neighbor of the 1410 sending host. If that address equals the destination IPv6 address of 1411 the datagram, then this is equivalent to the existing SO_DONTROUTE 1412 socket option. 1414 This option does not have any meaning for multicast destinations. In 1415 such a case, the specified next hop will be ignored. 1417 When the outgoing interface is specified by IPV6_PKTINFO as well, the 1418 next hop specified by this option must be reachable via the specified 1419 interface. 1421 In order to clear a sticky IPV6_NEXTHOP option the application must 1422 issue a setsockopt for IPV6_NEXTHOP with a zero length. 1424 6.5. Specifying/Receiving the Traffic Class value 1426 The outgoing traffic class is normally set to 0. Specifying the 1427 traffic class as ancillary data lets the application override either 1428 the kernel's default or a previously specified value, for either a 1429 unicast destination or a multicast destination, for a single output 1430 operation. Returning the received traffic class is useful for 1431 programs such as a diffserv debugging tool and for user level ECN 1432 (explicit congestion notification) implementation. 1434 The received traffic class is returned as ancillary data by recvmsg() 1435 only if the application has enabled the IPV6_RECVTCLASS socket 1436 option: 1438 int on = 1; 1439 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVTCLASS, &on, sizeof(on)); 1441 In the cmsghdr structure containing this ancillary data, the 1442 cmsg_level member will be IPPROTO_IPV6, the cmsg_type member will be 1443 IPV6_TCLASS, and the first byte of cmsg_data[] will be the first byte 1444 of the integer traffic class. 1446 To specify the outgoing traffic class value, just specify the control 1447 information as ancillary data for sendmsg() or using setsockopt(). 1448 Just like the hop limit value, the interpretation of the integer 1449 traffic class value is 1451 x < -1: return an error of EINVAL 1452 x == -1: use kernel default 1453 0 <= x <= 255: use x 1454 x >= 256: return an error of EINVAL 1456 In order to clear a sticky IPV6_TCLASS option the application can 1457 specify -1 as the value. 1459 There are cases where the kernel needs to control the traffic class 1460 value and conflicts with the user-specified value on the outgoing 1461 traffic. An example is an implementation of ECN in the kernel, 1462 setting 2 bits of the traffic class value. In such cases, the kernel 1463 should override the user-specified value. On the incoming traffic, 1464 the kernel may mask some of the bits in the traffic class field. 1466 6.6. Additional Errors with sendmsg() and setsockopt() 1468 With the IPV6_PKTINFO socket option there are no additional errors 1469 possible with the call to recvmsg(). But when specifying the 1470 outgoing interface or the source address, additional errors are 1471 possible from sendmsg() or setsockopt(). Note that some 1472 implementations might only be able to return this type of errors for 1473 setsockopt(). The following are examples, but some of these may not 1474 be provided by some implementations, and some implementations may 1475 define additional errors: 1477 ENXIO The interface specified by ipi6_ifindex does not exist. 1479 ENETDOWN The interface specified by ipi6_ifindex is not enabled 1480 for IPv6 use. 1482 EADDRNOTAVAIL ipi6_ifindex specifies an interface but the address 1483 ipi6_addr is not available for use on that interface. 1485 EHOSTUNREACH No route to the destination exists over the interface 1486 specified by ifi6_ifindex. 1488 6.7. Summary of outgoing interface selection 1490 This document and [BASICAPI] specify various methods that affect the 1491 selection of the packet's outgoing interface. This subsection 1492 summarizes the ordering among those in order to ensure deterministic 1493 behavior. 1495 For a given outgoing packet on a given socket, the outgoing interface 1496 is determined in the following order: 1498 1. if an interface is specified in an IPV6_PKTINFO ancillary data 1499 item, the interface is used. 1500 2. otherwise, if an interface is specified in an IPV6_PKTINFO sticky 1501 option, the interface is used. 1502 3. otherwise, if the destination address is a multicast address and 1503 the IPV6_MULTICAST_IF socket option is specified for the socket, 1504 the interface is used. 1505 4. otherwise, if an IPV6_NEXTHOP ancillary data item is specified, 1506 the interface to the next hop is used. 1507 5. otherwise, if an IPV6_NEXTHOP sticky option is specified, 1508 the interface to the next hop is used. 1509 6. otherwise, the outgoing interface should be determined in an 1510 implementation dependent manner. 1512 The ordering above particularly means if the application specifies an 1513 interface by the IPV6_MULTICAST_IF socket option (described in 1514 [BASICAPI]) as well as specifying a different interface by the 1515 IPV6_PKTINFO sticky option, the latter will override the former for 1516 every multicast packet on the corresponding socket. The reason for 1517 the ordering comes from expectation that the source address is 1518 specified as well and that the pair of the address and the outgoing 1519 interface should be preferred. 1521 In any case, the kernel must also verify that the source and 1522 destination addresses do not break their scope zones with regard to 1523 the outgoing interface. 1525 7. Routing Header Option 1527 Source routing in IPv6 is accomplished by specifying a Routing header 1528 as an extension header. There can be different types of Routing 1529 headers, but IPv6 currently defines only the Type 0 Routing header 1530 [RFC-2460]. This type supports up to 127 intermediate nodes (limited 1531 by the length field in the extension header). With this maximum 1532 number of intermediate nodes, a source, and a destination, there are 1533 128 hops. 1535 Source routing with the IPv4 sockets API (the IP_OPTIONS socket 1536 option) requires the application to build the source route in the 1537 format that appears as the IPv4 header option, requiring intimate 1538 knowledge of the IPv4 options format. This IPv6 API, however, 1539 defines six functions that the application calls to build and examine 1540 a Routing header, and the ability to use sticky options or ancillary 1541 data to communicate this information between the application and the 1542 kernel using the IPV6_RTHDR option. 1544 Three functions build a Routing header: 1546 inet6_rth_space() - return #bytes required for Routing header 1547 inet6_rth_init() - initialize buffer data for Routing header 1548 inet6_rth_add() - add one IPv6 address to the Routing header 1550 Three functions deal with a returned Routing header: 1552 inet6_rth_reverse() - reverse a Routing header 1553 inet6_rth_segments() - return #segments in a Routing header 1554 inet6_rth_getaddr() - fetch one address from a Routing header 1556 The function prototypes for these functions are defined as a result 1557 of including the . 1559 To receive a Routing header the application must enable the 1560 IPV6_RECVRTHDR socket option: 1562 int on = 1; 1563 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVRTHDR, &on, sizeof(on)); 1565 Each received Routing header is returned as one ancillary data object 1566 described by a cmsghdr structure with cmsg_type set to IPV6_RTHDR. 1567 When multiple Routing headers are received, multiple ancillary data 1568 objects (with cmsg_type set to IPV6_RTHDR) will be returned to the 1569 application. 1571 To send a Routing header the application specifies it either as 1572 ancillary data in a call to sendmsg() or using setsockopt(). For the 1573 sending side, this API assumes the number of occurrences of the 1574 Routing header as described in [RFC-2460]. That is, applications can 1575 only specify at most one outgoing Routing header. 1577 The application can remove any sticky Routing header by calling 1578 setsockopt() for IPV6_RTHDR with a zero option length. 1580 When using ancillary data a Routing header is passed between the 1581 application and the kernel as follows: The cmsg_level member has a 1582 value of IPPROTO_IPV6 and the cmsg_type member has a value of 1583 IPV6_RTHDR. The contents of the cmsg_data[] member is implementation 1584 dependent and should not be accessed directly by the application, but 1585 should be accessed using the six functions that we are about to 1586 describe. 1588 The following constant is defined as a result of including the 1589 : 1591 #define IPV6_RTHDR_TYPE_0 0 /* IPv6 Routing header type 0 */ 1593 When a Routing header is specified, the destination address specified 1594 for connect(), sendto(), or sendmsg() is the final destination 1595 address of the datagram. The Routing header then contains the 1596 addresses of all the intermediate nodes. 1598 7.1. inet6_rth_space 1600 socklen_t inet6_rth_space(int type, int segments); 1602 This function returns the number of bytes required to hold a Routing 1603 header of the specified type containing the specified number of 1604 segments (addresses). For an IPv6 Type 0 Routing header, the number 1605 of segments must be between 0 and 127, inclusive. The return value 1606 is just the space for the Routing header. When the application uses 1607 ancillary data it must pass the returned length to CMSG_SPACE() to 1608 determine how much memory is needed for the ancillary data object 1609 (including the cmsghdr structure). 1611 If the return value is 0, then either the type of the Routing header 1612 is not supported by this implementation or the number of segments is 1613 invalid for this type of Routing header. 1615 (Note: This function returns the size but does not allocate the space 1616 required for the ancillary data. This allows an application to 1617 allocate a larger buffer, if other ancillary data objects are 1618 desired, since all the ancillary data objects must be specified to 1619 sendmsg() as a single msg_control buffer.) 1621 7.2. inet6_rth_init 1623 void *inet6_rth_init(void *bp, socklen_t bp_len, int type, int segments); 1625 This function initializes the buffer pointed to by bp to contain a 1626 Routing header of the specified type and sets ip6r_len based on the 1627 segments parameter. bp_len is only used to verify that the buffer is 1628 large enough. The ip6r_segleft field is set to zero; inet6_rth_add() 1629 will increment it. 1631 When the application uses ancillary data the application must 1632 initialize any cmsghdr fields. 1634 The caller must allocate the buffer and its size can be determined by 1635 calling inet6_rth_space(). 1637 Upon success the return value is the pointer to the buffer (bp), and 1638 this is then used as the first argument to the inet6_rth_add() 1639 function. Upon an error the return value is NULL. 1641 7.3. inet6_rth_add 1643 int inet6_rth_add(void *bp, const struct in6_addr *addr); 1645 This function adds the IPv6 address pointed to by addr to the end of 1646 the Routing header being constructed. 1648 If successful, the segleft member of the Routing Header is updated to 1649 account for the new address in the Routing header and the return 1650 value of the function is 0. Upon an error the return value of the 1651 function is -1. 1653 7.4. inet6_rth_reverse 1655 int inet6_rth_reverse(const void *in, void *out); 1657 This function takes a Routing header extension header (pointed to by 1658 the first argument) and writes a new Routing header that sends 1659 datagrams along the reverse of that route. The function reverses the 1660 order of the addresses and sets the segleft member in the new Routing 1661 header to the number of segments. Both arguments are allowed to 1662 point to the same buffer (that is, the reversal can occur in place). 1664 The return value of the function is 0 on success, or -1 upon an 1665 error. 1667 7.5. inet6_rth_segments 1669 int inet6_rth_segments(const void *bp); 1671 This function returns the number of segments (addresses) contained in 1672 the Routing header described by bp. On success the return value is 1673 zero or greater. The return value of the function is -1 upon an 1674 error. 1676 7.6. inet6_rth_getaddr 1677 struct in6_addr *inet6_rth_getaddr(const void *bp, int index); 1679 This function returns a pointer to the IPv6 address specified by 1680 index (which must have a value between 0 and one less than the value 1681 returned by inet6_rth_segments()) in the Routing header described by 1682 bp. An application should first call inet6_rth_segments() to obtain 1683 the number of segments in the Routing header. 1685 Upon an error the return value of the function is NULL. 1687 8. Hop-By-Hop Options 1689 A variable number of Hop-by-Hop options can appear in a single Hop- 1690 by-Hop options header. Each option in the header is TLV-encoded with 1691 a type, length, and value. This IPv6 API defines seven functions 1692 that the application calls to build and examine a Hop-by_Hop options 1693 header, and the ability to use sticky options or ancillary data to 1694 communicate this information between the application and the kernel. 1695 This uses the IPV6_HOPOPTS for a Hop-by-Hop options header. 1697 Today several Hop-by-Hop options are defined for IPv6. Two pad 1698 options, Pad1 and PadN, are for alignment purposes and are 1699 automatically inserted by the inet6_opt_XXX() routines and ignored by 1700 the inet6_opt_XXX() routines on the receive side. This section of 1701 the API is therefore defined for other (and future) Hop-by-Hop 1702 options that an application may need to specify and receive. 1704 Four functions build an options header: 1706 inet6_opt_init() - initialize buffer data for options header 1707 inet6_opt_append() - add one TLV option to the options header 1708 inet6_opt_finish() - finish adding TLV options to the options header 1709 inet6_opt_set_val() - add one component of the option content to the 1710 option 1712 Three functions deal with a returned options header: 1714 inet6_opt_next() - extract the next option from the options header 1715 inet6_opt_find() - extract an option of a specified type from the 1716 header 1717 inet6_opt_get_val() - retrieve one component of the option content 1719 Individual Hop-by-Hop options (and Destination options, which are 1720 described in Section 9 and are very similar to the Hop-by-Hop 1721 options) may have specific alignment requirements. For example, the 1722 4-byte Jumbo Payload length should appear on a 4-byte boundary, and 1723 IPv6 addresses are normally aligned on an 8-byte boundary. These 1724 requirements and the terminology used with these options are 1725 discussed in Section 4.2 and Appendix B of [RFC-2460]. The alignment 1726 of first byte of each option is specified by two values, called x and 1727 y, written as "xn + y". This states that the option must appear at 1728 an integer multiple of x bytes from the beginning of the options 1729 header (x can have the values 1, 2, 4, or 8), plus y bytes (y can 1730 have a value between 0 and 7, inclusive). The Pad1 and PadN options 1731 are inserted as needed to maintain the required alignment. The 1732 functions below need to know the alignment of the end of the option 1733 (which is always in the form "xn," where x can have the values 1, 2, 1734 4, or 8) and the total size of the data portion of the option. These 1735 are passed as the "align" and "len" arguments to inet6_opt_append(). 1737 Multiple Hop-by-Hop options must be specified by the application by 1738 placing them in a single extension header. 1740 Finally, we note that use of some Hop-by-Hop options or some 1741 Destination options, might require special privilege. That is, 1742 normal applications (without special privilege) might be forbidden 1743 from setting certain options in outgoing packets, and might never see 1744 certain options in received packets. 1746 8.1. Receiving Hop-by-Hop Options 1748 To receive a Hop-by-Hop options header the application must enable 1749 the IPV6_RECVHOPOPTS socket option: 1751 int on = 1; 1752 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVHOPOPTS, &on, sizeof(on)); 1754 When using ancillary data a Hop-by-hop options header is passed 1755 between the application and the kernel as follows: The cmsg_level 1756 member will be IPPROTO_IPV6 and the cmsg_type member will be 1757 IPV6_HOPOPTS. These options are then processed by calling the 1758 inet6_opt_next(), inet6_opt_find(), and inet6_opt_get_val() 1759 functions, described in Section 10. 1761 8.2. Sending Hop-by-Hop Options 1763 To send a Hop-by-Hop options header, the application specifies the 1764 header either as ancillary data in a call to sendmsg() or using 1765 setsockopt(). 1767 The application can remove any sticky Hop-by-Hop options header by 1768 calling setsockopt() for IPV6_HOPOPTS with a zero option length. 1770 All the Hop-by-Hop options must be specified by a single ancillary 1771 data object. The cmsg_level member is set to IPPROTO_IPV6 and the 1772 cmsg_type member is set to IPV6_HOPOPTS. The option is normally 1773 constructed using the inet6_opt_init(), inet6_opt_append(), 1774 inet6_opt_finish(), and inet6_opt_set_val() functions, described in 1775 Section 10. 1777 Additional errors may be possible from sendmsg() and setsockopt() if 1778 the specified option is in error. 1780 9. Destination Options 1782 A variable number of Destination options can appear in one or more 1783 Destination options headers. As defined in [RFC-2460], a Destination 1784 options header appearing before a Routing header is processed by the 1785 first destination plus any subsequent destinations specified in the 1786 Routing header, while a Destination options header that is not 1787 followed by a Routing header is processed only by the final 1788 destination. As with the Hop-by-Hop options, each option in a 1789 Destination options header is TLV-encoded with a type, length, and 1790 value. 1792 9.1. Receiving Destination Options 1794 To receive Destination options header the application must enable the 1795 IPV6_RECVDSTOPTS socket option: 1797 int on = 1; 1798 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVDSTOPTS, &on, sizeof(on)); 1800 Each Destination options header is returned as one ancillary data 1801 object described by a cmsghdr structure with cmsg_level set to 1802 IPPROTO_IPV6 and cmsg_type set to IPV6_DSTOPTS. 1804 These options are then processed by calling the inet6_opt_next(), 1805 inet6_opt_find(), and inet6_opt_get_value() functions. 1807 9.2. Sending Destination Options 1809 To send a Destination options header, the application specifies it 1810 either as ancillary data in a call to sendmsg() or using 1811 setsockopt(). 1813 The application can remove any sticky Destination options header by 1814 calling setsockopt() for IPV6_RTHDRDSTOPTS/IPV6_DSTOPTS with a zero 1815 option length. 1817 This API assumes the ordering about extension headers as described in 1818 [RFC-2460]. Thus, one set of Destination options can only appear 1819 before a Routing header, and one set can only appear after a Routing 1820 header (or in a packet with no Routing header). Each set can consist 1821 of one or more options but each set is a single extension header. 1823 Today all destination options that an application may want to specify 1824 can be put after (or without) a Routing header. Thus, applications 1825 should usually need IPV6_DSTOPTS only and should avoid using 1826 IPV6_RTHDRDSTOPTS whenever possible. 1828 When using ancillary data a Destination options header is passed 1829 between the application and the kernel as follows: The set preceding 1830 a Routing header are specified with the cmsg_level member set to 1831 IPPROTO_IPV6 and the cmsg_type member set to IPV6_RTHDRDSTOPTS. Any 1832 setsockopt or ancillary data for IPV6_RTHDRDSTOPTS is silently 1833 ignored when sending packets unless a Routing header is also 1834 specified. Note that the "Routing header" here means the one 1835 specified by this API. Even when the kernel inserts a routing header 1836 in its internal routine (e.g. in a mobile IPv6 stack), the 1837 Destination options header specified by IPV6_RTHDRDSTOPTS will still 1838 be ignored unless the application explicitly specifies its own 1839 Routing header. 1841 The set of Destination options after a Routing header, which are also 1842 used when no Routing header is present, are specified with the 1843 cmsg_level member is set to IPPROTO_IPV6 and the cmsg_type member is 1844 set to IPV6_DSTOPTS. 1846 The Destination options are normally constructed using the 1847 inet6_opt_init(), inet6_opt_append(), inet6_opt_finish(), and 1848 inet6_opt_set_val() functions, described in Section 10. 1850 Additional errors may be possible from sendmsg() and setsockopt() if 1851 the specified option is in error. 1853 10. Hop-by-Hop and Destination Options Processing 1855 Building and parsing the Hop-by-Hop and Destination options is 1856 complicated for the reasons given earlier. We therefore define a set 1857 of functions to help the application. These functions assume the 1858 formatting rules specified in Appendix B in [RFC-2460] i.e. that the 1859 largest field is placed last in the option. 1861 The function prototypes for these functions are defined as a result 1862 of including the . 1864 The first 3 functions (init, append, and finish) are used both to 1865 calculate the needed buffer size for the options, and to actually 1866 encode the options once the application has allocated a buffer for 1867 the header. In order to only calculate the size the application must 1868 pass a NULL extbuf and a zero extlen to those functions. 1870 10.1. inet6_opt_init 1872 int inet6_opt_init(void *extbuf, socklen_t extlen); 1874 This function returns the number of bytes needed for the empty 1875 extension header i.e. without any options. If extbuf is not NULL it 1876 also initializes the extension header to have the correct length 1877 field. In that case if the extlen value is not a positive (i.e., 1878 non-zero) multiple of 8 the function fails and returns -1. 1880 (Note: since the return value on success is based on a "constant" 1881 parameter, i.e. the empty extension header, an implementation may 1882 return a constant value. However, this specification does not 1883 require the value be constant, and leaves it as implementation 1884 dependent. The application should not assume a particular constant 1885 value as a successful return value of this function.) 1887 10.2. inet6_opt_append 1889 int inet6_opt_append(void *extbuf, socklen_t extlen, int offset, 1890 uint8_t type, socklen_t len, uint_t align, 1891 void **databufp); 1893 Offset should be the length returned by inet6_opt_init() or a 1894 previous inet6_opt_append(). This function returns the updated total 1895 length taking into account adding an option with length 'len' and 1896 alignment 'align'. If extbuf is not NULL then, in addition to 1897 returning the length, the function inserts any needed pad option, 1898 initializes the option (setting the type and length fields) and 1899 returns a pointer to the location for the option content in databufp. 1900 If the option does not fit in the extension header buffer the 1901 function returns -1. 1903 Type is the 8-bit option type. Len is the length of the option data 1904 (i.e. excluding the option type and option length fields). 1906 Once inet6_opt_append() has been called the application can use the 1907 databuf directly, or use inet6_opt_set_val() to specify the content 1908 of the option. 1910 The option type must have a value from 2 to 255, inclusive. (0 and 1 1911 are reserved for the Pad1 and PadN options, respectively.) 1913 The option data length must have a value between 0 and 255, 1914 inclusive, and is the length of the option data that follows. 1916 The align parameter must have a value of 1, 2, 4, or 8. The align 1917 value can not exceed the value of len. 1919 10.3. inet6_opt_finish 1921 int inet6_opt_finish(void *extbuf, socklen_t extlen, int offset); 1923 Offset should be the length returned by inet6_opt_init() or 1924 inet6_opt_append(). This function returns the updated total length 1925 taking into account the final padding of the extension header to make 1926 it a multiple of 8 bytes. If extbuf is not NULL the function also 1927 initializes the option by inserting a Pad1 or PadN option of the 1928 proper length. 1930 If the necessary pad does not fit in the extension header buffer the 1931 function returns -1. 1933 10.4. inet6_opt_set_val 1935 int inet6_opt_set_val(void *databuf, int offset, void *val, 1936 socklen_t vallen); 1938 Databuf should be a pointer returned by inet6_opt_append(). This 1939 function inserts data items of various sizes in the data portion of 1940 the option. Val should point to the data to be inserted. Offset 1941 specifies where in the data portion of the option the value should be 1942 inserted; the first byte after the option type and length is accessed 1943 by specifying an offset of zero. 1945 The caller should ensure that each field is aligned on its natural 1946 boundaries as described in Appendix B of [RFC-2460], but the function 1947 must not rely on the caller's behavior. Even when the alignment 1948 requirement is not satisfied, inet6_opt_set_val should just copy the 1949 data as required. 1951 The function returns the offset for the next field (i.e., offset + 1952 vallen) which can be used when composing option content with multiple 1953 fields. 1955 10.5. inet6_opt_next 1957 int inet6_opt_next(void *extbuf, socklen_t extlen, int offset, 1958 uint8_t *typep, socklen_t *lenp, 1959 void **databufp); 1961 This function parses received option extension headers returning the 1962 next option. Extbuf and extlen specifies the extension header. 1963 Offset should either be zero (for the first option) or the length 1964 returned by a previous call to inet6_opt_next() or inet6_opt_find(). 1965 It specifies the position where to continue scanning the extension 1966 buffer. The next option is returned by updating typep, lenp, and 1967 databufp. Typep stores the option type, lenp stores the length of 1968 the option data (i.e. excluding the option type and option length 1969 fields), and databufp points the data field of the option. This 1970 function returns the updated "previous" length computed by advancing 1971 past the option that was returned. This returned "previous" length 1972 can then be passed to subsequent calls to inet6_opt_next(). This 1973 function does not return any PAD1 or PADN options. When there are no 1974 more options or if the option extension header is malformed the 1975 return value is -1. 1977 10.6. inet6_opt_find 1979 int inet6_opt_find(void *extbuf, socklen_t extlen, int offset, 1980 uint8_t type, socklen_t *lenp, 1981 void **databufp); 1983 This function is similar to the previously described inet6_opt_next() 1984 function, except this function lets the caller specify the option 1985 type to be searched for, instead of always returning the next option 1986 in the extension header. 1988 If an option of the specified type is located, the function returns 1989 the updated "previous" total length computed by advancing past the 1990 option that was returned and past any options that didn't match the 1991 type. This returned "previous" length can then be passed to 1992 subsequent calls to inet6_opt_find() for finding the next occurrence 1993 of the same option type. 1995 If an option of the specified type is not located, the return value 1996 is -1. If the option extension header is malformed, the return value 1997 is -1. 1999 10.7. inet6_opt_get_val 2001 int inet6_opt_get_val(void *databuf, int offset, void *val, 2002 socklen_t vallen); 2004 Databuf should be a pointer returned by inet6_opt_next() or 2005 inet6_opt_find(). This function extracts data items of various sizes 2006 in the data portion of the option. Val should point to the 2007 destination for the extracted data. Offset specifies from where in 2008 the data portion of the option the value should be extracted; the 2009 first byte after the option type and length is accessed by specifying 2010 an offset of zero. 2012 It is expected that each field is aligned on its natural boundaries 2013 as described in Appendix B of [RFC-2460], but the function must not 2014 rely on the alignment. 2016 The function returns the offset for the next field (i.e., offset + 2017 vallen) which can be used when extracting option content with 2018 multiple fields. 2020 11. Additional Advanced API Functions 2022 11.1. Sending with the Minimum MTU 2024 Unicast applications should usually let the kernel perform path MTU 2025 discovery [RFC-1981], as long as the kernel supports it, and should 2026 not care about the path MTU. Some applications, however, might not 2027 want to incur the overhead of path MTU discovery, especially if the 2028 applications only send a single datagram to a destination. A 2029 potential example is a DNS server. 2031 [RFC-1981] describes how path MTU discovery works for multicast 2032 destinations. From practice in using IPv4 multicast, however, many 2033 careless applications that send large multicast packets on the wire 2034 have caused implosion of ICMPv4 error messages. The situation can be 2035 worse when there is a filtering node that blocks the ICMPv4 messages. 2036 Though the filtering issue applies to unicast as well, the impact is 2037 much larger in the multicast cases. 2039 Thus, applications sending multicast traffic should explicitly enable 2040 path MTU discovery only when they understand that the benefit of 2041 possibly larger MTU usage outweighs the possible impact of MTU 2042 discovery for active sources across the delivery tree(s). This 2043 default behavior is based on the today's practice with IPv4 multicast 2044 and path MTU discovery. The behavior may change in the future once 2045 it is found that path MTU discovery effectively works with actual 2046 multicast applications and network configurations. 2048 This specification defines a mechanism to avoid path MTU discovery by 2049 sending at the minimum IPv6 MTU [RFC-2460]. If the packet is larger 2050 than the minimum MTU and this feature has been enabled the IP layer 2051 will fragment to the minimum MTU. To control the policy about path 2052 MTU discovery, applications can use the IPV6_USE_MIN_MTU socket 2053 option. 2055 As described above, the default policy should depend on whether the 2056 destination is unicast or multicast. For unicast destinations path 2057 MTU discovery should be performed by default. For multicast 2058 destinations path MTU discovery should be disabled by default. This 2059 option thus takes the following three types of integer arguments: 2061 -1: perform path MTU discovery for unicast destinations but do 2062 not perform it for multicast destinations. Packets to multicast 2063 destinations are therefore sent with the minimum MTU. 2064 0: always perform path MTU discovery. 2065 1: always disable path MTU discovery and send packets at the 2066 minimum MTU. 2068 The default value of this option is -1. Values other than -1, 0, and 2069 1 are invalid, and an error EINVAL will be returned for those values. 2071 As an example, if a unicast application intentionally wants to 2072 disable path MTU discovery, it will add the following lines: 2074 int on = 1; 2075 setsockopt(fd, IPPROTO_IPV6, IPV6_USE_MIN_MTU, &on, sizeof(on)); 2077 Note that this API intentionally excludes the case where the 2078 application wants to perform path MTU discovery for multicast but to 2079 disable it for unicast. This is because such usage is not feasible 2080 considering a scale of performance issues around whether to do path 2081 MTU discovery or not. When path MTU discovery makes sense to a 2082 destination but not to a different destination, regardless of whether 2083 the destination is unicast or multicast, applications either need to 2084 toggle the option between sending such packets on the same socket, or 2085 use different sockets for the two classes of destinations. 2087 This option can also be sent as ancillary data. In the cmsghdr 2088 structure containing this ancillary data, the cmsg_level member will 2089 be IPPROTO_IPV6, the cmsg_type member will be IPV6_USE_MIN_MTU, and 2090 the first byte of cmsg_data[] will be the first byte of the integer. 2092 11.2. Sending without fragmentation 2094 In order to provide for easy porting of existing UDP and raw socket 2095 applications IPv6 implementations will, when originating packets, 2096 automatically insert a fragment header in the packet if the packet is 2097 too big for the path MTU. 2099 Some applications might not want this behavior. An example is 2100 traceroute which might want to discover the actual path MTU. 2102 This specification defines a mechanism to turn off the automatic 2103 inserting of a fragment header for UDP and raw sockets. This can be 2104 enabled using the IPV6_DONTFRAG socket option. 2106 int on = 1; 2107 setsockopt(fd, IPPROTO_IPV6, IPV6_DONTFRAG, &on, sizeof(&on)); 2109 By default, this socket option is disabled. Setting the value to 0 2110 also disables the option i.e. reverts to the default behavior of 2111 automatic inserting. This option can also be sent as ancillary data. 2112 In the cmsghdr structure containing this ancillary data, the 2113 cmsg_level member will be IPPROTO_IPV6, the cmsg_type member will be 2114 IPV6_DONTFRAG, and the first byte of cmsg_data[] will be the first 2115 byte of the integer. This API only specifies the use of this option 2116 for UDP and raw sockets, and does not define the usage for TCP 2117 sockets. 2119 When the data size is larger than the MTU of the outgoing interface, 2120 the packet will be discarded. Applications can know the result by 2121 enabling the IPV6_RECVPATHMTU option described below and receiving 2122 the corresponding ancillary data items. An additional error EMSGSIZE 2123 may also be returned in some implementations. Note, however, that 2124 some other implementations might not be able to return this 2125 additional error when sending a message. 2127 11.3. Path MTU Discovery and UDP 2129 UDP and raw socket applications need to be able to determine the 2130 "maximum send transport-message size" (Section 5.1 of [RFC-1981]) to 2131 a given destination so that those applications can participate in 2132 path MTU discovery. This lets those applications send smaller 2133 datagrams to the destination, avoiding fragmentation. 2135 This is accomplished using a new ancillary data item (IPV6_PATHMTU) 2136 which is delivered to recvmsg() without any actual data. The 2137 application can enable the receipt of IPV6_PATHMTU ancillary data 2138 items by setting the IPV6_RECVPATHMTU socket option. 2140 int on = 1; 2141 setsockopt(fd, IPPROTO_IPV6, IPV6_RECVPATHMTU, &on, sizeof(on)); 2143 By default, this socket option is disabled. Setting the value to 0 2144 also disables the option. This API only specifies the use of this 2145 option for UDP and raw sockets, and does not define the usage for TCP 2146 sockets. 2148 When the application is sending packets too big for the path MTU 2149 recvmsg() will return zero (indicating no data) but there will be a 2150 cmsghdr with cmsg_type set to IPV6_PATHMTU, and cmsg_len will 2151 indicate that cmsg_data is sizeof(struct ip6_mtuinfo) bytes long. 2152 This can happen when the sending node receives a corresponding ICMPv6 2153 packet too big error, or when the packet is sent from a socket with 2154 the IPV6_DONTFRAG option being on and the packet size is larger than 2155 the MTU of the outgoing interface. This indication is considered as 2156 an ancillary data item for a separate (empty) message. Thus, when 2157 there are buffered messages (i.e., messages that the application has 2158 not received yet) on the socket the application will first receive 2159 the buffered messages and then receive the indication. 2161 The first byte of cmsg_data[] will point to a struct ip6_mtuinfo 2162 carrying the path MTU to use together with the IPv6 destination 2163 address. 2165 struct ip6_mtuinfo { 2166 struct sockaddr_in6 ip6m_addr; /* dst address including zone ID */ 2167 uint32_t ip6m_mtu; /* path MTU in host byte order */ 2168 }; 2170 This cmsghdr will be passed to every socket that sets the 2171 IPV6_RECVPATHMTU socket option, even if the socket is non-connected. 2172 Note that this also means an application that sets the option may 2173 receive an IPv6_MTU ancillary data item for each ICMP too big error 2174 the node receives, including such ICMP errors caused by other 2175 applications on the node. Thus, an application that wants to perform 2176 the path MTU discovery by itself needs to keep history of 2177 destinations that it has actually sent to and to compare the address 2178 returned in the ip6_mtuinfo structure to the history. An 2179 implementation may choose not to delivery data to a connected socket 2180 that has a foreign address that is different than the address 2181 specified in the ip6m_addr structure. 2183 When an application sends a packet with a routing header, the final 2184 destination stored in the ip6m_addr member does not necessarily 2185 contain complete information of the entire path. 2187 11.4. Determining the current path MTU 2189 Some applications might need to determine the current path MTU e.g. 2190 applications using IPV6_RECVPATHMTU might want to pick a good 2191 starting value. 2193 This specification defines a get-only socket option to retrieve the 2194 current path MTU value for the destination of a given connected 2195 socket. If the IP layer does not have a cached path MTU value it 2196 will return the interface MTU for the interface that will be used 2197 when sending to the destination address. 2199 This information is retrieved using the IPV6_PATHMTU socket option. 2200 This option takes a pointer to the ip6_mtuinfo structure as the 2201 fourth argument, and the size of the structure should be passed as a 2202 value-result parameter in the fifth argument. 2204 struct ip6_mtuinfo mtuinfo; 2205 socklen_t infolen = sizeof(mtuinfo); 2207 getsockopt(fd, IPPROTO_IPV6, IPV6_PATHMTU, &mtuinfo, &infolen); 2209 When the call succeeds, the path MTU value is stored in the ip6m_mtu 2210 member of the ip6_mtuinfo structure. Since the socket is connected, 2211 the ip6m_addr member is meaningless and should not be referred to by 2212 the application. 2214 This option can only be used for a connected socket, because a non- 2215 connected socket does not have the information of the destination and 2216 there is no way to pass the destination via getsockopt(). When 2217 getsockopt() for this option is issued on a non-connected socket, the 2218 call will fail. Despite this limitation, this option is still useful 2219 from a practical point of view, because applications that care about 2220 the path MTU tend to send a lot of packets to a single destination 2221 and to connect the socket to the destination for performance reasons. 2222 If the application needs to get the MTU value in a more generic way, 2223 it should use a more generic interface, such as routing sockets 2224 [TCPIPILLUST]. 2226 12. Ordering of Ancillary Data and IPv6 Extension Headers 2227 Three IPv6 extension headers can be specified by the application and 2228 returned to the application using ancillary data with sendmsg() and 2229 recvmsg(): the Routing header, Hop-by-Hop options header, and 2230 Destination options header. When multiple ancillary data objects are 2231 transferred via recvmsg() and these objects represent any of these 2232 three extension headers, their placement in the control buffer is 2233 directly tied to their location in the corresponding IPv6 datagram. 2234 For example, when the application has enabled the IPV6_RECVRTHDR and 2235 IPV6_RECVDSTOPTS options and later receives an IPv6 packet with 2236 extension headers in the following order: 2238 The IPv6 header 2239 A Hop-by-Hop options header 2240 A Destination options header (1) 2241 A Routing header 2242 An Authentication header 2243 A Destination options header (2) 2244 A UDP header and UDP data 2246 then the application will receive three ancillary data objects in the 2247 following order: 2249 an object with cmsg_type set to IPV6_DSTOPTS, which represents 2250 the destination options header (1) 2251 an object with cmsg_type set to IPV6_RTHDR, which represents the 2252 Routing header 2253 an object with cmsg_type set to IPV6_DSTOPTS, which represents the 2254 destination options header (2) 2256 This example follows the header ordering described in [RFC-2460], but 2257 the receiving side of this specification does not assume the 2258 ordering. Applications may receive any numbers of objects in any 2259 order according to the ordering of the received IPv6 datagram. 2261 For the sending side, however, this API imposes some ordering 2262 constraints according to [RFC-2460]. Applications using this API 2263 cannot make a packet with extension headers that do not follow the 2264 ordering. Note, however, that this does not mean applications must 2265 always follow the restriction. This is just a limitation in this API 2266 in order to give application programmers a guideline to construct 2267 headers in a practical manner. Should an application need to make an 2268 outgoing packet in an arbitrary order about the extension headers, 2269 some other technique, such as the datalink interfaces BPF or DLPI, 2270 must be used. 2272 The followings are more details about the constraints: 2274 - Each IPV6_xxx ancillary data object for a particular type of 2275 extension header can be specified at most once in a single control 2276 buffer. 2278 - IPV6_xxx ancillary data objects can appear in any order in a 2279 control buffer, because there is no ambiguity of the ordering. 2281 - Each set of IPV6_xxx ancillary data objects and sticky options 2282 will be put in the outgoing packet along with the header ordering 2283 described in [RFC-2460]. 2285 - An ancillary data object or a sticky option of IPV6_RTHDRDSTOPTS 2286 will affect the outgoing packet only when a Routing header is 2287 specified as an ancillary data object or a sticky option. 2288 Otherwise, the specified value for IPV6_RTHDRDSTOPTS will be 2289 ignored. 2291 For example, when an application sends a UDP datagram with a control 2292 data buffer containing ancillary data objects in the following order: 2294 an object with cmsg_type set to IPV6_DSTOPTS 2295 an object with cmsg_type set to IPV6_RTHDRDSTOPTS 2296 an object with cmsg_type set to IPV6_HOPOPTS 2298 and the sending socket does not have any sticky options, then the 2299 outgoing packet would be constructed as follows: 2301 The IPv6 header 2302 A Hop-by-Hop options header 2303 A Destination options header 2304 A UDP header and UDP data 2306 where the destination options header corresponds to the ancillary 2307 data object with the type IPV6_DSTOPTS. 2309 Note that the constraints above do not necessarily mean that the 2310 outgoing packet sent on the wire always follows the header ordering 2311 specified in this API document. The kernel may insert additional 2312 headers that break the ordering as a result. For example, if the 2313 kernel supports Mobile IPv6, an additional destination options header 2314 may be inserted before an authentication header, even without a 2315 routing header. 2317 This API does not provide access to any other extension headers than 2318 the supported three types of headers. In particular, no information 2319 is provided about the IP security headers on an incoming packet, nor 2320 can be specified for an outgoing packet. This API is for 2321 applications that do not care about the existence of IP security 2322 headers. 2324 13. IPv6-Specific Options with IPv4-Mapped IPv6 Addresses 2326 The various socket options and ancillary data specifications defined 2327 in this document apply only to true IPv6 sockets. It is possible to 2328 create an IPv6 socket that actually sends and receives IPv4 packets, 2329 using IPv4-mapped IPv6 addresses, but the mapping of the options 2330 defined in this document to an IPv4 datagram is beyond the scope of 2331 this document. 2333 In general, attempting to specify an IPv6-only option, such as the 2334 Hop-by-Hop options, Destination options, or Routing header on an IPv6 2335 socket that is using IPv4-mapped IPv6 addresses, will probably result 2336 in an error. Some implementations, however, may provide access to 2337 the packet information (source/destination address, send/receive 2338 interface, and hop limit) on an IPv6 socket that is using IPv4-mapped 2339 IPv6 addresses. 2341 14. Extended interfaces for rresvport, rcmd and rexec 2343 Library functions that support the "r" commands hide the creation of 2344 a socket and the name resolution procedure from an application. When 2345 the libraries return an AF_INET6 socket to an application that do not 2346 support the address family, the application may encounter an 2347 unexpected result when, e.g., calling getpeername() for the socket. 2348 In order to support AF_INET6 sockets for the "r" commands while 2349 keeping backward compatibility, this section defines some extensions 2350 to the libraries. 2352 14.1. rresvport_af 2354 The rresvport() function is used by the rcmd() function, and this 2355 function is in turn called by many of the "r" commands such as 2356 rlogin. While new applications are not being written to use the 2357 rcmd() function, legacy applications such as rlogin will continue to 2358 use it and these will be ported to IPv6. 2360 rresvport() creates an IPv4/TCP socket and binds a "reserved port" to 2361 the socket. Instead of defining an IPv6 version of this function we 2362 define a new function that takes an address family as its argument. 2364 #include 2366 int rresvport_af(int *port, int family); 2368 This function behaves the same as the existing rresvport() function, 2369 but instead of creating an AF_INET TCP socket, it can also create an 2370 AF_INET6 TCP socket. The family argument is either AF_INET or 2371 AF_INET6, and a new error return is EAFNOSUPPORT if the address 2372 family is not supported. 2374 (Note: There is little consensus on which header defines the 2375 rresvport() and rcmd() function prototypes. 4.4BSD defines it in 2376 , others in , and others don't define the function 2377 prototypes at all.) 2379 14.2. rcmd_af 2381 The existing rcmd() function can not transparently use AF_INET6 2382 sockets since an application would not be prepared to handle AF_INET6 2383 addresses returned by e.g. getpeername() on the file descriptor 2384 created by rcmd(). Thus a new function is needed. 2386 int rcmd_af(char **ahost, unsigned short rport, const char *locuser, 2387 const char *remuser, const char *cmd, int *fd2p, int af) 2389 This function behaves the same as the existing rcmd() function, but 2390 instead of creating an AF_INET TCP socket, it can also create an 2391 AF_INET6 TCP socket. The family argument is AF_INET, AF_INET6, or 2392 AF_UNSPEC. When either AF_INET or AF_INET6 is specified, this 2393 function will create a socket of the specified address family. When 2394 AF_UNSPEC is specified, it will try all possible address families 2395 until a connection can be established, and will return the associated 2396 socket of the connection. A new error EAFNOSUPPORT will be returned 2397 if the address family is not supported. 2399 14.3. rexec_af 2401 The existing rexec() function can not transparently use AF_INET6 2402 sockets since an application would not be prepared to handle AF_INET6 2403 addresses returned by e.g. getpeername() on the file descriptor 2404 created by rexec(). Thus a new function is needed. 2406 int rexec_af(char **ahost, unsigned short rport, const char *name, 2407 const char *pass, const char *cmd, int *fd2p, int af) 2409 This function behaves the same as the existing rexec() function, but 2410 instead of creating an AF_INET TCP socket, it can also create an 2411 AF_INET6 TCP socket. The family argument is AF_INET, AF_INET6, or 2412 AF_UNSPEC. When either AF_INET or AF_INET6 is specified, this 2413 function will create a socket of the specified address family. When 2414 AF_UNSPEC is specified, it will try all possible address families 2415 until a connection can be established, and will return the associated 2416 socket of the connection. A new error EAFNOSUPPORT will be returned 2417 if the address family is not supported. 2419 15. Summary of New Definitions 2421 The following list summarizes the constants and structure, 2422 definitions discussed in this memo, sorted by header. 2424 ICMP6_DST_UNREACH 2425 ICMP6_DST_UNREACH_ADDR 2426 ICMP6_DST_UNREACH_ADMIN 2427 ICMP6_DST_UNREACH_BEYONDSCOPE 2428 ICMP6_DST_UNREACH_NOPORT 2429 ICMP6_DST_UNREACH_NOROUTE 2430 ICMP6_ECHO_REPLY 2431 ICMP6_ECHO_REQUEST 2432 ICMP6_INFOMSG_MASK 2433 ICMP6_PACKET_TOO_BIG 2434 ICMP6_PARAMPROB_HEADER 2435 ICMP6_PARAMPROB_NEXTHEADER 2436 ICMP6_PARAMPROB_OPTION 2437 ICMP6_PARAM_PROB 2438 ICMP6_ROUTER_RENUMBERING 2439 ICMP6_RR_FLAGS_FORCEAPPLY 2440 ICMP6_RR_FLAGS_PREVDONE 2441 ICMP6_RR_FLAGS_REQRESULT 2442 ICMP6_RR_FLAGS_SPECSITE 2443 ICMP6_RR_FLAGS_TEST 2444 ICMP6_RR_PCOUSE_FLAGS_DECRPLTIME 2445 ICMP6_RR_PCOUSE_FLAGS_DECRVLTIME 2446 ICMP6_RR_PCOUSE_RAFLAGS_AUTO 2447 ICMP6_RR_PCOUSE_RAFLAGS_ONLINK 2448 ICMP6_RR_RESULT_FLAGS_FORBIDDEN 2449 ICMP6_RR_RESULT_FLAGS_OOB 2450 ICMP6_TIME_EXCEEDED 2451 ICMP6_TIME_EXCEED_REASSEMBLY 2452 ICMP6_TIME_EXCEED_TRANSIT 2453 MLD_LISTENER_QUERY 2454 MLD_LISTENER_REDUCTION 2455 MLD_LISTENER_REPORT 2456 ND_NA_FLAG_OVERRIDE 2457 ND_NA_FLAG_ROUTER 2458 ND_NA_FLAG_SOLICITED 2459 ND_NEIGHBOR_ADVERT 2460 ND_NEIGHBOR_SOLICIT 2461 ND_OPT_MTU 2462 ND_OPT_PI_FLAG_AUTO 2463 ND_OPT_PI_FLAG_ONLINK 2464 ND_OPT_PREFIX_INFORMATION 2465 ND_OPT_REDIRECTED_HEADER 2466 ND_OPT_SOURCE_LINKADDR 2467 ND_OPT_TARGET_LINKADDR 2468 ND_RA_FLAG_MANAGED 2469 ND_RA_FLAG_OTHER 2470 ND_REDIRECT 2471 ND_ROUTER_ADVERT 2472 ND_ROUTER_SOLICIT 2474 struct icmp6_filter{}; 2475 struct icmp6_hdr{}; 2476 struct icmp6_router_renum{}; 2477 struct mld_hdr{}; 2478 struct nd_neighbor_advert{}; 2479 struct nd_neighbor_solicit{}; 2480 struct nd_opt_hdr{}; 2481 struct nd_opt_mtu{}; 2482 struct nd_opt_prefix_info{}; 2483 struct nd_opt_rd_hdr{}; 2484 struct nd_redirect{}; 2485 struct nd_router_advert{}; 2486 struct nd_router_solicit{}; 2487 struct rr_pco_match{}; 2488 struct rr_pco_use{}; 2489 struct rr_result{}; 2491 IPPROTO_AH 2492 IPPROTO_DSTOPTS 2493 IPPROTO_ESP 2494 IPPROTO_FRAGMENT 2495 IPPROTO_HOPOPTS 2496 IPPROTO_ICMPV6 2497 IPPROTO_IPV6 2498 IPPROTO_NONE 2499 IPPROTO_ROUTING 2500 IPV6_CHECKSUM 2501 IPV6_DONTFRAG 2502 IPV6_DSTOPTS 2503 IPV6_HOPLIMIT 2504 IPV6_HOPOPTS 2505 IPV6_NEXTHOP 2506 IPV6_PATHMTU 2507 IPV6_PKTINFO 2508 IPV6_RECVDSTOPTS 2509 IPV6_RECVHOPLIMIT 2510 IPV6_RECVHOPOPTS 2511 IPV6_RECVPKTINFO 2512 IPV6_RECVRTHDR 2513 IPV6_RECVTCLASS 2514 IPV6_RTHDR 2515 IPV6_RTHDRDSTOPTS 2516 IPV6_RTHDR_TYPE_0 2517 IPV6_RECVPATHMTU 2518 IPV6_TCLASS 2519 IPV6_USE_MIN_MTU 2520 struct in6_pktinfo{}; 2521 struct ip6_mtuinfo{}; 2523 IP6F_MORE_FRAG 2524 IP6F_OFF_MASK 2525 IP6F_RESERVED_MASK 2526 IP6OPT_JUMBO 2527 IP6OPT_JUMBO_LEN 2528 IP6OPT_MUTABLE 2529 IP6OPT_NSAP_ADDR 2530 IP6OPT_PAD1 2531 IP6OPT_PADN 2532 IP6OPT_ROUTER_ALERT 2533 IP6OPT_TUNNEL_LIMIT 2534 IP6OPT_TYPE_DISCARD 2535 IP6OPT_TYPE_FORCEICMP 2536 IP6OPT_TYPE_ICMP 2537 IP6OPT_TYPE_SKIP 2538 IP6_ALERT_AN 2539 IP6_ALERT_MLD 2540 IP6_ALERT_RSVP 2541 struct ip6_dest{}; 2542 struct ip6_frag{}; 2543 struct ip6_hbh{}; 2544 struct ip6_hdr{}; 2545 struct ip6_opt{}; 2546 struct ip6_opt_jumbo{}; 2547 struct ip6_opt_nsap{}; 2548 struct ip6_opt_router{}; 2549 struct ip6_opt_tunnel{}; 2550 struct ip6_rthdr{}; 2551 struct ip6_rthdr0{}; 2553 The following list summarizes the function and macro prototypes 2554 discussed in this memo, sorted by header. 2556 void ICMP6_FILTER_SETBLOCK(int, struct icmp6_filter *); 2557 void ICMP6_FILTER_SETBLOCKALL(struct icmp6_filter *); 2558 void ICMP6_FILTER_SETPASS(int, struct icmp6_filter *); 2559 void ICMP6_FILTER_SETPASSALL(struct icmp6_filter *); 2560 int ICMP6_FILTER_WILLBLOCK(int, 2561 const struct icmp6_filter *); 2562 int ICMP6_FILTER_WILLPASS(int, 2563 const struct icmp6_filter *); 2565 int IN6_ARE_ADDR_EQUAL(const struct in6_addr *, 2566 const struct in6_addr *); 2568 int inet6_opt_append(void *, socklen_t, int, 2569 uint8_t, socklen_t, uint_t, void **); 2570 int inet6_opt_get_val(void *, int, void *, socklen_t); 2571 int inet6_opt_find(void *, socklen_t, int, uint8_t , 2572 socklen_t *, void **); 2573 int inet6_opt_finish(void *, socklen_t, int); 2574 int inet6_opt_init(void *, socklen_t); 2575 int inet6_opt_next(void *, socklen_t, int, uint8_t *, 2576 socklen_t *, void **); 2577 int inet6_opt_set_val(void *, int, void *, socklen_t); 2579 int inet6_rth_add(void *, 2580 const struct in6_addr *); 2581 struct in6_addr inet6_rth_getaddr(const void *, 2582 int); 2583 void *inet6_rth_init(void *, socklen_t, int, int); 2584 int inet6_rth_reverse(const void *, void *); 2585 int inet6_rth_segments(const void *); 2586 soccklen_t inet6_rth_space(int, int); 2588 int IP6OPT_TYPE(uint8_t); 2590 socklen_t CMSG_LEN(socklen_t); 2591 socklen_t CMSG_SPACE(socklen_t); 2593 int rresvport_af(int *, int); 2594 int rcmd_af(char **, unsigned short, const char *, 2595 const char *, const char *, int *, int); 2596 int rexec_af(char **, unsigned short , const char *, 2597 const char *, const char *, int *, int); 2599 16. Security Considerations 2601 The setting of certain Hop-by-Hop options and Destination options may 2602 be restricted to privileged processes. Similarly some Hop-by-Hop 2603 options and Destination options may not be returned to non-privileged 2604 applications. 2606 The ability to specify an arbitrary source address using IPV6_PKTINFO 2607 must be prevented; at least for non-privileged processes. 2609 17. Change History 2611 Changes from RFC 2292: 2613 - Removed the IPV6_PKTOPTIONS socket option by allowing sticky 2614 options to be set with individual setsockopt calls. This 2615 simplifies the protocol stack implementation by not having to 2616 handle options within options and also clarifies the failure 2617 semantics when some option is incorrectly formatted. 2619 - Added the IPV6_RTHDRDSTOPTS for a Destination header before the 2620 Routing header. This is necessary to allow setting these 2621 Destination headers without IPV6_PKTOPTIONS. 2623 - Removed the ability to be able to specify Hop-by-Hop and 2624 Destination options using multiple ancillary data items. The 2625 application, using the inet6_option_*() routines, is responsible 2626 for formatting the whole extension header. This removes the need 2627 for the protocol stack to somehow guess the alignment restrictions 2628 on options when concatenating them together. 2630 - Added separate IPV6_RECVxxx options to enable the receipt of the 2631 corresponding ancillary data items. This makes the API cleaner 2632 since it allows the application to retrieve with getsockopt the 2633 sticky options it has set with setsockopt. 2635 - Clarified how sticky options are turned off. 2637 - Clarified how and when TCP returns ancillary data. 2639 - Removed the support for the loose/strict Routing header since that 2640 has been removed from the IPv6 specification. 2642 - Modified the inet6_rthdr_XXX() functions to not assume a cmsghdr 2643 structure in order to work with both sticky options and ancillary 2644 data. Renamed the functions to inet6_rth_XXX() to allow 2645 implementations to provide both the old and new functions. 2647 - Modified the inet6_option_XXX() functions to not assume a cmsghdr 2648 structure in order to work with both sticky options and ancillary 2649 data. Renamed the functions to inet6_opt_XXX() to allow 2650 implementations to provide both the old and new functions. 2652 - The new inet6_opt_XXX() functions were made different than the old 2653 as to not require structure declarations but instead use functions 2654 to add the individual fields to the option. 2656 - Changed inet6_rthdr_getaddr() to operate on index 0 through N-1 2657 (used to be 1 through N). 2659 - Changed the comments in the struct ip6_hdr from "priority" to 2660 "traffic class". 2662 - Clarified the alignment issues involving ancillary data to allow 2663 for separate alignment of cmsghdr structures and the data. Made 2664 CMSG_SPACE() return an upper bound on the needed space. 2666 - Added rcmd_af() and rexec_af(). 2668 Changes since -00: 2670 - Changed ICMP unreachable code 2 name to be "beyond scope of source 2671 address". 2673 - Added motivation for rcmd_af() and rexec_af(). 2675 - Added option definitions (IP6OPT_PAD1 etc) to ip6.h. 2677 - Added MLD and router renumbering definitions to icmp6.h 2679 - Removed ip6r0_addr field - replaced by a comment. 2681 - Made the content of IPV6_RTHDR, IPV6_HOPOPTS etc be specified as 2682 the extension header format (struct ip6_rthdr etc) instead of the 2683 previous "implementation dependent". 2685 - Removed attempt at RFC 2292 compatibility. 2687 - Excluded pad options from inet6_opt_next(). 2689 - Added IPV6_USE_MIN_MTU socket option for applications to avoid 2690 fragmentation by sending at the minimum IPv6 MTU. 2692 - Added MTU notification so that UDP and raw socket applications can 2693 participate in path MTU discovery. 2695 - Added Reachability confirmation for UDP and raw socket 2696 applications. 2698 - Clarified that if the application asks for e.g., IPV6_RTHDR and a 2699 received datagram does not contain a Routing header an 2700 implementation will exclude the IPV6_RTHDR ancillary data item. 2702 - Removed the constraints for jumbo option. 2704 - Moved the new CMSG macros and changes from the appendix. 2706 - Add text about inet6_opt_ depending on 2460 appendix B formatting 2707 rules i.e. largest field last in the option. 2709 - Specified that getsockopt() of a sticky option returns what was 2710 set with setsockopt(). 2712 - Updated the summary of new definitions to make it current. 2714 Changes since -01: 2716 - Added a note about the minor threat for DoS attacks using 2717 IPV6_REACHCONF 2719 - Clarified checksum and other receive side verification for RAW 2720 ICMP sockets. 2722 - Editorial clarifications. 2724 Changes since -02: 2726 - Changed IPV6_PATHMTU to carry an ip6_mtuinfo data structure. 2728 - Added the ability to do a getsockopt with IPV6_PATHMTU. 2730 - Added IPV6_DONTFRAG socket option. 2732 - Incorporated IPV6_TCLASS and friends, from draft-itojun- 2733 ipv6-tclass-api-03.txt. 2735 - Removed definitions and descriptions about ongoing stuff, 2736 including ones for mobile IPv6 and site prefixes. 2738 - Revised the overriding mechanism of sticky options so that a 2739 sticky option can only be overridden by an ancillary data item of 2740 the same option name. 2742 - Loosened requirements on the size of option fields in 2743 inet6_opt_get_val() and inet6_opt_set_val(). 2745 - Added credits for the original idea of IPV6_USE_MIN_MTU. 2747 - Clarified how to clear sticky options and the ICMPv6 filter. 2749 - Clarified alignment issues on inet6_opt_set_val(). 2751 - Clarified that IPV6_CHECKSUM assumes an even positive offset and 2752 that the checksum field is aligned on a 16-bit boundary. 2754 - Removed the ip6_ext structure, which was intended to be used as a 2755 "generic" extension header prototype. But the working group 2756 consensus was that we should NOT include such stuff. 2758 - Revised the "TCP implications" section; for the receiving side, 2759 reverted to a RFC2292-style getsockopt instead of using recvmsg() 2760 and ancillary data. 2762 - Disabled the use of the IPV6_HOPLIMIT sticky option. 2764 - Clarified the ordering between IPV6_MULTICAST_IF and the 2765 IPV6_PKTINFO sticky option for multicast packets. 2767 - Added considerations on Specifying/Receiving Source/Destination 2768 Address, mainly about TCP implications. 2770 - Clarified the scoped-address case of the source address 2771 specification; the kernel must first determine the appropriate 2772 scope zone. 2774 - Added a summary about the interface selection rule. 2776 - Clarified that IPV6_NEXTHOP should be ignored for a multicast 2777 destination and that it should not contradict with the specified 2778 outgoing interface. 2780 - Added clarifications on extension headers ordering; for the 2781 sending side, assume the recommended ordering described in 2782 RFC2460. For the receiving side, do not assume any ordering and 2783 pass all headers to the application in the received order. 2785 - Clarified return values of getsockopt() for IPV6_xxx sticky 2786 options when no sticky option value has been set by setsockopt(). 2788 - Described TCP implications about "additional advanced API 2789 functions." 2791 - Updated text about Hop-by-Hop and Destination options headers that 2792 was based on the old RFC version of this API document. 2794 - Updated the URL for protocol numbers. 2796 - Clarified that access to the flow label field was not provided. 2798 - Removed IPV6_RECVRTHDRDSTOPTS. Since we have loosened the 2799 ordering restriction for the receiving side, it is not meaningful 2800 to separate the two cases. 2802 - Clearly stated that this API only handles the case where 2803 IPV6_NEXTHOP takes sockaddr_in6. 2805 Changes since -03: 2807 - Changed the member name for icmp6_hdr in the mld_hdr structure in 2808 order to avoid conflict between the structure name and a member 2809 name of the structure, which is forbidden in ANSI C++. 2811 - Clearly stated that the IPV6_CHECKSUM socket option would fail for 2812 non-raw sockets. 2814 - Added more precise description about some arguments to 2815 inet6_opt_next() to make the semantics clear. 2817 - Made the way to get received information on TCP sockets 2818 unspecified, based on a discussion in the working group mailing 2819 list. Removed references to IPV6_PKTOPTIONS, which was revived in 2820 the previous revision, accordingly. 2822 Changes since -04: 2824 - Allowed AF_UNSPEC for rcmd_af() and rexec_af(). 2826 - Clarified that buffered messages would be received before a path 2827 MTU indication as an IPV6_PATHMTU ancillary data. 2829 - Described why an error code was not defined when the application 2830 sets IPV6_DONTFRAG and data size is larger than the MTU of the 2831 outgoing interface. 2833 - Added a note that an application that wants to perform the path 2834 MTU discovery by itself needs to keep history of destinations. 2836 - Mentioned the routing socket, with a reference, as a generic 2837 interface to get the path MTU for arbitrary destinations. 2839 Changes since -05: 2841 - Disallowed to specify a non-unspecified address by the 2842 IPV6_PKTINFO option for a TCP socket. 2844 - Added EMSGSIZE as an additional error when the outgoing packet is 2845 larger than the MTU of the outgoing interface with IPV6_DONTFRAG 2846 being enabled. 2848 - Moved description about the ordering between IPV6_PKTINFO and 2849 IPV6_MULTICAST_IF to Section 6.7, which summarized the ordering 2850 among various options. 2852 - Removed the section for IPV6_REACHCONF and all references to this 2853 option, based on a discussion after 04. 2855 - Clarified the header ordering issue much more, to make it clear 2856 that the ordering is just for this particular API. 2858 Changes since -06: 2860 - Revised the "minimum MTU" section so that path MTU discovery would 2861 be disabled for multicast by default. A new (default) value "-1" 2862 as an argument was introduced accordingly. 2864 Changes since -07: 2866 - Changed the type of some function arguments and return values from 2867 size_t to int or socklen_t to be aligned with the latest POSIX. 2868 Revised code examples accordingly. 2870 - Used PF_xxx instead of AF_xxx. 2872 - Replaced MUST with must. 2874 - Made the URL of assigned numbers less specific so that it would be 2875 more robust for future changes. 2877 - Changed the reference to the basic API from RFC2553 to the latest 2878 Internet Draft. 2880 - Added a sentence to mention that this document is intended to 2881 replace RFC2292. 2883 - Revised abstract to be more clear and concise, particularly 2884 concentrating on differences from RFC2292. 2886 - Removed traceroute as a usage of returning the received hop limit. 2888 - Moved new definitions about CMSG_xxx from appendices to the 2889 document body. 2891 - Added a reference to the latest POSIX standard. 2893 - Clarified that inet6_opt_init() may return a constant, but this 2894 document left it as implementation dependent. 2896 - Changed the argument name "prevlen" in inet6_opt_xxx() function 2897 prototypes to "offset", which better describes the intended usage. 2899 - Revised the text about the minimum MTU for multicast to make it 2900 clear that the default behavior came from operation practices. 2902 - Many other style and wording improvements. 2904 18. References 2906 [RFC-2460] Deering, S., Hinden, R., "Internet Protocol, Version 6 2907 (IPv6), Specification", RFC 2460, Dec. 1998. 2909 [BASICAPI] Gilligan, R. E., Thomson, S., Bound, J., McCann, J. 2910 Stevens, W. R., "Basic Socket Interface Extensions for 2911 IPv6", Internet Draft, July 2002. 2913 [POSIX] IEEE Std. 1003.1-2001 Standard for Information 2914 Technology -- Portable Operating System Interface 2915 (POSIX) 2917 Open Group Technical Standard: Base Specifications, 2918 Issue 6 December 2001 2920 ISO 9945 (pending final approval by ISO) 2922 http://www.opengroup.org/austin 2924 [RFC-1981] McCann, J., Deering, S., Mogul, J, "Path MTU Discovery 2925 for IP version 6", RFC 1981, Aug. 1996. 2927 [TCPIPILLUST] Wright, G., Stevens, W., "TCP/IP Illustrated, Volume 2: 2928 The Implementation", Addison Wesley, 1994. 2930 19. Acknowledgments 2931 Matt Thomas and Jim Bound have been working on the technical details 2932 in this draft for over a year. Keith Sklower is the original 2933 implementor of ancillary data in the BSD networking code. Craig Metz 2934 provided lots of feedback, suggestions, and comments based on his 2935 implementing many of these features as the document was being 2936 written. Mark Andrews first proposed the idea of the 2937 IPV6_USE_MIN_MTU option. Jun-ichiro Hagino contributed text for the 2938 traffic class API from a draft of his own. 2940 The following provided comments on earlier drafts: Pascal Anelli, 2941 Hamid Asayesh, Ran Atkinson, Karl Auerbach, Hamid Asayesh, Don 2942 Coolidge, Matt Crawford, Sam T. Denton, Richard Draves, Francis 2943 Dupont, Toerless Eckert, Lilian Fernandes, Bob Gilligan, Gerri 2944 Harter, Tim Hartrick, Bob Halley, Masaki Hirabaru, Yoshinobu Inoue, 2945 Mukesh Kacker, A. N. Kuznetsov, Sam Manthorpe, Pedro Marques, Jack 2946 McCann, der Mouse, John Moy, Lori Napoli, Thomas Narten, Atsushi 2947 Onoe, Steve Parker, Charles Perkins, Ken Powell, Tom Pusateri, Pedro 2948 Roque, Sameer Shah, Peter Sjodin, Stephen P. Spackman, Jinmei Tatuya, 2949 Karen Tracey, Sowmini Varadhan, Quaizar Vohra, Carl Williams, Steve 2950 Wise, Eric Wong, Farrell Woods, Kazu Yamamoto, Vladislav Yasevich, 2951 and YOSHIFUJI Hideaki. 2953 20. Authors' Addresses 2955 W. Richard Stevens (deceased) 2957 Matt Thomas 2958 3am Software Foundry 2959 8053 Park Villa Circle 2960 Cupertino, CA 95014 2961 Email: matt@3am-software.com 2963 Erik Nordmark 2964 Sun Microsystems Laboratories, Europe 2965 29 Chemin du Vieux Chene 2966 38240 Meylan, France 2967 Email: Erik.Nordmark@sun.com 2969 Tatuya JINMEI 2970 Corporate Research & Development Center, Toshiba Corporation 2971 1 Komukai Toshiba-cho, Kawasaki-shi 2972 Kanagawa 212-8582, Japan 2973 Email: jinmei@isl.rdc.toshiba.co.jp 2975 21. Appendix A: Ancillary Data Overview 2977 4.2BSD allowed file descriptors to be transferred between separate 2978 processes across a UNIX domain socket using the sendmsg() and 2979 recvmsg() functions. Two members of the msghdr structure, 2980 msg_accrights and msg_accrightslen, were used to send and receive the 2981 descriptors. When the OSI protocols were added to 4.3BSD Reno in 2982 1990 the names of these two fields in the msghdr structure were 2983 changed to msg_control and msg_controllen, because they were used by 2984 the OSI protocols for "control information", although the comments in 2985 the source code call this "ancillary data". 2987 Other than the OSI protocols, the use of ancillary data has been 2988 rare. In 4.4BSD, for example, the only use of ancillary data with 2989 IPv4 is to return the destination address of a received UDP datagram 2990 if the IP_RECVDSTADDR socket option is set. With Unix domain sockets 2991 ancillary data is still used to send and receive descriptors. 2993 Nevertheless the ancillary data fields of the msghdr structure 2994 provide a clean way to pass information in addition to the data that 2995 is being read or written. The inclusion of the msg_control and 2996 msg_controllen members of the msghdr structure along with the cmsghdr 2997 structure that is pointed to by the msg_control member is required by 2998 the Posix sockets API standard. 3000 21.1. The msghdr Structure 3002 The msghdr structure is used by the recvmsg() and sendmsg() 3003 functions. Its Posix definition is: 3005 struct msghdr { 3006 void *msg_name; /* ptr to socket address structure */ 3007 socklen_t msg_namelen; /* size of socket address structure */ 3008 struct iovec *msg_iov; /* scatter/gather array */ 3009 int msg_iovlen; /* # elements in msg_iov */ 3010 void *msg_control; /* ancillary data */ 3011 socklen_t msg_controllen; /* ancillary data buffer length */ 3012 int msg_flags; /* flags on received message */ 3013 }; 3015 The structure is declared as a result of including . 3017 (Note: Before Posix the two "void *" pointers were typically "char 3018 *", and the two socklen_t members were typically integers. Earlier 3019 drafts of Posix had the two socklen_t members as size_t, but it then 3020 changed these to socklen_t to simplify binary portability for 64-bit 3021 implementations and to align Posix with X/Open's Networking Services, 3022 Issue 5. The change in msg_control to a "void *" pointer affects any 3023 code that increments this pointer.) 3025 Most Berkeley-derived implementations limit the amount of ancillary 3026 data in a call to sendmsg() to no more than 108 bytes (an mbuf). 3027 This API requires a minimum of 10240 bytes of ancillary data, but it 3028 is recommended that the amount be limited only by the buffer space 3029 reserved by the socket (which can be modified by the SO_SNDBUF socket 3030 option). (Note: This magic number 10240 was picked as a value that 3031 should always be large enough. 108 bytes is clearly too small as the 3032 maximum size of a Routing header is 2048 bytes.) 3034 21.2. The cmsghdr Structure 3036 The cmsghdr structure describes ancillary data objects transferred by 3037 recvmsg() and sendmsg(). Its Posix definition is: 3039 struct cmsghdr { 3040 socklen_t cmsg_len; /* #bytes, including this header */ 3041 int cmsg_level; /* originating protocol */ 3042 int cmsg_type; /* protocol-specific type */ 3043 /* followed by unsigned char cmsg_data[]; */ 3044 }; 3046 This structure is declared as a result of including . 3048 (Note: Before Posix the cmsg_len member was an integer, and not a 3049 socklen_t. See the Note in the previous section for why socklen_t is 3050 used here.) 3052 As shown in this definition, normally there is no member with the 3053 name cmsg_data[]. Instead, the data portion is accessed using the 3054 CMSG_xxx() macros, as described in Section 21.3. Nevertheless, it is 3055 common to refer to the cmsg_data[] member. 3057 When ancillary data is sent or received, any number of ancillary data 3058 objects can be specified by the msg_control and msg_controllen 3059 members of the msghdr structure, because each object is preceded by a 3060 cmsghdr structure defining the object's length (the cmsg_len member). 3061 Historically Berkeley-derived implementations have passed only one 3062 object at a time, but this API allows multiple objects to be passed 3063 in a single call to sendmsg() or recvmsg(). The following example 3064 shows two ancillary data objects in a control buffer. 3066 |<--------------------------- msg_controllen -------------------------->| 3067 | OR | 3068 |<--------------------------- msg_controllen ----------------------->| 3069 | | 3070 |<----- ancillary data object ----->|<----- ancillary data object ----->| 3071 |<------ min CMSG_SPACE() --------->|<------ min CMSG_SPACE() --------->| 3072 | | | 3073 |<---------- cmsg_len ---------->| |<--------- cmsg_len ----------->| | 3074 |<--------- CMSG_LEN() --------->| |<-------- CMSG_LEN() ---------->| | 3075 | | | | | 3076 +-----+-----+-----+--+-----------+--+-----+-----+-----+--+-----------+--+ 3077 |cmsg_|cmsg_|cmsg_|XX| |XX|cmsg_|cmsg_|cmsg_|XX| |XX| 3078 |len |level|type |XX|cmsg_data[]|XX|len |level|type |XX|cmsg_data[]|XX| 3079 +-----+-----+-----+--+-----------+--+-----+-----+-----+--+-----------+--+ 3080 ^ 3081 | 3082 msg_control 3083 points here 3085 The fields shown as "XX" are possible padding, between the cmsghdr 3086 structure and the data, and between the data and the next cmsghdr 3087 structure, if required by the implementation. While sending an 3088 application may or may not include padding at the end of last 3089 ancillary data in msg_controllen and implementations must accept both 3090 as valid. On receiving a portable application must provide space for 3091 padding at the end of the last ancillary data as implementations may 3092 copy out the padding at the end of the control message buffer and 3093 include it in the received msg_controllen. When recvmsg() is called 3094 if msg_controllen is too small for all the ancillary data items 3095 including any trailing padding after the last item an implementation 3096 may set MSG_CTRUNC. 3098 21.3. Ancillary Data Object Macros 3100 To aid in the manipulation of ancillary data objects, three macros 3101 from 4.4BSD are defined by Posix: CMSG_DATA(), CMSG_NXTHDR(), and 3102 CMSG_FIRSTHDR(). Before describing these macros, we show the 3103 following example of how they might be used with a call to recvmsg(). 3105 struct msghdr msg; 3106 struct cmsghdr *cmsgptr; 3108 /* fill in msg */ 3110 /* call recvmsg() */ 3112 for (cmsgptr = CMSG_FIRSTHDR(&msg); cmsgptr != NULL; 3113 cmsgptr = CMSG_NXTHDR(&msg, cmsgptr)) { 3114 if (cmsgptr->cmsg_len == 0) { 3115 /* Error handling */ 3116 break; 3117 } 3118 if (cmsgptr->cmsg_level == ... && cmsgptr->cmsg_type == ... ) { 3119 u_char *ptr; 3121 ptr = CMSG_DATA(cmsgptr); 3122 /* process data pointed to by ptr */ 3123 } 3124 } 3126 We now describe the three Posix macros, followed by two more that are 3127 new with this API: CMSG_SPACE() and CMSG_LEN(). All these macros are 3128 defined as a result of including . 3130 21.3.1. CMSG_FIRSTHDR 3132 struct cmsghdr *CMSG_FIRSTHDR(const struct msghdr *mhdr); 3134 CMSG_FIRSTHDR() returns a pointer to the first cmsghdr structure in 3135 the msghdr structure pointed to by mhdr. The macro returns NULL if 3136 there is no ancillary data pointed to by the msghdr structure (that 3137 is, if either msg_control is NULL or if msg_controllen is less than 3138 the size of a cmsghdr structure). 3140 One possible implementation could be 3142 #define CMSG_FIRSTHDR(mhdr) \ 3143 ( (mhdr)->msg_controllen >= sizeof(struct cmsghdr) ? \ 3144 (struct cmsghdr *)(mhdr)->msg_control : \ 3145 (struct cmsghdr *)NULL ) 3147 (Note: Most existing implementations do not test the value of 3148 msg_controllen, and just return the value of msg_control. The value 3149 of msg_controllen must be tested, because if the application asks 3150 recvmsg() to return ancillary data, by setting msg_control to point 3151 to the application's buffer and setting msg_controllen to the length 3152 of this buffer, the kernel indicates that no ancillary data is 3153 available by setting msg_controllen to 0 on return. It is also 3154 easier to put this test into this macro, than making the application 3155 perform the test.) 3157 21.3.2. CMSG_NXTHDR 3159 As described in Section 5.1, CMSG_NXTHDR has been extended to handle 3160 a NULL 2nd argument to mean "get the first header". This provides an 3161 alternative way of coding the processing loop shown earlier: 3163 struct msghdr msg; 3164 struct cmsghdr *cmsgptr = NULL; 3166 /* fill in msg */ 3168 /* call recvmsg() */ 3170 while ((cmsgptr = CMSG_NXTHDR(&msg, cmsgptr)) != NULL) { 3171 if (cmsgptr->cmsg_len == 0) { 3172 /* Error handling */ 3173 break; 3174 } 3175 if (cmsgptr->cmsg_level == ... && cmsgptr->cmsg_type == ... ) { 3176 u_char *ptr; 3178 ptr = CMSG_DATA(cmsgptr); 3179 /* process data pointed to by ptr */ 3180 } 3181 } 3183 One possible implementation could be: 3185 #define CMSG_NXTHDR(mhdr, cmsg) \ 3186 (((cmsg) == NULL) ? CMSG_FIRSTHDR(mhdr) : \ 3187 (((u_char *)(cmsg) + ALIGN_H((cmsg)->cmsg_len) \ 3188 + ALIGN_D(sizeof(struct cmsghdr)) > \ 3189 (u_char *)((mhdr)->msg_control) + (mhdr)->msg_controllen) ? \ 3190 (struct cmsghdr *)NULL : \ 3191 (struct cmsghdr *)((u_char *)(cmsg) + ALIGN_H((cmsg)->cmsg_len)))) 3193 The macros ALIGN_H() and ALIGN_D(), which are implementation 3194 dependent, round their arguments up to the next even multiple of 3195 whatever alignment is required for the start of the cmsghdr structure 3196 and the data, respectively. (This is probably a multiple of 4 or 8 3197 bytes.) They are often the same macro in implementations platforms 3198 where alignment requirement for header and data is chosen to be 3199 identical. 3201 21.3.3. CMSG_DATA 3203 unsigned char *CMSG_DATA(const struct cmsghdr *cmsg); 3205 CMSG_DATA() returns a pointer to the data (what is called the 3206 cmsg_data[] member, even though such a member is not defined in the 3207 structure) following a cmsghdr structure. 3209 One possible implementation could be: 3211 #define CMSG_DATA(cmsg) ( (u_char *)(cmsg) + \ 3212 ALIGN_D(sizeof(struct cmsghdr)) ) 3214 21.3.4. CMSG_SPACE 3216 CMSG_SPACE is new with this API (see Section 5.2). It is used to 3217 determine how much space needs to be allocated for an ancillary data 3218 item. 3220 One possible implementation could be: 3222 #define CMSG_SPACE(length) ( ALIGN_D(sizeof(struct cmsghdr)) + \ 3223 ALIGN_H(length) ) 3225 21.3.5. CMSG_LEN 3227 CMSG_LEN is new with this API (see Section 5.3). It returns the 3228 value to store in the cmsg_len member of the cmsghdr structure, 3229 taking into account any padding needed to satisfy alignment 3230 requirements. 3232 One possible implementation could be: 3234 #define CMSG_LEN(length) ( ALIGN_D(sizeof(struct cmsghdr)) + length ) 3236 22. Appendix B: Examples using the inet6_rth_XXX() functions 3238 Here we show an example for both sending Routing headers and 3239 processing and reversing a received Routing header. 3241 22.1. Sending a Routing Header 3243 As an example of these Routing header functions defined in this 3244 document, we go through the function calls for the example on p. 17 3245 of [RFC-2460]. The source is S, the destination is D, and the three 3246 intermediate nodes are I1, I2, and I3. 3248 S -----> I1 -----> I2 -----> I3 -----> D 3250 src: * S S S S S 3251 dst: D I1 I2 I3 D D 3252 A[1]: I1 I2 I1 I1 I1 I1 3253 A[2]: I2 I3 I3 I2 I2 I2 3254 A[3]: I3 D D D I3 I3 3255 #seg: 3 3 2 1 0 3 3257 src and dst are the source and destination IPv6 addresses in the IPv6 3258 header. A[1], A[2], and A[3] are the three addresses in the Routing 3259 header. #seg is the Segments Left field in the Routing header. 3261 The six values in the column beneath node S are the values in the 3262 Routing header specified by the sending application using sendmsg() 3263 of setsockopt(). The function calls by the sender would look like: 3265 void *extptr; 3266 socklen_t extlen; 3267 struct msghdr msg; 3268 struct cmsghdr *cmsgptr; 3269 int cmsglen; 3270 struct sockaddr_in6 I1, I2, I3, D; 3272 extlen = inet6_rth_space(IPV6_RTHDR_TYPE_0, 3); 3273 cmsglen = CMSG_SPACE(extlen); 3274 cmsgptr = malloc(cmsglen); 3275 cmsgptr->cmsg_len = CMSG_LEN(extlen); 3276 cmsgptr->cmsg_level = IPPROTO_IPV6; 3277 cmsgptr->cmsg_type = IPV6_RTHDR; 3279 extptr = CMSG_DATA(cmsgptr); 3280 extptr = inet6_rth_init(extptr, extlen, IPV6_RTHDR_TYPE_0, 3); 3282 inet6_rth_add(extptr, &I1.sin6_addr); 3283 inet6_rth_add(extptr, &I2.sin6_addr); 3284 inet6_rth_add(extptr, &I3.sin6_addr); 3286 msg.msg_control = cmsgptr; 3287 msg.msg_controllen = cmsglen; 3289 /* finish filling in msg{}, msg_name = D */ 3290 /* call sendmsg() */ 3292 We also assume that the source address for the socket is not 3293 specified (i.e., the asterisk in the figure). 3295 The four columns of six values that are then shown between the five 3296 nodes are the values of the fields in the packet while the packet is 3297 in transit between the two nodes. Notice that before the packet is 3298 sent by the source node S, the source address is chosen (replacing 3299 the asterisk), I1 becomes the destination address of the datagram, 3300 the two addresses A[2] and A[3] are "shifted up", and D is moved to 3301 A[3]. 3303 The columns of values that are shown beneath the destination node are 3304 the values returned by recvmsg(), assuming the application has 3305 enabled both the IPV6_RECVPKTINFO and IPV6_RECVRTHDR socket options. 3306 The source address is S (contained in the sockaddr_in6 structure 3307 pointed to by the msg_name member), the destination address is D 3308 (returned as an ancillary data object in an in6_pktinfo structure), 3309 and the ancillary data object specifying the Routing header will 3310 contain three addresses (I1, I2, and I3). The number of segments in 3311 the Routing header is known from the Hdr Ext Len field in the Routing 3312 header (a value of 6, indicating 3 addresses). 3314 The return value from inet6_rth_segments() will be 3 and 3315 inet6_rth_getaddr(0) will return I1, inet6_rth_getaddr(1) will return 3316 I2, and inet6_rth_getaddr(2) will return I3, 3318 If the receiving application then calls inet6_rth_reverse(), the 3319 order of the three addresses will become I3, I2, and I1. 3321 We can also show what an implementation might store in the ancillary 3322 data object as the Routing header is being built by the sending 3323 process. If we assume a 32-bit architecture where sizeof(struct 3324 cmsghdr) equals 12, with a desired alignment of 4-byte boundaries, 3325 then the call to inet6_rth_space(3) returns 68: 12 bytes for the 3326 cmsghdr structure and 56 bytes for the Routing header (8 + 3*16). 3328 The call to inet6_rth_init() initializes the ancillary data object to 3329 contain a Type 0 Routing header: 3331 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3332 | cmsg_len = 20 | 3333 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3334 | cmsg_level = IPPROTO_IPV6 | 3335 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3336 | cmsg_type = IPV6_RTHDR | 3337 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3338 | Next Header | Hdr Ext Len=6 | Routing Type=0| Seg Left=0 | 3339 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3340 | Reserved | 3341 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3343 The first call to inet6_rth_add() adds I1 to the list. 3345 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3346 | cmsg_len = 36 | 3347 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3348 | cmsg_level = IPPROTO_IPV6 | 3349 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3350 | cmsg_type = IPV6_RTHDR | 3351 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3352 | Next Header | Hdr Ext Len=6 | Routing Type=0| Seg Left=1 | 3353 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3354 | Reserved | 3355 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3356 | | 3357 + + 3358 | | 3359 + Address[1] = I1 + 3360 | | 3361 + + 3362 | | 3363 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3365 cmsg_len is incremented by 16, and the Segments Left field is 3366 incremented by 1. 3368 The next call to inet6_rth_add() adds I2 to the list. 3370 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3371 | cmsg_len = 52 | 3372 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3373 | cmsg_level = IPPROTO_IPV6 | 3374 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3375 | cmsg_type = IPV6_RTHDR | 3376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3377 | Next Header | Hdr Ext Len=6 | Routing Type=0| Seg Left=2 | 3378 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3379 | Reserved | 3380 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3381 | | 3382 + + 3383 | | 3384 + Address[1] = I1 + 3385 | | 3386 + + 3387 | | 3388 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3389 | | 3390 + + 3391 | | 3392 + Address[2] = I2 + 3393 | | 3394 + + 3395 | | 3396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3398 cmsg_len is incremented by 16, and the Segments Left field is 3399 incremented by 1. 3401 The last call to inet6_rth_add() adds I3 to the list. 3403 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3404 | cmsg_len = 68 | 3405 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3406 | cmsg_level = IPPROTO_IPV6 | 3407 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3408 | cmsg_type = IPV6_RTHDR | 3409 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3410 | Next Header | Hdr Ext Len=6 | Routing Type=0| Seg Left=3 | 3411 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3412 | Reserved | 3413 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3414 | | 3415 + + 3416 | | 3417 + Address[1] = I1 + 3418 | | 3419 + + 3420 | | 3421 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3422 | | 3423 + + 3424 | | 3425 + Address[2] = I2 + 3426 | | 3427 + + 3428 | | 3429 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3430 | | 3431 + + 3432 | | 3433 + Address[3] = I3 + 3434 | | 3435 + + 3436 | | 3437 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3439 cmsg_len is incremented by 16, and the Segments Left field is 3440 incremented by 1. 3442 22.2. Receiving Routing Headers 3444 This example assumes that the application has enabled IPV6_RECVRTHDR 3445 socket option. The application prints and reverses a source route 3446 and uses that to echo the received data. 3448 struct sockaddr_in6 addr; 3449 struct msghdr msg; 3450 struct iovec iov; 3451 struct cmsghdr *cmsgptr; 3452 socklen_t cmsgspace; 3453 void *extptr; 3454 int extlen; 3456 int segments; 3457 int i; 3458 char databuf[8192]; 3460 segments = 100; /* Enough */ 3461 extlen = inet6_rth_space(IPV6_RTHDR_TYPE_0, segments); 3462 cmsgspace = CMSG_SPACE(extlen); 3463 cmsgptr = malloc(cmsgspace); 3464 if (cmsgptr == NULL) { 3465 perror("malloc"); 3466 exit(1); 3467 } 3468 extptr = CMSG_DATA(cmsgptr); 3470 msg.msg_control = cmsgptr; 3471 msg.msg_controllen = cmsgspace; 3472 msg.msg_name = (struct sockaddr *)&addr; 3473 msg.msg_namelen = sizeof (addr); 3474 msg.msg_iov = &iov; 3475 msg.msg_iovlen = 1; 3476 iov.iov_base = databuf; 3477 iov.iov_len = sizeof (databuf); 3478 msg.msg_flags = 0; 3479 if (recvmsg(s, &msg, 0) == -1) { 3480 perror("recvmsg"); 3481 return; 3482 } 3483 if (msg.msg_controllen != 0 && 3484 cmsgptr->cmsg_level == IPPROTO_IPV6 && 3485 cmsgptr->cmsg_type == IPV6_RTHDR) { 3486 struct in6_addr *in6; 3487 char asciiname[INET6_ADDRSTRLEN]; 3488 struct ip6_rthdr *rthdr; 3490 rthdr = (struct ip6_rthdr *)extptr; 3491 segments = inet6_rth_segments(extptr); 3492 printf("route (%d segments, %d left): ", 3493 segments, rthdr->ip6r_segleft); 3494 for (i = 0; i < segments; i++) { 3495 in6 = inet6_rth_getaddr(extptr, i); 3496 if (in6 == NULL) 3497 printf(" "); 3499 else 3500 printf("%s ", inet_ntop(AF_INET6, 3501 (void *)in6->s6_addr, 3502 asciiname, INET6_ADDRSTRLEN)); 3503 } 3504 if (inet6_rth_reverse(extptr, extptr) == -1) { 3505 printf("reverse failed"); 3506 return; 3507 } 3508 } 3509 iov.iov_base = databuf; 3510 iov.iov_len = strlen(databuf); 3511 if (sendmsg(s, &msg, 0) == -1) 3512 perror("sendmsg"); 3513 if (cmsgptr != NULL) 3514 free(cmsgptr); 3516 Note: The above example is a simple illustration. It skips some 3517 error checks, including those involving the MSG_TRUNC and MSG_CTRUNC 3518 flags. It also leaves some type mismatches in favor of brevity. 3520 23. Appendix C: Examples using the inet6_opt_XXX() functions 3522 This shows how Hop-by-Hop and Destination options can be both built 3523 as well as parsed using the inet6_opt_XXX() functions. This examples 3524 assume that there are defined values for OPT_X and OPT_Y. 3526 Note: The example is a simple illustration. It skips some error 3527 checks and leaves some type mismatches in favor of brevity. 3529 23.1. Building options 3531 We now provide an example that builds two Hop-by-Hop options using 3532 the example in Appendix B of [RFC-2460]. 3534 void *extbuf; 3535 socklen_t extlen; 3536 int currentlen; 3537 void *databuf; 3538 int offset; 3539 uint8_t value1; 3540 uint16_t value2; 3541 uint32_t value4; 3542 uint64_t value8; 3544 /* Estimate the length */ 3545 currentlen = inet6_opt_init(NULL, 0); 3546 if (currentlen == -1) 3547 return (-1); 3548 currentlen = inet6_opt_append(NULL, 0, currentlen, OPT_X, 12, 8, NULL); 3549 if (currentlen == -1) 3550 return (-1); 3551 currentlen = inet6_opt_append(NULL, 0, currentlen, OPT_Y, 7, 4, NULL); 3552 if (currentlen == -1) 3553 return (-1); 3554 currentlen = inet6_opt_finish(NULL, 0, currentlen); 3555 if (currentlen == -1) 3556 return (-1); 3557 extlen = currentlen; 3559 extbuf = malloc(extlen); 3560 if (extbuf == NULL) { 3561 perror("malloc"); 3562 return (-1); 3563 } 3564 currentlen = inet6_opt_init(extbuf, extlen); 3565 if (currentlen == -1) 3566 return (-1); 3568 currentlen = inet6_opt_append(extbuf, extlen, currentlen, 3569 OPT_X, 12, 8, &databuf); 3570 if (currentlen == -1) 3571 return (-1); 3572 /* Insert value 0x12345678 for 4-octet field */ 3573 offset = 0; 3574 value4 = 0x12345678; 3575 offset = inet6_opt_set_val(databuf, offset, &value4, sizeof (value4)); 3576 /* Insert value 0x0102030405060708 for 8-octet field */ 3577 value8 = 0x0102030405060708; 3578 offset = inet6_opt_set_val(databuf, offset, &value8, sizeof (value8)); 3580 currentlen = inet6_opt_append(extbuf, extlen, currentlen, 3581 OPT_Y, 7, 4, &databuf); 3582 if (currentlen == -1) 3583 return (-1); 3584 /* Insert value 0x01 for 1-octet field */ 3585 offset = 0; 3586 value1 = 0x01; 3587 offset = inet6_opt_set_val(databuf, offset, &value1, sizeof (value1)); 3588 /* Insert value 0x1331 for 2-octet field */ 3589 value2 = 0x1331; 3590 offset = inet6_opt_set_val(databuf, offset, &value2, sizeof (value2)); 3591 /* Insert value 0x01020304 for 4-octet field */ 3592 value4 = 0x01020304; 3593 offset = inet6_opt_set_val(databuf, offset, &value4, sizeof (value4)); 3595 currentlen = inet6_opt_finish(extbuf, extlen, currentlen); 3596 if (currentlen == -1) 3597 return (-1); 3598 /* extbuf and extlen are now completely formatted */ 3600 23.2. Parsing received options 3602 This example parses and prints the content of the two options in the 3603 previous example. 3605 int 3606 print_opt(void *extbuf, socklen_t extlen) 3607 { 3608 struct ip6_dest *ext; 3609 int currentlen; 3610 uint8_t type; 3611 socklen_t len; 3612 void *databuf; 3613 int offset; 3614 uint8_t value1; 3615 uint16_t value2; 3616 uint32_t value4; 3617 uint64_t value8; 3619 ext = (struct ip6_dest *)extbuf; 3620 printf("nxt %u, len %u (bytes %d)\n", ext->ip6d_nxt, 3621 ext->ip6d_len, (ext->ip6d_len + 1) * 8); 3623 currentlen = 0; 3624 while (1) { 3625 currentlen = inet6_opt_next(extbuf, extlen, currentlen, 3626 &type, &len, &databuf); 3627 if (currentlen == -1) 3628 break; 3629 printf("Received opt %u len %u\n", 3630 type, len); 3631 switch (type) { 3632 case OPT_X: 3633 offset = 0; 3634 offset = inet6_opt_get_val(databuf, offset, 3635 &value4, sizeof (value4)); 3636 printf("X 4-byte field %x\n", value4); 3637 offset = inet6_opt_get_val(databuf, offset, 3638 &value8, sizeof (value8)); 3640 printf("X 8-byte field %llx\n", value8); 3641 break; 3642 case OPT_Y: 3643 offset = 0; 3644 offset = inet6_opt_get_val(databuf, offset, 3645 &value1, sizeof (value1)); 3646 printf("Y 1-byte field %x\n", value1); 3647 offset = inet6_opt_get_val(databuf, offset, 3648 &value2, sizeof (value2)); 3649 printf("Y 2-byte field %x\n", value2); 3650 offset = inet6_opt_get_val(databuf, offset, 3651 &value4, sizeof (value4)); 3652 printf("Y 4-byte field %x\n", value4); 3653 break; 3654 default: 3655 printf("Unknown option %u\n", type); 3656 break; 3657 } 3658 } 3659 return (0); 3660 }