idnits 2.17.1 

draft-ietf-ipngwg-bsd-api-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-18) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 6 instances of too long lines in the document, the longest one
     being 4 characters in excess of 72.

  ** The abstract seems to contain references ([1]), which it shouldn't. 
     Please replace those with straight textual mentions of the documents in
     question.

  == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the
     document.

  == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 213 has weird spacing: '... u_long  s6_ad...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (March 13, 1995) is 10629 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: '0' is mentioned on line 634, but not defined

  -- Looks like a reference, but probably isn't: 'N-1' on line 646

  -- Looks like a reference, but probably isn't: 'N' on line 650

  -- Possible downref: Non-RFC (?) normative reference: ref. '1'

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  -- Possible downref: Non-RFC (?) normative reference: ref. '3'

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'


     Summary: 11 errors (**), 0 flaws (~~), 5 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                   R. E. Gilligan (Sun)
3	INTERNET-DRAFT                                   S. Thomson (Bellcore)
4	                                                    J. Bound (Digital)

6	                                                        March 13, 1995

8	                IPv6 Program Interfaces for BSD Systems
9	                   <draft-ietf-ipngwg-bsd-api-00.txt>

11	Abstract

13	In order to implement the version 6 Internet Protocol (IPv6) [1] in an
14	operating system based on Berkeley Unix (4.x BSD), changes must be made
15	to the application program interface (API).  TCP/IP applications written
16	for BSD-based operating systems have in the past enjoyed a high degree
17	of portability because most of the systems derived from BSD provide the
18	same API, known informally as "the socket interface".  We would like the
19	same portability with IPv6.  This memo presents a set of extensions to
20	the BSD socket API to support IPv6.  The changes include a new data
21	structure to carry IPv6 addresses, new name to address translation
22	library functions, new address conversion functions, and some new
23	setsockopt() options.  The extensions are designed to provide access to
24	IPv6 features, while introducing a minimum of change into the system and
25	providing complete compatibility for existing IPv4 applications.

27	Status of this Memo

29	This document is an Internet Draft.  Internet Drafts are working
30	documents of the Internet Engineering Task Force (IETF), its Areas, and
31	its Working Groups.  Note that other groups may also distribute working
32	documents as Internet Drafts.

34	Internet Drafts are draft documents valid for a maximum of six months.
35	This Internet Draft expires on September 13, 1995.  Internet Drafts may
36	be updated, replaced, or obsoleted by other documents at any time.  It
37	is not appropriate to use Internet Drafts as reference material or to
38	cite them other than as a "working draft" or "work in progress."

40	To learn the current status of any Internet-Draft, please check the
41	1id-abstracts.txt listing contained in the Internet-Drafts Shadow
42	Directories on ds.internic.net, nic.nordu.net, ftp.isi.edu, or
43	munnari.oz.au.

45	Distribution of this memo is unlimited.

47	1.  Introduction.

49	While IPv4 addresses are 32-bits long, IPv6 nodes are identified by
50	128-bit addresses.  The socket interface API make the size of an IP
51	address quite visible to an application; virtually all TCP/IP
52	applications for BSD-based systems have knowledge of the size of an IP
53	address.  Those parts of the API that expose the addresses need to be
54	extended to accommodate the larger IPv6 address size.  This paper
55	defines a set of extensions to the socket interface API to support IPv6.
56	This specification is preliminary.  The API extensions are expected to
57	evolve as we gain more implementation experience.

59	2.  Design Considerations

61	There are a number of important considerations in designing changes to
62	this well-worn API:

64	   -    The extended API should provide both source and binary
65	        compatibility for programs written to the original API.  That
66	        is, existing program binaries should continue to operate when
67	        run on a system supporting the new API.  In addition, existing
68	        applications that are re-compiled and run on a system supporting
69	        the new API should continue to operate.  Simply put, the API
70	        changes for IPv6 should not break existing programs.

72	   -    The changes to the API should be as small as possible in order
73	        to simplify the task of converting existing IPv4 applications to
74	        IPv6.

76	   -    Where possible, applications should be able to use the extended
77	        API to interoperate with both IPv6 and IPv4 hosts.  Applications
78	        should not need know which type of host they are communicating
79	        with.

81	   -    IPv6 addresses carried in data structures should be 64-bit
82	        aligned.  This is necessary in order to obtain optimum
83	        performance on 64-bit machine architectures.

85	Because of the importance of providing IPv4 compatibility in the API,
86	our extensions are explicitly designed to operate on machines that
87	provide complete support for both IPv4 and IPv6.  A subset of this API
88	could probably be designed for operation on systems that support only
89	IPv6.  However, this is not addressed in this document.

91	2.1.  Overview of Changes
92	The socket interface API consists of a few distinct components:

94	   -    Core socket functions.

96	   -    Address data structures.

98	   -    Name-to-address translation functions.

100	   -    Address conversion functions.

102	The core socket functions -- those functions that deal with such things
103	as setting up and tearing down TCP connections, and sending and
104	receiving UDP packets -- were designed to be transport independent.
105	Where protocol addresses are passed as function arguments, they are
106	carried via opaque pointers.  A protocol specific address data structure
107	is defined for each protocol that the socket functions support.
108	Applications must cast these protocol specific address structures into
109	the generic "sockaddr" data type when using the socket functions.  These
110	functions need not change for IPv6, but a new IPv6 specific address data
111	structure is needed.

113	The "sockaddr_in" structure is the protocol specific data structure for
114	IPv4.  This data structure actually includes 8-octets of unused space,
115	and it is tempting to try to use this space to adapt the sockaddr_in
116	structure to IPv6.  Unfortunately, the sockaddr_in structure is not
117	large enough to hold the 16-octet IPv6 address as well as the other
118	information (2-octet address family and 2-octet port number) that is
119	needed.  So a new address data structure must be defined for IPv6.

121	The name-to-address translation functions in the socket interface are
122	gethostbyname() and gethostbyaddr().  Gethostbyname() does not provide
123	enough flexibility to accommodate more than one protocol family.  To
124	solve this problem, we introduced a new name-to-address translation
125	function which is analogous to gethostbyname(), but supports addresses
126	in both the IPv4 and IPv6 address families.  Gethostbyaddr() does not,
127	strictly speaking, need to be replaced since it carries an address
128	family argument and can be extended to support both address families
129	without introducing compatibility problems.  However, we have chosen to
130	introduce a new function to maintain symmetry with the replacement to
131	gethostbyname().  The new functions both carry an address family
132	parameter, so they can be extended to operate with other protocol
133	families in addition to IPv4 and IPv6.

135	The address conversion functions -- inet_ntoa() and inet_addr() --
136	convert IPv4 addresses between binary and printable form.  These
137	functions are quite specific to 32-bit IPv4 addresses.  We have designed
138	two analogous functions which convert both IPv4 and IPv6 addresses, and
139	carry an address type parameter so that they can be extended to other
140	protocol families as well.

142	Finally, a few miscellaneous features are needed to support IPv6.  A new
143	interface is needed in order to support the IPv6 flow label.  New
144	interfaces are needed in order to receive IPv6 multicast packets and
145	control the sending of multicast packets.  And an interface is necessary
146	in order to pass IPv6 source route information between the application
147	and the system.

149	3.  Implementation Experience

151	A few issues exposed in experimenting with prototype implementations
152	of IPv6 helped to guide the design of this API.

154	First, we discovered that, by providing a way to represent the
155	addresses of IPv4 nodes as IPv6 addresses, we could greatly simplify
156	the applications' task of providing IPv4 compatibility.  New
157	applications could interoperate with IPv4 nodes by using the new API
158	and expressing the addresses of IPv4 nodes they interoperate with as
159	IPv6 addresses.  For example, a client application could open a TCP
160	connection to an IPv4 server by giving the IPv6 representation of the
161	server's IPv4 address in the connect() call.  Most applications do not
162	even need to know whether the peer is an IPv4 or IPv6 node.  Such
163	applications can simply treat IPv6 addresses as opaque values; They
164	need not understand the "structure" by which IPv4 addresses are
165	encoded within IPv6 addresses.  Yet the structure can be decoded by
166	those applications that do need to know whether the peer is IPv6 or
167	IPv4.  This should prove to be a significant simplification since most
168	applications will need to interoperate with both IPv4 and IPv6 nodes
169	for some time to come.

171	Second, we learned that existing applications written to the IPv4 API
172	could be made to interoperate with IPv6 nodes to a limited degree.  This
173	technique does not work for all applications, but does for certain
174	applications, such as those that do not "look at" the peer address that
175	is provided by the API.  (e.g.  the source address provided by the
176	recvfrom() function when a UDP packet is received, or the client address
177	returned by the accept() function.)

179	Third, we learned that the common application practice of passing open
180	socket descriptors between processes across an exec() call can cause
181	problems.  It is possible, for example, for an application using the
182	extended API to pass an open socket to an older application using the
183	original API.  The old application could be confused if the socket
184	functions return IPv6 address structures to it.  The solution designed
185	was to provide a mechanism by which applications could have explicit
186	control over what form of addresses are returned.

188	4.  Interface Specification

190	4.1.  New Address Family

192	A new address family macro, named AF_INET6, is defined in
193	<sys/socket.h>.  The AF_INET6 definition is used to distinguish between
194	the original sockaddr_in address data structure, and the new
195	sockaddr_in6 data structure.

197	A new protocol family macro, named PF_INET6, is defined in
198	<sys/socket.h>.  Like most of the other protocol family macros, this
199	will usually be defined to have the same value as the corresponding
200	address family macro:

202	        #define PF_INET6        AF_INET6

204	The PF_INET6 is used in the first argument to the socket() function to
205	indicate that an IPv6 socket is being created.

207	4.2. IPv6 Address Data Structure

209	A new data structure to hold a single IPv6 address is defined in
210	<netinet/in.h>:

212	        struct in_addr6 {
213	                u_long  s6_addr[4];     /* IPv6 address */
214	        }

216	This data structure contains an array of four 32-bit elements, which
217	make up one 128-bit IPv6 address.

219	The IPv6 address is stored in in network byte order.

221	4.3.  Socket Address Structure for 4.3 BSD-Based Systems

223	In the socket interface, a different protocol-specific data structure
224	is defined to carry the addresses for each of the protocol suite.
225	Each protocol-specific data structure is designed so it can be cast
226	into a protocol-independent data structure -- the "sockaddr"
227	structure.  Each has a "family" field which overlays the "sa_family"
228	of the sockaddr data structure.  This field can be used to identify
229	the type of the data structure.

231	The sockaddr_in structure is the protocol-specific address data
232	structure for IPv4.  It is used to pass addresses between applications
233	and the system in the socket functions.  We have defined the following
234	structure in <netinet/in.h> to carry IPv6 addresses:

236	        struct sockaddr_in6 {
237	                u_short         sin6_family;    /* AF_INET6 */
238	                u_short         sin6_port;      /* Transport layer port # */
239	                u_long          sin6_flowlabel; /* IPv6 flow label */
240	                struct in_addr6 sin6_addr;      /* IPv6 address */
241	        };

243	This structure is designed to be compatible with the sockaddr data
244	structure used in the 4.3 BSD release.

246	The sin6_family field is used to identify this as a sockaddr_in6
247	structure.  This field is designed to overlay the sa_family field when
248	the buffer is cast to a sockaddr data structure.  The value of this
249	field must be AF_INET6.

251	The sin6_port field is used to store the 16-bit UDP or TCP port
252	number.  This field is used in the same way as the sin_port field of
253	the sockaddr_in structure.  The port number is stored in network byte
254	order.

256	The sin6_flowlabel field is a 32-bit field that is used to store the
257	28-bit IPv6 flow label.  The IPv6 flow label is represented as the
258	low-order 28-bits of a 32-bit value, which is stored in network byte
259	order in the sin6_flowlabel field.  The use of this field is explained
260	in sec 4.8.

262	The sin6_addr field is a single in_addr6 structure (defined in the
263	previous section).  This field holds one 128-bit IPv6 address.  The
264	address is stored in in network byte order.

266	The ordering of elements in this structure is specifically designed so
267	that the sin6_addr field will be aligned on a 64-bit boundary.  This
268	is done for optimum performance on 64-bit architectures.

270	The data types of the structure elements given here and in the
271	previous section are intended as examples only.  System
272	implementations may use other types if they are appropriate for the
273	system they are used on.

275	4.4. Socket Address Structure for 4.4 BSD-Based Systems

277	The 4.4 BSD release includes a small, but incompatible change to the
278	socket interface.  The "sa_family" field of the sockaddr data
279	structure was changed from a 16-bit value to an 8-bit value, and the
280	space saved used to hold a length field, named "sa_len". The
281	sockaddr_in6 data structure given in the previous section can not be
282	correctly cast into the newer sockaddr data structure.  For this
283	reason, we have defined the following alternative IPv6 address data
284	structure to be used on systems based on 4.4 BSD:

286	        #define SIN6_LEN

288	        struct sockaddr_in6 {
289	                u_char          sin6_len;       /* length of this struct */
290	                u_char          sin6_family;    /* AF_INET6 */
291	                u_short         sin6_port;      /* Transport layer port # */
292	                u_long          sin6_flowlabel; /* IPv6 flow label */
293	                struct in_addr6 sin6_addr;      /* IPv6 address */
294	        };

296	This structure is defined in the <netinet/in.h> header file.  The only
297	differences between this data structure and the 4.3 BSD variant are
298	the inclusion of the length field, and the change of the family field
299	to a 8-bit data type.  The definitions of all the other fields are
300	identical to the 4.3 BSD variant defined in the previous section.

302	Systems that provide this version of the sockaddr_in6 data structure
303	must include the SIN6_LEN macro definition in <netinet/in.h>.  This
304	macro allows applications to determine whether they are being built on
305	a system that supports the 4.3 BSD or 4.4 BSD variants of the data
306	structure.  Applications can be written to run on both systems by
307	simply making their assignments and use of the sin6_len field
308	conditional on the SIN6_LEN field.  For example, to fill in an IPv6
309	address structure in an application, one might write:

311	        struct sockaddr_in6 sin6;

313	        bzero((char *) &sin6, sizeof(struct sockaddr_in6));
314	        #ifdef SIN6_LEN
315	        sin6.sin6_len = sizeof(struct sockaddr_in6);
316	        #endif
317	        sin6.sin6_family = AF_INET6;
318	        sin6.sin6_port = 23;

320	4.5.  The Socket Functions

322	Applications use the socket() function to create a socket descriptor
323	that represents a communication endpoint.  The arguments to the socket()
324	function tell the system which protocol to use, and what format address
325	structure will be used in subsequent functions.  For example, to create
326	an IPv4/TCP socket, applications make the call:

328	        s = socket (PF_INET, SOCK_STREAM, 0);

330	To create an IPv4/UDP socket, applications make the call:

332	        s = socket (PF_INET, SOCK_DGRAM, 0);

334	Applications may create IPv6/TCP and IPv6/UDP sockets by simply using
335	the constant PF_INET6 instead of PF_INET in the first argument.  For
336	example, to create an IPv6/TCP socket, applications make the call:

338	        s = socket (PF_INET6, SOCK_STREAM, 0);

340	To create an IPv6/UDP socket, applications make the call:

342	        s = socket (PF_INET6, SOCK_DGRAM, 0);

344	Once the application has created a PF_INET6 socket, it must use the
345	sockaddr_in6 address structure when passing addresses in to the system.
346	The functions which the application uses to pass addresses into the
347	system are:

349	           bind()
350	           connect()
351	           sendto()

353	The system will use the sockaddr_in6 address structure to return
354	addresses to applications that are using PF_INET6 sockets.  The
355	functions that return an address from the system to an application
356	are:

358	           accept()
359	           recvfrom()
360	           getpeername()
361	           getsockname()

363	No changes to the syntax of the socket functions are needed to support
364	IPv6, since the all of the "address carrying" functions use an opaque
365	address pointer, and carry an address length as a function argument.

367	4.6.  Compatibility with IPv4 Applications

369	In order to support the large base of applications using the original
370	API, system implementations must provide complete source and binary
371	compatibility with the original API.  This means that systems must
372	continue to support PF_INET sockets and the sockaddr_in addresses
373	structure.  Applications must be able to create IPv4/TCP and IPv4/UDP
374	sockets using the PF_INET constant in the socket() function, as
375	described in the previous section.  Applications should be able to hold
376	a combination of IPv4/TCP, IPv4/UDP, IPv6/TCP and IPv6/UDP sockets
377	simultaneously within the same process.

379	Applications using the original API should continue to operate as they
380	did on systems supporting only IPv4.  That is, they should continue to
381	interoperate with IPv4 nodes.  It is not clear, though, how, or even if,
382	those IPv4 applications should interoperate with IPv6 nodes.  The open
383	issues section (section 7) discusses some of the alternatives.

385	4.7.  Compatibility with IPv4 Nodes

387	The API also provides a different type of compatibility: the ability for
388	applications using the extended API to interoperate with IPv4 nodes.
389	This feature uses the IPv4-mapped IPv6 address format defined in the
390	IPv6 addressing architecture specification [3].  This address format
391	allows the IPv4 address of an IPv4 node to be represented as an IPv6
392	address.  The IPv4 address is encoded into the low-order 32-bits of the
393	IPv6 address, and the high-order 96-bits hold the fixed prefix
394	0:0:0:0:0:FFFF.  IPv4-mapped addresses are written as follows:

396	        ::FFFF:<IPv4-address>

398	Applications may use PF_INET6 sockets to open TCP connections to IPv4
399	nodes, or send UDP packets to IPv4 nodes, by simply encoding the
400	destination's IPv4 address as an IPv4-mapped IPv6 address, and passing
401	that address, within a sockaddr_in6 structure, in the connect() or
402	sendto() call.  When applications use PF_INET6 sockets to accept TCP
403	connections from IPv4 nodes, or receive UDP packets from IPv4 nodes, the
404	system returns the peer's address to the application in the accept(),
405	recvfrom(), or getpeername() call using a sockaddr_in6 structure encoded
406	this way.

408	We expect that few applications will need to know which type of node
409	they are interoperating with.  However, for those applications that do
410	need to know, the following function is provided:

412	        int is_ipv4_addr (const struct in_addr6 *ap);

414	The "ap" argument to this function points to a buffer holding an IPv6
415	address in network byte order.  The function returns true (non-zero)
416	if that address is an IPv4-mapped address, and returns 0 otherwise.
417	When an application using the extended API accepts a TCP connection,
418	or receives a UDP packet, it may determine whether the peer is an IPv4
419	node by applying the is_ipv4_addr() function to the address returned
420	by accept() or recvfrom().

422	4.8.  Sockets Passed Across exec()
423	Unix allows open sockets to be passed across an exec() call.  It is a
424	relatively common application practice to pass open sockets across
425	exec() calls.  Because of this, it is possible for an application
426	using the original API to pass an open PF_INET socket to an
427	application that is expecting to receive a PF_INET6 socket.
428	Similarly, it is possible for an application using the extended API to
429	pass an open PF_INET6 socket to an application using the original API,
430	which would be equipped only to deal with PF_INET sockets.  Either of
431	these cases could cause problems, because the application which is
432	passed the open socket might not know how to decode the address
433	structures returned in subsequent socket functions.

435	To remedy this problem, we have defined a new setsockopt() option that
436	allows an application to "transform" a PF_INET6 socket into a PF_INET
437	socket and vice-versa.

439	An IPv6 application that is passed an open socket from an unknown
440	process may use the IP_ADDRFORM setsockopt() option to "convert" the
441	socket to PF_INET6.  Once that has been done, the system will return
442	sockaddr_in6 address structures in subsequent socket functions.
443	Similarly, an IPv6 application that is about to pass an open PF_INET6
444	socket to a program that may not be IPv6 capable may "downgrade" the
445	socket to PF_INET before calling exec().  After that, the system will
446	return sockaddr_in address structures to the application that was
447	exec()'ed.

449	The macro definition for IP_ADDRFORM is in <netinet/in.h>.

451	The IP_ADDRFORM option is at the IPPROTO_IP level.  The only valid
452	option values are PF_INET6 and PF_INET.  For example, to convert a
453	PF_INET6 socket to PF_INET, a program would call:

455	        int addrform = PF_INET;

457	        if (setsockopt(s, IPPROTO_IP, IP_ADDRFORM, (char *) &addrform,
458	                sizeof(addrform)) == -1)
459	                perror("setsockopt IP_ADDRFORM");

461	An application may use IP_ADDRFORM in the getsckopt() function to learn
462	whether an open socket is a PF_INET of PF_INET6 socket.  For example:

464	        int addrform;
465	        int len = sizeof(int);

467	        if (getsockopt(s, IPPROTO_IP, IP_ADDRFORM, (char *) &addrform,
468	                &len) == -1)
469	                perror("getsockopt IP_ADDRFORM");
470	        if (addrform == PF_INET)
471	                printf("This is an IPv4 socket.\n");
472	        else if (addrform == PF_INET6)
473	                printf("This is an IPv6 socket.\n");
474	        else
475	                printf("This system is broken.\n");

477	4.9.  Flow Label

479	The IPv6 header has a 28-bit field to hold a "flow label".  Applications
480	have control over what flow label value is used in packets that they
481	originate, and have access to the flow label value of packets that they
482	send.

484	The sin6_flowlabel field of the sockaddr_in6 structure is used to
485	carry the flow label between the application and the system.  An
486	application may specify a flow label to use in the transmitted packets
487	of an actively opened TCP connection by setting the sin6_flowlabel
488	field of the destination address sockaddr_in6 structure passed in the
489	connect() function.  An application may specify the flow label to use
490	in transmitted UDP packets by setting the sin6_flowlabel field of the
491	destination address sockaddr_in6 structure passed in the sendto()
492	function.  If an application does not care what flow label is used, it
493	should set the flowlabel value to zero.

495	An application may specify the flow label to use in transmitted packets
496	of a passively accepted TCP connection, by setting the sin6_flowlabel
497	field of the address passed in the bind() function.

499	The flow label that appeared in received UDP packets is passed up to
500	the application in the sin6_flowlabel field of the source address
501	sockaddr_in6 structure that is returned in the recvfrom() call.  The
502	flow label that appeared in the received SYN segment of a passively
503	accepted TCP connection is returned to the application in the source
504	address sin6_flowlabel field of the sockaddr_in6 structure that is
505	passed in the accept() call.

507	4.10.  Handling IPv6 Source Routes

509	IPv6 makes more use of the source routing mechanism than IPv4.  In order
510	for source routing to operate properly, the node receiving a request
511	packet that bears a source route must reverse that source route when
512	sending the reply.  In the case of TCP, the reversal can be done in the
513	transport protocol implementation transparently to the application.  But
514	in the case of UDP, the application must perform the reversal itself.
515	The transport protocol code can not perform the reversal for UDP packets
516	because a UDP application may receive a number of requests and generate
517	replies asynchronously.  A "reply" sent by an application may not match
518	the "request" most recently passed up to the application.

520	The API for source routing has two components: providing a source route
521	to be used with originated traffic -- actively opened TCP connections
522	and UDP packets being sent -- and retrieving the source route of
523	received traffic -- passively accepted TCP connections and received UDP
524	packets.  An application may always provide a source route with TCP
525	connections being originated and UDP packets being sent.  But to receive
526	source routes, the application must enable an option.

528	To provide a source route, an application simply provides an array of
529	sockaddr_in6 data structures in the address argument of the sendto()
530	function (when sending a UDP packet), or the connect() function (when
531	actively opening a TCP connection).  The length argument of the function
532	is the total length, in octets, of the array.  The elements of the array
533	represent the full source route, including both source and destination
534	identifying address.  The elements of the array are ordered from
535	destination to source.  That is, the first element of the array
536	represents the destination identifying address, and the last element of
537	the array represents the source identifying address.  If the application
538	provides a source route, the source identifying address can not be
539	omitted.  The sin6_addr field of the source identifying address may be
540	set to zero, however, in which case the system will select an
541	appropriate source address.  The sin6_port field of the destination
542	identifying address must be assigned.  The sin_port field of the source
543	identifying address may be set to zero, in which case the system will
544	select an appropriate source port number.  The sin6_port and
545	sin6_flowlabel fields of the intermediate addresses must be set to zero.

547	The arrangement of the address structures in the address buffer passed
548	to connect() or sendto() is shown in the figure below:

550	        +--------------------+
551	        |                    |
552	        |  sockaddr_in6[0]   |  Destination Identifying Address
553	        |                    |
554	        +--------------------+
555	        |                    |
556	        |  sockaddr_in6[1]   |  Last Source-Route Hop Address
557	        |                    |
558	        +--------------------+
559	        .                    .
560	        .                    .
561	        .                    .
562	        +--------------------+
563	        |                    |
564	        | sockaddr_in6[N-1]  |  First Source-Route Hop Address
565	        |                    |
566	        +--------------------+
567	        |                    |
568	        |  sockaddr_in6[N]   |  Source Identifying Address
569	        |                    |
570	        +--------------------+

572	               Address buffer when sending a source route

574	The IP_RCVSRCRT setsockopt() option controls the reception of source
575	routes.  The option is disabled by default.  Applications must
576	explicitly enable the option using the setsockopt() function in order to
577	receive source routes.

579	The macro definition for IP_RCVSRCRT is in <netinet/in.h>.

581	The IP_RCVSRCRT option is at the IPPROTO_IP level.  An example of how an
582	application might use this option is:

584	        int on = 1;             /* value == 1 means enable the option */

586	        if (setsockopt(s, IPPROTO_IP, IP_RCVSRCRT, (char *) &on,
587	                sizeof(on)) == -1)
588	                perror("setsockopt IP_RCVSRCRT");

590	When the IP_RCVSRCRT option is disabled, only a single sockaddr_in6
591	address structure is returned to applications in the address argument
592	of the recvfrom() and accept() functions.  This address represents the
593	source identifying address of the UDP packet received or the TCP
594	connection accepted.

596	When the IP_RCVSRCRT option is enabled, the address argument of the
597	recvfrom() function (when receiving UDP packets) and the accept()
598	functions (when passively accepting TCP connections) points to an array
599	of sockaddr_in6 structures.  When the function returns, the array will
600	hold two elements -- source and destination address -- when the received
601	UDP packet or TCP SYN packet does not carry a source route.  The array
602	will hold more than two elements when the received packet carries a
603	source route.

605	The addresses in the array are ordered from source to destination.  That
606	is, the first element of the array holds source identifying address of
607	the received packet.  Following this in the array are the intermediary
608	hops.  And the last element of the array holds the destination
609	identifying address.  Note that this is the opposite of the order
610	specified for sending.  This ordering was chosen so that the address
611	array received in a recvfrom() call can be used in a subsequent sendto()
612	call without requiring the application to re-order the addresses in the
613	array.  Similarly, the address array received in an accept() call can be
614	used unchanged in a subsequent connect() call.

616	The address length argument of the recvfrom() and accept() functions
617	indicate the length, in octets, of the full address array.  This
618	argument is a value-result parameter.  The application sets the maximum
619	size of the address buffer when it makes the call, and the system
620	modifies the value to return the actual size of the buffer to the
621	application.

623	The sin6_port field of the first and last array elements (source and
624	destination identifying address) will hold the source and destination
625	UDP or TCP port number of the received packet.  The sin6_port field of
626	the intermediate elements of the array will be zero.

628	The address buffer returned to the application in the recvfrom() or
629	accept() functions when the IP_RCVSRCRT option is enabled is shown
630	below:

632	        +--------------------+
633	        |                    |
634	        |  sockaddr_in6[0]   |  Source Identifying Address
635	        |                    |
636	        +--------------------+
637	        |                    |
638	        |  sockaddr_in6[1]   |  First Source-Route Hop Address
639	        |                    |
640	        +--------------------+
641	        .                    .
642	        .                    .
643	        .                    .
644	        +--------------------+
645	        |                    |
646	        | sockaddr_in6[N-1]  |  Last Source-Route Hop Address
647	        |                    |
648	        +--------------------+
649	        |                    |
650	        |  sockaddr_in6[N]   |  Destination Identifying Address
651	        |                    |
652	        +--------------------+

654	              Address buffer when receiving a source route

656	Since IPv6 allows the number of elements in a source route to be very
657	large, it is impractical for all applications that have enabled the
658	reception of source routes to provide buffer space to hold the maximum
659	number of elements.  Some applications may choose a buffer size that is
660	appropriate for their own use.  This means that it is possible that a
661	received source route may be too large to fit into the buffer provided
662	by the application.  In this circumstance, the system should return only
663	a single address element -- the source identifying address -- to the
664	application.  This case is clearly distinguishable to the application
665	because in all other cases, the system returns at least two address
666	elements -- the source and destination identifying addresses.

668	4.11.  Unicast Hop Limit

670	A new setsockopt() option is used to control the hop limit used in
671	outgoing unicast IPv6 packets.  The name of this option is
672	IP_UNICAST_HOPS, and it is used at the IPPROTO_IP layer.  The macro
673	definition for IP_UNICAST_HOPS resides in the <netinet/in.h> header
674	file.  The following example illustrates how it is used:

676	        int hoplimit = 10;

678	        if (setsockopt(s, IPPROTO_IP, IP_UNICAST_HOPS, (char *) &hoplimit,
679	                sizeof(hoplimit)) == -1)
680	                perror("setsockopt IP_UNICAST_HOPS);

682	When the IP_UNICAST_HOPS option is set with setsockopt(), the option
683	value given is used as the hop limit for all subsequent unicast packets
684	sent via that socket.  If the option is not set, the system selects a
685	default value.

687	The IP_UNICAST_HOPS option may be used in the getsockopt() function to
688	determine the hop limit value that the system will use for subsequent
689	unicast packets sent via that socket.  For example:

691	        int hoplimit;
692	        int len = sizeof(hoplimit);

694	        if (getsockopt(s, IPPROTO_IP, IP_UNICAST_HOPS, (char *) &hoplimit,
695	                &len) == -1)
696	                perror("getsockopt IP_UNICAST_HOPS);
697	        else
698	                printf("Using %d for hop limit.\n", hoplimit);

700	4.12.  Sending and Receiving Multicast Packets

702	IPv6 applications may send UDP multicast packets by simply specifying an
703	IPv6 multicast address in the address argument of the sendto() function.

705	A few setsockopt options at the IPPROTO_IP layer are used to control
706	some of the parameters of sending multicast packets.  These options are
707	optional: applications may send multicast packets without using these
708	options.  The setsockopt() options for controlling the sending of
709	multicast packets are summarized below:

711	        IP_MULTICAST_IF         Set the interface to use for outgoing
712	                                multicast packets.

714	        IP_MULTICAST_HOPS       Set the hop limit to use for outgoing
715	                                multicast packets.  (Note a separate
716	                                option - IP_UNICAST_HOPS - is provided
717	                                to set the hop limit to use for outgoing
718	                                unicast packets.)

720	        IP_MULTICAST_LOOP       Controls whether outgoing multicast
721	                                packets sent should be delivered back to
722	                                the local application.  A toggle.

724	The reception of multicast packets is controlled by the two setsockopt()
725	options summarized below:

727	        IP_ADD_MEMBERSHIP       Join a multicast group.  Requests
728	                                that multicast packets sent to a
729	                                particular multicast address
730	                                be delivered to this socket.

732	        IP_DROP_MEMBERSHIP      Leave a multicast group.  Requests that
733	                                multicast packets sent to a particular
734	                                multicast address no longer be delivered
735	                                to this socket.

737	4.13.  Name-to-Address Translation Functions

739	We have defined two new functions analogous to gethostbyname() and
740	gethostbyaddr() which support addresses in both the IPv4 and IPv6
741	address families.  The names of the new functions are hostname2addr()
742	and addr2hostname().  These functions were designed to have semantics
743	similar to gethostbyname() and gethostbyaddr(), so that existing IPv4
744	applications can be easily ported to IPv6.

746	Hostname2addr() is defined similarly to gethostbyname(), but enables
747	applications to specify the type of address to be looked up:

749	          struct hostent *hostname2addr(const char *name, int af);

751	This new function looks up the given name in the name service and
752	returns the completed hostent structure if the lookup succeeds, and NULL
753	otherwise.  The name argument is the domain name of the host to look up.
754	The af argument specifies the type of the address -- IPv4 (AF_INET) or
755	IPv6 (AF_INET6) -- to return to the caller in the h_addr_list field of
756	the hostent structure.

758	If the af argument is AF_INET, hostname2addr() queries the name service
759	for IPv4 addresses and, if any are found, returns a hostent structure
760	that includes an array of IPv4 addresses.  Each IPv4 address is encoded
761	in network byte order.

763	If the af argument is AF_INET6, the processing is as follows: the
764	hostname2addr() function first queries the name service for IPv6
765	addresses. If IPv6 addresses are found, they are returned in an array in
766	the hostent structure.  If no IPv6 addresses are found, the function
767	queries the name service for IPv4 addresses. If IPv4 addresses are
768	found, they are returned as IPv4-mapped IPv6 addresses.  As in IPv4,
769	each IPv6 address returned in the hostent structure is encoded in
770	network byte order.

772	The second new function, called addr2hostname(), is defined in exactly
773	the same way as the gethostbyaddr() function, except that it now
774	supports both the IPv4 and IPv6 address families:

776	        struct hostent *addr2hostname(const void *addr, int len, int af);

778	addr2hostname() performs an address-to-name lookup on the address
779	specified, returning a completed hostent structure if the lookup
780	succeeds, or NULL, if the lookup fails. This function supports both the
781	AF_INET and AF_INET6 address families. If the af argument is AF_INET,
782	then len must be specified to be 4-octets and addr must refer to an IPv4
783	address.  If af is AF_INET6, then len must be specified as 16-octets and
784	addr must refer to an IPv6 address.  If the addr argument is an
785	IPv4-mapped IPv6 address, an IPv4 address-to-name lookup is performed on
786	the embedded IPv4 address.

788	A new name-to-address translation library function is now under
789	development at Berkeley [2].  This new function, named getconninfo(),
790	will subsume the functionality of gethostbyname(), hostname2addr(), as
791	well as the getservbyname() and getservbyport() functions.  The new
792	function is specifically designed to be "transport independent", so it
793	should be directly usable by IPv6 applications.

795	System implementations should provide the addr2hostname() and
796	hostname2addr() functions in order to simplify the porting of existing
797	IPv4 applications to IPv6.  System implementations may also provide the
798	getconninfo() function, once it is defined, so that newly written
799	applications can be transport independent.

801	The getconninfo() function is expected to be published as a separate
802	specification document, not included in this spec.

804	Implementations must retain the BSD gethostbyname() and gethostbyaddr()
805	functions in order to provide source and binary compatibility for
806	existing applications.

808	4.14.  Address Conversion Functions

810	BSD Unix provides two functions, inet_addr() and inet_ntoa(), to convert
811	an IPv4 address between binary and printable form.  IPv6 applications
812	need similar functions.  We have defined the following two functions to
813	convert both IPv6 and IPv4 addresses:

815	        int ascii2addr(int af, const char *cp, void *ap);

817	and

819	        char *addr2ascii(int af, const void *ap, int len, char *cp);

821	The first function converts an ascii string to an address in the address
822	family specified by the af argument.  Currently AF_INET and AF_INET6
823	address families are supported.  The cp argument points to the ascii
824	string being passed in.  The ap argument points to a buffer into which
825	the function stores the address.  Ascii2addr() returns the length of the
826	address in octets if the conversion succeeds, and -1 otherwise. The
827	function does not modify the storage pointed to by ap if the conversion
828	fails. The application must ensure that the buffer referred to by ap is
829	large enough to hold the converted address.

831	If the af argument is AF_INET, the function accepts a string in the
832	standard IPv4 dotted decimal form:

834	        ddd.ddd.ddd.ddd

836	where ddd is a one to three digit decimal number between 0 and 255.

838	If the af argument is AF_INET6, then the function accepts a string in
839	one of the standard IPv6 printing forms defined in the addressing
840	architecture specification [3].

842	The second function converts an address into a printable string.  The af
843	argument specifies the form of the address.  This can be AF_INET or
844	AF_INET6.  The ap argument points to a buffer holding an IPv4 address if
845	the af argument is AF_INET, and an IPv6 address if the af argument is
846	AF_INET6.  The len field specifies the length in octets of the address
847	pointed to by ap, and must be 4 if af is AF_INET, or 16 if af is
848	AF_INET6.  The cp argument points to a buffer that the function can use
849	to store the ascii string.  If the cp argument is NULL, the function
850	uses its own private static buffer.  If the application specifies a cp
851	argument, it must be large enough to hold the ascii conversion of the
852	address specified as an argument, including the terminating null octet.
853	For IPv6 addresses, the buffer must be at least 46-octets.  For IPv4
854	addresses, the buffer must be at least 16-octets.

856	The addr2ascii() function returns a pointer to the buffer containing the
857	ascii string if the conversion succeeds, and NULL otherwise.  The
858	function does not modify the storage pointed to by cp if the conversion
859	fails.

861	5.  Security Considerations

863	IPv6 provides a number of new security mechanisms, many of which need to
864	be accessible to applications.  A companion document detailing the
865	extensions to the socket interfaces to support IPv6 security is being
866	written [4].  At some point in the future, that document and this one
867	may be merged into a single API specification.

869	6. Changes from October 1994 Edition
870	   -    Added variant of sockaddr_in6 for 4.4 BSD-based systems (sa_len
871	        compatibility).

873	   -    Removed references to SIT transition specification, and added
874	        reference to addressing architecture document, for definition of
875	        IPv4-mapped addresses.

877	   -    Added a solution to the problem of the application not providing
878	        enough buffer space to hold a received source route.

880	   -    Moved discussion of IPv4 applications interoperating with IPv6
881	        nodes to open issues section.

883	   -    Added length parameter to addr2ascii() function to be consistent
884	        with addr2hostname().

886	   -    Changed IP_MULTICAST_TTL to IP_MULTICAST_HOPS to match IPv6
887	        terminology, and added IP_UNICAST_HOPS option to match
888	        IP_MULTICAST_HOPS.

890	   -    Removed specification of numeric values for AF_INET6,
891	        IP_ADDRFORM, and IP_RCVSRCRT, since they need not be the same on
892	        different implementations.

894	   -    Added a definition for the in_addr6 IPv6 address data
895	        structure.  Added this so that applications could use
896	        sizeof(struct in_addr6) to get the size of an IPv6 address,
897	        and so that a structured type could be used in the
898	        is_ipv4_addr().

900	7. Open Issues

902	A few open issues for IPv6 socket interface API specification remain,
903	including:

905	   -    The multicast API needs to be documented in more detail.

907	   -    Should we add a timeout parameter to hostname2addr() and
908	        addr2hostname()?  DNS lookups need to be given some finite
909	        timeout interval, so it might be nice to let the application
910	        specify that interval.

912	   -    Can existing IPv4 applications interoperate with IPv6 nodes?

914	7.1. IPv4 Applications Interoperating with IPv6 Nodes

916	This problem primarily has to do with the how IPv4 applications
917	represent addresses of IPv6 nodes.  What address should be returned to
918	the application when an IPv6/UDP packet is received, or an IPv6/TCP
919	connection is accepted?  The peer's address could be any arbitrary
920	128-bit IPv6 address.  But the application is only equipped to deal with
921	32-bit IPv4 addresses encoded in sockaddr_in data structures.

923	We have not discovered any solution that provides complete transparent
924	interoperability with IPv6 nodes for applications using the original
925	IPv4 API.  However, two techniques that partially solve the problem are:

927	   1)   Prohibit communication between IPv4 applications and IPv6 nodes.
928	        Only UDP packets received from IPv4 nodes would be passed up to
929	        the application, and only TCP connections received from IPv4
930	        nodes would be accepted.  UDP packets from IPv6 nodes would be
931	        dropped, and TCP connections from IPv6 nodes would be refused.

933	   2)   The system could generate a local 32-bit cookie to represent the
934	        full 128-bit IPv6 address, and pass this value to the
935	        application.  The system would maintain a mapping from cookie
936	        value into the 128-bit IPv6 address that it represents.  When
937	        the application passed a cookie back into the system (for
938	        example, in a sendto() or connect() call) the system would use
939	        the 128-bit IPv6 address that the cookie represents.

941	        The cookie would have to be chosen so as to be an invalid IPv4
942	        address (e.g. an address on net 127.0.0.0), and the system would
943	        have to make sure that these cookie values did not escape into
944	        the Internet as the source or destination addresses of IPv4
945	        packets.

947	Both of these techniques have drawbacks.  This is an area for further
948	study.  System implementors may use one of these techniques or implement
949	another solution.

951	Acknowledgments

953	Thanks to the many people who made suggestions and provided feedback to
954	earlier revisions of this document.  Comments were provided by: Richard
955	Stevens, Dan McDonald, Christian Huitema, Steve Deering, Andrew
956	Cherenson, Charles Lynn, Ran Atkinson, Erik Nordmark, Glenn Trewitt,
957	Fred Baker, Robert Elz, Dean D. Throop, and Francis Dupont.  Craig
958	Partridge suggested the addr2ascii() and ascii2addr() functions.

960	Ramesh Govindan made a number of contributions and co-authored an
961	earlier version of this paper.

963	References

965	  [1]   R. Hinden. "Internet Protocol, Version 6 (IPv6) Specification".
966	        Internet Draft.  October 1994.

968	  [2]   K. Sklower. Private communication.

970	  [3]   R. Hinden. "IP Next Generation Addressing Architecture".
971	        Internet Draft. October 1994.

973	  [4]   D. McDonald. "IPv6 Security API for BSD Sockets".  Internet
974	        Draft. 30 January 1995.

976	Authors' Address

978	        Jim Bound
979	        Digital Equipment Corporation
980	        110 Spitbrook Road ZK3-3/U14
981	        Nashua, NH 03062-2698
982	        Phone: +1 603 881 0400
983	        Email: bound@zk3.dec.com

985	        Susan Thomson
986	        Bell Communications Research
987	        MRE 2P-343, 445 South Street
988	        Morristown, NJ 07960
989	        Telephone: +1 201 829 4514
990	        Email: set@thumper.bellcore.com

992	        Robert E. Gilligan
993	        Sun Microsystems, Inc.
994	        2550 Garcia Avenue
995	        Mailstop UMTV05-44
996	        Mountain View, CA 94043-1100
997	        Phone: +1 415 336 1012
998	        Email: bob.gilligan@eng.sun.com