idnits 2.17.1 draft-ullmann-ipv7-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-03-29) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** Expected the document's filename to be given on the first page, but didn't find any == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 33 longer pages, the longest (page 1) being 64 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There is 1 instance of too long lines in the document, the longest one being 2 characters in excess of 72. == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == Couldn't figure out when the document was first submitted -- there may comments or warnings related to the use of a disclaimer for pre-RFC5378 work that could not be issued because of this. Please check the Legal Provisions document at https://trustee.ietf.org/license-info to determine if you need the pre-RFC5378 disclaimer. -- The document date (March 18, 1993) is 11334 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 405 == Unused Reference: 'RFC768' is defined on line 1583, but no explicit reference was found in the text == Unused Reference: 'RFC791' is defined on line 1585, but no explicit reference was found in the text == Unused Reference: 'RFC793' is defined on line 1593, but no explicit reference was found in the text == Unused Reference: 'RFC801' is defined on line 1597, but no explicit reference was found in the text == Unused Reference: 'RFC1287' is defined on line 1599, but no explicit reference was found in the text == Unused Reference: 'RFC1335' is defined on line 1606, but no explicit reference was found in the text == Unused Reference: 'RFC1338' is defined on line 1610, but no explicit reference was found in the text == Unused Reference: 'RFC1347' is defined on line 1614, but no explicit reference was found in the text ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Downref: Normative reference to an Unknown state RFC: RFC 801 ** Downref: Normative reference to an Informational RFC: RFC 1287 ** Obsolete normative reference: RFC 1323 (Obsoleted by RFC 7323) ** Downref: Normative reference to an Informational RFC: RFC 1335 ** Obsolete normative reference: RFC 1338 (Obsoleted by RFC 1519) ** Downref: Normative reference to an Historic RFC: RFC 1347 -- Possible downref: Non-RFC (?) normative reference: ref. 'ID-RAP' Summary: 19 errors (**), 0 flaws (~~), 12 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Draft R. L. Ullmann 2 Process Software Corporation 3 March 18, 1993 5 TCP/IP: Internet Version 7 7 1 Status of this Memo 9 This memo describes a proposal for the next version of the Internet 10 protocols. 12 The protocol described is experimental, and does not represent a 13 protocol on the formal internet standards track at this writing. 15 The first version of this memo, describing a possible Internet Version 16 7 protocol was written by the present author in the summer and fall of 17 1989, and circulated informally, including to the IESG, in December 18 1989. A further informal note on the addressing, called "Toasternet 19 Part II", was circulated on the IETF mail list during March of 1992. 21 This document is an Internet Draft. Internet Drafts are working 22 documents of the Internet Engineering Task Force (IETF), its Areas, 23 and its Working Groups. (Note that other groups may also distribute 24 working documents as Internet Drafts). 26 Internet Drafts are draft documents valid for a maximum of six months. 27 Internet Drafts may be updated, replaced, or obsoleted by other 28 documents at any time. It is not appropriate to use Internet Drafts 29 as reference material or to cite them other than as a "working draft" 30 or "work in progress." 32 Please check the I-D abstract listing contained in each Internet Draft 33 directory to learn the current status of this or any other Internet 34 Draft. 36 This draft expires on or before September 18, 1993. 38 Ullmann DRAFT: expires September 18, 1993 [page 1] 39 2 Contents 41 1 Status of this Memo . . . . . . . . . . . . . . . . 1 42 2 Contents . . . . . . . . . . . . . . . . . . . . . . 2 43 3 Introduction . . . . . . . . . . . . . . . . . . . . 4 44 3.1 Objectives . . . . . . . . . . . . . . . . . . . . 5 45 3.2 Philosophy . . . . . . . . . . . . . . . . . . . . 6 46 4 Internet numbers . . . . . . . . . . . . . . . . . . 7 47 4.1 Is 64 Bits Enough? . . . . . . . . . . . . . . . . 7 48 4.2 Why version 7? . . . . . . . . . . . . . . . . . . 7 49 4.3 The version 7 IP address . . . . . . . . . . . . . 8 50 4.4 AD numbers . . . . . . . . . . . . . . . . . . . . 8 51 4.5 Mapping of version 4 numbers . . . . . . . . . . . 9 52 5 IP: Internet datagram protocol . . . . . . . . . . . 9 53 5.1 IP datagram header format . . . . . . . . . . . 10 54 5.1.1 Version . . . . . . . . . . . . . . . . . . . 10 55 5.1.2 Header length . . . . . . . . . . . . . . . . 11 56 5.1.3 Time to live . . . . . . . . . . . . . . . . . 11 57 5.1.4 Total datagram length . . . . . . . . . . . . 11 58 5.1.5 Forward route identifier . . . . . . . . . . . 11 59 5.1.6 Destination . . . . . . . . . . . . . . . . . 11 60 5.1.7 Source . . . . . . . . . . . . . . . . . . . . 11 61 5.1.8 Protocol . . . . . . . . . . . . . . . . . . . 11 62 5.1.9 Checksum . . . . . . . . . . . . . . . . . . . 11 63 5.1.10 Options . . . . . . . . . . . . . . . . . . . 12 64 5.2 Option Format . . . . . . . . . . . . . . . . . 12 65 5.2.1 Class (C) . . . . . . . . . . . . . . . . . . 12 66 5.2.2 Copy on fragmentation (F) . . . . . . . . . . 13 67 5.2.3 Type . . . . . . . . . . . . . . . . . . . . . 13 68 5.2.4 Length . . . . . . . . . . . . . . . . . . . . 13 69 5.2.5 Option data . . . . . . . . . . . . . . . . . 13 70 5.3 IP options . . . . . . . . . . . . . . . . . . . 13 71 5.3.1 Null . . . . . . . . . . . . . . . . . . . . . 13 72 5.3.2 Fragment . . . . . . . . . . . . . . . . . . . 13 73 5.3.3 Last Fragment . . . . . . . . . . . . . . . . 14 74 5.3.4 Don't Fragment . . . . . . . . . . . . . . . . 14 75 5.3.5 Don't Convert . . . . . . . . . . . . . . . . 14 76 5.4 Forward route identifier . . . . . . . . . . . . 15 77 5.4.1 Procedure description . . . . . . . . . . . . 15 78 5.4.2 Flows . . . . . . . . . . . . . . . . . . . . 16 79 5.4.3 Mobile hosts . . . . . . . . . . . . . . . . . 17 80 6 TCP: Transport protocol . . . . . . . . . . . . . 18 81 6.1 TCP segment header format . . . . . . . . . . . 18 82 6.1.1 Data offset . . . . . . . . . . . . . . . . . 18 83 6.1.2 MBZ . . . . . . . . . . . . . . . . . . . . . 19 84 6.1.3 Flags . . . . . . . . . . . . . . . . . . . . 19 85 6.1.4 Checksum . . . . . . . . . . . . . . . . . . . 19 86 6.1.5 Source port . . . . . . . . . . . . . . . . . 19 87 6.1.6 Destination port. . . . . . . . . . . . . . . 19 88 6.1.7 Sequence . . . . . . . . . . . . . . . . . . . 19 89 6.1.8 Acknowledgement . . . . . . . . . . . . . . . 19 90 6.1.9 Window . . . . . . . . . . . . . . . . . . . . 19 91 6.1.10 Options . . . . . . . . . . . . . . . . . . . 19 92 6.2 Port numbers . . . . . . . . . . . . . . . . . . 20 93 6.3 TCP options . . . . . . . . . . . . . . . . . . 20 95 Ullmann DRAFT: expires September 18, 1993 [page 2] 96 6.3.1 Option Format . . . . . . . . . . . . . . . . 20 97 6.3.2 Null . . . . . . . . . . . . . . . . . . . . . 20 98 6.3.3 Maximum Segment Size . . . . . . . . . . . . . 20 99 6.3.4 Urgent Pointer . . . . . . . . . . . . . . . . 20 100 6.3.5 32 Bit rollover . . . . . . . . . . . . . . . 21 101 7 UDP: User Datagram protocol . . . . . . . . . . . 21 102 7.1 UDP header format . . . . . . . . . . . . . . . 21 103 7.1.1 Data offset . . . . . . . . . . . . . . . . . 21 104 7.1.2 MBZ . . . . . . . . . . . . . . . . . . . . . 21 105 7.1.3 Checksum . . . . . . . . . . . . . . . . . . . 21 106 7.1.4 Source port . . . . . . . . . . . . . . . . . 22 107 7.1.5 Destination port. . . . . . . . . . . . . . . 22 108 7.1.6 Options . . . . . . . . . . . . . . . . . . . 22 109 8 ICMP . . . . . . . . . . . . . . . . . . . . . . . 22 110 8.1 ICMP header format . . . . . . . . . . . . . . . 22 111 8.2 Conversion failed ICMP message . . . . . . . . . 23 112 9 Notes on the domain system . . . . . . . . . . . . 24 113 9.1 A records . . . . . . . . . . . . . . . . . . . 24 114 9.2 PTR zone . . . . . . . . . . . . . . . . . . . . 24 115 10 Conversion between version 4 and version 7 . . . . 25 116 10.1 Version 4 IP address extension option . . . . . 25 117 10.1.1 Option format . . . . . . . . . . . . . . . . 25 118 10.2 Fragmented datagrams . . . . . . . . . . . . . . 25 119 10.3 Where does the conversion happen? . . . . . . . 26 120 10.4 Hybrid IPv4 systems . . . . . . . . . . . . . . 27 121 10.5 Maximum segment size in TCP . . . . . . . . . . 27 122 10.6 Forwarding and redirects . . . . . . . . . . . . 27 123 10.7 Design considerations . . . . . . . . . . . . . 27 124 10.8 Conversion from IPv4 to IPv7 . . . . . . . . . . 28 125 10.9 Conversion from IPv7 to IPv4 . . . . . . . . . . 29 126 10.10 Conversion from TCPv4 to TCPv7 . . . . . . . . . 30 127 10.11 Conversion from TCPv7 to TCPv4 . . . . . . . . . 30 128 10.12 ICMP conversion . . . . . . . . . . . . . . . . 31 129 11 Postscript . . . . . . . . . . . . . . . . . . . . 32 130 12 References . . . . . . . . . . . . . . . . . . . . 33 131 13 Author's Address . . . . . . . . . . . . . . . . . 33 133 Ullmann DRAFT: expires September 18, 1993 [page 3] 134 3 Introduction 136 This memo presents the specification for version 7 of the Internet 137 Protocol, as well as version 7 of the TCP and the user datagram 138 protocol. Version 7 has been designed to address several major 139 problems that have arisen as version 4 has evolved and been deployed, 140 and to make a major step forward in the datagram switching and 141 forwarding architecture of the Internet. 143 The major problems are threefold. First, the address space of version 144 4 is now seen to be too small. While it was viewed as being almost 145 impossibly large when version 4 was designed, two things have occurred 146 to create a problem. The first is a success crisis: the internet 147 protocols have been more widely used and accepted than their designers 148 anticipated. Also, technology has moved forward, putting 149 microprocessors into devices not anticipated except as future dreams a 150 decade ago. 152 The second major problem is a perceived routing explosion. The 153 present routing architecture of the internet calls for routing each 154 organization's network independently. It is becoming increasingly 155 clear that this does not scale to a universal internet. While it is 156 possible to route several billion networks in a flat, structureless 157 domain, it is not desireable. 159 There is also the political administrative issue of assigning network 160 numbers to organizations. The version 4 administrative system calls 161 for organizations to request network assignments from a single 162 authority. While to some extent this has been alleviated by reserving 163 blocks to delegated assignments, the address space is not large enough 164 to do this in the necessary general case, with large blocks allocated 165 to (e.g.) national authority. 167 The third problem is the increasing bandwidth of the networks and of 168 the applications possible on the network. The TCP, while having 169 proven useful on an unprecedented range of network speeds, is now the 170 limiting factor at the highest speeds, due to restrictions of window 171 size, sequence-space, and port numbers. These limitations can all be 172 addressed by increasing the sizes of the relevent fields. 174 There is also an opportunity to move the technology forward, and take 175 advantage of a combination of the best features of the hop-by-hop 176 connectionless forwarding of version 4 (and CLNP) as well as the 177 pre-established paths of version 5 (and, e.g., the OSI CONS). 179 Internet Version 7 includes four major areas of improvement, while at 180 the same time retaining interoperation with version 4 with a small 181 amount of conversion knowledge imposed on version 7 hosts and routers. 183 o It increases the address fields to 64 bits, with sufficient 184 space for visible future expansion of the internet. 186 Ullmann DRAFT: expires September 18, 1993 [page 4] 187 o It adds a numbering layer for administrations, above the 188 organization or network layer, as well as providing more 189 space for subnetting within organizations. 191 o It increases the range of speeds and network path delays over 192 which the TCP will operate satisfactorily, as well as the 193 number of transactions in bounded time that can be served by 194 a host. 196 o Finally, it provides a forward route identifier in each 197 datagram, to support extremely fast path, circuit, or 198 flow-based forwarding, or any desired combination, while 199 preserving hop-by-hop connectivity. 201 The result is not just a movement sideways, deploying a new network 202 layer protocol to patch current problems. It is a significant step 203 forward for network layer technology, 205 3.1 Objectives 207 The following are some of the objectives of the design. 209 o Use what has been learned from the IP version 4 protocol, fixing 210 things that are troublesome, and not fixing that which is not 211 broken. 213 o Retain the essential "look and feel" of the Internet protocol 214 suite. It has been very successful, and one doesn't argue with 215 success. 217 o Not introduce concepts that the Internet has shown do not belong 218 in the protocol definition. Best example: we do not want to add 219 any kind of routing information into the addressing, other than 220 the administrative hierarchy that has sometimes proved useful. 221 Note that the one feature in version 4 addressing (the class 222 system) designed to aid routing is now the most serious single 223 problem. 225 o Allow current hosts to interoperate, if not universally, at least 226 within an organization or larger area for the indefinite future. 227 There will be version 4 hosts for 10-15 years into the future, 228 the Internet must remain on good terms with them. 230 o Likewise, we must not impose the new version, telling sites they 231 must convert to stay connected. People resist imposed solutions. 232 It must not be marketed as something different from IPv4; the 233 differences must be down-played at every opportunity. 235 o The design must allow individual hosts and routers to be upgraded 236 effectively at random, with no transition plan constraints. 238 Ullmann DRAFT: expires September 18, 1993 [page 5] 239 o The design must not require renumbering the Internet. The 240 administrative work already accomplished is immense, if it is to 241 be done again it will be in assigning NSAPs. 243 o It must allow IPv4 hosts to interoperate without any reduction in 244 function, without any modification to their software or 245 configuration. (Universal connectivity will be lost by IPv4 246 hosts, but they must be able to continue operating within their 247 organization at least.) 249 o It must permit network layer state-free translation of datagrams 250 between IPv4 and IPv7; this is important to the previous point, 251 and essential to early testing and transitional deployment. 253 o It must be a competent alternative to CLNP. 255 o It must not involve changing the semantics of the network layer 256 service in any way that invalidates the huge amount of work that 257 has gone into understanding how TCP (for example) functions in 258 the net, and the implementation of that understanding. 260 o It must be defined Real Soon; the window of opportunity is almost 261 closed. It will take vendors 3 years to deploy from the time the 262 standard is rock-solid concrete. 264 I believe all of these are accomplishable in a consistent, 265 well-engineered solution, and all are essential to the survival of the 266 Internet. 268 3.2 Philosophy 270 Protocols should become simpler as they evolve. 272 Ullmann DRAFT: expires September 18, 1993 [page 6] 273 4 Internet numbers 275 The version 4 numbering system has proven to be very flexible, 276 (mostly) expandable, and simple. In short: it works. There are two 277 problems, neither serious when this specification was first developed 278 in 1988 and 1989, but have as expected become more serious: 280 o The division into network, and then subnet, is insufficient. 281 Almost all sites need a network assignment large enough to 282 subnet. At the top of the hierarchy, there is a need to 283 assign administrative domains. 285 o As bit-packing is done to accomplish the desired network 286 structure, the 32 bit limit causes more and more aggravation. 288 4.1 Is 64 Bits Enough? 290 Consider: (thought experiment) 32 bits presently numbers "all" of the 291 computers in the world, and another 32 bits could be used to number 292 all of the bytes of on-line storage on each computer. (Most have a 293 lot less than 4 gigabytes on-line, the ones that have more could be 294 notionally assigned more than one address.) 296 So: 64 bits is enough to number every byte of online storage in 297 existence today, in a hierarchical structured numbering plan. 299 Another way of looking at 64 bits: it is more than 2 billion 300 addresses for each person on the planet. Even if I have 301 microprocessors in my shirt buttons I'm not going to have that many. 302 32 bits, on the other hand, was never going to be sufficient: there 303 are more than 2^32 people. 305 4.2 Why version 7? 307 It was clearly recognized at the start of this project in 1988 that 308 making the address 64 bits implies a new IP header format, which was 309 called either "TP/IX" or "IP version 7"; there wasn't anything magic 310 about the number 7, I made it up. Version 4 is the familiar current 311 version of IP. Version 5 is the experimental ST (Stream) protocol. 312 ST-II, a newer version of ST, uses the same version number, something 313 I was not aware of until recently; I suspected it might have been 314 allocated 6. Besides, I liked 7. 316 Apparently (as reported by Bob Braden) the IAB followed much the same 317 logic, and may have had the idea planted by the mention of version 7 318 in the "Toasternet Part II" memo. The IAB in June 1992 floated a 319 proposal that CLNP, or a CLNP-based design, be Internet Version 7. 320 (And promptly got themselves toasted.) However, close inspection of 321 the bits shows that CLNP is clearly version 8. 323 Ullmann DRAFT: expires September 18, 1993 [page 7] 324 4.3 The version 7 IP address 326 The Version 7 IP 64 bit address looks like: 328 +-------+-------+-------+-------+-------+-------+-------+-------+ 329 | Admin Domain | Network | Host | 330 +-------+-------+-------+-------+-------+-------+-------+-------+ 332 Note: the boundary between "network" and "host" is no more fixed than 333 it is today; each (sub)network will have its own mask. Just as the 334 mask today can be anywhere from FF00 0000 (8/24) to FFFF FFFC (30/2), 335 the mask for the 64 bit address can reasonably be FFFF FF00 0000 0000 336 (24/40) to FFFF FFFF FFFF FFFC (62/2). 338 The AD (Administrative domain), identifies an administration which may 339 be a service provider, a national administration, or a large 340 multi-organization (e.g. a government). The idea is that there 341 should not be more than a few hundred of these at first, and 342 eventually thousands or tens of thousands at most. (But note that we 343 do not introduce a hard limit of 2^16 here; this estimate may be off 344 by a few orders of magnitude.) Since only 1/4th of the address space 345 is initially used (first two bits are 10), the remainder can then be 346 allocated in the future with more information available. 348 Most individual organizations would not be ADs. In the short term, 349 ADs are known to the "core routing"; it pays to keep the number 350 smallish, a few thousand given current routing technology. In the 351 long term, this is not necessary. Big administrations (i.e. with 352 tens of millions of networks) get small blocks where needed, or 353 additional single AD numbers when needed. 355 While the AD may be used for last resort routing (with a 24/40 mask), 356 it is primarily only an administrative device. Most routing will be 357 done on the entire 48 bit AD+network number, or sub and super-sets of 358 those numbers. (I.e. masks between about 32/32 and 56/8.) 360 Some ADs (e.g. NSF) may make permanent assignments; others (such as a 361 telephone company defining a network number for each subscriber line) 362 may tie the assignment to such a subscription. But in no case does 363 this require traffic to be routed via the AD. 365 4.4 AD numbers 367 AD numbers are allocated out of the same numbering space as network 368 numbers. This means that a version 4 address can be distinguished 369 from the first 32 bits of a version 7 address. This is useful to help 370 prevent the inadvertent use of the first half of the longer address by 371 a version 4 host. 373 There is a non-trivial amount of software that assumes that an "int" 374 is the same size and shape as an IP address, and does things like 375 "ipaddr = *(int *)ptr". This usage has always been incorrect, but 376 does occur with disturbing frequency. As IPv7 8 byte addresses appear 377 in the application layers, this software will find those addresses 379 Ullmann DRAFT: expires September 18, 1993 [page 8] 380 unreachable; this is preferable to interacting with a random host. 382 ADs are allocated in the range 96.0.0 to 126.255.255, using the top 383 1/4 of the version 4 class A space. It is probably best to allocate 384 the first component downwards from 126, so that the boundary between 385 class A and AD can be moved if desired later. This initial allocation 386 provides for 2031616 ADs, many more than there should be even in full 387 deployment. 389 Eventually, both AD and network will use the full 24 bit space 390 available to them. Knowledge of the AD range should not be coded into 391 software. If it was coded in, that software would break when the 392 entire 24 bit space is used for ADs. (This lesson should have been 393 learned from CIDR.) 395 4.5 Mapping of version 4 numbers 397 Initially, all existing Internet numbers are defined as belonging to 398 the NSF/Internet AD, number 126.0.0. 400 The mapping from/to version 4 IP addresses: 402 +-------+-------+-------+-------+-------+-------+-------+-------+ 403 | Admin Domain | Network | Host | 404 +-------+-------+-------+-------+-------+-------+-------+-------+ 405 [ fixed at 7E 00 00 ] [ 1st 24 bits of V4 IP] [1] [last 8] 407 So, for example, 192.42.95.15 (V4) becomes 126.0.0.192.42.95.1.15. 409 And the "standard" loopback interface address becomes 410 126.0.0.127.0.0.1.1 (I can see explaining that in 2015 to someone born 411 in 1995.) 413 The present protocol multicast (126.0.0.224.x.y.1.z) and loopback 414 addresses are permanently allocated in the NSF AD. 416 5 IP: Internet datagram protocol 418 The Internet datagram protocol is revised to expand some fields (most 419 notably the addresses), while removing and relegating to options all 420 fields not universally useful (imperative) in every datagram in every 421 environment. 423 This results in some simplification, a length less than twice the size 424 of IPv4 even though most fields are doubled in size, and an expanded 425 space for options. 427 There is also a change in the option philosophy from IPv4: it 428 specified that implementation of options was not optional, what was 429 optional was the existence of options in any given datagram. This is 430 changed in IPv7: no option need be implemented to be fully 431 conformant. However, implementations must understand the option 432 classes; and a future Host Requirements specification for hosts and 433 routers used in the "connected Internet" may require some options in 435 Ullmann DRAFT: expires September 18, 1993 [page 9] 436 its profile, e.g. Fragment would probably be required. 438 Digression: In IPv4, options are often "considered harmful". It is 439 the opinion of the present author that this is because they are rarely 440 needed, and not designed to be processed rapidly on most 441 architectures. This leads to little or no attempt to improve 442 performance in implementations, while at the same time enormous effort 443 is dedicated to optimization of the no-option case. IPv7 is expected 444 to be different on both counts. 446 Fields are always aligned on their own size; the 64 bit fields on 64 447 bit intervals from the start of the datagram. 449 Options are all 32 bit aligned, and the null option can be used to 450 push a subsequent option (or the transport layer header) into 64 bit 451 or 64+32 off-phase alignment as desired. 453 5.1 IP datagram header format 455 0 1 2 3 456 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 457 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 458 |version| header length | time to live | 459 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 460 | total datagram length | 461 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 462 | | 463 + forward route identifier + 464 | | 465 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 466 | | 467 + destination address + 468 | | 469 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 470 | | 471 + source address + 472 | | 473 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 474 | protocol | checksum | 475 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 476 | options | 477 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 479 A description of each field follows. 481 5.1.1 Version 483 This document describes version 7 of the protocol. 485 5.1.2 Header length 487 The header length is a 12 bit count of the number of 32 bit words in 488 the IPv7 header. This allows a header to be (theoretically at least) 489 up to 16380 bytes in length. 491 5.1.3 Time to live 493 The time to live is a 16 bit count, nominally in 1/16 seconds. Each 494 hop is required to decrement TTL by at least one. 496 This definition should allow continuation of the useful (even though 497 not entirely valid) interpretation of TTL as a hop count, while we 498 move to faster networks and routers. (The most familiar use is by 499 "traceroute", which really ought to be directly implemented by one or 500 more ICMP messages.) 502 The scale factor converts the usual version 4 default TTL into a 503 larger number of hops. This is desireable because the forward route 504 architecture of version 7 enables the construction of simpler, faster 505 switches, and this may cause the network diameter to increase. 507 5.1.4 Total datagram length 509 The 32 bit length of the entire datagram in octets. A datagram can 510 therefore be up to 4294967295 bytes in overall length. Particular 511 networks will normally impose lower limits. 513 5.1.5 Forward route identifier 515 The identifier from the routing protocol to be used by the next hop 516 router to find its next hop. (A more complete description is given 517 below.) 519 5.1.6 Destination 521 The 64 bit IPv7 destination address. 523 5.1.7 Source 525 The 64 bit IPv7 source address. 527 5.1.8 Protocol 529 The transport layer protocol, e.g. TCP is 6. The present code space 530 for this layer of demultiplexing is about half full. Expanding it to 531 16 bits, allowing 65535 registered "transport" layers seems prudent. 533 5.1.9 Checksum 535 The checksum is a 16 bit checksum of the entire IP header, using the 536 familiar algorithm used in IPv4. 538 5.1.10 Options 540 Options may follow. They are variable length, and always 32 bit 541 aligned, as discussed previously. 543 5.2 Option Format 545 Each option begins with a 32 bit header: 547 0 1 2 3 548 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 550 | C |F| type | length | 551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 552 | option data ... | padding | 553 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 555 A description of each field: 557 5.2.1 Class (C) 559 This field tells implementations what to do with datagrams containing 560 options they do not understand. No implementation is required to 561 implement (i.e. understand) any given option by the TCP/IP 562 specification itself. 564 Classes: 566 0 use or forward and include this option unmodified 567 1 use this datagram, but do not forward the datagram 568 2 discard, or forward and include this option unmodified 569 3 discard this datagram 571 A host receiving a datagram addressed to itself will use it if there 572 are no unknown options of class 2 or 3. A router receiving a datagram 573 not addressed to it will forward the datagram if and only if there are 574 no unknown options of class 1 or 3. (The astute reader will note that 575 the bits can also be seen as having individual interpretations, one 576 allowing use even if unknown, one allowing forwarding if unknown.) 578 Note that classes 0 and 2 are imperative: if the datagram is 579 forwarded, the unknown option must be included. 581 Class and type are entirely orthogonal, different implementations 582 might use different classes for the same option, except where 583 restricted by the option definition. 585 Also note that for options that are known (implemented by) the host or 586 router, the class has no meaning; the option definition totally 587 determines the behavior. (Although it should be noted that the option 588 might explicitly define a class dependent behavior.) 589 5.2.2 Copy on fragmentation (F) 591 If the F bit is set, this option must be copied into all fragments 592 when a datagram is fragmented. If the F bit is reset (zero), the 593 option must only be copied into the first (zero-offset) fragment. 595 5.2.3 Type 597 The type field identifies the particular option, types being 598 registered as well known values in the internet. A few of the options 599 with their types are described below. 601 5.2.4 Length 603 Length of the option data, in bytes. 605 5.2.5 Option data 607 Variable length specified by the length field, plus 0-3 bytes of zeros 608 to pad to a 32 bit boundary. Fields within the option data that are 609 64 bits long are normally placed on the assumption that the option 610 header is off-phase aligned, the usual case when the option is the 611 only one present, and immediately follows the IP header. 613 5.3 IP options 615 The following sections describe the options defined to emulate IPv4 616 features, or necessary in the basic structure of the protocol. 618 5.3.1 Null 620 The null option, type 0, provides for a space filler in the option 621 area. The data may be of any size, including 0 bytes (perhaps the 622 most useful case.) 624 It may be used to change alignment of the following options or to 625 replace an option being deleted, by setting type to 0 and class to 0, 626 leaving the length and content of the data unmodified. (Note that 627 this implies that options must not contain "secret" data, relying on 628 class 3 to prevent the data from leaving the domain of routers that 629 understand the option.) 631 Null is normally class 0, and need not be implemented to serve its 632 function. 634 5.3.2 Fragment 636 Fragment (type 1) indicates that the datagram is part of a complete IP 637 datagram. It is always class 2. 639 The data consists of (one of) the 64 bit IP address(es) of the router 640 doing the fragmentation, a 64 bit datagram ID generated by that 641 router, and a 32 bit fragment offset. The IDs should be generated so 642 as to be very likely unique over a period of time larger than the TCP 643 MSL (maximum segment lifetime). (The TCP ISN (initial sequence 644 number) generator might be used to initialize the ID generator in a 645 router.) 647 0 1 2 3 648 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 649 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 650 | C |F| type | length | 651 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 652 | | 653 + fragmenting router IP address + 654 | | 655 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 656 | | 657 + datagram ID + 658 | | 659 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 660 | offset | 661 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 663 If a datagram must be refragmented, the original 128 bit address+ID is 664 preserved, so that the datagram can be reassembled from any sufficient 665 set of the resulting fragments. The 64 bits fields are positioned so 666 that they are aligned in the usual case of the fragment option 667 following the IP header. 669 A router implementing Fragment (doing fragmentation) must recognize 670 the Don't Fragment option. 672 5.3.3 Last Fragment 674 Last Fragment (type 2) has the same format as Fragment, but implies 675 that this datagram is the last fragment needed to reassemble the 676 original datagram. 678 Note that an implementation can reasonably add arriving datagrams with 679 Fragment to a cache, and then attempt a reassembly when a datagram 680 with Last Fragment arrives (and the the total length is known); this 681 will work well when datagrams are not reordered in the network. 683 5.3.4 Don't Fragment 685 This option (type 3, class 0) indicates that the datagram may not be 686 fragmented. If it can not be forwarded without fragmentation, it is 687 discarded, and the appropriate ICMP message sent. (Unless, of course, 688 the datagram is an ICMP message.) There is no data present. 690 5.3.5 Don't Convert 692 The Don't Convert option prohibits conversion from IPv7 to IPv4 693 protocol, requiring instead that the datagram be discarded and an ICMP 694 message sent (conversion failed/don't convert set). It is type 4, 695 usually class 0, and must be implemented by any router implementing 696 conversion. A host is under no such constraint; like any protocol 697 specification, only the "bits on the wire" can be specified, the host 698 receiving the datagram may convert it as part of its procedure. There 699 is no data present in this option. 701 5.4 Forward route identifier 703 Each IP datagram carries a 64 bit field, called "forward route 704 identifier", that is updated (if the information is available) at each 705 hop. This field's value is derived from the routing protocol (e.g. 706 RAP [ID-RAP]). It is used to expedite routing decisions by preserving 707 knowledge where possible between consecutive routers. It can also be 708 used to make datagrams stay within reserved flows and mobile-host 709 tunnels where required. 711 5.4.1 Procedure description 713 Consider 3 routers, A, B, and C. Traffic is passing through them, 714 between two other hosts (or networks), X and Y, packets are going 715 XABCY and YCBAX. Consider only one direction: routing info flowing 716 from C to A, to provide a route from A to C. The same thing will be 717 happening in the other direction. 719 An explanation of the notation: 721 R(r,d,i,h) A route that means: "from router r, to go toward final 722 destination d, replace the forward route identifier in 723 the packet with i, and take next hop h." 725 Ri(r,d) An opaque (outside of router r) identifier, that can be 726 used by r to find R(r,d,...). 728 Flowi(r,rt) An opaque (outside of router r) identifier, that router 729 r can use to find a flow or tunnel with which the datagram 730 is associated, and from that the route rt on which the 731 flow or tunnel is built, as well as the Flowi() for the 732 subsequent hop. 734 Ri(Dgram) The forward route identifier in a datagram. 736 Router C announces a route R(C,Y,0,Y) to router B. It includes in it 737 an identifier Ri(C,Y) internal to C, that will allow C to find the 738 route rapidly. (A table index, or an actual memory address.) 740 Router B creates a route R(B,Y,Ri(C,Y),C) via router C, it announces 741 it to A, including an identifier Ri(B,Y), internal to B, and used by A 742 as an opaque object. 744 Router A creates a route R(A,Y,Ri(B,Y),B) via router B. It has no one 745 to announce it to. 747 Now: X originates a datagram addressed to Y. It has no routing 748 information, and sets Ri(Dgram) to zero. It forwards the datagram to 749 router A (X's default gateway). 751 A finds no valid Ri(Dgram), and looks up the destination (Y) in its 752 routing tables. It finds R(A,Y,Ri(B,Y),B), sets Ri(Dgram) <- Ri(B,Y), 753 and forwards the datagram to B. 755 Router B looks at Ri(Dgram) which directly identifies the next hop 756 route R(B,Ri(C,Y),C), sets Ri(Dgram) <- Ri(C,Y) and forwards it to 757 router C. 759 Router C looks at Ri(Dgram) which directly locates R(C,0,Y), sets 760 Ri(Dgram) <- 0 and forwards to Y. 762 Y recognizes its own address in Dest(Dgram), ignores Ri(Dgram). 764 Of course, the routers will validate the Ri's received, particularily 765 if they are memory addresses (e.g. M(a) < Ri < M(b), Ri mod N == 0), 766 and probably check that the route in fact describes the destination of 767 the datagram. If the Ri is invalid, the router must use the ordinary 768 method of finding a route (i.e. what it would have done if Ri was 0), 769 and silently ignore the invalid Ri. 771 When a route has been aggregated at some router, implicitly or 772 explicitly, it will find that the incoming Ri(Dgram) at most can 773 identify the aggregation, and it must make a decision; the forwarded 774 datagram then contains the Ri for the specific route. (Note this may 775 happen well upstream of the point at which the routes actually 776 diverge.) 778 This allows all cooperating routers to make immediate forwarding 779 decisions, without any searching of tables or caches once the datagram 780 has entered the routing domain. If the host participates in the 781 routing, at least to the extent of acquiring the initial Ri required 782 from the first router, then only routers that have done aggregations 783 need make decisions. (If the routing changes with datagrams in 784 flight, some router will be required to make a decision to re-rail 785 each datagram.) 787 5.4.2 Flows 789 If a "flow" is to be set up, the identifiers are replaced by 790 Flowi(router,route), where each router's structure for the flow 791 contains a pointer to the route on which the flow is built. Datagrams 792 can drop out of the flow at some point, and can be inserted either by 793 the originating host or by a cooperating router near the originator. 794 Since the forward route identifier field is opaque to the sending 795 router, and implicitly meaningful only to the next hop router, use for 796 flows (or similar optimizations) need not be otherwise defined by the 797 protocol. (One presumes that a router issuing both Ri's and Flowi's 798 will take care to make sure that it can distinguish them by some 799 private method.) 801 If a flow has been set up by a restricted target RAP route 802 announcement, it is no different from a route in the implementation. 803 If this announcement originates from the host itself, the Ri in 804 incoming datagrams can be used to determine whether they followed the 805 flow, or to optimize delivery of the datagrams to the next layer 806 protocol. 808 5.4.3 Mobile hosts 810 First, a definition: A "mobile host" is a host that can move around, 811 connecting via different networks at different times, while 812 maintaining open TCP connections. It is distinguished from a 813 "portable host", which is simply a host that can appear in various 814 places in the net, without continuity. A portable host can be 815 implemented by assigning a new address for each location (more or less 816 automatically), and arranging to update the domain system. Supporting 817 truly mobile hosts is the more interesting problem. 819 To implement mobile host support in a general way, either some layer 820 of the protocol suite must provide network-wide routing, or the 821 datagrams must be tunnelled from the "home" network of the host to its 822 present location. In the real network, some combination of these is 823 probable: most of the net will forward datagrams toward the home 824 network, and then the datagrams will follow a specific host route to 825 the mobile host. 827 The requirement on the routing system is that it must be able to 828 propagate a host route at least to the home network; any other 829 distribution is useful optimization. When a host route is propagated 830 by RAP as a targeted route, and the routers use the resulting Ri's, 831 the datagram follows an effective tunnel to the mobile host. (Not a 832 real tunnel, in the strict sense; the datagrams are following an 833 actual route at the network protocol layer.) 835 As explained in RAP [ID-RAP], a targeted route can be issued when 836 desired; in particular, it can be triggered by the establishment of a 837 TCP connection or by the arrival of datagrams that do not carry an Ri 838 indicating that they have followed a (non-tunnel) route. 840 6 TCP: Transport protocol 842 Internet version 7 expands the sizes of the sequence and 843 acknowledgement fields, the window, and the port numbers. This is to 844 remove limitations in version 4 that begin to restrict throughput at 845 (for example) the bandwidth of FDDI and round trip delay of more than 846 60 milliseconds. At gigabit speeds and delays typical of 847 international links, the version 4 TCP would be a serious limitation. 849 The port numbers are also expanded. This alleviates the problem of 850 going through the entire port number range with a rapid sequence of 851 transactions in less than the lifetime of datagrams in the network. 853 6.1 TCP segment header format 855 The 64 bit fields (sequence and acknowledgement) in the TCP header are 856 off-phase aligned, in anticipation of the usual case of the TCP header 857 following the 7 32-bit word IP header. If IP options add up to an odd 858 number of 32 bit words, a null option may be added to push the 859 transport header to off-phase alignment. 861 0 1 2 3 862 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 863 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 864 | data offset | MBZ |A|P|R|S|F| checksum | 865 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 866 | source port | 867 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 868 | destination port | 869 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 870 | | 871 + sequence number + 872 | | 873 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 874 | | 875 + acknowledgement number + 876 | | 877 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 878 | window | 879 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 880 | options ... | 881 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 883 A description of each field: 885 6.1.1 Data offset 887 An 8 bit count of the number of 32 bit words in the TCP header, 888 including any options. 890 6.1.2 MBZ 892 Reserved bits, must be zero, and must be ignored. 894 6.1.3 Flags 896 These are the protocol state flags, use exactly as in TCPv4, except 897 that there is no urgent data flag. 899 6.1.4 Checksum 901 This is a 16 bit checksum of the segment. The pseudo-header used in 902 the checksum consists of the destination address, the source address, 903 the protocol field (constant 6 for TCP), and the 32 bit length of the 904 TCP segment. 906 6.1.5 Source port 908 The source port number, a 32 bit identifier. See the section on port 909 numbers below. 911 6.1.6 Destination port. 913 The 32 bit destination port number. 915 6.1.7 Sequence 917 A 64 bit sequence number, the sequence number of the first octet of 918 user data in the segment. 920 The ISN (Initial Sequence Number) generator used in TCPv4 is used in 921 TCPv7, with the 32 bit value loaded into both the high and low 32 bits 922 of the TCPv7 sequence number. This provides reasonable behavior when 923 the 32 bit rollover option is used (see below) for TCPv4 924 interoperation. V7 hosts must implement the full 64 bit sequence 925 number rollover. 927 6.1.8 Acknowledgement 929 The 64 bit acknowledgement number, acknowledging receipt of octets up 930 to but not including the octet identified. Valid if the A flag is 931 set, if A is reset (0), this field should be zero, and must be 932 ignored. 934 6.1.9 Window 936 The 32 bit offered window. 938 6.1.10 Options 940 TCP options, some of which are documented below. 942 6.2 Port numbers 944 Port numbers are divided into several ranges: (all numbers are 945 decimal) 947 0 reserved 948 1-32767 Internet registered ("well-known") protocols 949 32768-98303 reserved, to allow TCPv7-TCPv4 conversion 950 98304 up dynamic assignment 952 It must also be remembered that hosts are free to dynamically assign 953 for active connections any port not actually in use by that host: 954 hosts must not reject connections because the "client" port is in the 955 registered range. 957 6.3 TCP options 959 6.3.1 Option Format 961 Each option begins with a 32 bit header: 963 0 1 2 3 964 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 965 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 966 | type | length | 967 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 968 | option data ... | padding | 969 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 971 6.3.2 Null 973 The null option (type = 0), is to be ignored. 975 6.3.3 Maximum Segment Size 977 Maximum segment size (type = 1) specifies the largest segment that the 978 other TCP should send, in terms of the number of data octets. When 979 sent on a SYN segment, it is mandatory; if sent on any other segment 980 it is advisory. 982 Data is one 32 bit word specifying the size in octets. 984 6.3.4 Urgent Pointer 986 The urgent pointer (type = 2) emulates the urgent field in TCPv4. Its 987 presence is equivalent to the U flag being set. The data is a 64 bit 988 sequence number identifying the last octet of urgent data. (Not an 989 offset, as in v4.) 990 6.3.5 32 Bit rollover 992 The 32 bit rollover option (type = 3) indicates that only the low 993 order 32 bits of the sequence and acknowledgement packets are 994 significant in the packet. 996 This is necessary because a converting internet layer gateway has no 997 retained state, and cannot properly set the high order bits. This 998 option must be implemented by version 7 hosts that want to 999 interoperate with version 4 hosts. 1001 7 UDP: User Datagram protocol 1003 The user datagram protocol is also expanded to include larger port 1004 numbers, for reasons similar to the TCP. 1006 7.1 UDP header format 1008 0 1 2 3 1009 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1010 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1011 | data offset | MBZ | checksum | 1012 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1013 | source port | 1014 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1015 | destination port | 1016 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1017 | options ... | 1018 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1020 A description of each field: 1022 7.1.1 Data offset 1024 An 8 bit count of the number of 32 bit words in the UDP header, 1025 including any options. 1027 7.1.2 MBZ 1029 Reserved bits, must be zero, and must be ignored. 1031 7.1.3 Checksum 1033 This is a 16 bit checksum of the datagram. The pseudo-header used in 1034 the checksum consists of the destination address, the source, address, 1035 and the protocol field (constant 17 for UDP), and the 32 bit length of 1036 the user datagram. 1038 7.1.4 Source port 1040 The source port number, a 32 bit identifier. See the section on TCP 1041 port numbers above. 1043 7.1.5 Destination port. 1045 The 32 bit destination port number. 1047 7.1.6 Options 1049 UDP options, none are presently defined. 1051 8 ICMP 1053 The ICMP protocol is very similar to ICMPv4, in some cases not 1054 requiring any conversion. 1056 The complication is that IP datagrams are nested within ICMP messages, 1057 and must be converted. This is discussed later. 1059 8.1 ICMP header format 1061 0 1 2 3 1062 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1063 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1064 | type | code | checksum | 1065 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1066 | type-specific parameter | 1067 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1068 | type-specific data ... | 1069 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1071 Type and code are well-known values, defined in [RFC792]. The codes 1072 have meaning only within a particular type, they are not orthogonal. 1074 The next 32 bit word is usually defined for the specific type, 1075 sometimes it is unused. 1077 For many types, the data consists of a nested IP datagram, usually 1078 truncated, which is a copy of the datagram causing the event being 1079 reported. In IPv4, the nested datagram consists of the IP header, and 1080 another 64 bits (at least) of the original datagram. 1082 For IPv7, the nested datagram must include the IP header plus 96 bits 1083 of the remaining datagram (thus including the port numbers within TCP 1084 and UDP), and should include the first 256 bytes of the datagram. 1085 I.e. in most cases where the original datagram was not large, it will 1086 return the entire datagram. 1088 8.2 Conversion failed ICMP message 1090 The introduction of network layer conversion requires a new message 1091 type, to report conversion errors. Note that an invalid datagram 1092 should result in the sending of some other ICMP message (e.g. 1093 parameter problem) or the silent discarding of the datagram. This 1094 message is only sent when a valid datagram cannot be converted. 1096 Note: implementations are not expected to, and should not, check the 1097 validity of incoming datagrams just to accomplish this; it simply 1098 means that an error detected during conversion that is known to be an 1099 actual error in the incoming datagram should be reported as such, not 1100 as a conversion failure. 1102 Note that the conversion failed ICMP message may be sent in either the 1103 IPv4 or IPv7 domain; it is a valid ICMP message type for IPv4. 1105 0 1 2 3 1106 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1107 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1108 | type | code | checksum | 1109 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1110 | pointer to problem area | 1111 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1112 | copy of datagram that could not be converted .... | 1113 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1115 The type for Conversion Failed is 31. 1117 The codes are: 1119 0 Unknown/unspecified error 1120 1 Don't Convert option present 1121 2 Unknown mandatory option present 1122 3 Known unsupported option present 1123 4 Unsupported transport protocol 1124 5 Overall length exceeded 1125 6 IP header length exceeded 1126 7 Transport protocol > 255 1127 8 Port conversion out of range 1128 9 Transport header length exceeded 1129 10 32 Bit Rollover missing and ACK set 1130 11 Unknown mandatory transport option present 1132 The use of code 0 should be avoided, any other condition found by 1133 implementors should be assigned a new code requested from IANA. When 1134 code 0 is used, it is particularily important that the pointer be set 1135 properly. 1137 The pointer is an offset from the start of the original datagram to 1138 the beginning of the offending field. 1140 The data is part of the datagram that could not be converted. It must 1141 be at least the IP and transport headers, and must include the field 1142 pointed to by the previous parameter. For code 4, the transport 1143 header is probably not identifiable; the data should include 256 bytes 1144 of the original datagram. 1146 9 Notes on the domain system 1148 9.1 A records 1150 Address records will be added to the IN (Internet) zone with IPv7 1151 addresses for all hosts as IPv7 is deployed. Eventually the IPv4 1152 addresses will be removed. As mentioned above, the AD space is 1153 initially assigned so that the first 4 octets of a v7 address cannot 1154 be confused with a v4 address (or, rather, the confusion will be to no 1155 effect.) 1157 For example: 1159 DELTA.Process.COM. A 192.42.95.68 1160 A 126.0.0.192.42.95.1.68 1162 It is important that the A record be used, to avoid the cache 1163 consistancy problem that would arise when different records had 1164 different remaining TTLs. 1166 Note that if an unmodified version of the more popular public domain 1167 nameserver is a secondary for a zone containing IPv7 addresses, it 1168 will erroneously issue RRs with only the first four bytes. (I.e. 1169 126.0.0.192 in the example.) This is another reason to ensure that the 1170 AD numbers are initially reserved out of the IPv4 network number 1171 space. Eventually, zones with IPv7 addresses would be expected to be 1172 served only by upgraded servers. 1174 9.2 PTR zone 1176 The inverse (PTR) zone is .#, with the IPv7 address (reversed). I.e. 1177 just like .IN-ADDR.ARPA, but with .# instead. 1179 This respects the difference in actual authority: the NSF/DDN NIC is 1180 the authority for the entire space rooted in .IN-ADDR.ARPA. in the v4 1181 Internet, while in the new Internet it holds the authority only for 1182 the AD 0.0.126.#. (Plus, of course, any other ADs assigned to it over 1183 time.) 1184 10 Conversion between version 4 and version 7 1186 As noted in the description of datagram format, it is possible to 1187 provide a mostly-transparent bridge between version 4 and version 7. 1189 This discusses TCP and ICMP at the session/transport layer; UDP is a 1190 subset of the TCP conversion. Most protocols at this layer will 1191 probably need no translation; however it will probably be necessary to 1192 specify exactly which will have translations done. 1194 New protocols at the session/transport layer defined over IPv7 should 1195 have protocol numbers greater than 255, and will not be translated to 1196 IPv4. 1198 Most of the translations should consist of copying various fields, 1199 verifying fixed values in the datagram being translated, and setting 1200 fixed values in the datagram being produced. In general, the checksum 1201 must be verified first, and then a new checksum computed for the 1202 generated datagram. 1204 10.1 Version 4 IP address extension option 1206 A new option is defined for IP version 4, to carry the extended 1207 addresses of IPv7. This will be particularily useful in the initial 1208 testing of IPv7, during a time when most of the fabric of the internet 1209 is IPv4. An IPv7 host will be able to connect to another IPv7 host 1210 anywhere in the internet even though most of the paths and routers are 1211 IPv4, and still use the full addressing. This will continue to work 1212 until non-unique network numbers are assigned, by which time most of 1213 the infrastructure should be IPv7. 1215 10.1.1 Option format 1217 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1218 | type (147) | length = 10 | source IPv7 AD number | 1219 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1220 | ... | src 7th octet | destination IPv7 AD | 1221 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1222 | number ... | dst 7th octet | 1223 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1225 The source and destination are in IPv4 order (source first), for 1226 consistancy. The type code is 147. 1228 10.2 Fragmented datagrams 1230 Datagrams that have been fragmented must be reassembled by the 1231 converting host or router before conversion. Where the conversion is 1232 being done by the destination host (i.e. the case of a v7 host 1233 receiving v4 datagrams), this is similar to the present fragmentation 1234 model. 1236 When it is being done by an intermediate router (acting as an 1237 internetwork layer gateway) the router should use all of source, 1238 destination, and datagram ID for identification of IPv4 fragments; 1239 note that destination is used implicitly in the usual reassembly at 1240 the destination. When reassembling an IPv7 datagram, the 128 bit 1241 fragment ID is used as usual. 1243 If the fragments take different paths through the net, and arrive at 1244 different conversion points, the datagram is lost. 1246 10.3 Where does the conversion happen? 1248 The objective of conversion is to be able to upgrade systems, both 1249 hosts and routers, in whatever order desired by their owners. 1250 Organizations must be able to upgrade any given system without 1251 reconfiguration or modification of any other; and IPv4 hosts must be 1252 able to interoperate essentially forever. (IPv4 routers will probably 1253 be effectively eliminated at some point, except where they exist in 1254 their own remote or isolated corners.) 1256 Each TCP/IP v7 system, whether host or router, must be able to 1257 recognize adjacent systems in the topology that are (only) v4, and 1258 call the appropriate conversion routine just before sending the 1259 datagram. 1261 Digression: I believe v7 hosts will get much better performance by 1262 doing everything internally in v7, and using conversion to filter 1263 datagrams when necessary. This keeps the usual code path simple, with 1264 only a "hook" right after receiving to convert incoming IPv4 1265 datagrams, and just before sending to convert to IPv4. Routers may 1266 prefer to keep datagrams in their incoming version, at least until 1267 after the routing decision is made, and then doing the conversion only 1268 if necessary. In either case, this is an implementation specific 1269 decision. 1271 It must be noted that any forwarding system may convert datagrams to 1272 IPv7, then back to IPv4, even if that loses information such as 1273 unknown options. The reverse is not acceptable: a system that 1274 receives an IPv7 datagram should not convert it to IPv4, then back to 1275 IPv7 on forwarding. 1277 The preferred method for identifying which hosts require conversion is 1278 to ARP first for the IPv7 address, and then again if no response is 1279 received, for the IPv4 address. The reservation of ADs out of the v4 1280 network number space is useful again here, protecting hosts that fail 1281 to properly use the ARP address length fields. 1283 On networks where ARP is not normally used, the method is to assume 1284 that a remote system is v7. If an IPv7 datagram is received from it, 1285 the assumption is confirmed. If, after a short time, no IPv7 datagram 1286 is received, a v7 ICMP echo is sent. If a reply is received (in 1287 either version) the assumption is confirmed. 1289 If no reply is recieved, the remote system is assumed not to 1290 understand IPv7, and datagrams are converted to IPv4 just before 1291 transmitting them. 1293 Implementations should also provide for explicit configuration where 1294 desired. 1296 10.4 Hybrid IPv4 systems 1298 In the course of implementing IPv7, especially in constrained 1299 environments such as small terminal servers, it may be useful to 1300 implement the IPv4 address extension option directly, thereby 1301 regaining universal connectivity. 1303 This may also be a useful interim step for vendors not prepared to do 1304 a major rework of an implementation; but it is important not to get 1305 stalled in this step. 1307 A hybrid IPv4 + address extension system does not have to implement 1308 the conversion, it places this onus on its neighbors. It may itself 1309 have an address with the subnet extension (7th byte) not equal to 1. 1311 The implication of hybrid systems is that it is not valid to assume 1312 that a host with a IPv7 address is a native IPv7 implementation. 1314 10.5 Maximum segment size in TCP 1316 It is probably advisable for IPv4 implementations to reduce the MSS 1317 offered by a small amount where possible, to avoid fragmentation when 1318 datagrams are converted to IPv7. This arises when IPv4 hosts are 1319 communicating through an IPv7 infrastructure, with the same MTU as the 1320 local networks of the hosts. 1322 10.6 Forwarding and redirects 1324 It may be important for a router to not send ICMP redirects when it 1325 finds that it must do a conversion as part of forwarding the datagram. 1326 In this case, the hosts involved may not be able to interact directly. 1327 The IPv7 host could ignore the redirect, but this results in an 1328 unpleasant level of noise as the sequence continually recurs. 1330 10.7 Design considerations 1332 The conversion is designed to be fairly efficient in implementation, 1333 especially on RISC architectures, assuming they can either do a 1334 conditional move (or store), or do a short forward branch without 1335 losing the instruction cache. The other conditional branches in the 1336 body of the code are usually not-taken out to the failure/discard 1337 case. 1339 Handling options does involve a loop and a dispatch (case) operation. 1340 The options in IPv4 are more difficult to handle, not being designed 1341 for speed on a 32 bit aligned RISCish architecture, but they do not 1342 occur often, except perhaps the address extension option. 1344 For CISC machines, the same considerations will lead to fairly 1345 efficient code. 1347 The conversion code must be extremely careful to be robust when 1348 presented with invalid input; in particular, it may be presented with 1349 truncated transport layer headers when called recursively from the 1350 ICMP conversion. 1352 10.8 Conversion from IPv4 to IPv7 1354 Individual steps in the conversion; the order is in most cases not 1355 significant. 1357 o Verify checksum. 1359 o Verify fragment offset is 0, MF flag is 0. 1361 o Verify version is 4. 1363 o Extend TTL to 16 bits, multiply by 16. 1365 o Set forward route identifier to 0. 1367 o Set first 3 octets of destination to AD (i.e. 126.0.0), copy 1368 first three octets from v4 address, set next octet to 1, copy 1369 last octet. (This can be done with shift/mask/or operations 1370 on most architectures.) 1372 o Do the same translation on source address. 1374 o Copy protocol, set high 8 bits to zero. 1376 o If DF flag set, add Don't Fragment option. 1378 o If Address Extension option present, copy ADs and subnet 1379 extension numbers into destination and source. 1381 o Convert other options where possible. If an unknown option 1382 with copy-on-fragment is found, fail. If copy-on-fragment is 1383 not set, ignore the option. I.e. the flag is (ab)used as an 1384 indicator of whether the option is mandatory. 1386 o Compute new IP header length. 1388 o Convert session/transport layer (TCP) header and data. 1390 o Compute new overall datagram length. 1392 o Calculate IPv7 checksum. 1394 10.9 Conversion from IPv7 to IPv4 1396 The steps to convert IPv7 to IPv4 follow. Note that the converting 1397 router or host is partly in the role of destination host; it checks 1398 both bits of class in IP options, and (as in the other direction) must 1399 reassemble fragmented datagrams. 1401 o Verify checksum. 1403 o Verify version is 7 1405 o Set type-of-service to 0 (there may be an option defined, 1406 that will be handled later). 1408 o If length is greater than (about) 65563, fail. (That number 1409 is not a typographical error. Note that the IPv7+TCPv7 1410 headers add up to 28 bytes more than the corresponding v4 1411 headers in the usual case.) This check is only to avoid 1412 useless work, the precise check is later. 1414 o Generate an ID (using an ISN based sequence generator, 1415 possibly also based on destination or source or both). 1417 o Set flags and fragment field to 0. 1419 o Divide TTL by 16, if zero, fail (send ICMP Time Exceeded). 1420 If greater that 255, set to 255. 1422 o If next layer protocol is greater than 255, fail. Else copy. 1424 o Copy first 3 octets and 8th octet of destination to 1425 destination address. 1427 o Same for source address. 1429 o Generate v4 address extension option. (If enabled; this 1430 probably should be a configuration option, should default to 1431 on.) 1433 o Process v7 options. If any unknown options of class not 0 1434 found, fail. 1436 o If Don't Fragment option found, set DF flag. 1438 o If Don't Convert option found, fail. 1440 o Convert other options where possible, or fail. 1442 o Compute new IP header length. This may fail (too large), 1443 fail conversion if so. 1445 o Convert session/transport layer (e.g. TCP). 1447 o Compute new overall datagram length. If greater than 65535, 1448 fail. 1450 o Compute IPv4 checksum. 1452 10.10 Conversion from TCPv4 to TCPv7 1454 o Subtract header words from v4 checksum. (Note that this is 1455 actually done with one's complement addition.) 1457 o Copy flags (except for Urgent). 1459 o If source port is less than 32768 (a sign condition test will 1460 suffice on most architectures), copy it. If equal or 1461 greater, add 65536. 1463 o Same operation on destination port. 1465 o Copy sequence to low 32 bits, set high to 0. 1467 o Copy acknowledgement to low 32 bits, set high to 0. 1469 o Copy window. (The TCPv4 performance extension [RFC1323] 1470 window-scale cannot be used, as it would require state; we 1471 use the basic window offered.) 1473 o Add 32 bit rollover option. 1475 o Convert maximum segment size option if present. 1477 o Compute data offset and copy data. 1479 o Add header words into saved checksum. It is important not to 1480 recompute the checksum over the data; it must remain an 1481 end-to-end checksum. 1483 o Return to IP layer conversion. 1485 10.11 Conversion from TCPv7 to TCPv4 1487 o Subtract header from v7 checksum. 1489 o If source port is greater than 65535, subtract 65536. If 1490 result is still greater than 65535, fail. (Send ICMP 1491 conversion failed/port conversion out of range. The sending 1492 host may then reset its port number generator to 98304.) 1494 o Same translation for destination port. 1496 o Copy low 32 bits of sequence number. 1498 o If A bit set, copy low 32 bits of acknowledgement. 1500 o Copy flags. 1502 o If window is greater than 61440, set it to 24576. If less, 1503 copy it unchanged. (Rationale for the 24K figure: this has 1504 been found to be a good default for IPv4 hosts. If the IPv7 1505 host is offering a very large window, the IPv4 host probably 1506 isn't prepared to play at that level.) 1508 o Process options. If 32 Bit Rollover is not present, and A 1509 flag is set, fail. (Send ICMP conversion failed/32 bit 1510 Rollover missing.) 1512 o If Urgent is present, compute offset. If in segment, set U 1513 flag and offset field. If not, ignore. 1515 o Convert Maximum Segment Size option. If greater than 16384, 1516 set to 16384. 1518 o Compute new data offset. 1520 o Add header words into v4 checksum. 1522 o Return to IP layer conversion. 1524 10.12 ICMP conversion 1526 ICMP messages are converted by copying the type and code into the new 1527 packet, and copying the other type-specific fields directly. 1529 If the message contains an encapsulated, and usually truncated, IP 1530 datagram, the conversion routine is called recursively to translate it 1531 as far as possible. There are some special considerations: 1533 o The encapsulated datagram is less likely to be valid, given 1534 that it did generate an error of some kind. 1536 o The conversion should attempt to complete all fields 1537 available, even if some would cause failures in the general 1538 case. Note, in particular, that in the course of converting 1539 a datagram, when a failure occurs, an ICMP message 1540 (conversion failed) is sent; this message itself may 1541 immediately require conversion. Part of that conversion will 1542 involve converting the original datagram. 1544 o Conditions such as overall datagram length too large are not 1545 checked. 1547 o The AD and subnet byte assumed in the nested conversion may 1548 not be sensible if the IPv4 address extension option is not 1549 present and the datagram has strayed from the expected AD. 1550 (Not unlikely, given that we know a priori that some error 1551 occured.) 1553 o The conversion must be very sure not to make another 1554 recursive call if the nested datagram is an ICMP message. 1555 (This should not occur, but obviously may.) 1557 o It is probably impossible to generate a correct transport 1558 layer checksum in the nested datagram. The conversion may 1559 prefer to just zero the checksum field. Likewise, validating 1560 the original checksum is pointless. 1562 It may be best in a given implementation to have a separate code path 1563 for the nested conversion, that handles these issues out of the 1564 optimized usual path. 1566 11 Postscript 1568 The present version of TCP/IP has been a success partly by accident, 1569 for reasons that weren't really designed in. Perhaps the most 1570 significant is the low level of network integration required to make 1571 it work. 1573 We must be careful to retain the successful ingredients, even where we 1574 may be unaware of them. Tread lightly, and use all that we have 1575 learned, especially about not changing things that work. 1577 This document has described a fairly conservative step forward, with 1578 clear extensibility for future developments, but without jumping into 1579 the abyss. 1581 12 References 1583 [RFC768] Jon Postel. User Datagram protocol. August, 1980 1585 [RFC791] Jon Postel, editor. Internet Protocol. DARPA Internet 1586 Program Protocol Specification. ISI/USC. September, 1587 1981. 1589 [RFC792] Jon Postel, editor. Internet Control Message Protocol. 1590 DARPA Internet Program Protocol Specification. 1591 ISI/USC. September, 1981. 1593 [RFC793] Jon Postel, editor. Transmission Control Protocol. 1594 DARPA Internet Program Protocol Specification. 1595 ISI/USC. September, 1981. 1597 [RFC801] Jon Postel, NCP/TCP transition plan. November, 1981. 1599 [RFC1287] D. Clark, L. Chapin, V. Cerf, R. Braden, R. Hobby. 1600 Towards the Future Internet Architecture. December, 1601 1991. 1603 [RFC1323] V. Jacobson, R. T. Braden, D. A. Borman. TCP extensions 1604 for high performance. May, 1992. 1606 [RFC1335] Z. Wang, J. Crowcroft, Two-tier address structure for the 1607 Internet: A solution to the problem of address space 1608 exhaustion. May, 1992. 1610 [RFC1338] V. Fuller, T. Li, J. Yu, K. Varadhan. Supernetting: an 1611 Address Assignment and Aggregation Strategy. June, 1612 1992. 1614 [RFC1347] R. W. Callon. TCP and UDP with Bigger Addresses (TUBA), 1615 A simple proposal for Internet addressing and routing. 1616 June, 1992. 1618 [ID-RAP] Robert Ullmann. RAP: Internet Route Access Protocol. 1619 Internet draft: unpublished at this writing. January, 1620 1993. 1622 13 Author's Address 1624 Robert Ullmann 1625 Process Software Corporation 1626 959 Concord Street 1627 Framingham, MA 01701 1628 USA 1630 Phone: +1 508 879 6994 x226 1631 Email: Ariel@Process.COM 1633 This draft expires on or before July 4, 1993.