idnits 2.17.1 draft-chiappa-ipng-nimrod-arch-00.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-19) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 728 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** There are 311 instances of too long lines in the document, the longest one being 6 characters in excess of 72. ** There are 82 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 21, 1994) is 10865 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 12 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Draft J. Noel Chiappa 2 Expires: January 21, 1995 July 21, 1994 4 IPng Technical Requirements 5 Of the Nimrod Routing and Addressing Architecture 6 8 Status of this Memo 10 This document is an Internet Draft. Internet Drafts are working 11 documents of the Internet Engineering Task Force (IETF), its Areas, and 12 its Working Groups. Note that other groups may also distribute working 13 documents as Internet Drafts. 14 Internet Drafts are draft documents valid for a maximum of six 15 months. Internet Drafts may be updated, replaced, or obsoleted by other 16 documents at any time. It is not appropriate to use Internet Drafts as 17 reference material or to cite them other than as a 'working draft' or 18 'work in progress.' 19 Please check the Internet Draft abstract listing (in the file 20 1id-abstracts.txt) contained in the Internet Drafts Shadow Directories 21 (cd internet-drafts, on nic.ddn.mil, nnsc.nsf.net, ftp.nisc.sri.com, 22 nic.nordu.net, or munnari.oz.au) to learn the current status of any 23 Internet Draft. 25 This draft document will be submitted to the RFC Editor as an 26 Informational RFC. Distribution of this document is unlimited. Please 27 send comments to jnc@lcs.mit.edu. 29 1.1 Abstract 31 This document presents the requirements that the Nimrod routing and 32 addressing architecture has upon the internetwork layer protocol. To be most 33 useful to Nimrod, any protocol selected as the IPng should satisfy these 34 requirements. Also presented is some background information, consisting of 35 i) information about architectural and design principles which might apply 36 to the design of a new internetworking layer, and ii) some details of the 37 logic and reasoning behind particular requirements. 39 1.2 Introduction 41 It is important to note that this document is not "IPng Requirements 42 for Routing", as other proposed routing and addressing designs may need 43 different support; this document is specific to Nimrod, and doesn't claim to 44 speak for other efforts. 46 However, although I don't wish to assume that the particular designs 47 being worked on by the Nimrod WG will be widely adopted by the Internet (if 48 for no other reason, they have not yet been deployed and tried and tested in 49 practise, to see if they really work, an absolutely necessary hurdle for any 50 protocol), there are reasons to believe that any routing architecture for a 51 large, ubiquitous global Internet will have many of the same basic fundamental 52 principles as the Nimrod architecture, and the requirements that these 53 generate. 54 While current day routing technologies do not yet have the 55 characteristics and capabilities that generate these requirements, they also 56 do not seem to be completely suited to routing in the next-generation 57 Internet. As routing technology moves towards what is needed for the next 58 generation Internet, the underlying fundamental laws and principles of routing 59 will almost inevitably drive the design, and hence the requirements, toward 60 things which look like the material presented here. 61 Therefore, even if Nimrod is not the routing architecture of the 62 next-generation Internet, the basic routing architecture of that Internet will 63 have requirements that, while differing in detail, will almost inevitably be 64 similar to these. 66 In a similar, but more general, context, note that, by and large, the 67 general analysis of sections 3.1 ("Interaction Architectural Issues") and 3.2 68 ("State and Flows in the Internetwork Layer") will apply to other areas of 69 a new internetwork layer, not just routing. 71 I will tackle the internetwork packet format first (which is simpler), 72 and then the whole issue of the interaction with the rest of the internetwork 73 layer (which is a much more subtle topic). 75 2.1 Packet Format Issues 77 As a general rule, the design philosophy of Nimrod is "maximize the 78 lifetime (and flexibility) of the architecture". Design tradeoffs (i.e. 79 optimizations) that will adversely affect the flexibility, adaptability and 80 lifetime of the design are not not necessarily wise choices; they may cost 81 more than they save. Such optimizations might be the correct choices in a 82 stand-alone system, where the replacement costs are relatively small; in the 83 global communication network, the replacement costs are very much higher. 85 Providing the Nimrod functionality requires the carrying of certain 86 information in the packets. The design principle noted above has a number of 87 corollaries in specifying the fields to contain that information. 88 First, the design should be "simple and straightforward", which means 89 that various functions should be handled by completely separate mechanisms, 90 and fields in the packets. It may seem that an opportunity exists to save 91 space by overloading two functions onto one mechanism or field, but general 92 experience is that, over time, this attempt at optimization costs more, by 93 restricting flexibility and adaptability. 94 Second, field lengths should be specified to be somewhat larger than 95 can conceivably be used; the history of system architecture is replete with 96 examples (processor address size being the most notorious) where fields became 97 too short over the lifetime of the system. The document indicates what the 98 smallest reasonable "adequate" lengths are, but this is more of a "critical 99 floor" than a recommendation. A "recommended" length is also given, which is 100 the length which corresponds to the application of this principle. The wise 101 designer would pick this length. 102 It is important to now that this does *not* mean that 103 implementations must support the maximum value possible in a field of that 104 size. I imagine that system-wide administrative limits will be placed on the 105 maximum values which must be supported. Then, as the need arises, we can 106 increase the administrative limit. This allows an easy, and completely 107 interoperable (with no special mechanisms) path to upgrade the capability of 108 the network. If the maximum supported value of a field needs to be increased 109 from M to N, an announcement is made that this is coming; during the interim 110 period, the system continues to operate with M, but new implementations are 111 deployed; while this is happening, interoperation is automatic, with no 112 transition mechanisms of any kind needed. When things are "ready" (i.e. the 113 proportion of old equipment is small enough), use of the larger value 114 commences. 116 Also, in speaking of the packet format, you first need to distinguish 117 between the host-router part of the path, and the router-router part; a 118 format that works OK for one may not do for another. 119 The issue is complicated by the fact that Nimrod can be made to work, 120 albeit not in optimal form, with information/fields missing from the packet in 121 the host to "first hop router" section of the packet's path. The missing 122 fields and information can then be added by the first hop router. (This 123 capability will be used to allow deployment and operation with unmodified IPv4 124 hosts, although similar techniques could be used with other internetworking 125 protocols.) Access to the full range of Nimrod capabilities will require 126 upgrading of hosts to include the necessary information in the packets they 127 exchange with the routers. 128 Second, Nimrod currently has three planned forwarding modes (flows, 129 datagram, and source-routed packets), and a format that works for one may not 130 work for another; some modes use fields that are not used by other modes. 131 The presence or absence of these fields will make a difference. 133 2.2 Packet Format Fields 135 What Nimrod would like to see in the internetworking packet is: 137 - Source and destination endpoint identification. There are several 138 possibilities here. 140 One is "UID"s, which are "shortish", fixed length fields which appear in 141 each packet, in the internetwork header, which contain globally unique, 142 topologically insensitive identifiers for either i) endpoints (if you 143 aren't familiar with endpoints, think of them as hosts), or ii) 144 multicast groups. (In the former instance, the UID is an EID; in the 145 latter, a "set ID", or SID. An SID is an identifier which looks just 146 like an EID, but it refers to a group of endpoints. The semantics of 147 SID's are not completely defined.) For each of these 48 bits is 148 adequate, but we would recommend 64 bits. (IPv4 will be able to operate 149 with smaller ones for a while, but eventually either need a new packet 150 format, or the difficult and not wholly satisfactory technique known as 151 Network Address Translators, which allows the contents of these fields 152 to be only locally unique.) 154 Another possibility is some shorter field, named an "endpoint selector", 155 or ESEL, which contains a value which is not globally unique, but only 156 unique in mapping tables on each end, tables which map from the small 157 value to a globally unique value, such as a DNS name. 159 Finally, it is possible to conceive of overall networking designs which 160 do not include any endpoint identification in the packet at all, but 161 transfer it at the start of a communication, and from then on infer it. 162 This alternative would have to have some other means of telling which 163 endpoint a given packet is for, if there are several endpoints at a 164 given destination. Some coordination on allocation of flow-ids, or 165 higher level port numbers, etc, might do this. 167 - Flow identification. There are two basic approaches here, depending on 168 whether flows are aggregated (in intermediate switches) or not. It 169 should be emphasized at this point that it is not yet known whether 170 flow aggregation will be needed. The only reason to do it is to control 171 the growth of state in intermediate routers, but there is no hard case 172 made that either this growth will be unmanageable, or that aggregating 173 flows will be feasible practically. 175 For the non-aggregated case, a single "flow-id" field will suffice. 176 This *must not* use one of the two previous UID fields, as in 177 datagram mode (and probably source-routed mode as well) the flow-id will 178 be over-written during transit of the network. It could most easily be 179 constructed by adding a UID to a locally unique flow-id, which will 180 provide a globally unique flow-id. It is possible to use non-globally 181 unique flow-ids, (which would allow a shorter length to this field), 182 although this would mean that collisions would result, and have to be 183 dealt with. An adequate length for the local part of a globally unique 184 flow-id would be 12 bits (which would be my "out of thin air" guess), 185 but we recommend 32. For a non-globally unique flow-id, 24 bits would 186 be adequate, but I recommend 32. 188 For the aggregated case, three broad classes of mechanism are possible. 190 - Option 1: The packet contains a sequence (sort of like a source route) 191 of flow-ids. Whenever you aggregate or deaggregate, you move along the 192 list to the next one. This takes the most space, but is otherwise the 193 least work for the routers. 195 - Option 2: The packet contains a stack of flow-ids, with the current 196 one on the top. When you aggregate, you push a new one on; when you 197 de-aggregate, you take one off. This takes more work, but less space in 198 the packet than the complete "source-route". Encapsulating packets to do 199 aggregation does basically this, but you're stacking entire headers, not 200 just flow-ids. The clever way to do this flow-id stacking, without doing 201 encapsulation, is to find out from flow-setup how deep the stack will get, 202 and allocate the space in the packet when it's created. That way, all you 203 ever have to do is insert a new flow-id, or "remove" one; you never have 204 to make room for more flow-ids. 206 - Option 3: The packet contains only the "base" flow-id (i.e. the one 207 with the finest granularity), and the current flow-id. When you aggregate, 208 you just bash the current flow-id. The tricky part comes when you 209 de-aggregate; you have to put the right value back. To do this, you have 210 to have state in the router at the end of the aggregated flow, which tells 211 you what the de-aggregated flow for each base flow is. The downside 212 here is obvious: we get away without individual flow state for each of 213 the constituent flows in all the routers along the path of that 214 aggregated, flow, *except* for the last one. 216 Other than encapsulation, which has significant inefficiency in space 217 overhead fairly quickly, after just a few layers of aggregation, there 218 appears to be no way to do it with just one flow-id in the packet header. 219 Even if you don't touch the packets, but do the aggregation by mapping 220 some number of "base" flow-id's to a single aggregated flow in the routers 221 along the path of the aggregated flow, the table that does the mapping is 222 still going to have to have a number of entries directly proportional to 223 the number of base flows going through the switch. 225 - A looping packet detector. This is any mechanism that will detect a packet 226 which is "stuck" in the network; a timeout value in packets, together 227 with a check in routers, is an example. If this is a hop-count, it has 228 to be more than 8 bits; 12 bits would be adequate, and I recommend 16 229 (which also makes it easy to update). This is not to say that I think 230 networks with diameters larger than 256 are good, or that we should design 231 such nets, but I think limiting the maximum path through the network to 232 256 hops is likely to bite us down the road the same way making 233 "infinity" 16 in RIP did (as it did, eventually). When we hit that 234 ceiling, it's going to hurt, and there won't be an easy fix. I will 235 note in passing that we are already seeing paths lengths of over 30 hops. 237 - Optional source and destination locators. These are structured, variable 238 length items which are topologically sensitive identifiers for the 239 place in the network from which the traffic originates or to which the 240 traffic is destined. The locator will probably contain internal 241 separators which divide up the fields, so that a particular field can be 242 enlarged without creating a great deal of upheaval. An adequate value 243 for maximum length supported would be up to 32 bytes per locator, and 244 longer would be even better; I would recommend up to 256 bytes per 245 locator. 247 - Perhaps (paired with the above), an optional pointer into the locators. 248 This is optional "forwarding state" (i.e. state in the packet which 249 records something about its progress across the network) which is used 250 in the datagram forwarding mode to help ensure that the packet does not 251 loop. It can also improve the forwarding processing efficiency. It is thus 252 not absolutely essential, but is very desirable from a real-world 253 engineering view point. It needs to be large enough to identify 254 locations in either locator; e.g. if locators can be up to 256 bytes, 255 it would need to be 9 bits. 257 - An optional source route. This is used to support the "source routed 258 packet" forwarding mode. Although not designed in detail yet, we can 259 discuss two possible approaches. 261 In one, used with "semi-strict" source routing (in which a contiguous 262 series of entities is named, albeit perhaps at a high layer of 263 abstraction), the syntax will likely look much like source routes in PIP; 264 in Nimrod they will be a sequence of Nimrod entity identifiers (i.e. 265 locator elements, not complete locators), along with clues as to the 266 context in which each identifier is to be interpreted (e.g. up, down, 267 across, etc). Since those identifiers themselves are variable length 268 (although probably most will be two bytes or less, otherwise the routing 269 overhead inside the named object would be excessive), and the hop count 270 above contemplates the possibility of paths of over 256 hops, it would 271 seem that these might possibly some day exceed 512 bytes, if a lengthy 272 path was specified in terms of the actual physical assets used. An 273 adequate length would be 512 bytes; the recommended length would be 2^16 274 bytes (although this length would probably not be supported in practise; 275 rather, the field length would allow it). 277 In the other, used with classical "loose" source routes, the source 278 consists of a number of locators. It is not yet clear if this mode will 279 be supported. If so, the header would need to be able to store a 280 sequence of locators (as described above). Space might be saved by 281 not repeating locator prefixes that match that of the previous locator 282 in the sequence; Nimrod will probably allow use of such "locally useful" 283 locators. It is hard to determine what an adequate length would be for 284 this case; the recommended length would be 2^16 bytes (again, with the 285 previous caveat). 287 - Perhaps (paired with the above), an optional pointer into the source 288 route. This is also optional "forwarding state". It needs to be large 289 enough to identify locations anywhere in the source route; e.g. if the 290 source router can be up to 1024 bytes, it would need to be 10 bits. 292 - An internetwork header length. I mention this since the above fields could 293 easily exceed 256 bytes, if they are to all be carried in the internetwork 294 header (see comments below as to where to carry all this information), the 295 header length field needs to be more than 8 bits; 12 bits would be 296 adequate, and I recommend 16 bits. The approach of putting some of the 297 data items above into an interior header, to limit the size of the basic 298 internetworking header, does not really seem optimal, as this data is 299 for use by the intermediate routers, and it needs to be easily accessible. 301 - Authentication of some sort is needed. See the recent IAB document which 302 was produced as a result of the IAB architecture retreat on security 303 (draft-iab-sec-arch-workshop-00.txt), section 4, and especially section 304 4.3. There is currently no set way of doing "denial/theft of service" in 305 Nimrod, but this topic is well explored in that document; Nimrod would 306 use whatever mechanism(s) seem appropriate to those knowledgeable in 307 this area. 309 - A version number. Future forwarding mechanisms might need other 310 information (i.e. fields) in the packet header; use a version number would 311 allow it to be modified to contain what's needed. (This would not 312 necessarily be information that is visible to the hosts, so this does 313 not necessarily mean that the hosts would need to know about this new 314 format.) 4 bits is adequate; it's not clear if a larger value needs to be 315 recommended. 317 2.3 Field Requirements and Addition Methods 319 As noted above, it's possible to use Nimrod in a limited mode where 320 needed information/fields are added by the first-hop router. It's thus 321 useful to ask "which of the fields must be present in the host-router 322 header, and which could be added by the router?" The only ones which are 323 absolutely necessary in all packets are the endpoint identification 324 (provided that some means is available to map them into locators; this 325 would obviously be most useful on UID's which are EID's). 326 As to the others, if the user wishes to use flows, and wants to 327 guarantee that their packets are assigned to the correct flows, the flow-id 328 field is needed. If the user wishes efficient use of the datagram mode, it's 329 probably necessary to include the locators in the packet sent to the router. 330 If the user wishes to specify the route for the packets, and does not wish to 331 set up a flow, they need to include the source route. 333 How would additional information/fields be added to the packet, if 334 the packet is emitted from the host in incomplete form? (By this, I mean the 335 simple question of how, mechanically, not the more complex issue of where 336 any needed information comes from.) 337 This question is complex, since all the IPng candidates (and in fact, 338 any reasonable inter-networking protocol) are extensible protocols; those 339 extension mechanisms could be used. Also, it would possible to carry some of 340 the required information as user data in the internetworking packet, with the 341 original user's data encapsulated further inside. Finally, a private 342 inter-router packet format could be defined. 343 It's not clear which path is best, but we can talk about which fields 344 the Nimrod routers need access to, and how often; less used ones could be 345 placed in harder-to-get-to locations (such as in an encapsulated header). The 346 fields to which the routers need access on every hop are the flow-id and the 347 looping packet detector. The locator/pointer fields are only needed at 348 intervals (in what datagram forwarding mode calls "active" routers), as is the 349 source route (the latter at every object which is named in the source route). 350 Depending on how access control is done, and which forwarding mode is 351 used, the UID's and/or locators might be examined for access control purposes, 352 wherever that function is performed. 353 This is not a complete exploration of the topic, but should give a 354 rough idea of what's going on. 356 3.1 Interaction Architectural Issues 358 The topic of the interaction with the rest of the internetwork layer 359 is more complex. Nimrod springs in part from a design vision which sees the 360 entire internetwork layer, distributed across all the hosts and routers of the 361 internetwork, as a single system, albeit a distributed system. 363 Approached from that angle, one naturally falls into a typical system 364 designer point of view, where you start to think of the modularization of the 365 system; chosing the functional boundaries which divide the system up into 366 functional units, and defining the interactions between the functional units. 367 As we all know, that modularization is the key part of the system design 368 process. 369 It's rare that a group of completely independent modules form a 370 system; there's usually a fairly strong internal interaction. Those 371 interactions have to be thought about and understood as part of the 372 modularization process, since it effects the placement of the functional 373 boundaries. Poor placement leads to complex interactions, or desired 374 interactions which cannot be realized. 375 These are all more important issues with a system which is expected to 376 have a long lifetime; correct placement of the functional boundaries, so as to 377 clearly and simply break up the system into truly fundamental units, is a 378 necessity is the system is to endure and serve well. 380 3.1.1 The Internetwork Layer Service Model 382 To return to the view of the internetwork layer as a system, that 383 system provides certain services to its clients; i.e. it instantiates a 384 service model. To begin with, lacking a shared view of the service model that 385 the internetwork layer is supposed to provide, it's reasonable to suppose that 386 it will prove impossible to agree on mechanisms at the internetwork level to 387 provide that service. 388 To answer the question of what the service model ought to be, one can 389 view the internetwork layer itself as a subsystem of an even large system, the 390 entire internetwork itself. (That system is quite likely the largest and most 391 complex system we will ever build, as it is the largest system we can possibly 392 build; it is the system which will inevitably contain almost all other 393 systems.) 394 From that point of view, the issue of the service model of the 395 internetwork layer becomes a little clearer. The services provided by the 396 internetwork layer are no longer purely abstract, but can be thought about as 397 the external module interface of the internetwork layer module. If agreement 398 can be reached on where to put the functional boundaries of the internetwork 399 layer, and on what overall service the internet as a whole should provide, the 400 service model of the internetwork layer should be easier to agree on. 401 In general terms, it seems that the unreliable packet ought to remain 402 the fundamental building block of the internetwork layer. The design principle 403 that says that we can take any packet and throw it away with no warning or 404 other action, or take any router and turn it off with no warning, and have the 405 system still work, seems very powerful. The component design simplicity (since 406 routers don't have to stand on their heads to retain a packet which they have 407 the only copy of), and overall system robustness, resulting from these two 408 assumptions is absolutely critical. 409 In detail, however, particularly in areas which are still the subject 410 of research and experimentation (such as resource allocation, security, 411 etc), it seems difficult to provide a finished definition of exactly what the 412 service model of the internetwork layer ought to be. 414 3.1.2 The Subsystems of the Internetwork Layer 416 In any event, by viewing the internetwork layer as a large system, one 417 starts to think about what subsystems are needed, and what the interactions 418 among them should look like. Nimrod is simply a number of the subsystems of 419 this larger system, the internetwork layer. It is *not* intended to be a 420 purely standalone set of subsystems, but to work together in close concert 421 with the other subsystems of the internetwork layer (resource allocation, 422 security, charging, etc) to provide the internetwork layer service model. 423 One reason that Nimrod is not simply a monolithic subsystem is that 424 some of the interactions with the other subsystems of the internetwork layer, 425 for instance the resource allocation subsystem, are much clearer and easier to 426 manage if the routing is broken up into several subsystems, with the 427 interactions between them open. 428 It is important to realize that Nimrod was initially broken up into 429 separate subsystems for purely internal reasons. It so happens that, 430 considered as a separate problem, the fundamental boundary lines for dividing 431 routing up into subsystems are the same boundaries that make interaction with 432 other subsystems cleaner; this provides added evidence that these boundaries 433 are in fact the right ones. 435 The subsystems which comprise the functionality covered by Nimrod are 436 i) routing information distribution (in the case of Nimrod, topology map 437 distribution, along with the attributes [policy, QOS, etc] of the topology 438 elements), ii) route selection (strictly speaking, not part of the Nimrod 439 spec per se, but functional examples will be produced), and iii) user traffic 440 handling. 441 The former can fairly well be defined without reference to other 442 subsystems, but the second and third are necessarily more involved. For 443 instance, route selection might involve finding out which links have the 444 resources available to handle some required level of service. For user traffic 445 handling, if a particular application needs a resource reservation, getting 446 that resource reservation to the routers is as much a part of getting the 447 routers ready as making sure they have the correct routing information, so 448 here too, routing is tied in with other subsystems. 450 In any event, although we can talk about the relationship between the 451 Nimrod subsystems, and the other functional subsystems of the internetwork 452 layer, until the service model of the internetwork layer is more clearly 453 visible, along with the functional boundaries within that layer, such a 454 discussion is necessarily rather nebulous. 456 3.2 State and Flows in the Internetwork Layer 458 The internetwork layer as whole contains a variety of information, of 459 varying lifetimes. This information we can refer to as the internetwork 460 layer's "state". Some of this state is stored in the routers, and some is 461 stored in the packets. 462 In the packet, I distinguish between what I call "forwarding state", 463 which records something about the progress of this individual packet through 464 the network (such as the hop count, or the pointer into a source route), and 465 other state, which is information about what service the user wants from the 466 network (such as the destination of the packet), etc. 468 3.2.1 User and Service State 470 I call state which reflects the desires and service requests of the 471 user "user state". This is information which could be sent in each packet, or 472 which can be stored in the router and applied to multiple packets (depending 473 on which makes the most engineering sense). It is still called user state, 474 even when a copy is stored in the routers. 475 User state can be divided into two classes; "critical" (such as 476 destination addresses), without which the packets cannot be forwarded at all, 477 and "non-critical" (such as a resource allocation class), without which the 478 packets can still be forwarded, just not quite in the way the user would most 479 prefer. 480 There are a range of possible mechanisms for getting this user state 481 to the routers; it may be put in every packet, or placed there by a setup. In 482 the latter case, you have a whole range of possibilities for how to get it 483 back when you lose it, such as placing a copy in every Nth packet. 485 However, other state is needed which cannot be stored in each packet; 486 it's state about the longer-term (i.e. across the life of many packets) 487 situation; i.e. state which is inherently associated with a number of packets 488 over some time-frame (e.g. a resource allocation). I call this state "server 489 state". 490 This apparently changes the "stateless" model of routers somewhat, 491 but this change is more apparent than real. The routers already contain 492 state, such as routing table entries; state without which is it virtually 493 impossible to handle user traffic. All that is being changed is the 494 amount, granularity, and lifetime, of state in the routers. 495 Some of this service state may need to be installed in a fairly 496 reliable fashion; e.g. if there is service state related to billing, or 497 allocation of resources for a critical application, one more or less needs to 498 be guaranteed that this service state has been correctly installed. 500 To the extent that you have state in the routers (either service 501 state, or user state), you have to be able to associate that state with the 502 packets it goes with. The fields in the packets that allow you to do this are 503 "tags". 505 3.2.2 Flows 507 It is useful to step back for a bit here, and think about the traffic 508 in the network. Some of it will be from applications with are basically 509 transactions; i.e. they require only a single packet, or a very small number. 510 (I tend to use the term "datagram" to refer to such applications, and use the 511 term "packet" to describe the unit of transmission through the network.) 512 However, other packets are part of longer-lived communications, which have 513 been termed "flows". 515 A flow, from the user's point of view, is a sequence of packets which 516 are associated, usually by being from a single application instance. In an 517 internetwork layer which has a more complex service model (e.g. supports 518 resource allocation, etc), the flow would have service requirements to pass 519 on to some or all of the subsystems which provide those services. 520 To the internetworking layer, a flow is a sequence of packets that 521 share all the attributes that the internetworking layer cares about. This 522 includes, but is not limited to: source/destination, path, resource 523 allocation, accounting/authorization, authentication/security, etc, etc. 524 There isn't necessarily a one-one mapping from flows to *anything* 525 else, be it a TCP connection, or an application instance, or whatever. A 526 single flow might contain several TCP connections (e.g. with FTP, where you 527 have the control connection, and a number of data connections), or a single 528 application might have several flows (e.g. multi-media conferencing, where 529 you'd have one flow for the audio, another for a graphic window, etc, with 530 different resource requirements in terms of bandwidth, delay, etc for each.) 531 Flows may also be multicast constructs, i.e. multiple sources and 532 destinations; they are not inherently unicast. Multicast flows are more 533 complex than unicast (there is a large pool of state which must be made 534 coherent), but the concepts are similar. 536 There's an interesting architectural issue here. Let's assume we have 537 all these different internetwork level subsystems (routing, resource 538 allocation, security/access-control, accounting), etc. Now, we have two 539 choices. 540 First, we could allow each individual subsystem which uses the 541 concept of flows to define itself what it thinks a "flow" is, and define 542 which values in which fields in the packet define a given "flow" for it. Now, 543 presumably, we have to allow 2 flows for subsystem X to map onto 1 flow for 544 subsystem Y to map onto 3 flows for subsystem Z; i.e. you can mix and match 545 to your heart's content. 546 Second, we could define a standard "flow" mechanism for the 547 internetwork layer, along with a way of identifying the flow in the packet, 548 etc. Then, if you have two things which wish to differ in *any* subsystem, 549 you have to have a separate flow for each. 550 The former has the advantages that it's a little easier to deploy 551 incrementally, since you don't have to agree on a common flow mechanism. It 552 may save on replicated state (if I have 3 flows, and they are the same for 553 subsystem X, and different for Y, I only need one set of X state). It also 554 has a lot more flexibility. The latter is simple and straightforward, and 555 given the complexity of what is being proposed, it seems that any place we 556 can make things simpler, we should. 557 The choice is not trivial; it all depends on things like "what 558 percentage of flows will want to share the same state in certain subsystems 559 with other flows". I don't know how to quantify those, but as an architect, I 560 prefer simple, straightforward things. This system is pretty complex already, 561 and I'm not sure the benefits of being able to mix and match are worth the 562 added complexity. So, for the moment I'll assume a single, system-wide, 563 definition of flows. 565 The packets which belong to a flow could be identified by a tag 566 consisting of a number of fields (such as addresses, ports, etc), as opposed 567 to a specialized field. However, it may be more straightforward, and 568 foolproof, to simply identify the flow a packet belongs to with by means of a 569 specialized tag field (the "flow-id" ) in the internetwork header. Given that 570 you can always find situations where the existing fields alone don't do the 571 job, and you *still* need a separate field to do the job correctly, it seems 572 best to take the simple, direct approach , and say "the flow a packet belongs 573 to is named by a flow-id in the packet header". 574 The simplicity of globally-unique flow-id's (or at least a flow-id 575 which unique along the path of the flow) is also desirable; they take more 576 bits in the header, but then you don't have to worry about all the mechanism 577 needed to remap locally-unique flow-id's, etc, etc. From the perspective of 578 designing something with a long lifetime, and which is to be deployed 579 widely, simplicity and directness is the only way to go. For me, that 580 translates into flows being named solely by globally unique flow-id's, 581 rather than some complex semantics on existing fields. 583 However, the issue of how to recognize which packets belong to flows 584 is somewhat orthogonal to the issue of whether the internetwork level 585 recognizes flows at all. Should it? 587 3.2.3 Flows and State 589 To the extent that you have service state in the routers you have to 590 be able to associate that state with the packets it goes with. This is a 591 fundamental reason for flows. Access to service state is one reason to 592 explicitly recognize flows at the internetwork layer, but it is not the only 593 one. 594 If the user has requirements in a number of areas (e.g. routing and 595 access control), they can theoretically communicate these to the routers by 596 placing a copy of all the relevant information in each packet (in the 597 internetwork header). If many subsystems of the internetwork are involved, 598 and the requirements are complex, this could be a lot of bits. 599 (As a final aside, there's clearly no point in storing in the routers 600 any user state about packets which are providing datagram service; the 601 datagram service has usually come and gone in the same packet, and this 602 discussion is all about state retention.) 604 There are two schools of thought as to how to proceed. The first says 605 that for reasons of robustness and simplicity, all user state ought to be 606 repeated in each packet. For efficiency reasons, the routers may cache such 607 user state, probably along with precomputed data derived from the user state. 608 (It makes sense to store such cached user state along with any applicable 609 server state, of course.) 611 The second school says that if something is going to generate lots of 612 packets, it makes engineering sense to give all this information to the 613 routers once, and from then on have a tag (the flow-id) in the packet which 614 tells the routers where to find that information. It's simply going to be too 615 inefficient to carry all the user state around all the time. This is purely 616 an engineering efficiency reason, but it's a significant one. 617 There is a slightly deeper argument, which says that the routers will 618 inevitably come to contain more user state, and it's simply a question of 619 whether that state is installed by an explicit mechanism, or whether the 620 routers infer that state from watching the packets which pass through them. 621 To the extent that it is inevitable anyway, there are obvious benefits to be 622 gained from recognizing that, and an explicit design of the installation is 623 more likely to give satisfactory results (as opposed to an ad-hoc mechanism). 624 It is worth noting that although the term "flow" is often used to 625 refer to this state in the routers along the path of the flow, it is important 626 to distinguish between i) a flow as a sequence of packets (i.e. the definition 627 given in 3.2.2 above), and ii) a flow, as the thing which is set up in the 628 routers. They are different, and although the particular meaning is usually 629 clear from the context, they are not the same thing at all. 631 I'm not sure how much use there is to any intermediate position, in 632 which one subsystem installs user state in the routers, and another carries a 633 copy of its user state in each packet. 634 (There are other intermediate positions. First, one flow might use a 635 given technique for all its subsystems, and another flow might use a 636 different technique for its; there is potentially some use to this, although 637 I'm not sure the cost in complexity of supporting both mechanisms is worth 638 the benefits. Second, one flow might use one mechanism with one router along 639 its path, and another for a different router. A number of different reasons 640 exist as to why one might do this, including the fact that not all routers 641 may support the same mechanisms simultaneously.) 642 It seems to me that to have one internetwork layer subsystem (e.g. 643 resource allocation) carry user state in all the packets (perhaps with use of 644 a "hint" in the packets to find potentially cached copies in the router), and 645 have a second (e.g. routing) use a direct installation, and use a tag in the 646 packets to find it, makes little sense. We should do one or the other, based 647 on a consideration of the efficiency/robustness tradeoff. 648 Also, if there is a way of installing such flow-associated state, it 649 makes sense to have only one, which all subsystems use, instead of building a 650 separate one for each flow. 652 It's a little difficult to make the choice between installation, and 653 carrying a copy in each packet, without more information of exactly how much 654 user state the network is likely to have in the future. (For instance, we 655 might wind up with 500 byte headers if we include the full source route, 656 resource reservation, etc, etc in every header.) 657 It's also difficult without consideration of the actual mechanisms 658 involved. As a general principle, we wish to make recovery from a loss of 659 state as local as possible, to limit the number of entities which have to 660 become involved. (For instance, when a router crashes, traffic is rerouted 661 around it without needing to open a new TCP connection.) The option of the 662 "installation" looks a lot more attractive if it's simple, and relatively 663 cheap, to reinstall the user state when a router crashes, without otherwise 664 causing a lot of hassle. 666 However, given the likely growth in user state, the necessity for 667 service state, the requirement for reliable installation, and a number of 668 similar considerations, it seems that direct installation of user state, and 669 explicit recognition of flows, through a unified definition and tag mechanism 670 in the packets, is the way to go, and this is the path that Nimrod has 671 chosen. 673 3.3 Specific Interaction Issues 675 Here is a very incomplete list of the things which Nimrod would like to see 676 from the internetwork layer as a whole: 678 - A unified definition of flows in the internetwork layer, and a unified 679 way of identifying, through a separate flow-id field, which packets belong 680 to a given flow. 682 - A unified mechanism (potentially distributed) for installing state about 683 flows (including multicast flows) in routers. 685 - A method for getting information about whether a given resource allocation 686 request has failed along a given path; this might be part of the unified 687 flow setup mechanism. 689 - An interface to (potentially distributed) mechanism for maintaining the 690 membership in a multi-cast group. 692 - Support for multiple interfaces; i.e. multi-homing. Nimrod does this by 693 decoupling transport identification (done via EID's) from interface 694 identification (done via locators). E.g., a packet with any valid 695 destination locator should be accepted by the TCP of an endpoint, if the 696 destination EID is the one assigned to that endpoint. 698 - Support for multiple locators ("addresses") per network interface. This 699 is needed for a number of reasons, among them to allow for less painful 700 transitions in the locator abstraction hierarchy as the topology changes. 702 - Support for multiple UID's ("addresses") per endpoint (roughly, per 703 host). This would definitely include both multiple multicast SID's, and 704 at least one unicast EID (the need for multiple unicast EID's per endpoint 705 is not obvious). 707 - Support for distinction between a multicast group as a named entity, 708 and a multicast flow which may not reach all the members. 710 - A distributed, replicated, user name translation system (DNS?) that maps 711 such user names into (EID, locator0, ... locatorN) bindings. 713 Expires: January 21, 1995 July 21, 1994