idnits 2.17.1 

draft-chiappa-ipng-nimrod-arch-00.txt:
  ** The Abstract section seems to be numbered


  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-19) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 1) being 728 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack an Authors' Addresses Section.

  ** There are 311 instances of too long lines in the document, the longest
     one being 6 characters in excess of 72.

  ** There are 82 instances of lines with control characters in the document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (July 21, 1994) is 10865 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

     No issues found here.

     Summary: 12 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Draft						J. Noel Chiappa
2	Expires: January 21, 1995				July 21, 1994

4				IPng Technical Requirements
5		    Of the Nimrod Routing and Addressing Architecture
6	                  <draft-chiappa-ipng-nimrod-arch-00.txt>

8	Status of this Memo

10		This document is an Internet Draft. Internet Drafts are working
11	documents of the Internet Engineering Task Force (IETF), its Areas, and
12	its Working Groups. Note that other groups may also distribute working
13	documents as Internet Drafts.
14		Internet Drafts are draft documents valid for a maximum of six
15	months. Internet Drafts may be updated, replaced, or obsoleted by other
16	documents at any time. It is not appropriate to use Internet Drafts as
17	reference material or to cite them other than as a 'working draft' or
18	'work in progress.'
19		Please check the Internet Draft abstract listing (in the file
20	1id-abstracts.txt) contained in the Internet Drafts Shadow Directories
21	(cd internet-drafts, on nic.ddn.mil, nnsc.nsf.net, ftp.nisc.sri.com,
22	nic.nordu.net, or munnari.oz.au) to learn the current status of any
23	Internet Draft.

25		This draft document will be submitted to the RFC Editor as an
26	Informational RFC. Distribution of this document is unlimited. Please
27	send comments to jnc@lcs.mit.edu.

29	1.1 Abstract

31		This document presents the requirements that the Nimrod routing and
32	addressing architecture has upon the internetwork layer protocol. To be most
33	useful to Nimrod, any protocol selected as the IPng should satisfy these
34	requirements. Also presented is some background information, consisting of
35	i) information about architectural and design principles which might apply
36	to the design of a new internetworking layer, and ii) some details of the
37	logic and reasoning behind particular requirements.

39	1.2 Introduction

41		It is important to note that this document is not "IPng Requirements
42	for Routing", as other proposed routing and addressing designs may need
43	different support; this document is specific to Nimrod, and doesn't claim to
44	speak for other efforts.

46		However, although I don't wish to assume that the particular designs
47	being worked on by the Nimrod WG will be widely adopted by the Internet (if
48	for no other reason, they have not yet been deployed and tried and tested in
49	practise, to see if they really work, an absolutely necessary hurdle for any
50	protocol), there are reasons to believe that any routing architecture for a
51	large, ubiquitous global Internet will have many of the same basic fundamental
52	principles as the Nimrod architecture, and the requirements that these
53	generate.
54		While current day routing technologies do not yet have the
55	characteristics and capabilities that generate these requirements, they also
56	do not seem to be completely suited to routing in the next-generation
57	Internet. As routing technology moves towards what is needed for the next
58	generation Internet, the underlying fundamental laws and principles of routing
59	will almost inevitably drive the design, and hence the requirements, toward
60	things which look like the material presented here.
61		Therefore, even if Nimrod is not the routing architecture of the
62	next-generation Internet, the basic routing architecture of that Internet will
63	have requirements that, while differing in detail, will almost inevitably be
64	similar to these.

66		In a similar, but more general, context, note that, by and large, the
67	general analysis of sections 3.1 ("Interaction Architectural Issues") and 3.2
68	("State and Flows in the Internetwork Layer") will apply to other areas of
69	a new internetwork layer, not just routing.

71		I will tackle the internetwork packet format first (which is simpler),
72	and then the whole issue of the interaction with the rest of the internetwork
73	layer (which is a much more subtle topic).

75	2.1 Packet Format Issues

77		As a general rule, the design philosophy of Nimrod is "maximize the
78	lifetime (and flexibility) of the architecture". Design tradeoffs (i.e.
79	optimizations) that will adversely affect the flexibility, adaptability and
80	lifetime of the design are not not necessarily wise choices; they may cost
81	more than they save. Such optimizations might be the correct choices in a
82	stand-alone system, where the replacement costs are relatively small; in the
83	global communication network, the replacement costs are very much higher.

85		Providing the Nimrod functionality requires the carrying of certain
86	information in the packets. The design principle noted above has a number of
87	corollaries in specifying the fields to contain that information.
88		First, the design should be "simple and straightforward", which means
89	that various functions should be handled by completely separate mechanisms,
90	and fields in the packets. It may seem that an opportunity exists to save
91	space by overloading two functions onto one mechanism or field, but general
92	experience is that, over time, this attempt at optimization costs more, by
93	restricting flexibility and adaptability.
94		Second, field lengths should be specified to be somewhat larger than
95	can conceivably be used; the history of system architecture is replete with
96	examples (processor address size being the most notorious) where fields became
97	too short over the lifetime of the system. The document indicates what the
98	smallest reasonable "adequate" lengths are, but this is more of a "critical
99	floor" than a recommendation. A "recommended" length is also given, which is
100	the length which corresponds to the application of this principle. The wise
101	designer would pick this length.
102		It is important to now that this does *not* mean that
103	implementations must support the maximum value possible in a field of that
104	size. I imagine that system-wide administrative limits will be placed on the
105	maximum values which must be supported. Then, as the need arises, we can
106	increase the administrative limit. This allows an easy, and completely
107	interoperable (with no special mechanisms) path to upgrade the capability of
108	the network. If the maximum supported value of a field needs to be increased
109	from M to N, an announcement is made that this is coming; during the interim
110	period, the system continues to operate with M, but new implementations are
111	deployed; while this is happening, interoperation is automatic, with no
112	transition mechanisms of any kind needed. When things are "ready" (i.e. the
113	proportion of old equipment is small enough), use of the larger value
114	commences.

116		Also, in speaking of the packet format, you first need to distinguish
117	between the host-router part of the path, and the router-router part; a
118	format that works OK for one may not do for another.
119		The issue is complicated by the fact that Nimrod can be made to work,
120	albeit not in optimal form, with information/fields missing from the packet in
121	the host to "first hop router" section of the packet's path. The missing
122	fields and information can then be added by the first hop router. (This
123	capability will be used to allow deployment and operation with unmodified IPv4
124	hosts, although similar techniques could be used with other internetworking
125	protocols.) Access to the full range of Nimrod capabilities will require
126	upgrading of hosts to include the necessary information in the packets they
127	exchange with the routers.
128		Second, Nimrod currently has three planned forwarding modes (flows,
129	datagram, and source-routed packets), and a format that works for one may not
130	work for another; some modes use fields that are not used by other modes.
131	The presence or absence of these fields will make a difference.

133	2.2 Packet Format Fields

135	What Nimrod would like to see in the internetworking packet is:

137	  - Source and destination endpoint identification. There are several
138	    possibilities here.

140	    One is "UID"s, which are "shortish", fixed length fields which appear in
141	    each packet, in the internetwork header, which contain globally unique,
142	    topologically insensitive identifiers for either i) endpoints (if you
143	    aren't familiar with endpoints, think of them as hosts), or ii)
144	    multicast groups. (In the former instance, the UID is an EID; in the
145	    latter, a "set ID", or SID. An SID is an identifier which looks just
146	    like an EID, but it refers to a group of endpoints. The semantics of
147	    SID's are not completely defined.) For each of these 48 bits is
148	    adequate, but we would recommend 64 bits. (IPv4 will be able to operate
149	    with smaller ones for a while, but eventually either need a new packet
150	    format, or the difficult and not wholly satisfactory technique known as
151	    Network Address Translators, which allows the contents of these fields
152	    to be only locally unique.)

154	    Another possibility is some shorter field, named an "endpoint selector",
155	    or ESEL, which contains a value which is not globally unique, but only
156	    unique in mapping tables on each end, tables which map from the small
157	    value to a globally unique value, such as a DNS name.

159	    Finally, it is possible to conceive of overall networking designs which
160	    do not include any endpoint identification in the packet at all, but
161	    transfer it at the start of a communication, and from then on infer it.
162	    This alternative would have to have some other means of telling which
163	    endpoint a given packet is for, if there are several endpoints at a
164	    given destination. Some coordination on allocation of flow-ids, or
165	    higher level port numbers, etc, might do this.

167	  - Flow identification. There are two basic approaches here, depending on
168	    whether flows are aggregated (in intermediate switches) or not. It
169	    should be emphasized at this point that it is not yet known whether
170	    flow aggregation will be needed. The only reason to do it is to control
171	    the growth of state in intermediate routers, but there is no hard case
172	    made that either this growth will be unmanageable, or that aggregating
173	    flows will be feasible practically.

175	    For the non-aggregated case, a single "flow-id" field will suffice.
176	    This *must not* use one of the two previous UID fields, as in
177	    datagram mode (and probably source-routed mode as well) the flow-id will
178	    be over-written during transit of the network. It could most easily be
179	    constructed by adding a UID to a locally unique flow-id, which will
180	    provide a globally unique flow-id. It is possible to use non-globally
181	    unique flow-ids, (which would allow a shorter length to this field),
182	    although this would mean that collisions would result, and have to be
183	    dealt with. An adequate length for the local part of a globally unique
184	    flow-id would be 12 bits (which would be my "out of thin air" guess),
185	    but we recommend 32. For a non-globally unique flow-id, 24 bits would
186	    be adequate, but I recommend 32.

188	    For the aggregated case, three broad classes of mechanism are possible.

190	    - Option 1: The packet contains a sequence (sort of like a source route)
191	    of flow-ids. Whenever you aggregate or deaggregate, you move along the
192	    list to the next one. This takes the most space, but is otherwise the
193	    least work for the routers.

195	    - Option 2: The packet contains a stack of flow-ids, with the current
196	    one on the top. When you aggregate, you push a new one on; when you
197	    de-aggregate, you take one off. This takes more work, but less space in
198	    the packet than the complete "source-route". Encapsulating packets to do
199	    aggregation does basically this, but you're stacking entire headers, not
200	    just flow-ids. The clever way to do this flow-id stacking, without doing
201	    encapsulation, is to find out from flow-setup how deep the stack will get,
202	    and allocate the space in the packet when it's created. That way, all you
203	    ever have to do is insert a new flow-id, or "remove" one; you never have
204	    to make room for more flow-ids.

206	    - Option 3: The packet contains only the "base" flow-id (i.e. the one
207	    with the finest granularity), and the current flow-id. When you aggregate,
208	    you just bash the current flow-id. The tricky part comes when you
209	    de-aggregate; you have to put the right value back. To do this, you have
210	    to have state in the router at the end of the aggregated flow, which tells
211	    you what the de-aggregated flow for each base flow is. The downside
212	    here is obvious: we get away without individual flow state for each of
213	    the constituent flows in all the routers along the path of that
214	    aggregated, flow, *except* for the last one.

216	    Other than encapsulation, which has significant inefficiency in space
217	    overhead fairly quickly, after just a few layers of aggregation, there
218	    appears to be no way to do it with just one flow-id in the packet header.
219	    Even if you don't touch the packets, but do the aggregation by mapping
220	    some number of "base" flow-id's to a single aggregated flow in the routers
221	    along the path of the aggregated flow, the table that does the mapping is
222	    still going to have to have a number of entries directly proportional to
223	    the number of base flows going through the switch.

225	  - A looping packet detector. This is any mechanism that will detect a packet
226	    which is "stuck" in the network; a timeout value in packets, together
227	    with a check in routers, is an example. If this is a hop-count, it has
228	    to be more than 8 bits; 12 bits would be adequate, and I recommend 16
229	    (which also makes it easy to update). This is not to say that I think
230	    networks with diameters larger than 256 are good, or that we should design
231	    such nets, but I think limiting the maximum path through the network to
232	    256 hops is likely to bite us down the road the same way making
233	    "infinity" 16 in RIP did (as it did, eventually). When we hit that
234	    ceiling, it's going to hurt, and there won't be an easy fix. I will
235	    note in passing that we are already seeing paths lengths of over 30 hops.

237	  - Optional source and destination locators. These are structured, variable
238	    length items which are topologically sensitive identifiers for the
239	    place in the network from which the traffic originates or to which the
240	    traffic is destined. The locator will probably contain internal
241	    separators which divide up the fields, so that a particular field can be
242	    enlarged without creating a great deal of upheaval. An adequate value
243	    for maximum length supported would be up to 32 bytes per locator, and
244	    longer would be even better; I would recommend up to 256 bytes per
245	    locator.

247	  - Perhaps (paired with the above), an optional pointer into the locators.
248	    This is optional "forwarding state" (i.e. state in the packet which
249	    records something about its progress across the network) which is used
250	    in the datagram forwarding mode to help ensure that the packet does not
251	    loop. It can also improve the forwarding processing efficiency. It is thus
252	    not absolutely essential, but is very desirable from a real-world
253	    engineering view point. It needs to be large enough to identify
254	    locations in either locator; e.g. if locators can be up to 256 bytes,
255	    it would need to be 9 bits.

257	  - An optional source route. This is used to support the "source routed
258	    packet" forwarding mode. Although not designed in detail yet, we can
259	    discuss two possible approaches.

261	    In one, used with "semi-strict" source routing (in which a contiguous
262	    series of entities is named, albeit perhaps at a high layer of
263	    abstraction), the syntax will likely look much like source routes in PIP;
264	    in Nimrod they will be a sequence of Nimrod entity identifiers (i.e.
265	    locator elements, not complete locators), along with clues as to the
266	    context in which each identifier is to be interpreted (e.g. up, down,
267	    across, etc). Since those identifiers themselves are variable length
268	    (although probably most will be two bytes or less, otherwise the routing
269	    overhead inside the named object would be excessive), and the hop count
270	    above contemplates the possibility of paths of over 256 hops, it would
271	    seem that these might possibly some day exceed 512 bytes, if a lengthy
272	    path was specified in terms of the actual physical assets used. An
273	    adequate length would be 512 bytes; the recommended length would be 2^16
274	    bytes (although this length would probably not be supported in practise;
275	    rather, the field length would allow it).

277	    In the other, used with classical "loose" source routes, the source
278	    consists of a number of locators. It is not yet clear if this mode will
279	    be supported. If so, the header would need to be able to store a
280	    sequence of locators (as described above). Space might be saved by
281	    not repeating locator prefixes that match that of the previous locator
282	    in the sequence; Nimrod will probably allow use of such "locally useful"
283	    locators. It is hard to determine what an adequate length would be for
284	    this case; the recommended length would be 2^16 bytes (again, with the
285	    previous caveat).

287	  - Perhaps (paired with the above), an optional pointer into the source
288	    route. This is also optional "forwarding state". It needs to be large
289	    enough to identify locations anywhere in the source route; e.g. if the
290	    source router can be up to 1024 bytes, it would need to be 10 bits.

292	  - An internetwork header length. I mention this since the above fields could
293	    easily exceed 256 bytes, if they are to all be carried in the internetwork
294	    header (see comments below as to where to carry all this information), the
295	    header length field needs to be more than 8 bits; 12 bits would be
296	    adequate, and I recommend 16 bits. The approach of putting some of the
297	    data items above into an interior header, to limit the size of the basic
298	    internetworking header, does not really seem optimal, as this data is
299	    for use by the intermediate routers, and it needs to be easily accessible.

301	  - Authentication of some sort is needed. See the recent IAB document which
302	    was produced as a result of the IAB architecture retreat on security
303	    (draft-iab-sec-arch-workshop-00.txt), section 4, and especially section
304	    4.3. There is currently no set way of doing "denial/theft of service" in
305	    Nimrod, but this topic is well explored in that document; Nimrod would
306	    use whatever mechanism(s) seem appropriate to those knowledgeable in
307	    this area.

309	  - A version number. Future forwarding mechanisms might need other
310	    information (i.e. fields) in the packet header; use a version number would
311	    allow it to be modified to contain what's needed. (This would not
312	    necessarily be information that is visible to the hosts, so this does
313	    not necessarily mean that the hosts would need to know about this new
314	    format.) 4 bits is adequate; it's not clear if a larger value needs to be
315	    recommended.

317	2.3 Field Requirements and Addition Methods

319		As noted above, it's possible to use Nimrod in a limited mode where
320	needed information/fields are added by the first-hop router. It's thus
321	useful to ask "which of the fields must be present in the host-router
322	header, and which could be added by the router?" The only ones which are
323	absolutely necessary in all packets are the endpoint identification
324	(provided that some means is available to map them into locators; this
325	would obviously be most useful on UID's which are EID's).
326		As to the others, if the user wishes to use flows, and wants to
327	guarantee that their packets are assigned to the correct flows, the flow-id
328	field is needed. If the user wishes efficient use of the datagram mode, it's
329	probably necessary to include the locators in the packet sent to the router.
330	If the user wishes to specify the route for the packets, and does not wish to
331	set up a flow, they need to include the source route.

333		How would additional information/fields be added to the packet, if
334	the packet is emitted from the host in incomplete form? (By this, I mean the
335	simple question of how, mechanically, not the more complex issue of where
336	any needed information comes from.)
337		This question is complex, since all the IPng candidates (and in fact,
338	any reasonable inter-networking protocol) are extensible protocols; those
339	extension mechanisms could be used. Also, it would possible to carry some of
340	the required information as user data in the internetworking packet, with the
341	original user's data encapsulated further inside. Finally, a private
342	inter-router packet format could be defined.
343		It's not clear which path is best, but we can talk about which fields
344	the Nimrod routers need access to, and how often; less used ones could be
345	placed in harder-to-get-to locations (such as in an encapsulated header). The
346	fields to which the routers need access on every hop are the flow-id and the
347	looping packet detector. The locator/pointer fields are only needed at
348	intervals (in what datagram forwarding mode calls "active" routers), as is the
349	source route (the latter at every object which is named in the source route).
350		Depending on how access control is done, and which forwarding mode is
351	used, the UID's and/or locators might be examined for access control purposes,
352	wherever that function is performed.
353		This is not a complete exploration of the topic, but should give a
354	rough idea of what's going on.

356	3.1 Interaction Architectural Issues

358		The topic of the interaction with the rest of the internetwork layer
359	is more complex. Nimrod springs in part from a design vision which sees the
360	entire internetwork layer, distributed across all the hosts and routers of the
361	internetwork, as a single system, albeit a distributed system.

363		Approached from that angle, one naturally falls into a typical system
364	designer point of view, where you start to think of the modularization of the
365	system; chosing the functional boundaries which divide the system up into
366	functional units, and defining the interactions between the functional units.
367	As we all know, that modularization is the key part of the system design
368	process.
369		It's rare that a group of completely independent modules form a
370	system; there's usually a fairly strong internal interaction. Those
371	interactions have to be thought about and understood as part of the
372	modularization process, since it effects the placement of the functional
373	boundaries. Poor placement leads to complex interactions, or desired
374	interactions which cannot be realized.
375		These are all more important issues with a system which is expected to
376	have a long lifetime; correct placement of the functional boundaries, so as to
377	clearly and simply break up the system into truly fundamental units, is a
378	necessity is the system is to endure and serve well.

380	3.1.1 The Internetwork Layer Service Model

382		To return to the view of the internetwork layer as a system, that
383	system provides certain services to its clients; i.e. it instantiates a
384	service model. To begin with, lacking a shared view of the service model that
385	the internetwork layer is supposed to provide, it's reasonable to suppose that
386	it will prove impossible to agree on mechanisms at the internetwork level to
387	provide that service.
388	 	To answer the question of what the service model ought to be, one can
389	view the internetwork layer itself as a subsystem of an even large system, the
390	entire internetwork itself. (That system is quite likely the largest and most
391	complex system we will ever build, as it is the largest system we can possibly
392	build; it is the system which will inevitably contain almost all other
393	systems.)
394		From that point of view, the issue of the service model of the
395	internetwork layer becomes a little clearer. The services provided by the
396	internetwork layer are no longer purely abstract, but can be thought about as
397	the external module interface of the internetwork layer module. If agreement
398	can be reached on where to put the functional boundaries of the internetwork
399	layer, and on what overall service the internet as a whole should provide, the
400	service model of the internetwork layer should be easier to agree on.
401		In general terms, it seems that the unreliable packet ought to remain
402	the fundamental building block of the internetwork layer. The design principle
403	that says that we can take any packet and throw it away with no warning or
404	other action, or take any router and turn it off with no warning, and have the
405	system still work, seems very powerful. The component design simplicity (since
406	routers don't have to stand on their heads to retain a packet which they have
407	the only copy of), and overall system robustness, resulting from these two
408	assumptions is absolutely critical.
409		In detail, however, particularly in areas which are still the subject
410	of research and experimentation (such as resource allocation, security,
411	etc), it seems difficult to provide a finished definition of exactly what the
412	service model of the internetwork layer ought to be.

414	3.1.2 The Subsystems of the Internetwork Layer

416		In any event, by viewing the internetwork layer as a large system, one
417	starts to think about what subsystems are needed, and what the interactions
418	among them should look like. Nimrod is simply a number of the subsystems of
419	this larger system, the internetwork layer. It is *not* intended to be a
420	purely standalone set of subsystems, but to work together in close concert
421	with the other subsystems of the internetwork layer (resource allocation,
422	security, charging, etc) to provide the internetwork layer service model.
423		One reason that Nimrod is not simply a monolithic subsystem is that
424	some of the interactions with the other subsystems of the internetwork layer,
425	for instance the resource allocation subsystem, are much clearer and easier to
426	manage if the routing is broken up into several subsystems, with the
427	interactions between them open.
428		It is important to realize that Nimrod was initially broken up into
429	separate subsystems for purely internal reasons. It so happens that,
430	considered as a separate problem, the fundamental boundary lines for dividing
431	routing up into subsystems are the same boundaries that make interaction with
432	other subsystems cleaner; this provides added evidence that these boundaries
433	are in fact the right ones.

435		The subsystems which comprise the functionality covered by Nimrod are
436	i) routing information distribution (in the case of Nimrod, topology map
437	distribution, along with the attributes [policy, QOS, etc] of the topology
438	elements), ii) route selection (strictly speaking, not part of the Nimrod
439	spec per se, but functional examples will be produced), and iii) user traffic
440	handling.
441		The former can fairly well be defined without reference to other
442	subsystems, but the second and third are necessarily more involved. For
443	instance, route selection might involve finding out which links have the
444	resources available to handle some required level of service. For user traffic
445	handling, if a particular application needs a resource reservation, getting
446	that resource reservation to the routers is as much a part of getting the
447	routers ready as making sure they have the correct routing information, so
448	here too, routing is tied in with other subsystems.

450		In any event, although we can talk about the relationship between the
451	Nimrod subsystems, and the other functional subsystems of the internetwork
452	layer, until the service model of the internetwork layer is more clearly
453	visible, along with the functional boundaries within that layer, such a
454	discussion is necessarily rather nebulous.

456	3.2 State and Flows in the Internetwork Layer

458		The internetwork layer as whole contains a variety of information, of
459	varying lifetimes. This information we can refer to as the internetwork
460	layer's "state". Some of this state is stored in the routers, and some is
461	stored in the packets.
462		In the packet, I distinguish between what I call "forwarding state",
463	which records something about the progress of this individual packet through
464	the network (such as the hop count, or the pointer into a source route), and
465	other state, which is information about what service the user wants from the
466	network (such as the destination of the packet), etc.

468	3.2.1 User and Service State

470		I call state which reflects the desires and service requests of the
471	user "user state". This is information which could be sent in each packet, or
472	which can be stored in the router and applied to multiple packets (depending
473	on which makes the most engineering sense). It is still called user state,
474	even when a copy is stored in the routers.
475		User state can be divided into two classes; "critical" (such as
476	destination addresses), without which the packets cannot be forwarded at all,
477	and "non-critical" (such as a resource allocation class), without which the
478	packets can still be forwarded, just not quite in the way the user would most
479	prefer.
480		There are a range of possible mechanisms for getting this user state
481	to the routers; it may be put in every packet, or placed there by a setup. In
482	the latter case, you have a whole range of possibilities for how to get it
483	back when you lose it, such as placing a copy in every Nth packet.

485		However, other state is needed which cannot be stored in each packet;
486	it's state about the longer-term (i.e. across the life of many packets)
487	situation; i.e. state which is inherently associated with a number of packets
488	over some time-frame (e.g. a resource allocation). I call this state "server
489	state".
490		This apparently changes the "stateless" model of routers somewhat,
491	but this change is more apparent than real. The routers already contain
492	state, such as routing table entries; state without which is it virtually
493	impossible to handle user traffic. All that is being changed is the
494	amount, granularity, and lifetime, of state in the routers.
495		Some of this service state may need to be installed in a fairly
496	reliable fashion; e.g. if there is service state related to billing, or
497	allocation of resources for a critical application, one more or less needs to
498	be guaranteed that this service state has been correctly installed.

500		To the extent that you have state in the routers (either service
501	state, or user state), you have to be able to associate that state with the
502	packets it goes with. The fields in the packets that allow you to do this are
503	"tags".

505	3.2.2 Flows

507		It is useful to step back for a bit here, and think about the traffic
508	in the network. Some of it will be from applications with are basically
509	transactions; i.e. they require only a single packet, or a very small number.
510	(I tend to use the term "datagram" to refer to such applications, and use the
511	term "packet" to describe the unit of transmission through the network.)
512	However, other packets are part of longer-lived communications, which have
513	been termed "flows".

515		A flow, from the user's point of view, is a sequence of packets which
516	are associated, usually by being from a single application instance. In an
517	internetwork layer which has a more complex service model (e.g. supports
518	resource allocation, etc), the flow would have service requirements to pass
519	on to some or all of the subsystems which provide those services.
520		To the internetworking layer, a flow is a sequence of packets that
521	share all the attributes that the internetworking layer cares about. This
522	includes, but is not limited to: source/destination, path, resource
523	allocation, accounting/authorization, authentication/security, etc, etc.
524		There isn't necessarily a one-one mapping from flows to *anything*
525	else, be it a TCP connection, or an application instance, or whatever. A
526	single flow might contain several TCP connections (e.g. with FTP, where you
527	have the control connection, and a number of data connections), or a single
528	application might have several flows (e.g. multi-media conferencing, where
529	you'd have one flow for the audio, another for a graphic window, etc, with
530	different resource requirements in terms of bandwidth, delay, etc for each.)
531		Flows may also be multicast constructs, i.e. multiple sources and
532	destinations; they are not inherently unicast. Multicast flows are more
533	complex than unicast (there is a large pool of state which must be made
534	coherent), but the concepts are similar.

536		There's an interesting architectural issue here. Let's assume we have
537	all these different internetwork level subsystems (routing, resource
538	allocation, security/access-control, accounting), etc. Now, we have two
539	choices.
540		First, we could allow each individual subsystem which uses the
541	concept of flows to define itself what it thinks a "flow" is, and define
542	which values in which fields in the packet define a given "flow" for it. Now,
543	presumably, we have to allow 2 flows for subsystem X to map onto 1 flow for
544	subsystem Y to map onto 3 flows for subsystem Z; i.e. you can mix and match
545	to your heart's content.
546		Second, we could define a standard "flow" mechanism for the
547	internetwork layer, along with a way of identifying the flow in the packet,
548	etc. Then, if you have two things which wish to differ in *any* subsystem,
549	you have to have a separate flow for each.
550		The former has the advantages that it's a little easier to deploy
551	incrementally, since you don't have to agree on a common flow mechanism. It
552	may save on replicated state (if I have 3 flows, and they are the same for
553	subsystem X, and different for Y, I only need one set of X state). It also
554	has a lot more flexibility. The latter is simple and straightforward, and
555	given the complexity of what is being proposed, it seems that any place we
556	can make things simpler, we should.
557		The choice is not trivial; it all depends on things like "what
558	percentage of flows will want to share the same state in certain subsystems
559	with other flows". I don't know how to quantify those, but as an architect, I
560	prefer simple, straightforward things. This system is pretty complex already,
561	and I'm not sure the benefits of being able to mix and match are worth the
562	added complexity. So, for the moment I'll assume a single, system-wide,
563	definition of flows.

565		The packets which belong to a flow could be identified by a tag
566	consisting of a number of fields (such as addresses, ports, etc), as opposed
567	to a specialized field. However, it may be more straightforward, and
568	foolproof, to simply identify the flow a packet belongs to with by means of a
569	specialized tag field (the "flow-id" ) in the internetwork header. Given that
570	you can always find situations where the existing fields alone don't do the
571	job, and you *still* need a separate field to do the job correctly, it seems
572	best to take the simple, direct approach , and say "the flow a packet belongs
573	to is named by a flow-id in the packet header".
574		The simplicity of globally-unique flow-id's (or at least a flow-id
575	which unique along the path of the flow) is also desirable; they take more
576	bits in the header, but then you don't have to worry about all the mechanism
577	needed to remap locally-unique flow-id's, etc, etc. From the perspective of
578	designing something with a long lifetime, and which is to be deployed
579	widely, simplicity and directness is the only way to go. For me, that
580	translates into flows being named solely by globally unique flow-id's,
581	rather than some complex semantics on existing fields.

583		However, the issue of how to recognize which packets belong to flows
584	is somewhat orthogonal to the issue of whether the internetwork level
585	recognizes flows at all. Should it?

587	3.2.3 Flows and State

589		To the extent that you have service state in the routers you have to
590	be able to associate that state with the packets it goes with. This is a
591	fundamental reason for flows. Access to service state is one reason to
592	explicitly recognize flows at the internetwork layer, but it is not the only
593	one.
594		If the user has requirements in a number of areas (e.g. routing and
595	access control), they can theoretically communicate these to the routers by
596	placing a copy of all the relevant information in each packet (in the
597	internetwork header). If many subsystems of the internetwork are involved,
598	and the requirements are complex, this could be a lot of bits.
599		(As a final aside, there's clearly no point in storing in the routers
600	any user state about packets which are providing datagram service; the
601	datagram service has usually come and gone in the same packet, and this
602	discussion is all about state retention.)

604		There are two schools of thought as to how to proceed. The first says
605	that for reasons of robustness and simplicity, all user state ought to be
606	repeated in each packet. For efficiency reasons, the routers may cache such
607	user state, probably along with precomputed data derived from the user state.
608	(It makes sense to store such cached user state along with any applicable
609	server state, of course.)

611		The second school says that if something is going to generate lots of
612	packets, it makes engineering sense to give all this information to the
613	routers once, and from then on have a tag (the flow-id) in the packet which
614	tells the routers where to find that information. It's simply going to be too
615	inefficient to carry all the user state around all the time. This is purely
616	an engineering efficiency reason, but it's a significant one.
617		There is a slightly deeper argument, which says that the routers will
618	inevitably come to contain more user state, and it's simply a question of
619	whether that state is installed by an explicit mechanism, or whether the
620	routers infer that state from watching the packets which pass through them.
621	To the extent that it is inevitable anyway, there are obvious benefits to be
622	gained from recognizing that, and an explicit design of the installation is
623	more likely to give satisfactory results (as opposed to an ad-hoc mechanism).
624		It is worth noting that although the term "flow" is often used to
625	refer to this state in the routers along the path of the flow, it is important
626	to distinguish between i) a flow as a sequence of packets (i.e. the definition
627	given in 3.2.2 above), and ii) a flow, as the thing which is set up in the
628	routers. They are different, and although the particular meaning is usually
629	clear from the context, they are not the same thing at all.

631		I'm not sure how much use there is to any intermediate position, in
632	which one subsystem installs user state in the routers, and another carries a
633	copy of its user state in each packet.
634		(There are other intermediate positions. First, one flow might use a
635	given technique for all its subsystems, and another flow might use a
636	different technique for its; there is potentially some use to this, although
637	I'm not sure the cost in complexity of supporting both mechanisms is worth
638	the benefits. Second, one flow might use one mechanism with one router along
639	its path, and another for a different router. A number of different reasons
640	exist as to why one might do this, including the fact that not all routers
641	may support the same mechanisms simultaneously.)
642		It seems to me that to have one internetwork layer subsystem (e.g.
643	resource allocation) carry user state in all the packets (perhaps with use of
644	a "hint" in the packets to find potentially cached copies in the router), and
645	have a second (e.g. routing) use a direct installation, and use a tag in the
646	packets to find it, makes little sense. We should do one or the other, based
647	on a consideration of the efficiency/robustness tradeoff.
648		Also, if there is a way of installing such flow-associated state, it
649	makes sense to have only one, which all subsystems use, instead of building a
650	separate one for each flow.

652		It's a little difficult to make the choice between installation, and
653	carrying a copy in each packet, without more information of exactly how much
654	user state the network is likely to have in the future. (For instance, we
655	might wind up with 500 byte headers if we include the full source route,
656	resource reservation, etc, etc in every header.)
657		It's also difficult without consideration of the actual mechanisms
658	involved. As a general principle, we wish to make recovery from a loss of
659	state as local as possible, to limit the number of entities which have to
660	become involved. (For instance, when a router crashes, traffic is rerouted
661	around it without needing to open a new TCP connection.) The option of the
662	"installation" looks a lot more attractive if it's simple, and relatively
663	cheap, to reinstall the user state when a router crashes, without otherwise
664	causing a lot of hassle.

666		However, given the likely growth in user state, the necessity for
667	service state, the requirement for reliable installation, and a number of
668	similar considerations, it seems that direct installation of user state, and
669	explicit recognition of flows, through a unified definition and tag mechanism
670	in the packets, is the way to go, and this is the path that Nimrod has
671	chosen.

673	3.3 Specific Interaction Issues

675	Here is a very incomplete list of the things which Nimrod would like to see
676	from the internetwork layer as a whole:

678	  - A unified definition of flows in the internetwork layer, and a unified
679	    way of identifying, through a separate flow-id field, which packets belong
680	    to a given flow.

682	  - A unified mechanism (potentially distributed) for installing state about
683	    flows (including multicast flows) in routers.

685	  - A method for getting information about whether a given resource allocation
686	    request has failed along a given path; this might be part of the unified
687	    flow setup mechanism.

689	  - An interface to (potentially distributed) mechanism for maintaining the
690	    membership in a multi-cast group.

692	  - Support for multiple interfaces; i.e. multi-homing. Nimrod does this by
693	    decoupling transport identification (done via EID's) from interface
694	    identification (done via locators). E.g., a packet with any valid
695	    destination locator should be accepted by the TCP of an endpoint, if the
696	    destination EID is the one assigned to that endpoint.

698	  - Support for multiple locators ("addresses") per network interface. This
699	    is needed for a number of reasons, among them to allow for less painful
700	    transitions in the locator abstraction hierarchy as the topology changes.

702	  - Support for multiple UID's ("addresses") per endpoint (roughly, per
703	    host). This would definitely include both multiple multicast SID's, and
704	    at least one unicast EID (the need for multiple unicast EID's per endpoint
705	    is not obvious).

707	  - Support for distinction between a multicast group as a named entity,
708	    and a multicast flow which may not reach all the members.

710	  - A distributed, replicated, user name translation system (DNS?) that maps
711	    such user names into (EID, locator0, ... locatorN) bindings.

713	Expires: January 21, 1995				July 21, 1994