idnits 2.17.1 

draft-shyam-real-ip-framework-32.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 600 has weird spacing: '...lent to  the a...'

  -- The document date (April 28, 2017) is 2554 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Unused Reference: '11' is defined on line 1117, but no explicit
     reference was found in the text

  == Unused Reference: '12' is defined on line 1120, but no explicit
     reference was found in the text

  == Unused Reference: '13' is defined on line 1123, but no explicit
     reference was found in the text

  == Unused Reference: '14' is defined on line 1126, but no explicit
     reference was found in the text

  == Unused Reference: '15' is defined on line 1128, but no explicit
     reference was found in the text

  == Unused Reference: '16' is defined on line 1131, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 4893 (ref. '4') (Obsoleted by RFC 6793)

  -- Obsolete informational reference (is this intentional?): RFC 1771 (ref.
     '12') (Obsoleted by RFC 4271)

  -- Obsolete informational reference (is this intentional?): RFC 1883 (ref.
     '13') (Obsoleted by RFC 2460)

  -- Obsolete informational reference (is this intentional?): RFC 2460 (ref.
     '15') (Obsoleted by RFC 8200)


     Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	INTERNET DRAFT                                          S. Bandyopadhyay
3	draft-shyam-real-ip-framework-32.txt                      April 28, 2017
4	Intended status: Experimental
5	Expires: October 28, 2017

7	    An Architectural Framework of the Internet for the Real IP World
8	                  draft-shyam-real-ip-framework-32.txt

10	Abstract

12	   This document tries to propose an architectural framework of the
13	   internet in the real IP world. It describes how a three-tier mesh
14	   structured hierarchy can be established in a large address space
15	   based on fragmenting it into some regions and some sub regions inside
16	   each of them. It addresses issues which could be relevant to this
17	   architecture in the context of IPv6. It shows how to make a
18	   transition from private IP to real IP without making significant
19	   changes with the existing network.

21	Status of this Memo

23	   This Internet-Draft is submitted in full conformance with the
24	   provisions of BCP 78 and BCP 79.

26	   Internet-Drafts are working documents of the Internet Engineering
27	   Task Force (IETF).  Note that other groups may also distribute
28	   working documents as Internet-Drafts.  The list of current Internet-
29	   Drafts is at http://datatracker.ietf.org/drafts/current/.

31	   Internet-Drafts are draft documents valid for a maximum of six months
32	   and may be updated, replaced, or obsoleted by other documents at any
33	   time.  It is inappropriate to use Internet-Drafts as reference
34	   material or to cite them other than as "work in progress."

36	   This Internet-Draft will expire on October 28, 2017.

38	Copyright Notice

40	   Copyright (c) 2017 IETF Trust and the persons identified as the
41	   document authors. All rights reserved.

43	   This document is subject to BCP 78 and the IETF Trust's Legal
44	   Provisions Relating to IETF Documents
45	   (http://trustee.ietf.org/license-info) in effect on the date of
46	   publication of this document. Please review these documents
47	   carefully, as they describe your rights and restrictions with respect
48	   to this document.

50	Table of Contents
51	   1. Introduction.....................................................2
52	   2. Background.......................................................3
53	   3. A Three tier mesh structured hierarchical network................4
54	      3.1. Route propagation...........................................5
55	      3.2. Determination of prefix lengths.............................7
56	           3.2.1. A pseudo optimal distribution of prefixes in
57	                  a 64bit architecture.................................8
58	           3.2.2. Whether to go for a two tier or three tier hierarchy
59	                  .....................................................9
60	      3.3. Issues related to Satellite communications.................10
61	   4. Provider Independent addressing, name services and multihoming..11
62	      4.1. PI address Resolution......................................12
63	   5. Issues related to IP mobility...................................17
64	      5.1. Changes expected with the specifications related
65	           to IP mobility.............................................18
66	   6. Refinements over existing IPv6 specification....................19
67	   7. Distributed processing and Multicasting.........................21
68	   8. Transition to real IP from private IP...........................22
69	   9. IANA Consideration..............................................23
70	   10. Security Consideration.........................................23
71	   11. Acknowledgments................................................23
72	   12. Normative References...........................................23
73	   13. Informative References.........................................24
74	   14. Author's Address...............................................24

76	1. Introduction

78	   Transition from IPv4 to IPv6 is in the process. Work has been done to
79	   upgrade individual nodes (workstations) from IPv4 to IPv6. Also,
80	   there are established documents to make routers/switches to work to
81	   support IPv4 as well as IPv6 packets simultaneously in order to make
82	   the transition possible [1].  CIDR[2] based hierarchical architecture
83	   in the existing 32-bit system is supposed to be continued in IPv6 too
84	   with a large address space. There are documents/concerns over BGP
85	   table entries to become too large in the existing system [3]. There
86	   are proposals to upgrade Autonomous System number to 32-bit from
87	   16-bit to support the demand at the same time [4]. The challenge
88	   relies on how to make the transition smooth from IPv4 to a real IP
89	   world with least changes possible.

91	   The term "real IP environment" is referred to an environment where
92	   hosts in a customer network will possess globally unique IP addresses
93	   and communicate with the rest of the world without the help of
94	   NAT[5]. This document reflects changes required with the BSD 4.4
95	   source code where ever applicable.

97	2. Background

99	   Existing system is in work with Autonomous System (AS) and inter-AS
100	   layer with the approach of CIDR. In order to meet the need within the
101	   32-bit address space, Autonomous Systems of various sizes maintain
102	   CIDR based hierarchical architecture. With the help of NAT [5], a
103	   stub network can maintain an user ID space as large as a class A
104	   network and can meet its useful need to communicate with the rest of
105	   the world with very few real IP addresses. With the combination of
106	   CIDR and NAT applied in the entire space, most of the part of 32-bit
107	   address space gets effectively used as network ID. If the same gets
108	   continued with a larger network ID, load in the switches will become
109	   too high.

111	   With traditional CIDR based hierarchy, a node of higher prefix can be
112	   divided into number of nodes with lower prefixes. Each divided node
113	   can further be subdivided with nodes of further lower prefixes. This
114	   process can be continued till no further division is possible. The
115	   point worth noting is at each point the designer of the network has
116	   to preconceive the future expansion of the network with the concept
117	   in the mind that the resource can not be exhausted at any point of
118	   time. This phenomenon leads the designer to allocate resources much
119	   higher than whatever is needed which leads to a space of unused
120	   address space and the concept of H-D (host-density) ratio comes into
121	   play. The problem gets aggravated once resource gets exhausted by any
122	   chance. e.g. a node of prefix /16 can be divided with a number of
123	   nodes of prefixes /24. If any one of the nodes /24 gets exhausted,
124	   resources of other nodes of prefixes /24 can not be used even if they
125	   are available.

127	   In IPv4 environment, there is a desperate attempt of the service
128	   providers to provide internet services with the help of NAT. e.g. a
129	   large educational institute meets its current requirement with 4 real
130	   IP addresses; one for its mail server, one for its web server, one
131	   for its ftp server and another one for its proxy server to provide
132	   web based services to all of its users. These four types of services
133	   are used by any organization of any size(it may be 400 or even
134	   40000). In the current provider network these organizations are
135	   supported their need with 4 IP addresses and the CIDR based tree has
136	   been built using these components together. When private IP will be
137	   replaced with real IP, each customer network will require IP
138	   addresses based on its size and requirement. Transitioning to real IP
139	   space with provider assigned addresses with CIDR based approach
140	   itself without reorganization of the existing provider network may
141	   not be a difficult task. This will continue with all the problems
142	   associated with routing and problems related to distribution. Mesh
143	   structured hierarchy is convenient to reduce the routing overhead as
144	   well as for distribution of network resources in a suitable manner in
145	   the long run.

147	3. A Three-tier mesh structured hierarchical network

149	   As Autonomous Systems of various sizes are supported, Autonomous
150	   Systems and the nodes inside the Autonomous Systems can be viewed as
151	   graphically lying on the same plane within the address apace. If
152	   network can be viewed as lying on different planes, routing issues
153	   can be made simpler. If network is designed with a fixed length of
154	   prefix for the Autonomous System everywhere, routing information for
155	   the rest will get confined with the other part of the network prefix.
156	   Which means the maximum size of AS gets assigned to all irrespective
157	   of their actual sizes. This can be made possible with the advantage
158	   of using a large address space and dividing it into number of regions
159	   of fixed sizes inside it. Thus entire network can be viewed as a
160	   network of inter-AS layer nodes. Each node in the inter-AS layer can
161	   act either only as a router in the inter-AS layer or as a router in
162	   the inter-AS layer with an Autonomous System attached to it with a
163	   single point of attachment or as an Autonomous System with multiple
164	   Autonomous System border routers (ASBR) appearing like a mesh. Thus
165	   two tier mesh structured hierarchy gets established between AS layer
166	   and inter-AS layer with each AS having a fixed length of prefix.

168	   Based on the definition of Autonomous System, it is a small area
169	   within the entire network that maintains its own independent identity
170	   that communicates with the rest of the world through some specific
171	   border routers. In the similar manner, if a larger area (say region
172	   or state) can be considered as network of Autonomous Systems, that
173	   can maintain its own identity by communicating with the rest of the
174	   world through some border routers (say, state border router), mesh
175	   structured hierarchy can be established within the inter-AS layer.
176	   The inter-AS layer will be split into inter-AS-top and inter-AS-
177	   bottom. To maintain this hierarchy, each node of inter-AS-top needs
178	   to have multiple regional or state border routers (say, SBR) through
179	   which each one will communicate with the rest of the world in the
180	   similar manner an Autonomous System maintains ASBR. Thus, entire
181	   network will appear as a network of nodes of inter-AS-top layer. To
182	   maintain hierarchy, each node of the inter-AS-top needs to have a
183	   fixed length of prefix. i.e. each node of the inter-AS top will be
184	   assigned a maximum (fixed) number of nodes of Autonomous Systems.

186	   Thus, with three-tier mesh structured hierarchy in the network layer,
187	   network ID can be viewed as A.B.C. If pA, pB and pC be the prefix
188	   lengths of inter-AS-top, inter-AS-bottom and AS layers respectively,
189	   there will be 2^pA nodes at the topmost layer, 2^pB at the inter-AS-
190	   bottom layer and 2^pC nodes at the AS layer. Thus the entire space
191	   gets divided into a fixed number of regions and each region gets
192	   divided into fixed number of sub regions. This division is supposed
193	   to be made based on geography, population density and their demands
194	   and related factors.

196	   Let nMaxInterASTopNodes be the possible maximum number of nodes
197	   assigned at the top most layer and nMaxInterASBottomNodes be that at
198	   the inter-AS-bottom layer and nMaxASNodes at the AS layer. Where
199	   nMaxInterASTopNodes <= 2^pA and nMaxInterASBottomNodes <= 2^pB and
200	   nMaxASNodes <= 2^pC.

202	3.1. Route propagation

204	   With hierarchy established, routing information that gets established
205	   inside a node of inter-AS-top, does not need to be propagated to
206	   another node of inter-AS-top. Entire routing information of inter-AS-
207	   top layer needs to be propagated to inter-AS-bottom layer. So, each
208	   router of inter-AS layer will have two tables of information, one for
209	   the inter-AS-top and another for the inter-AS-bottom of the inter-AS-
210	   top node that it belongs to. BGP (with little modification) will work
211	   very well with a trick applied at the SBRs. Each SBR will not
212	   propagate the routing information of inter-AS-bottom layer of its
213	   domain to another SBR of neighboring domain. i.e. SBR of one top
214	   layer node will propagate routing information only of inter-AS-top
215	   layer to SBR of another top layer node. Inside a node of inter-AS-
216	   top, routing information of inter-AS-top and inter-AS-bottom need to
217	   be propagated from one ASBR to another neighboring ASBR. Inside a top
218	   layer node A, routing information of another top layer node B will
219	   have two parts; one for the list of SBRs through which a packet will
220	   traverse from top layer node A to B and another for the list of ASBRs
221	   through which the packet will traverse from one AS to another inside
222	   A. In terms of BGP, AS_PATH attribute will be split into two parts;
223	   one for the information of the top layer and another for the bottom
224	   layer. Within the same node A routing information of one AS to
225	   another AS will not have any top layer information. i.e. the top
226	   layer information will be set to as NULL.

228	   Similarly, each node of the AS layer will have three tables of
229	   routing entries. One for the inter-AS-top, one for the inter-AS-
230	   bottom and another for the routing information inside the Autonomous
231	   System itself.

233	   Introduction of hierarchy at the inter-AS layer reduces the size of
234	   the routing table substantially. With the availability of hardware
235	   resources if flat address space is maintained at each layer, problems
236	   related to CIDR can be avoided. With flat address space, no
237	   hierarchical relationship needs to be established between any two
238	   nodes in the same layer. So, all the nodes inside each layer can be
239	   used till they get exhausted. With flat address space (i.e.  without
240	   prefix reduction), BGP tables will have maximum nMaxInterASTopNodes +
241	   nMaxInterASBottomNodes entries.

243	   IGP like OSPF has got provision to divide AS into smaller areas. OSPF
244	   hides the topology of an area from the rest of the Autonomous System.
245	   This information hiding enables a significant reduction in routing
246	   traffic. With the support of subnetting, OSPF attaches an IP address
247	   mask to indicate a range of IP addresses being described by that
248	   particular route. With this approach it reduces the size of the
249	   routing traffic instead of describing all the nodes inside it, but
250	   introduces another level of hierarchy. If subnetting concept can be
251	   avoided from the AS layer(with the additional overhead of computation
252	   inside the SPF tree), each area can be configured from a free pool of
253	   addresses based on its requirement dynamically. So, an AS can be
254	   divided into number of areas of heterogeneous sizes with the nodes
255	   from a free pool of address space.

257	   Similarly, the concept of area can be introduced in the inter-AS-
258	   bottom layer the way it works in OSPF. The area border routers in the
259	   inter-AS-bottom layer have to behave exactly in the similar manner
260	   the way an ABR behaves in OSPF.  i.e. an area border router will hide
261	   the topology inside an area to the rest of the world and will
262	   distribute the collected information inside the area to the rest. It
263	   will distribute the collected routing information from outside to the
264	   nodes inside as well. In order to implement this, protocol running in
265	   the inter-AS layer (say BGP) will have to introduce a 'cost' factor.
266	   This cost factor can be interpreted as the cost of propagation of a
267	   packet from one AS to another. The protocols running inside AS layer
268	   (RIP/OSPF, etc) will have to the supply the cost information for a
269	   packet to travel from one ASBR to another. All the protocols must
270	   behave in unison for supplying this information. The cost factor is
271	   needed for a remote node while sending a packet to a node inside an
272	   area while more than one area border routers are equidistant from
273	   that remote node. Thus inter-AS-bottom layer (i.e. one inter-AS-top
274	   level node) can be divided into number of areas of heterogeneous
275	   sizes with nodes of AS from a free pool of address space. BGP adopts
276	   a technique called route aggregation. Along with route aggregation it
277	   reduces routing information within a message. In the similar manner,
278	   introduction of area inside inter-AS-bottom layer will not only
279	   reduce the complexity of the protocol, but will reduce the size of a
280	   BGP packet substantially.

282	   With this architecture, each node(router) inside an AS is represented
283	   as A.B.C.  Each node may or may not be attached with a network which
284	   acts as a leaf node (i.e. a network will not act as a transit). In
285	   order to make use of user-id space properly and to support customer
286	   networks of heterogeneous sizes, the user-ID space needs to be
287	   divided as subnet-ID and user-ID. Profoundly, a VLSM (variable length
288	   subnet mask) type of approach has to be adopted at each node of an
289	   AS. So, each node of the AS layer will act as the root of a tree
290	   whose leaves are independent small customer networks which will act
291	   as stub. As the routing information of inter-AS layer as well as AS
292	   layer need not be passed inside any node of the VLSM tree, each
293	   router inside the tree should maintain default route for any address
294	   outside of its network. With this approach, load on each router of
295	   the service providers will become negligible. Protocols that supports
296	   VLSM with MPLS/VPN has to be implemented inside the tree (inside the
297	   VLSM tree, all the physical ports of a switch have to be configured
298	   with the subnet mask. So, mere MPLS on top of static routing table
299	   should do the rest).

301	   The fundamental assumptions based on which this architecture lies can
302	   be summarized as follows:

304	   i) Entire network can be viewed as a network of regions or states
305	   where each region or state can have its own identity by communicating
306	   with the rest of the world through some state border routers. Each
307	   region or state is a network of Autonomous Systems. Each region as
308	   well as each Autonomous System inside them will have a fixed
309	   (maximum) length of prefix.

311	   ii) Availability of hardware resources is such that flat address
312	   space can be maintained at the inter-AS layer.

314	   Introduction of mesh-structured hierarchy will have several
315	   advantages:

317	      o  Load at each router will get reduced substantially.
318	      o  Concept of CIDR style approach and complexity related to
319	           prefix reduction can be easily avoided.
320	      o  Mesh structured hierarchy will make traffic evenly distributed.
321	      o  Physical cable connection can be optimized.
322	      o  Administrative issues will become easier.

324	3.2. Determination of prefix lengths

326	   With this architecture, IP address can be described as A.B.C.D where
327	   the D part represents the user id. Each router in the inter-AS layer
328	   will have two tables of information, one for the inter-AS-top and
329	   another for the inter-AS-bottom of the inter-AS-top node that it
330	   belongs to. Whereas, each node of the AS layer will have three tables
331	   of routing entries; one for the inter-AS-top, one for the inter-AS-
332	   bottom and another for the routing information inside the Autonomous
333	   System itself. In the worst case. a node inside an AS needs to
334	   maintain nMaxInterASTopNodes + nMaxInterASBottomNodes + nMaxASNodes
335	   entries in its routing table.

337	   The dynamic nature of allocating an area from a free pool of address
338	   space is more frequent at the AS layer than at the inter-AS-bottom
339	   layer. As OSPF supports all the features needed, it can be considered
340	   as default choice in the AS layer.  Existing implementation of OSPF
341	   (Version 2) supports subnetting, by which an entire area can be
342	   represented as a combination of network address and subnet mask. With
343	   this approach, entire routing table gets reduced substantially.  With
344	   the removal of subnetting, all the nodes inside an area will have an
345	   entry inside the routing table (OSPF Version 1). So the deterministic
346	   factor is what is the maximum number of nodes inside an AS OSPF can
347	   support once subnetting support gets removed. So the prefix length of
348	   AS layer will be determined by this factor of OSPF.

350	   With the introduction of hierarchy in the inter-AS layer, number of
351	   entries in the BGP routing table will get reduced substantially. Even
352	   if pA and pB both are selected as 16, number of routing entries come
353	   within the admissible range of existing BGP protocol. But, it is the
354	   responsibility of IANA to come out with a scheme how
355	   nMaxInterASTopNodes and nMaxInterASBottomNodes are to be selected.
356	   Each top level node will have nMaxInterASBottomNodes nodes. It will
357	   be a waste of address space if each country gets assigned a top level
358	   nodes (e.g. china has got a population of 1,306,313,800 people where
359	   as Vatican City has got only 920 according to a census of 2006). So a
360	   moderate value of nMaxInterASBottomNodes is desirable, with which
361	   larger countries will have a number of top level nodes. e.g. each
362	   state of USA can be assigned a top level node. With the introduction
363	   of area in the inter-AS-bottom layer, each top level node can be
364	   divided into number of areas of heterogeneous sizes. So, a group of
365	   neighboring countries with less population can share the address
366	   space of a top level node. Similarly, user-id space has to be decided
367	   based on the largest area VLSM tree should be spanned through. All
368	   these issues are completely geo political and have to be decided by
369	   IANA.

371	3.2.1. A pseudo optimal distribution of prefixes in a 64bit architecture

373	   In order to have optimal use of cable connections, length of the VLSM
374	   tree is expected to be as short as possible. Also any single
375	   organization may prefer to have its user id space to be under the
376	   same network id. So, a 16bit user-id may become insufficient for
377	   places like large university campus, where as 32bit will become too
378	   large. Hence, 24bit user-id will be a moderate one which is the class
379	   A address space in IPv4 (also used as the space for private IP). As
380	   published in 1998 [6], OSPF can support an area with 1600 routers and
381	   30K external LSAs. So, 11 bits are needed to support this space. With
382	   the assumption that OSPF can support much more address space with the
383	   advancement of hardware technology as well as to keep the space open
384	   for future expansions, 12 bits are assigned for the AS layer. 16 bits
385	   are assigned for the inter-AS-bottom layer. So, if on the average,
386	   16bit equivalent space gets used within the user-id space (i.e. one
387	   out of 256) and 8bit equivalent nodes gets used inside an AS (16% of
388	   1600), for a top level node (with 16bit equivalent AS nodes), it will
389	   generate 2^40 IP addresses, which will give 8629 IP addresses per
390	   person in Japan (with a population of 127417200; Japan is at the 10th
391	   position from the top in the population list of the world). So, even
392	   if all the countries with population less than or equal to Japan are
393	   assigned a top level node and all the provinces/states of countries
394	   with larger population are assigned a top level node each, total
395	   number of nodes will come well under 1024. If a number of neighboring
396	   countries with lesser population shares a top level node, total
397	   number of top level nodes will come down further.  This suggests that
398	   62 bit equivalent (10(pA)+16(pB)+12(pC)+24(user-id)) space will be
399	   good enough for unicast addresses. This distribution expects OSPF to
400	   support 65K (64K+1K) external LSAs.

402	   64bit address space may be divided into two 63bit blocks as follows:

404	   i. Global unicast addresses with the most significant bit set to 0.
405	   This space is equally divided into provider assigned (PA) address
406	   space with prefix 00 and provider independent (PI) address space with
407	   prefix 01. Provider independent address space will be used for the
408	   customers who would like to retain their number even after changing
409	   their providers. As routing will be based on PA addresses, each PI
410	   address will be associated to at least one PA address. Section 4
411	   describes issues related to PI addressing in detail.

413	   ii. Address space with the MSB set to 1 will be distributed within
414	   the rest. Each of them will have a fixed prefix which will be
415	   determined with the consultation with IANA.  This distribution will
416	   be based on the requirements and the work that have already been done
417	   in connection to IPv6 along with the following requirements:

419	   a) Router address space: Any node in the router address space will be
420	   designated with a prefix followed by A.B.C.router-id.

422	   b) Address space for multicasting:

424	   c) Address space for private IP: A 32 bit address space should be
425	   good enough for private IP.

427	3.2.2. Whether to go for a two-tier or three-tier hierarchy

429	   Establishment of hierarchy in the inter-AS layer reduces the size of
430	   BGP entries to a great extent, but leads to an improper use of
431	   address space due to geo-political reason. If hierarchy in the inter-
432	   AS space gets removed, entire 26bit (10+16) space will be available
433	   for a single layer and use of inter-AS space will be true to its
434	   sense, but will increase external LSA (and/or number of entries in
435	   the BGP table) dramatically. So, it depends on to what extent OSPF
436	   can support external LSAs. BGP expects the packet length to be
437	   limited to 4096 bytes. BGP manages to make it work with this
438	   limitation with the concept of prefix reduction in the CIDR based
439	   environment.  As the number of inter-AS nodes increases, BGP has to
440	   change this limit in order to make it work in flat address space. The
441	   alternate will be to divide the inter-AS space into number of areas
442	   as defined in section 2.1. The area border routers will advertise the
443	   aggregated information to the rest of the world. BGP may have to
444	   incorporate both the options at the same time.  As the number of
445	   nodes in the inter-AS layer increases, in order to reduce the number
446	   of entries in the routing table, inter-AS space has to be split into
447	   two separate planes.  So, two-tier hierarchy can be considered as an
448	   interim state to go for three-tier hierarchy.  If it so happen that
449	   current available data is good enough to support the present need, it
450	   will be worth to look for to what extent it can support in the
451	   future. Assignment of inter-AS nodes in two-tier hierarchy should be
452	   based on the geographical distribution as if it is part of three-tier
453	   hierarchy.  Otherwise, introduction of three-tier hierarchy in the
454	   future will become another difficult task to go through. Based on the
455	   report of year 2011, BGP supports ~400,000 entries in the routing
456	   table. With this growing trend, BGP may have to change the limit of
457	   packet length even in a CIDR based environment. With the introduction
458	   of two-tier hierarchy, number of entries in the routing table will
459	   come down drastically and with the three-tier approach, it will come
460	   down further.

462	3.3. Issues related to Satellite communications

464	   Establishment of hierarchy in the inter-AS layer expects the only way
465	   any two autonomous systems in two different top level nodes
466	   communicate is through their SBRs. If two autonomous systems inside
467	   the same top level node communicate through satellite, it will be
468	   considered as a direct link between them. Whenever autonomous system
469	   'ASa' of top level node 'A' communicates with autonomous system 'ASb'
470	   of top level node 'B' through satellite, they have to go through
471	   their state border routers. i.e.  satellite port inside 'A' that
472	   communicates with a satellite port inside 'B' will be considered as
473	   state border router. If multiple such ports exists inside node 'A',
474	   all of them will be equidistant from any port inside 'B'.  Which
475	   expects any satellite port inside 'B' to have prior knowledge of list
476	   of autonomous systems that will be under the purview of any port
477	   inside 'A'. So, all the satellite ports of 'A' have to exchange such
478	   group of information with all the satellite ports of 'B' and vice
479	   versa.  These group of autonomous systems can be considered as a
480	   cluster of autonomous systems inside an area of a top level node. If
481	   number of such ports is small, some heuristics can be applied while
482	   assigning AS numbers in order to reduce the processing time during
483	   the circuit establishment phase.  It will become difficult to
484	   maintain such heuristics once the number of such ports becomes large.
485	   So, in case of satellite communication, the advantage of establishing
486	   hierarchy inside inter-AS layer diminishes as the number of satellite
487	   ports increases. If any private corporate maintains its own satellite
488	   channel to communicate between its offices at distant locations, all
489	   of these offices are going to be considered as under the user-id
490	   space of its network. Service providers that provide satellite
491	   services to the end-site customers, can operate in the usual manner
492	   as they will provide connection to customer networks which will act
493	   as stub.

495	4. Provider Independent addressing, name services and multihoming

497	   Provider independent addressing can be conceived as naming a host
498	   with a number. It can be used by customer networks who would like to
499	   retain their number even after changing their service provider; also
500	   it is useful to designate a host uniquely if the customer network is
501	   multihomed. Just like in name services, as address corresponding to a
502	   name needs to be resolved first to initiate communication, the same
503	   is required for PI addressing. Each globally unique PI address will
504	   be associated to at least one global unicast provider assigned
505	   address. For a host with single interface, this number will be same
506	   as the number of service providers the customer network is associated
507	   with.

509	   As either source or destination or both may be multihomed, there
510	   could be multiple paths to communicate between two hosts. This is
511	   required both for name services as well as for PI addressing.

513	   A system call needs to be introduced to get the source address based
514	   on the destination address. If application program needs to use the
515	   destination address directly, it needs to use this system call.

517	   int getcommaddr(int sockfd, struct in_addr *dst, struct addr_pair
518	   *endpts);

520	   'addr_pair' holds the addresses of communication end points as
521	   follows:

523	   struct addr_pair {
524	       struct in_addr src;
525	       struct in_addr dst;
526	   };

528	   'getcommaddr'[8] returns the number of source-destination pairs for
529	   communication; the field 'endpt' will hold the array of these
530	   addresses. The array will be in sorted manner based on the best
531	   possible route.  'sockfd' is used to get the 'type of service'
532	   assigned. So, an application program needs to set its type of service
533	   before using this call.

535	   'getcommaddr needs to call a routine 'getmappedaddr' to resolve the
536	   mapped provider assigned addresses of a provider independent address.

538	   int getmappedaddr(struct in_addr *piaddr, struct in_addr *mpiaddr);

540	   'getmappedaddr' will return number of mapped addresses and 'mpiaddr'
541	   will hold their values.

543	   Users may use name instead of IP address to reach the destination.  A
544	   new system call needs to be introduced 'gethostbynamewithsrcaddr',
545	   which is an extension to 'gethostbyname' as follows:

547	   struct hostent *gethostbynamewithsrcaddr(int sockfd,const char *name,
548	                  int *nroutes, struct addr_pair *endpts);

550	   'gethostbynamewithsrcaddr'[8] takes 'name' and 'sockfd' as input
551	   parameters and finds out the best possible route to reach the
552	   destination. It returns the pointer to the 'hostent' structure as
553	   returned by 'gethostbyname' system call.  The parameter 'nroutes'
554	   gets the number of possible routes to be used and the corresponding
555	   source and destination addresses gets assigned to 'endpts' in sorted
556	   manner. 'sockfd' is used to get the 'type of service' assigned. So,
557	   an application program needs to set its type of service before using
558	   this call.

560	   An application program needs to use these source addresses from the
561	   top (i.e. the 0th) to establish connection with the destination. It
562	   needs to bind source address 'src' and then connect with the
563	   destination address 'dst'.

565	4.1. PI address Resolution

567	   This section tries to come up with a solution for PI address
568	   resolution with the approach of DNS[7] with necessary differences.
569	   Just like name space in DNS, entire address range with prefix 01 will
570	   be the address space used by PI addresses. Servers that will hold the
571	   information of mapping between PI addresses and corresponding PA
572	   addresses will be called as PIMapServers and the programs that will
573	   be used to resolve addresses will be called as PIMapResolvers.

575	   In case of DNS where name is used in hierarchical format to resolve
576	   the addresses, PI address resolution will be based on the prefix of
577	   the PI address used for resolution.  The prefix is determined based
578	   on the architectural model used for the internet.  Based on the
579	   prefix information addresses of a list of servers can be found out
580	   that will act as regional servers which will be used to resolve
581	   mapped PA addresses corresponding to that PI address. A prefix will
582	   serve a fixed address space within entire PI address space. Address
583	   space belonging to a prefix will be distributed within customer
584	   networks of heterogeneous sizes. Address space allocation and the
585	   mapping of associated PA address(es) will be assigned by a regional
586	   authority. The regional authority will be fully responsible for the
587	   operation of regional servers in that region.

589	   Like DNS, there are some root servers which will have some fixed
590	   addresses, under which there are some prefixes which will act as top-
591	   level-domains. In case of CIDR based hierarchy, these prefixes may be
592	   of different prefix lengths which are selected based on the
593	   requirements. Each prefix in a top level domain can further be split
594	   into number of prefixes with the approach of CIDR. This tree
595	   structured hierarchy will be kept on growing till we get prefixes
596	   associated with regional servers. Each prefix associated with a
597	   regional server will be distributed amongst customer networks of
598	   various sizes as well as prefixes that will again be associated with
599	   some regional servers with the approach of CIDR. These regional
600	   servers can be considered as equivalent to  the authoritative name
601	   servers of DNS which are associated with zones. As stated earlier,
602	   prefixes starting with "00" will be assigned for provider assigned
603	   addresses and prefix starting with "01" will be assigned for provider
604	   independent addresses where as prefix starting with "1" will be
605	   assigned for addresses of all other types.

607	   As inherent hierarchy is involved in "Mesh Structured Hierarchy",
608	   this hierarchy goes up to two levels. As usual, there will be some
609	   root servers with fixed assigned addresses. Each root server will
610	   have prefixes with "01.A" that will act like top level domain. Under
611	   each top level domain, there will be entries with prefixes "01.A.B".
612	   Within a region "A.B", every global PA address is represented as
613	   "00.A.B.C.user-id". In order to support customer networks of
614	   heterogeneous sizes with the approach of VLSM, the "user-id" portion
615	   is further divided as "subnet-id.userid". So, the effective network
616	   prefix of a customer network in PA address space is "00.A.B.C.pa-
617	   subnet-id". Within an "A.B", entire PI address space with prefix
618	   "01.A.B" will be distributed within customer networks of
619	   heterogeneous sizes. So, effective network prefix of a customer
620	   network with PI address will be "01.A.B.pi-subnet-id". A particular
621	   prefix "01.A.B.pi-subnet-id" will be mapped to at least one provider
622	   assigned prefix of same prefix length.  For a multihomed customer
623	   network within "A.B" that receives services from two service
624	   providers will have prefixes "00.A.B.C1.pa-subnet-id1" and
625	   "00.A.B.C2.pa-subnet-id2". A PI address prefix "01.A.B.pi-subnet-id"
626	   of same length will be mapped to both these prefixes of PA address
627	   space. Every region "A.B" will have regional server and backup
628	   server(s) with net addresses "00.A.B.server1", "00.A.B.server2",
629	   "00.A.B.server3" and "00.A.B.server4".

631	   Each PIMapServer will have a database of records that will have
632	   information to resolve PI addresses. Each record will have the
633	   following format.

635	   +------------+---------+------+-----+-------+-----------+
636	   | NetAddress | NetMask | Type | TTL | NAddr | Addr(1-4) |
637	   +------------+---------+------+-----+-------+-----------+

639	   First two fields "NetAddress/NetMask" represents the PI address range
640	   of a network. "Type" will be either Domain/Referral/Individual/
641	   SingleEntry/Default based on which a query and rest of the fields of
642	   a record have to be processed. A PI address can have maximum four
643	   mapped PA addresses. "Addr1", "Addr2", "Addr3", "Addr4" will hold the
644	   corresponding PA addresses and "NAddr" will hold the number of such
645	   addresses. The field "TTL" is a 32bit integer measured in seconds
646	   which will hold same meaning and approach as defined in the
647	   specification of DNS[7]. When a server receives a query for an
648	   address "X", it extracts the record of the network based on
649	   "NetAddress/NetMask" and "X" from its database. If no matching record
650	   is found, a negative response is sent. Based on the "Type" of the
651	   record, the query is processed in the following manner.

653	   Type=Domain:

655	   This is the most common type. If a customer network would not like to
656	   maintain a map server opts for this option. In this case there will
657	   be one to one mapping between a PI address and corresponding PA
658	   addresses. The fields "Addr1"/"Addr2"/"Addr3"/"Addr4" will hold the
659	   PA Net Addresses corresponding to the PI address of the network.
660	   Server will extract the user-id portion of "X" and find the
661	   corresponding mapped PA addresses based on "Addr1"/"Addr2"/...etc.

663	   Theoretically, "A.B" portion of a PI address need not match with the
664	   "A.B" portion of the corresponding PA addresses. Consider a large
665	   corporate that has its corporate office and a branch office within
666	   the same region of a particular "A.B" and some other offices with
667	   different values of "A.B". The corporate can maintain a contiguous
668	   range of PI addresses for the ease of its operation. It needs to
669	   split entire PI address range based on its offices and assign the
670	   corresponding PA addresses. In order to minimize the path of a query
671	   it is desirable that "A.B" of a PI address and its corresponding
672	   mapped PA addresses belong to the same region.

674	   Type=Referral:

676	   This is used when an address within the domain "NetAddress"/"NetMask"
677	   has to be processed by another map server. The map server may itself
678	   be another regional server or a server within a customer network.

680	   When a customer network would like to have a direct control for the
681	   mapping of its addresses it needs to opt for this option.
682	   "Addr1"/"Addr2"/"Addr3"/"Addr4" of the database entry will hold the
683	   PA addresses of the map servers that need to be queried for further
684	   processing. A server may act either in recursive mode or in iterative
685	   mode based on its implementation just like in DNS. A large corporate
686	   may have different offices and each (or some of them) may maintain a
687	   map server based on their policies.

689	   When a server needs to handle a particular address separately, it
690	   needs to set "NetAddress" with that particular address and all the
691	   bits of "NetMask" will be set to "1". The "Type" field has to be set
692	   as "SingleEntry". If some of its addresses need to be handled
693	   separately but for the rest common rule may apply (like Type=Domain),
694	   records of the individual entries should be processed first and then
695	   for the rest. In these cases "Type" has to be set as "Default". So, a
696	   server of a customer network may have database entries with
697	   Type=Domain/Referral/SingleEntry/Default.

699	   For a host having multiple interfaces, each interface may be assigned
700	   PA addresses supplied by the all the service providers, but it is
701	   desirable that PI address gets mapped to only one of them (preferably
702	   the interface which will have the shortest path from the associated
703	   CE router).

705	   Type=Individual:

707	   This is meant for the individual users opting for services like
708	   telephonic services that need to maintain PI address. With this
709	   option a mobile user may maintain its PI address after changing its
710	   service provider. A map server needs to maintain some networks with a
711	   range of PI addresses in its database. When a query for an address
712	   "X" is received, server needs to get the corresponding record which
713	   will point to a separate database where there will be one to one
714	   mapping between PI address and its corresponding PA address of all
715	   the assigned PI addresses. These networks and assignment of
716	   individual PI addresses have to be done by the regional authority.

718	   In order to support most of the features of DNS, as well as to make
719	   use of the existing source code for implementation (e.g. BIND)
720	   message format has been retained almost same as that of DNS. So, all
721	   the relevant fields will be processed exactly in the same manner as
722	   that have been done in DNS and all the irrelevant issues have to be
723	   ignored. Rest of this section describes where and how changes have to
724	   be made.

726	   As defined in RFC 1035, the top level format of message is divided
727	   into 5 sections (some of which are empty in certain cases) shown
728	   below:

730	       +---------------------+
731	       |        Header       |
732	       +---------------------+
733	       |       Question      | the question for the name server
734	       +---------------------+
735	       |        Answer       | answering part of the question
736	       +---------------------+
737	       |      Authority      | authoritative map server
738	       +---------------------+
739	       |      Additional     | additional information
740	       +---------------------+

742	   The header section has been retained without making any change.

744	                                       1  1  1  1  1  1
745	         0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
746	       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
747	       |                      ID                       |
748	       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
749	       |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
750	       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
751	       |                    QDCOUNT                    |
752	       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
753	       |                    ANCOUNT                    |
754	       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
755	       |                    NSCOUNT                    |
756	       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
757	       |                    ARCOUNT                    |
758	       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

760	   The question section will hold just the PI address that needs to be
761	   resolved.

763	   The answer section will have the following fields:

765	   Data Type(octet)=PA address(01)/Referral PA address(02) + TTL+ Number
766	   of Addresses(octet) + Corresponding PA addresses.

768	   The authority section will hold the information of the authoritative
769	   domain that holds the PI address. It will hold the PI address prefix
770	   and corresponding mapped prefixes in PA addresses and address of its
771	   map server. The data will appear as follows:

773	   NetAddress/NetMask of the domain in PI address space + Number of PA
774	   address prefixes + NetAddress of all the PA address prefixes +
775	   Address of the map server.

777	   From the database record, if Type=Individual/SingleEntry/Default,
778	   NetAddress will hold the corresponding PI address where as all the
779	   bits of NetMask will be set to 1.  If mapping information is fetched
780	   from an address range with Type=Domain, then entire address range
781	   will appear in the form of NetAddress/NetMask. This will have
782	   advantages while catching data for any particular address, but
783	   getting the information of the entire address range.

785	   The additional section will hold the domain information of all the
786	   map servers through which PA addresses have been retrieved. The first
787	   entry will hold the information of the domain that is one level above
788	   the authoritative domain and the last entry will be the domain
789	   information of the map server that has been contacted to resolve the
790	   address. Data will appear in the following manner:

792	   Number of levels (octet) + domain information of each level (i.e.
793	   NetAddress/NetMask + PA address of the corresponding map server).

795	5. Issues related to IP mobility

797	   An interface of a customer network may have several IP addresses
798	   (e.g. for a multihomed customer site, each interface will have
799	   multiple global unicast addresses also it may have private
800	   addresses). For a mobile node that has been moved to a customer
801	   network which gets service from a service provider and maintains
802	   private IP addresses, will have at least three IP addresses; provider
803	   assigned unicast address, private address and its permanent "Home
804	   Address". The "Home Address" will be aliased with the provider
805	   assigned address (i.e. the co-located care-of address). So the
806	   interface structure needs to have an additional field to hold the
807	   value of care-of address. The PCB structure will have an additional
808	   field 'inp_lcladdr'.  So 'inp_lcladdr' will have the current provider
809	   assigned address that a foreign node needs to use for communication.
810	   The field 'inp_laddr' that is used to hold the value of local address
811	   will hold the value of "Home Address" of a mobile node. Similarly,
812	   PCB needs to introduce another field 'inp_fcladdr' to support the
813	   destination address to be mobile.  The existing field 'inp_faddr'
814	   which is used to address a foreign address will hold the value of
815	   "Home Address" of the mobile node. Customers with PI address who
816	   would like to have mobility support, the mapped address will be
817	   considered as the "Home Address" of the mobile node.

819	   An outgoing packet from a mobile node in a foreign site needs to be
820	   stacked with the associated care-of address. While initiating
821	   communication, the 'bind' system call needs to go through the
822	   interface list and fetch the associated structure to check whether
823	   the source address is aliased or not and needs to fill the value of
824	   'inp_lcladdr' of PCB accordingly.

826	   When TCP receives a SYN for connection establishment, it allocates a
827	   PCB and assigns the values for 'inp_laddr', and related fields.
828	   During this phase, TCP also needs to check whether the local address
829	   is aliased or not (based on the fields of interface structure; which
830	   is applicable for a mobile node at foreign site) and needs to fill
831	   the values of 'inp_lcladdr' accordingly. Similarly if destination
832	   address is found to be aliased, based on the stacking type, it needs
833	   to fill up the field 'inp_fcladdr'.

835	   IP address stacking can be performed with the approach introduced in
836	   section 6.4 of RFC6275[9]. RFC6275 talks about the stacking of IP
837	   addresses for a destination address (Let us call it as type 0
838	   stacking). Two more types of stacking need to be introduced; type 1
839	   stacking where only source address will appear in the stack and type
840	   2 stacking where both source address and destination address will
841	   appear in the stack with a particular type of ordering.

843	   Protocol output routine like 'tcp_output' or 'udp_output' needs to
844	   fill the IP packet in the following manner.

846	   If the socket contains a valid 'inp_lcladdr', use 'inp_lcladdr' as
847	   the source address and 'inp_laddr' will appear in the stack. If the
848	   socket contains a valid 'inp_fcladdr' use 'inp_fcladdr' as the
849	   destination address and 'inp_faddr' will appear in the stack. If only
850	   'inp_fcladdr' contains a valid address where as 'inp_lcladdr' is
851	   NULL, use type 0 stacking. If only 'inp_lcladdr' contains a valid
852	   address where as 'inp_fcladdr' is set as NULL, use type 1 stacking.
853	   If both 'inp_lcladdr' and 'inp_fcladdr' contains valid addresses, use
854	   type 2 stacking.

856	   Protocol input routine like 'tcp_input' or 'udp_input' needs to
857	   process the packet in the reverse order based on the type of
858	   stacking.  For type 0 stacking, use the address in the stack as the
859	   destination address; for type 1 stacking, use the address in the
860	   stack as the source address; for type 2 stacking use both source
861	   address and destination address from the stack.

863	5.1. Changes expected with the specifications related to IP mobility

865	   RFC6275 demands correspondent node binding from mobile nodes for
866	   route optimization. This binding is required when a connection gets
867	   established as well as when the mobile node changes it address space.
868	   There are application like HTTP which opens up multiple connections
869	   on the run time which are very short lived. If mobile nodes need to
870	   send binding messages for all the connections, network will be
871	   unnecessarily congested. This congestion can be avoided with the
872	   establishment of binding at the time of connection establishment
873	   itself.  So, if TCP server happens to be mobile, it will set the
874	   value of 'inp_lcladdr' in the stack while sending SYN+ACK. TCP client
875	   which initiates communication through 'connect' needs to set
876	   'inp_fcladdr' field on receiving TCP+ACK. With this approach
877	   correspondent node binding messages need to be sent only when a
878	   mobile node changes its position from one address space to another.

880	   Route optimization is not applicable to applications which are of
881	   multicast type.  In these cases packets need to be forwarded with the
882	   mechanism of reverse tunneling with the approach of "IP Encapsulation
883	   within IP" as defined in RFC 2003.  In order to support packet
884	   delivery with route optimization method as well as with
885	   "Encapsulating Delivery Style" based on the application type the
886	   protocol control block needs to introduce another field
887	   'inp_hagentaddr' to hold the address of the home agent of the mobile
888	   node. The interface structure also needs to have same field. The
889	   'bind' system call needs to go through the interface list to fetch
890	   'inp_hagentaddr' to the PCB along with 'inp_lcladdr' as described
891	   earlier. So, protocol output routines like 'tcp_output', 'udp_output'
892	   need to fill up the packets based on the application type. In
893	   "Encapsulating Delivery Style" packets need to be formed in the
894	   following manner.

896	   The inner IP header will contain
897	      Source Address: Home address of the mobile node
898	      (i.e. 'inp_laddr')
899	      Destination address: Address of the correspondent node
900	      (i.e. 'inp_faddr')
901	   The outer IP header will contain
902	      Source Address: co-located care of address of the mobile node
903	      (i.e. 'inp_lcladdr')
904	      Destination Address: Address of the home agent of the mobile node
905	      (i.e. 'inp_hagentaddr')
906	   Protocol field: IP in IP

908	6. Refinements over existing IPv6 specification

910	   As IPv6 was envisioned long before some of the newer technologies
911	   e.g. MPLS came into picture, some refinements can be made over the
912	   existing specification. These considerations are related to bandwidth
913	   usages and performance inside switches. Experimental results show
914	   that smaller packet size gives better result for the processing of RT
915	   packets.  So, it is desirable to have IP packet header to be as small
916	   as possible.

918	   As described earlier, evaluation of the parameters
919	   nMaxInterASTopNodes, nMaxInterASBottomNodes and nMaxASNodes is geo-
920	   political and have to be decided by IANA. Once these parameters are
921	   determined with mutual agreements, values of pA, pB, pC and prefix
922	   length of user id can be determined. With 64bit address space, IP
923	   header will be reduced by 16 bytes.

925	   The 'flow label' field of IPv6 packet header may not be of any use
926	   with MPLS is in use. ATM used to have 4 priority classes. The first
927	   specification of IPv6 RFC-1883 used a 4bit type of service field
928	   along with a 24bits flow label field. These two were modified to a
929	   8bit type of service field and a 20bit flow label field in the
930	   current spec RFC-2460.  Too many priority classes may increase
931	   complexities to process inside switches. If type of service field of
932	   IPv6 header may be reduced to be of 4bit length as it was stated in
933	   RFC-1883 and 'flow label' field gets removed, another three bytes may
934	   be reduced from the IPv6 header.

936	   The field 'Hop Limit' has got a 8bit value in the existing spec. The
937	   role of this field needs to be discussed properly with a large
938	   address space.

940	   RFC4862[10] introduces the concept of "Stateless auto configuration"
941	   with the goal in mind that no manual configuration is required by
942	   individual machines before connecting them to the network. It
943	   generates a link local address with a link-local prefix and the link
944	   address (e.g. Ethernet/E.164 for ISDN) first. This link local address
945	   is used to configure global unicast address and any other
946	   configurable parameters based on router advertisement.  Global
947	   unicast addresses are generated by the prefix supplied by the router
948	   advertisement and the link specific interface identifier. This
949	   identifier can be as large as 64 bit length. So irrespective of the
950	   size of the network (it may be 10000 or 100 or even less than that)
951	   every customer network will consume a 64bit equivalent addresses.
952	   This seems to be a huge blunder. What is expected is the length of
953	   the interface identifier is equivalent to support the number of nodes
954	   supported by that subnet. In order to achieve this the router itself
955	   or a server in that subnet needs to maintain a storage which will
956	   generate the interface identifier based on the request from
957	   individual hosts.  It may be desirable that interface identifiers are
958	   generated from DHCP servers. With the option of generating interface
959	   identifier through DHCP, changes in the auto configuration process
960	   can be looked at as follows:

962	   From the point of view of a host, it can be considered as a two step
963	   process. Host needs to send Router Solicitations message to find out
964	   the presence of a router. Router Advertisement message should include
965	   an option field which will inform whether prefix information should
966	   be configured through Router Advertisement or through DHCP.  Host
967	   needs to send a request message to get the interface identifier.  If
968	   both the information needs to be obtained from a DHCP server they can
969	   be obtained through a single message.

971	   From the server's point of view, it needs to maintain a database for
972	   a mapping of the link-layer address and subnet specific interface
973	   identifier. Lifetime of an interface identifier has to be processed
974	   in the usual manner the way existing DHCP implementation treats IP
975	   addresses.

977	   There seem to be another possible danger to obtain prefix information
978	   through Router Advertisement. As the Router Advertisement comes in
979	   the form of ICMP messages, once it is received by the ICMP layer, it
980	   looses information from which interface the message has been received
981	   (This problem arises for hosts that are having multiple interfaces
982	   and not all of them are attached to the same subnet).  So, auto
983	   configuration of a host has to be performed one interface at a time
984	   by making all other interfaces disabled. Once configuration of all
985	   the interfaces are done, all of them have to be enabled.

987	   If it is expected that hosts should reconfigure their addresses
988	   dynamically based on Router Advertisement message, Router
989	   Advertisement needs to generate a special message for a certain
990	   amount of time that needs to include old prefix and the corresponding
991	   new prefix in the message.

993	   In order to support multihoming[8], prefix information needs to
994	   include the fields 'default router' and 'next hop address' to reach
995	   the default router for each of the prefixes.

997	   In a 64bit architecture, link-local address can be formed with a
998	   link-local prefix and link-layer address in a suitable manner; say it
999	   can be formed with a 16bit link-local prefix followed by a 48bit
1000	   link-layer address. For hardware that supports more than 48bit
1001	   addressing (say E.164), the least significant 48bits may be
1002	   considered to generate link-local addresses.

1004	7. Distributed processing and Multicasting

1006	   With the inherent hierarchy involved in this architecture,
1007	   distributed applications can also be structured in a suitable manner.
1008	   Say, for a commonly used web based application a master level server
1009	   will be there at every top level node. Any change that might happen
1010	   in the application, has to be synchronized within these master level
1011	   servers first. There might be servers at the middle layer (inside
1012	   each inter-AS-bottom) inside each top level node. Once the changes
1013	   get reflected at the master node, all the servers at the middle layer
1014	   needs to update themselves with their master level node. This will
1015	   reduce network traffic substantially. Inherent hierarchy in the
1016	   architecture will also help establishing multicast tree in the
1017	   similar manner. Work on these issues can be progressed only after
1018	   this architecture gets approved.

1020	8. Transition to real IP from private IP

1022	   Both CIDR based hierarchy and Mesh structured hierarchy expects a
1023	   VLSM tree at the bottom. In VLSM, in real IP space with provider
1024	   assigned (PA) addresses, assignment of network resources has to be
1025	   associated with the address space to be used with the type of
1026	   service. Within a typical switch supporting multiple types of ports,
1027	   a line card of strength OC48 can be replaced with 4 line cards of
1028	   strength OC12. An OC12 card may also be replaced with 4 OC3 cards. An
1029	   OC12 card may be attached to another switch with DS3 ports and so on.
1030	   When it reaches to the customer network port density of a switch has
1031	   to be directly proportional to the address block that a customer
1032	   network will be assigned to. i.e. each customer network has to be
1033	   assigned a block of address space (say, 128, 256, 512, 1K, 2K etc).
1034	   Within the switch these ports have to be assigned net address/net
1035	   mask the way VLSM works.

1037	   In IPv4 environment, providers have provided services in terms of
1038	   bandwidth of the ports say, 2 Mbps/4 Mbps/1 Gbps line etc. If these
1039	   ports were assigned addresses based on the number of users of the
1040	   customer network, transition from private IP to real IP is simple.
1041	   Consider a switch that has supplied 2 Mbps line to a set of customers
1042	   with number of users within 1K to 2k, each of them will be assigned a
1043	   block of 2K each. But if number of users are not proportional to the
1044	   bandwidth used, say same 2 Mbps line were used to customers of sizes
1045	   1K, 2K 10K and 16K respectively reorganization will be needed if
1046	   possible. This rearrangement may be possible within the switch itself
1047	   or by connecting ports of appropriate sizes from different switch,
1048	   otherwise each of them has to be assigned an address block of 16K
1049	   each or with the way VLSM works whatever is suitable. So, address
1050	   block assignment in the VLSM tree has to grow in a bottom up
1051	   approach.

1053	   Thus, transition of existing provider network without (or very
1054	   little) rearrangement to a real IP space with CIDR based approach is
1055	   apparently not a difficult job. In a CIDR based approach, sizes of
1056	   the VLSM trees are heterogeneous that leads to number of routing
1057	   entries to be very high. Mesh structured hierarchy is convenient to
1058	   reduce the routing overhead as well as for distribution of network
1059	   resources in a suitable manner in the long run. To covert CIDR based
1060	   approach to Mesh structured hierarchy requires reorganization mainly
1061	   in the routing domain and by splitting trees of very large sizes (>24
1062	   bit address space) at the top.

1064	   Section 3.2.1 reveals that in Mesh structured hierarchy a 64bit
1065	   architecture will be good enough for our need in a provider assigned
1066	   (PA) address space; the same is true for CIDR based approach as well.

1068	9. IANA Consideration

1070	   This is a first level draft for proposed standard. Hence, IANA
1071	   actions should come into play at a later stage, if needed.

1073	10. Security Consideration

1075	   This document does not include any security related issues.

1077	11. Acknowledgments

1079	   The author would like to thank to Professor Amitava Datta of
1080	   University of Western Australia for his review and constructive
1081	   comments.

1083	12. Normative References

1085	   [1]  Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms for
1086	        IPv6 Hosts and Routers", RFC 4213, October 2005.

1088	   [2]  Fuller V., Li. T., "Classless Inter-Domain Routing (CIDR): The
1089	        Internet Address Assignment and Aggregation Plan", RFC 4632,
1090	        August 2006.

1092	   [3]  Huston, G., "Commentary on Inter-Domain Routing in the
1093	        Internet", RFC 3221, December 2001.

1095	   [4]  Q. Vohra, E. Chen., "BGP Support for Four-octet AS Number
1096	        Space", RFC 4893, May 2007.

1098	   [5]  Srisuresh, P. and K. Egevang, "Traditional IP Network Address
1099	        Translator (Traditional NAT)", RFC 3022, January 2001.

1101	   [6]  J. Moy., "OSPF Standardization Report", RFC 2329, April 1998

1103	   [7]  P.V. Mockapetris., "Domain names - concepts and facilities",
1104	        RFC 1034, November 1987.

1106	   [8]  S. Bandyopadhyay, "Solution for Site Multihoming in a Real IP
1107	        Environment", <draft-shyam-site-multi-41> work in progress.

1109	   [9]  C. Perkins, Ed., D. Johnson, J. Arkko, "Mobility Support in
1110	        IPv6" RFC 6275, July 2011.

1112	   [10] S. Thomson, T. Narten, T. Jinmei, "IPv6 Stateless Address
1113	        Autoconfiguration", RFC 4862, September 2007.

1115	13. Informative References

1117	   [11] Postel, J., "Internet Protocol", STD 5, RFC 791,
1118	        September 1981.

1120	   [12] Rekhter, Y., and T., Li, "A Border Gateway Protocol 4 (BGP-
1121	        4)",RFC 1771, March 1995.

1123	   [13] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6)
1124	        Specification, RFC 1883, December 1995.

1126	   [14] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.

1128	   [15] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6)
1129	        Specification", RFC 2460, December 1998.

1131	   [16] Rosen, E., Viswanathan, A. and R. Callon, "Multiprotocol
1132	        Label Switching Architecture", RFC 3031, January 2001.

1134	14. Author's Address

1136	   Shyamaprasad Bandyopadhyay
1137	   HL No 205/157/7, Kharagpur 721305, India
1138	   Phone: +91 3222 225137
1139	   e-mail: shyamb66@gmail.com