idnits 2.17.1 

draft-burness-locid-evaluate-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 17.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 1254.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1265.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1272.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1278.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the
     document.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 256: '...   networking solutions deployed [Handley].The solution MUST be:...'


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (July 15, 2008) is 5757 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'FLOWTOOLS' is defined on line 1144, but no explicit
     reference was found in the text

  == Unused Reference: 'I-D.vogt-rrg-six-one' is defined on line 1163, but no
     explicit reference was found in the text

  == Outdated reference: A later version (-06) exists of
     draft-irtf-rrg-design-goals-01

  == Outdated reference: A later version (-12) exists of
     draft-farinacci-lisp-08

  == Outdated reference: A later version (-02) exists of
     draft-vogt-rrg-six-one-01

  == Outdated reference: A later version (-12) exists of
     draft-farinacci-lisp-05

  -- Duplicate reference: draft-farinacci-lisp, mentioned in 'LISP', was also
     mentioned in 'I-D.farinacci-lisp'.


     Summary: 2 errors (**), 0 flaws (~~), 8 warnings (==), 8 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                          A. Burness, Ed.
3	Internet-Draft                                           P. Eardley, Ed.
4	Intended status: Informational                                        BT
5	Expires: January 16, 2009                                     L. Iannone
6	                                                              UC Louvain
7	                                                           July 15, 2008

9	                     Locater ID proposal evaluation
10	                    draft-burness-locid-evaluate-01

12	Status of this Memo

14	   By submitting this Internet-Draft, each author represents that any
15	   applicable patent or other IPR claims of which he or she is aware
16	   have been or will be disclosed, and any of which he or she becomes
17	   aware will be disclosed, in accordance with Section 6 of BCP 79.

19	   Internet-Drafts are working documents of the Internet Engineering
20	   Task Force (IETF), its areas, and its working groups.  Note that
21	   other groups may also distribute working documents as Internet-
22	   Drafts.

24	   Internet-Drafts are draft documents valid for a maximum of six months
25	   and may be updated, replaced, or obsoleted by other documents at any
26	   time.  It is inappropriate to use Internet-Drafts as reference
27	   material or to cite them other than as "work in progress."

29	   The list of current Internet-Drafts can be accessed at
30	   http://www.ietf.org/ietf/1id-abstracts.txt.

32	   The list of Internet-Draft Shadow Directories can be accessed at
33	   http://www.ietf.org/shadow.html.

35	   This Internet-Draft will expire on January 16, 2009.

37	Abstract

39	   There are many proposals for improving the Inter-domain routing
40	   system, most of which involve a form of locater-identity split.
41	   There needs to be a means to reason about the strengths of the
42	   different proposals against the design criteria, and without
43	   requiring large scale implementations.  This document aims to start
44	   this process by drawing parallels with existing systems.  It
45	   identifies a number of questions that need to be more fully thought
46	   about whilst we press ahead with system development.

48	Table of Contents

50	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
51	   2.  Design Goals . . . . . . . . . . . . . . . . . . . . . . . . .  4
52	     2.1.  Router Scalability . . . . . . . . . . . . . . . . . . . .  5
53	     2.2.  Traffic Engineering  . . . . . . . . . . . . . . . . . . .  5
54	     2.3.  Multi-Homing . . . . . . . . . . . . . . . . . . . . . . .  5
55	     2.4.  Mobility . . . . . . . . . . . . . . . . . . . . . . . . .  6
56	     2.5.  Ease of changing providers . . . . . . . . . . . . . . . .  6
57	     2.6.  Routing Quality  . . . . . . . . . . . . . . . . . . . . .  6
58	     2.7.  Routing Security . . . . . . . . . . . . . . . . . . . . .  6
59	     2.8.  Deployability  . . . . . . . . . . . . . . . . . . . . . .  7
60	     2.9.  Unclear Requirements . . . . . . . . . . . . . . . . . . .  8
61	     2.10. Address Shortage . . . . . . . . . . . . . . . . . . . . .  8
62	     2.11. Failure Management . . . . . . . . . . . . . . . . . . . .  8
63	   3.  Related Working Options  . . . . . . . . . . . . . . . . . . .  8
64	     3.1.  NAT  . . . . . . . . . . . . . . . . . . . . . . . . . . .  9
65	     3.2.  Mobile networks and directory systems  . . . . . . . . . . 10
66	       3.2.1.  3G Systems . . . . . . . . . . . . . . . . . . . . . . 10
67	       3.2.2.  Mobile IP  . . . . . . . . . . . . . . . . . . . . . . 11
68	       3.2.3.  DNS  . . . . . . . . . . . . . . . . . . . . . . . . . 11
69	       3.2.4.  Summary  . . . . . . . . . . . . . . . . . . . . . . . 12
70	     3.3.  The routing system . . . . . . . . . . . . . . . . . . . . 12
71	   4.  Map and Encap Schemes  . . . . . . . . . . . . . . . . . . . . 13
72	     4.1.  Routing System Scalability . . . . . . . . . . . . . . . . 13
73	     4.2.  Traffic Engineering  . . . . . . . . . . . . . . . . . . . 14
74	     4.3.  Multi-Homing . . . . . . . . . . . . . . . . . . . . . . . 14
75	     4.4.  Mobility . . . . . . . . . . . . . . . . . . . . . . . . . 15
76	     4.5.  Changing Provider  . . . . . . . . . . . . . . . . . . . . 15
77	     4.6.  Route Quality  . . . . . . . . . . . . . . . . . . . . . . 15
78	       4.6.1.  Traffic Volume Overhead  . . . . . . . . . . . . . . . 16
79	     4.7.  Routing Security . . . . . . . . . . . . . . . . . . . . . 16
80	     4.8.  Deployability  . . . . . . . . . . . . . . . . . . . . . . 17
81	     4.9.  Address Shortage . . . . . . . . . . . . . . . . . . . . . 17
82	     4.10. Failure Handling . . . . . . . . . . . . . . . . . . . . . 17
83	   5.  Translation Schemes  . . . . . . . . . . . . . . . . . . . . . 18
84	     5.1.  Routing System Scalability . . . . . . . . . . . . . . . . 18
85	     5.2.  Traffic Engineering  . . . . . . . . . . . . . . . . . . . 18
86	     5.3.  Multi-Homing . . . . . . . . . . . . . . . . . . . . . . . 18
87	     5.4.  Mobility . . . . . . . . . . . . . . . . . . . . . . . . . 18
88	     5.5.  Changing Provider  . . . . . . . . . . . . . . . . . . . . 18
89	     5.6.  Route Quality  . . . . . . . . . . . . . . . . . . . . . . 19
90	     5.7.  Deployability  . . . . . . . . . . . . . . . . . . . . . . 19
91	     5.8.  Address Shortage . . . . . . . . . . . . . . . . . . . . . 19
92	     5.9.  Failure Handling . . . . . . . . . . . . . . . . . . . . . 19
93	   6.  Mapping System Design  . . . . . . . . . . . . . . . . . . . . 20
94	     6.1.  Push . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
95	     6.2.  Pull . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
96	       6.2.1.  Data Collection  . . . . . . . . . . . . . . . . . . . 21
97	         6.2.1.1.  Mapping Cache Size . . . . . . . . . . . . . . . . 21
98	         6.2.1.2.  Mapping Cache Efficiency . . . . . . . . . . . . . 23
99	         6.2.1.3.  Mapping Lookups  . . . . . . . . . . . . . . . . . 23
100	     6.3.  Route Through  . . . . . . . . . . . . . . . . . . . . . . 23
101	   7.  conclusions  . . . . . . . . . . . . . . . . . . . . . . . . . 24
102	   8.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25
103	   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 25
104	   10. Security Considerations  . . . . . . . . . . . . . . . . . . . 25
105	   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25
106	     11.1. Normative References . . . . . . . . . . . . . . . . . . . 25
107	     11.2. Informative References . . . . . . . . . . . . . . . . . . 25
108	   Appendix A.  Additional Stuff  . . . . . . . . . . . . . . . . . . 27
109	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27
110	   Intellectual Property and Copyright Statements . . . . . . . . . . 29

112	1.  Introduction

114	   The Internet routing system has problems with scalability and
115	   stability.  These problems are made worse by the need to support
116	   functionality such as multi-homing and traffic engineering [IAB].
117	   There have been a multitude of proposals that involve some form of
118	   locater-identity split that all aim to solve the problem of routing
119	   scalability.  However without large scale implementations it is very
120	   difficult to assess the relative strengths of these different
121	   proposals.  On the other hand, it should be possible to characterize
122	   the proposals against the requirements.  Further, by comparing the
123	   proposals against existing systems, we may also be able to start to
124	   understand the likely processing, storage and communications
125	   requirements.

127	   Whittle [whittle] has made a study of this type to compare some of
128	   the specific locator-ID split proposals.  Here, instead of studying
129	   specific proposals, we group proposals into simple categories ( map
130	   and encap schemes which were the focus of the previous study,
131	   translation schemes and directory systems) to enable us to understand
132	   the likely behaviour of whole groups of proposals at a more generic
133	   level.

135	   This paper aims to start a process of evaluation.  This document is
136	   written not as a truth, but as the perception of the authors that
137	   should be challenged.

139	   We begin by reviewing the requirements against which proposals should
140	   be assessed.  Then we highlight some existing systems which may have
141	   processing, communications or memory requirements similar to those of
142	   the proposed schemes.  Their behavior might help to guide us in
143	   assessing the proposals.  This is essentially trying to learn from
144	   history.  We appeal here in particular to equipment manufacturers who
145	   may have a better grasp of equipment capabilities; which are
146	   fundamental and which are limits based on market requirements.  We
147	   then assess the generic schemes against the requirements.  There are
148	   essentially two main approaches to routing, commonly known and map
149	   and encap, and translation.  The critique of map and encap is based
150	   primarily on an understanding of LISP (draft 5) [LISP], the apparent
151	   current leader in that set of schemes; similarly the translation
152	   section is based upon 6/1 [six-one].  The aim is to be as critical as
153	   possible in order to stimulate future activity before making some
154	   conclusions.

156	2.  Design Goals

158	   In order to compare the solutions we need to understand the full
159	   breadth of requirements for a future routing proposal.  The first 7
160	   are direct echoes of the requirements in [Goals], the later
161	   requirements we feel are not sufficiently highlighted in that draft.
162	   Although these are the list of requirements for the new routing
163	   architecture, there is no need for all these features to be
164	   implemented within one protocol.  For example, making it easy for
165	   networks to change provider may mean that the edge network addresses
166	   need to be decoupled from those in the core.  However an alternative
167	   approach is to develop automated tools that can smoothly manage
168	   address changes of hosts, routers and other elements (access control
169	   lists for example) within an edge network.  Multi-homing may be
170	   managed by the routing system, or the routing system might simply(!)
171	   expose multiple paths that can be used by another mechanism to
172	   support multi-homing.  However, we feel that any routing proposal
173	   should make clear how well the additional features could be supported
174	   in order to assess the whole solution.

176	2.1.  Router Scalability

178	   Memory and processing requirements are growing all the time; already
179	   routers need to be upgraded every 2 to 5 years.  Many people believe
180	   the rate of growth is faster than Moore's law, meaning that cost
181	   could go up significantly or technology could start to fail.  The
182	   reason behind this growth appears to be a decreasing reliance on
183	   address aggregation rather than the absolute growth of the system
184	   itself.  Also, it is not necessarily the actual memory requirements
185	   that is the problem, but the need to be able to read and write those
186	   memories quickly, because there is a high rate of churn in the
187	   routing system.  The churn in the system also adds a processing
188	   requirement.  Again, churn rates are increasing in line with
189	   increasing de-aggregation.

191	2.2.  Traffic Engineering

193	   Traffic engineering is the ability to direct traffic along non-
194	   default path(s).  The ability to control the path taken by inbound
195	   traffic is as important as the ability to control the outbound path.
196	   Both these are non-trivial today: control of the in-bound data path
197	   requires manipulation of BGP messages.  Control of the outbound path
198	   can be made difficult as a result of ingress filtering blocking data
199	   which appears to have been spoofed.

201	2.3.  Multi-Homing

203	   A multi-homed site can connect to the Internet via more than one
204	   network provider.  Today this is done by injecting multiple, more
205	   specific address prefixes into the global routing table, which
206	   therefore impacts on BGP's scalability.  Therefore any solution
207	   should have a simple and effective means to manage multi-homing.
208	   Since one reason for multi-homing is to improve resilience, the
209	   multi-homing solution must be clear how failures are detected and
210	   repaired.  This type of edge network failure management should
211	   ideally not impact on the convergence and stability of the global
212	   routing system.  Availability requirements vary tremendously from a
213	   few seconds to as small as possible (ms range) where running sessions
214	   should not be affected.

216	   Whilst multi-homing is primarily considered as a means of failure
217	   management, where-ever multi-homing appears, the desire for policy
218	   controlled routing simultaneously over all potential links soon
219	   follows.

221	2.4.  Mobility

223	   Increasingly nodes and sites will be mobile.  An efficient, scalable
224	   means is needed to support mobility.  When a host moves, hosts and
225	   routers that are not in communication with the mobile host should not
226	   need to be informed of the mobility.  When a network moves, the
227	   number of routers informed of the change should be minimized.

229	2.5.  Ease of changing providers

231	   This is often cited as a key reason behind the increasing use of
232	   provider independent (PI) addresses, and hence a key reason behind
233	   routing scalability issues.  Using PI addresses, end-sites can change
234	   providers without renumbering (or at least with much less
235	   disruption).  Customers may want to change service provider on a
236	   yearly basis.  A future routing system should make it easy for
237	   customers to change provider with minimal configuration requirements
238	   on the customer.  The process should be as simple as possible, almost
239	   certainly automated.

241	2.6.  Routing Quality

243	   Quality of routes includes convergence time, stability of path, loss,
244	   delay and stretch.  The first parameters are of interest to the
245	   network user.  The later parameter gives and indication of efficiency
246	   of use of network transmission resources.

248	2.7.  Routing Security

250	   The new architecture should be at least as secure as the existing
251	   system.

253	2.8.  Deployability

255	   The Internet is stagnating; it is amazingly difficult to get new
256	   networking solutions deployed [Handley].The solution MUST be:

258	   o  Technically deployable

260	   o  Incrementally deployable

262	   o  All aspects of operation with legacy systems must be well
263	      understood.  Applications (that have not hard-coded address into
264	      themselves) would see no changes.  An updated or legacy node in a
265	      part of the Internet that uses the new system should be reachable
266	      by legacy or updated nodes operating within legacy or updated
267	      networks.

269	   o  There must be a motivation for the person or organisation to
270	      deploy the system as solving the greater good is not sufficient.
271	      This benefit (or a subset) should exist for an isolated
272	      deployment.  It seems probable that new functionality (rather than
273	      faster or even cheaper) is most likely to motivate deployment.
274	      This is because any new technology always has hidden costs such as
275	      training people to install and manage it for example.  Examples of
276	      new functionality could be a security improvement, in and out-
277	      bound reliable traffic engineering, or visibility of alternative,
278	      low delay or highly reliable data paths.  However, it is difficult
279	      to predict what new features or services will attract users

281	   o  Flexible service models should be supported, in other words a
282	      user, edge site or ISP should be able to deploy the service on
283	      behalf of others.

285	   o  Key players must not be disadvantaged, or they may try to obstruct
286	      standards or restrict deployment.  A specific aspect of this to
287	      highlight is how network providers today use policy control.
288	      Providers are unlikely to support any scheme which make policy
289	      management more difficult that today.  They are likely to require
290	      the ability to check that routes are as diverse as possible, to
291	      chose routes based on cost and performance and to avoid routes
292	      leaving or entering a specific country or domain.

294	   If the constraints of operation with legacy systems and flexibility
295	   in location of functionality are met, then a non-issue is that of
296	   host upgradeability.  However, host upgradeability is not impossible
297	   and recent history suggests this might be easier than network
298	   evolution.  Recent host upgrades in ECN, IPv6 and RSVP based QoS are
299	   not being supported by similar network evolution.

301	2.9.  Unclear Requirements

303	   Two other requirements are mentioned in [Goals].

305	   The first is that mechanisms used must be first class elements within
306	   the architecture.  I am not totally sure what this means.

308	   The second requirement is that location and identification should be
309	   able to be decoupled.  It is required that a solution for scalable
310	   routing is compatible with (but does not require) a solution that
311	   separates the host identification from the host location name-space.
312	   This separation should improve the flexibility of the Internet.  The
313	   significance of this requirement is unclear, perhaps because none of
314	   the proposed solutions have failed to meet this requirement, and may
315	   only become clearer if assumptions or requirements on the
316	   identification such as cryptographic authentication requirements or
317	   the need to be able to reverse map from location to identifier are
318	   made.

320	   Less often mentioned are two other requirements that we believe are
321	   nevertheless critical:

323	2.10.  Address Shortage

325	   Current predictions are that the unallocated IPv4 address space will
326	   soon be used up, with suggestions [Huston] that IANA will run out of
327	   addresses by 2011, with RIR running out by 2012.  Routing and
328	   addressing are closely related, and the impact of the scheme on the
329	   address shortage problem should be considered.  It may be easier if
330	   one major network overhaul is required rather than two.

332	2.11.  Failure Management

334	   If the routing system never encountered any changes, then it is
335	   likely that there would be no scalability issues.  Minimizing
336	   connectivity disruption in the presence of failures is critical as
337	   failure recovery is one of the drivers behind multi-homing.  Some
338	   end-sites have a target of no more than 10ms downtime _ although it
339	   is not clear that this would ever be achievable!  Other sites may be
340	   happier with a few seconds disruption.  Any scheme should make it
341	   clear how failures are handled, and should be no less robust to
342	   failure than today's systems.

344	3.  Related Working Options

346	   What can we learn from running systems today?  This section is by no
347	   means complete, but rather presented as a starting point.  In
348	   particular, we are have not got hard data on systems that exactly
349	   match anything than any new systems are trying to do.  We are simply
350	   placing a stake in the ground at a loosely justifiable point; and
351	   asking people to move it.  Note however that at this stage, we are
352	   looking for order of magnitude figures and hence OM movement of the
353	   stake!

355	3.1.  NAT

357	   NAT solves the problems of address shortage and provider
358	   independence.  Hence, whatever we may feel about the architectural
359	   violations of NAT, we could imagine simply promoting the greater use
360	   of NAT to reduce the scalability problem as provider-dependent
361	   addresses can then be more easily promoted.  It is after all clearly
362	   deployable.  Many edge sites, mobile operators and even some ISPs are
363	   already using NAT, often claiming increased security in addition to
364	   the other benefits.  For example, by hiding the addresses of servers
365	   and routers inside the network, it makes it a bit harder for an
366	   attacker to try and establish a session with these devices.

368	   A NAT box can control traffic flows over different links if it is
369	   multi-homed, thus providing some traffic engineering capabilities.
370	   In particular, for sessions that are started behind the NAT, then the
371	   in and out-bound data path can be controlled by the choice of address
372	   that the NAT box uses for the session.  One could imagine
373	   enhancements to NAT that would enable widely separated NAT boxes to
374	   communicate to support different multi-homing architectures.

376	   Of course, there are issues with NAT which is why it has never been
377	   proposed as a solution to the routing system scalability problem;
378	   most significantly it breaks the end to end semantics of the
379	   Internet.

381	   However, it is interesting to note that NAT is typically not used by
382	   the larger sites, and it appears to be the performance rather than
383	   any purist objections that lead to this.

385	   The performance limitations come from the fact that NAT requires a
386	   high level of per-flow tracking and per-packet modifications.
387	   Because there are so many flavours of NAT, it is hard to get
388	   quantifiable information on the performance.  For NAT-PT, we should
389	   expect to map one IP address to 65,000 different sessions using the
390	   port identifier.  CISCO's web site [Cisco] suggest that a typical NAT
391	   router would not need to support more than 10,000 translations, and
392	   based on the same source, 128,000 such sessions would take 40MBytes
393	   DRAM.

395	   Assuming we can easily support 128,000 NAT sessions, we can then
396	   estimate then how many users this corresponds to.  Each TCP flow is
397	   mapped to a different NAT session.  A peer to peer application may
398	   run 100 concurrent sessions.  Perhaps only 10% of an ISP customers
399	   are peer-peer users; the remaining 90% will typically have a low
400	   number of concurrent connections, say 5.  So on average a customer
401	   has 14.5 active TCP sessions, meaning that the NAT as described can
402	   handle 8,827 users.  This might mean that universities and medium
403	   enterprises could all be placed behind NAT devices, but larger
404	   corporate bodies and large ISPs would need something a little
405	   different, or very many co-ordinated NAT boxes.  If we assume that
406	   the mappings maintained are between pairs of IP addresses rather than
407	   each individual TCP sessions, then we may be able to handle 3 times
408	   that number of users behind a single device.

410	   Netflow is another networking tool that supports per flow packet
411	   processing.  Cisco [Cisco] claim that their NETFLOW accounting tool
412	   can support 128,000 simultaneous connections - similar in scale to
413	   our NAT estimations.

415	   In summary, per flow processing of each packet is likely to lead to
416	   limitations on how fast edge devices can operate, putting a limit on
417	   how many users could be behind such a box.  Routers work fast today
418	   because they are highly optimised towards a single simple forwarding
419	   duty.

421	3.2.  Mobile networks and directory systems

423	3.2.1.  3G Systems

425	   GSM and 3G cellular systems already have a locater-identity split.
426	   For the voice system, the phone number acts as an identity.  The Home
427	   Location Database (HLR) contains a mapping of the phone number to its
428	   current location as identified as a routing area.  The HLR system
429	   will typically have, without known scalability concerns, up to tens
430	   of millions of users in this centralized database.  A routing area
431	   will contain 10's or even 100's of cells which range in size from the
432	   few metres ( pico cells in buildings in cities) to several kilometres
433	   in the countryside.  To find a user, the HLR is used to discover
434	   which routing area last knowingly contained the phone.  All nodes in
435	   that routing area then receive a paging message in an attempt to
436	   discover the actual location of the user.  This temporary location
437	   mapping is then held by the router responsible for that routing area.

439	   The HLR mapping system does not know if the end node is reachable
440	   (and indeed for many data services the end node may not be
441	   reachable).  This is discovered during the paging process, which
442	   means that it can take 5 to 10 seconds in order to make initial
443	   contact with a mobile device.  When a user moves around a routing
444	   area, it is not necessary to update the HLR of the location unless
445	   the node changes routing area.  The size of the routing area depends
446	   not on how fast the HLR can be updated but on how much paging is
447	   expected as paging wastes the resources (battery power) of all phones
448	   in the area.

450	   Handover (without session disruption) is only possible within one
451	   service provider network, as much as anything due to the time taken
452	   to manage security associations and general business concerns.
453	   Handover is managed locally with co-ordination between the different
454	   base stations.  A make before break system is used to minimize
455	   service disruption.

457	   Roaming occurs when a node changes service provider network.  Here,
458	   the HLR will be updated to point to a Visitor Locator Database in the
459	   visited network which is updated with the routing area associated
460	   with the node.  The (hand-crafted)peering arrangements to allow
461	   roaming are sorted by management processes.

463	3.2.2.  Mobile IP

465	   Mobile IP (MIP) uses a different scheme.  Data is directed via a home
466	   agent which is updated with the current location of the mobile
467	   device.  (In one sense this is similar to schemes where the data is
468	   re-directed via the mapping system).  Mobile IP is much less widely
469	   deployed.  Reasons for this could include the performance
470	   implications of the tunnelling process and the amount of per-node
471	   state management at the home agent.  Designing adequate security
472	   mechanisms has also troubled MIP development.

474	3.2.3.  DNS

476	   Within the Internet, we already have experience with a large
477	   distributed database for mapping from name to address.  DNS works
478	   well, so well in fact that people are loathe to change it in case it
479	   gets disrupted; after all it is a critical piece of the
480	   communications infrastructure and a user is unlikely to care if it is
481	   a routing or DNS problem that disrupts their on-line shopping trip -
482	   it will be equally broken.

484	   It is usually stated that DNS works well because of the hierarchy in
485	   the name space (although the structure is relatively flat at about 3
486	   levels) and the aggressive use of caching.  Time to live (TTL) values
487	   are typically set at about an hour.  However recent studies [DNS] are
488	   beginning to suggest that caching is not as vital as previously
489	   thought and that much shorter TTL of the order a few hundred seconds
490	   would not noticeably degrade DNS performance.  This is in part
491	   because a DNS update message is only processed locally, there is no
492	   attempt to keep all DNS servers with up to date information.

494	   A host typically begins each transport layer session with a DNS
495	   lookup.  This can take up to 2 seconds to resolve, although it is
496	   usually much quicker.

498	   The DNS system is held together by IP addresses that are hand-coded
499	   into the system.  A question to answer is what happens if the IP
500	   address is replaced by an intransient identifier and a transient
501	   locater.  If the DNS servers need to be identified and their current
502	   location found before a DNS query could be resolved, then the
503	   performance of the identifier resolution system will have a big
504	   impact here as several DNS servers often need to be found to achieve
505	   a single name resolution.  Further, if the DNS system is the
506	   identifier resolution system, we would have a nasty circular
507	   dependency.

509	3.2.4.  Summary

511	   We can make some observations about the systems that work well.  They
512	   seem to have extremely low functionality with low rates of change of
513	   the data.  These changes are effectively confined (localized).  Data
514	   changes are not propagated around the system.  Hard-wiring of
515	   directory associations is commonplace.  Perhaps an automated
516	   discovery and topology building protocol may give more problems than
517	   its worth for this type of system?  It is possible that automation is
518	   only required for systems with large amounts of change.

520	3.3.  The routing system

522	   So having considered systems that work well, what are the
523	   characteristics of the routing system?  A mid-tier isp network may
524	   contain double the number of prefixes as the core of the Internet -
525	   thus we must be careful of designs that move complexity from the core
526	   to the periphery of the network.

528	   how many prefixes; how many AS; how many nodes; how many end sites;
529	   how many transits; how big is the DFZ;

531	   The churn rate is very high and very variable.  If a site recieves on
532	   average 400 BGP messages a minute, it may easily expect to have 8000
533	   or 80,000 updates at peak periods of intense instability.

535	   Many of these messages are not really indicating true physical
536	   problems.  A site may rapidly flap its links in an effort to
537	   manipulate the flow of data between different multi-paths.  A site
538	   may perform mild policy updates.

540	   Churn is typically slowed by the introduction of timers to delay
541	   sending of messages.  Often however these timers are turned as low as
542	   possible to try and maximise network availability.  Ideally, local
543	   repair mechanisms should be used to recover from failure without
544	   involving the entire routing system.

546	4.  Map and Encap Schemes

548	4.1.  Routing System Scalability

550	   These schemes aim to encourage use of provider dependent addresses
551	   thus removing the load from the core routing system.  This is
552	   achieved by making addresses in the edge network independent from
553	   those in the core transit system, so that provider lock-in is
554	   avoided.

556	   All these schemes require a mapping system to translate between edge
557	   and core network locators.  The scalability of mapping system is
558	   uncertain.  We shall assume that the mapping system holds essentially
559	   static information.  We further assume that (using LISP terminology)
560	   End Point Identifiers (EID) are aggregatable so a system of required
561	   size could be built.  It is probable that this system could be built
562	   to store and return all locators associated with an end point
563	   identifier prefix range.  Issues that would impact the probable
564	   scalability of this system are

566	   o  if the system needs to propagate this information globally, in
567	      which case it would become very sensitive to churn rate and
568	      bandwidth.  In this case, it could not sensibly be used for
569	      mobility management for example

571	   o  if the system was used to propagate policy or traffic engineering
572	      information, as all the evidence is that this information is very
573	      rapidly changing

575	   The third item to be considered are the edge routers which may need
576	   to do per-flow packet processing.  This processing may be required to
577	   manage reachability information (is it sufficient to hold a mapping
578	   to the core locator of the edge router and to know that the lower
579	   layer routing system thinks this address is still valid, or do we
580	   need to know that the higher layer functionality is alive?)  Further,
581	   the LISP description of multi-homing management seems to imply per-
582	   flow packet processing for example reading of headers on return
583	   packets of a flow to discover which of the possible edge routers are
584	   prepared to handle this session).  If per-flow packet processing is
585	   required, we may run into scalability problems as in NAT routers
586	   today.  Is the per-flow assumption fair?  If we were considering all
587	   flows to a specific tunnel end-point, perhaps there may be some way
588	   to aggregate information?  This would depend on the location of the
589	   tunnel end points.  If they are near to the network edge it is quite
590	   likely that there will be a limited number of flows heading towards a
591	   specific the tunnel router.  If the edge routers are near the core,
592	   we then introduce a scaling problem behind the edge routers, where
593	   all networks now have provider independent address spaces.  Since the
594	   absolute size of the mid-tier networks is greater than that of the
595	   DFZ, adding scaling pressure here is unlikely to be a good idea.

597	   Of course, another way to consider these schemes is to assume that
598	   they do nothing apart from append a new packet header at the edge
599	   router: in this case, a better simile would be with MPLS; where the
600	   primary scalability worry to date comes from lack of labels (only 20
601	   bytes available).  The main issues with MPLS are the ability to
602	   verify reachability, rather than processing and memory requirements.
603	   Certainly MPLS has yet to be implemented inter-domain and is not
604	   suggested as a solution itself.

606	   In summary, the main scalability questions may arise only when a
607	   clearer understanding of how multi-homing with traffic engineering
608	   are to be managed.

610	4.2.  Traffic Engineering

612	   Traffic engineering and policy controls may require co-ordination
613	   between two layers.  It requires the ITR to respect ETR instructions.
614	   It is probable that some policy opaqueness is lost.  One interesting
615	   question for example, is how peering relationships are managed, as to
616	   be reachable by any node, the ETR must be advertised openly in the
617	   mapping system, and once this is done, how is it ensured that only
618	   networks with distinct peering relationships use the more expensive
619	   links?

621	4.3.  Multi-Homing

623	   By separating the routing system into two parts, it is expected that
624	   it will be possible to implement multi-homing management separately
625	   from the routing system.

627	   The mapping system may return many possible locaters.  The edge
628	   routers using edge to edge communications manage multi-homing.  In
629	   LISP it is described how an ITR will spray packets from a flow across
630	   the different possible ETRs, according to the weights associated with
631	   the ETR devices.  The ETRs communicate back to the ITR(s) which
632	   addresses they would prefer to see used.  This is used for traffic
633	   engineering as well as simply reachability purposes.  If this
634	   information is piggybacked onto a data session , which may raise
635	   security questions [Bagnulo] , how is this managed for UDP
636	   applications which may have the return control channel as a different
637	   session to the data channel?  This also breaks the model that TCP
638	   has, of packets typically following a single path which may have
639	   unfortunate implications both for congestion control and for TCP
640	   performance.  If we assume that a TCP flow is kept together, but that
641	   packets destined to the same end site are spread amongst the edge
642	   routers, we now definitely have per-flow state, and unlike ECMP,
643	   associated packet processing (adding the correct outer header).

645	4.4.  Mobility

647	   As in multi-homing, by separating the routing system into two parts,
648	   it is expected that it will be possible to implement mobility
649	   management as an overlay to the routing system.

651	   It may be possible to manage simple portability by updating the
652	   mapping system so that new sessions would start correctly.  This
653	   assumes that the mapping system operates like DNS today, without the
654	   information needing to be distributed globally.  In session mobility
655	   however requires the updating of the mappings dynamically;
656	   Discussions on the mailing lists [MailList] to date imply that this
657	   is difficult, with suggestions that it should be an application
658	   specific signalling.  It is likely that should source and destination
659	   simultaneously move, the session will be dropped, unless the edge
660	   routers offer a forwarding functionality.

662	4.5.  Changing Provider

664	   This is by design extremely simple as only the mapping system needs
665	   updating.  However there may still be issues ensuring packet filters
666	   and firewalls are correctly configured.  These have been covered to
667	   some extent for Ipv6 in RFC 4192 [RFC4192] where make before break
668	   techniques have been described, but this may not be suitable on the
669	   whole for IPv4.

671	4.6.  Route Quality

673	   Since multiple edge routers can be associated with a name, the
674	   network system may have a greater choice of routes to use to reach a
675	   specific device (although it is not clear that this control could be
676	   passed back to the data sources).

678	   If the mapping replies take a long time, a TCP session start up may
679	   be disrupted.  Similarities with ARP are not necessarily relevant:
680	   ARP is an extremely local process that can resolve very quickly, and
681	   ARP entries are normally within a cache because they are used
682	   frequently.

684	   Since multi-homing requires a flow to be sent along diverse paths TCP
685	   may see lots of out of sequence packets and congestion control
686	   mechanisms may not work as expected .

688	4.6.1.  Traffic Volume Overhead

690	   It is not clear how easy it is to solve the problem of tunnel
691	   overheads and packet fragmentation, or if indeed that is a major
692	   issue.  During the study of locator-ID cache performance, described
693	   below in Section 6.2.1 an analysis was also made of the overhead in
694	   terms of traffic volume.  Table 1 and Table 2 compares the original
695	   traffic volume (expressed in Mbit/sec) with the volume obtained
696	   encapsulating all packets, respectively for incoming traffic and
697	   outgoing traffic.  As can be observed, the overhead introduced by the
698	   tunneling approach consists in few Mbit/sec.  For outgoing traffic
699	   this means an overhead that ranges from 4.15% up to 11.15% .  For
700	   incoming traffic this means an overhead that ranges from 3.8% up to
701	   5.75%.  What the table also show is that even if in terms of absolute
702	   bandwidth the overhead is more important during the high traffic load
703	   period (i.e., day), in terms of percentage points it is more
704	   important during the low traffic load period (i.e., night).

706	   +--------------+------------+-----------+-------------+-------------+
707	   |      Traffic |        Min |       Max |   Avg Night |     Avg Day |
708	   +--------------+------------+-----------+-------------+-------------+
709	   |     Original |      13.22 |    108.10 |       18.70 |       85.62 |
710	   | Encapsulated |      13.98 |    112.21 |       19.72 |       89.17 |
711	   |              |   (+5.75%) |   (+3.8%) |    (+5.45%) |    (+4.15%) |
712	   +--------------+------------+-----------+-------------+-------------+

714	              Table 1: Incoming Traffic Volume (in Mbit/sec)

716	   +--------------+-------------+------------+------------+------------+
717	   |      Traffic |         Min |        Max |  Avg Night |    Avg Day |
718	   +--------------+-------------+------------+------------+------------+
719	   |     Original |        6.28 |      48.25 |       9.75 |      32.58 |
720	   | Encapsulated |        6.98 |      51.67 |      10.68 |      35.63 |
721	   |              |   (+11.15%) |   (+7.09%) |   (+9.54%) |   (+4.15%) |
722	   +--------------+-------------+------------+------------+------------+

724	              Table 2: Outgoing Traffic Volume (in Mbit/sec)

726	4.7.  Routing Security

728	   Pending further thought.  The security analysis so far performed
729	   [Bagnulo] was on LISP version 1.

731	4.8.  Deployability

733	   o  Technically deployable

735	   o  It is not clear how incrementally deployable this is.  If it is
736	      required that (PI) EID space is advertised in the legacy routing
737	      system to enable communication with legacy nodes, then the scaling
738	      pressures on the routing system will shoot up dramatically during
739	      the early stages of deployment.

741	   o  Operation with legacy systems is not well understood

743	   o  There is no clear motivation why an edge system should deploy this
744	      scheme.  Since provider lock-in can be avoided today using
745	      existing well known techniques, there is no motivation for a end
746	      site to chose LISP over the familiar technology.  Traffic
747	      engineering and multi-homing control have been mentioned as
748	      possibilities to motivate a deployment, but to date are too poorly
749	      described to be able to judge if they meet all requirements well.

751	   o  There may be opposition as traffic engineering and policy control
752	      requires communications between ITR and ETR devices, which may
753	      reduce the opaqueness of the policy control over existing
754	      techniques.  Policy control may become more complicated

756	4.9.  Address Shortage

758	   Although described for IPv4, which is seen as an advantage, these
759	   schemes are essentially IP version agnostic.  Unlike the NAT
760	   solutions of today, the EIDs in any domain must have global
761	   uniqueness for the mapping system, thus potentially making the
762	   problem worse.  Although better allocations of addresses may become
763	   possible, it is unlikely that addresses can be easily recovered.

765	4.10.  Failure Handling

767	   These schemes always require an additional global database
768	   infrastructure.  This is therefore as critical a resource as the
769	   current DNS system is.  All things being equal, the addition of this
770	   would decrease the resilience of the overall Internet.  Further,
771	   fault tracing would become yet more complex.  The underlying routing
772	   system takes care of path failures between the tunnel routers.
773	   However tunnel routers become critical points of failure if they hold
774	   state.

776	5.  Translation Schemes

778	5.1.  Routing System Scalability

780	   It aims to encourage use of provider dependent addresses removing the
781	   load from the core routing system.  It does this by providing a
782	   different way to manage multi-homing.  Since the edge routers are not
783	   state holding, and only need to tamper with the first few packets of
784	   a flow, the scalability of these edge routers should be better than
785	   that of current NAT devices.

787	5.2.  Traffic Engineering

789	   In and outbound traffic engineering is managed through either the
790	   node or egress router setting the routing portion of the locater.
791	   For in-bound sessions, this only works when both ends are translation
792	   aware.  Existing policy control is possible, although there is
793	   motivation to move to alternative ways to achieve same goal.  For
794	   example AS pre-pending to indicate that a route should be avoided
795	   could be replaced with a translation to the preferred route.  Since
796	   this could work more reliably than AS pre-pending there is a driver
797	   for change.

799	5.3.  Multi-Homing

801	   For multi-homed edge networks (as opposed to multi-homed hosts) this
802	   can be controlled by edge networks but is visible to end hosts.
803	   Applications bind only to the identifier part of the address.

805	5.4.  Mobility

807	   Since applications can tolerate the address changing, mobility should
808	   be simplified.  Many of the functions to support multi-homing are
809	   like those to support mobility but it is not clear that the details
810	   and overlaps have been fully identified, especially with regard to
811	   security.

813	5.5.  Changing Provider

815	   This will be complicated, and additional protocol support will be
816	   required.  As well as DHCP re-configuration of hosts, there will be
817	   DNS updates and firewall and filter settings.  Also the intra-domain
818	   routing system may be affected.  This later problem may be made more
819	   manageable if internal routers can mask out the network address
820	   portion within the internal routing system.  This may make it harder
821	   to do efficient routing inside the network or to manage edge node
822	   failures.  An architecural viewpoint would suggest that this problem
823	   will remain unsolved because identifiers are only allocated to end
824	   systems, making IP address the primary identifier used in all
825	   management systems.

827	5.6.  Route Quality

829	   The scheme adds minimal additional delays.  All data translations are
830	   based only on locally held, locally visible material.  Alternative
831	   routes, as indicated by different address pairings, are visible to
832	   the end devices.

834	5.7.  Deployability

836	   o  Technically deployable

838	   o  Proxy support, to avoid upgrading of hosts, may look very like NAT
839	      with a break in the end to end semantics

841	   o  Some of the new benefits over the existing system (specifically
842	      in-bound TE) are only evident when there is a large deployed base.

844	   o  Operation with legacy hosts is possible provided all 6/1 elements
845	      can identify it as a legacy host

847	   o  Motivation is based on additional feature of in-bound TE.  The
848	      ability to see and use different routes, as identified through
849	      different addresses may also be valuable.

851	   o  Hosts, edge devices and possibly internal networks all need to be
852	      upgraded.

854	5.8.  Address Shortage

856	   Forces upgrade to IPv6

858	5.9.  Failure Handling

860	   Since the edge devices are expected only to translate on the first
861	   packets of a flow (relying on the end host to use the correct address
862	   once it is made aware), the edge devices become less critical as they
863	   are not state holding.  It has been suggested that Should the edge
864	   router or access link fail, a local mechanism (similar to handover in
865	   a cellular system) can be used to achieve fast recovery.

867	   Relies on DNS system to provide the locater mapping.  Currently DNS
868	   servers are found through the hard-coding of related DNS server
869	   addresses.  If addresses become transient what does this mean for the
870	   DNS system?  Thus although a separate resolution system is not
871	   required, some consideration on DNS use would still be needed.  Would
872	   DNS servers need to be logically within the transit (provider
873	   independent address) zone?

875	6.  Mapping System Design

877	   The concept of tunnelling IP data packets across a large scale
878	   network is not new.  Many years ago there was much activity put into
879	   the design of networks that could run IP over ATM clouds.  This
880	   activity failed because of the difficulty of managing the mapping
881	   process - hence the design of MPLS which uses a single IP control
882	   plane across the entire network.  Are there any lessons to be learnt
883	   from this experience?

885	   There appear to be three basic options: push, pull or route through.

887	6.1.  Push

889	   If the full database is pushed to all tunnel routers, these devices
890	   may end up with larger storage requirements than current routers
891	   because all end sites now have provider independent addresses and so
892	   no aggregation is possible here.  There is also the problem of
893	   keeping the database securely up to date.  This is the way that name
894	   to address mappings were originally managed, before DNS was
895	   introduced.  This new database could however be smaller than DNS
896	   because you have a locater associated with an EID prefix (ie roughly
897	   equivalent to having a locater associated with bt.com, not one for
898	   www.bt.com, mail.bt.com etc).  There have been claims that this
899	   mapping system would be easier to manage than the current routing
900	   system because it can be the same everywhere, whereas a routing table
901	   varies according to the router.  However, link state protocols
902	   actually distribute a topology database which is the same everywhere,
903	   and they are not used for very large scale networks is this because
904	   there is no localisation of changes and they are considered un-
905	   scalable, or is this because they do not provide suitable hooks for
906	   policy routing?  It is possible that the scalability concerns are
907	   out-dated [OSPF-LITE]

909	6.2.  Pull

911	   DNS is an example of a pull system.  It enables localisation of
912	   changes so could be used to carry more dynamically varying
913	   information, although the rate of updates should be slower than the
914	   cache lifetimes.  The disadvantage of this scheme used mid-flight is
915	   the additional delays that will be introduced.  These, as well as
916	   being annoying, may also upset protocols such as TCP.  Further, as
917	   the query is performed by a network element this opens up the
918	   potential for a DOS attack where a source simply sends initial
919	   packets to unknowable destinations.  However, the results of a study,
920	   summarised below suggest that the size of the cache mapping and be
921	   made small, with a realtively small timeout, whilst still achieveing
922	   a high hit rate.  This could mean that the negative impacts will be
923	   constrained.  More results can be found in [MAPCOST].

925	6.2.1.  Data Collection

927	   During the end of May and the beginning of June 2007, NetFlow
928	   [NETFLOW] traces have been collected from UCL [UCL] Campus network,
929	   from the border router, a Cisco Catalyst 6509.  The UCL network has
930	   almost ten thousand active users per day.  It uses a class B (i.e.,
931	   /16) prefix block and is connected to the Internet through a border
932	   router that has a 1 Gigabit link toward the Belgian National Research
933	   Network (Belnet).  These traces have been used to try and emulate the
934	   behavior of the Loc/ID separation cache, as if a protocol such as
935	   LISP ([I-D.farinacci-lisp]) was deployed on the border router of UCL
936	   campus network.

938	   The analysis performed assumes that the ID-to-Locator mapping has the
939	   same granularity as the prefix blocks assigned by RIRs.  A first
940	   analysis of the traffic of UCL network shows that the number BGP
941	   prefixes contacted per minute ranges from 3,618 up to 11,074, with a
942	   clear night/day cycle.  In particular, during the night the average
943	   number of prefixes contacted per minute is 4,000, while during the
944	   day the average raises to slightly more than 8,000, thus doubling the
945	   load.

947	6.2.1.1.  Mapping Cache Size

949	   The cache emulator uses a timeout policy in order to flush unused
950	   cache entries.  In particular, the analysis on the traces has been
951	   performed three times with three different timeout values,
952	   respectively three (3), thirty (30) and three hundred (300) minutes.

954	   Table 3 shows the summary of the size of the Mapping Cache, expressed
955	   in number of entries, for a daylong observation period.  The table
956	   also shows the average size during night period and day period.  The
957	   night period is the average of the Mapping Cache size between 0 am
958	   and 6 am, which is the period with the lowest traffic load, while the
959	   day period is the average between 10 am and 4 pm, which is the period
960	   with the highest traffic load.

962	    +----------+----------+----------+----------------+--------------+
963	    |  Timeout | Min Size | Max Size | Avg Night Size | Avg Day Size |
964	    +----------+----------+----------+----------------+--------------+
965	    |   3 Min. |    7,530 |   17,804 |          8,056 |       14,093 |
966	    |  30 Min. |   22,588 |   43,529 |         24,161 |       38,405 |
967	    | 300 Min. |   61,939 |  103,283 |         65,600 |       81,060 |
968	    +----------+----------+----------+----------------+--------------+

970	            Table 3: Mapping Cache Size (in number of entries)

972	   From the previous table is easy to compute the size of the Mapping
973	   Cache in terms of bytes by using the following equation:

975	            S = E x (5 + N x 6 + C)                     (1)

977	   Where S is the size of the cache expressed in bytes and E is the
978	   number of entries.  N represents the number of RLOCs per EID.  Note
979	   that due to multihoming, an EID can be associated to more than one
980	   RLOC.  The number 6 represents the size of an RLOC assuming four
981	   bytes for the address and two other bytes for traffic engineering
982	   purposes (e.g. priority and weight like in [I-D.farinacci-lisp]).  C
983	   represents the overhead in terms of bytes necessary to build the
984	   cache data structure.  Assuming the cache is organized as a tree, C
985	   can be set to 8 bytes, just the size of a pair of pointers.  The
986	   number 5 represents the size of an EID.  Since we are using mappings
987	   with a granularity of BGP prefixes, five bytes are necessary, four
988	   for the IP prefix address and one for the prefix length.  Table 4
989	   shows the maximum size (in Kbytes) for the Mapping Cache assuming
990	   respectively one, two, or three RLOCs for each EID.  Depending on the
991	   timeout used, the size of the cache can range from few hundreds
992	   KBytes, up to few MBytes.

994	                 +----------+--------+---------+---------+
995	                 |  Timeout | 1 RLOC | 2 RLOCs | 3 RLOCs |
996	                 +----------+--------+---------+---------+
997	                 |   3 Min. |    334 |     440 |     545 |
998	                 |  30 Min. |    807 |    1062 |    1317 |
999	                 | 300 Min. |   1917 |    2522 |    3127 |
1000	                 +----------+--------+---------+---------+

1002	         Table 4: Mapping Cache Maximum Size (expressed in Kbytes)

1004	6.2.1.2.  Mapping Cache Efficiency

1006	   The cache hit rate does not increase proportionally with the cache
1007	   size.  Table 5 shows the summary of the analysis of the hit ratio for
1008	   a daylong observation period.  The averages present in the table are
1009	   calculated in the same time period as for the size (Section 6.2.1.1).

1011	            +----------+-------+-------+-----------+---------+
1012	            |  Timeout |   Min |   Max | Avg Night | Avg Day |
1013	            +----------+-------+-------+-----------+---------+
1014	            |   3 Min. | 91.4% | 97.5% |     93.5% |   96.4% |
1015	            |  30 Min. | 96.8% | 99.5% |     98.5% |   99.2% |
1016	            | 300 Min. | 98.9% | 99.9% |     99.7% |   99.8% |
1017	            +----------+-------+-------+-----------+---------+

1019	               Table 5: Mapping Cache Efficiency (Hit Ratio)

1021	6.2.1.3.  Mapping Lookups

1023	   The previous sections have shown some analysis related to the Mapping
1024	   Cache.  In order to build the cache a lookup operation is needed each
1025	   time the correct mapping is not present in the cache.  This means
1026	   that the lookup operation, which consists in a query to a mapping
1027	   distribution system, can be triggered by both a new outgoing flow as
1028	   well as a new incoming flow.  Table 6 shows a summary of the lookup
1029	   operations for a daylong observation time.

1031	            +----------+-------+-------+-----------+---------+
1032	            |  Timeout |   Min |   Max | Avg Night | Avg Day |
1033	            +----------+-------+-------+-----------+---------+
1034	            |   3 Min. | 1,301 | 4,046 |    1502.5 |  2381.4 |
1035	            |  30 Min. |   257 | 1,211 |     357.1 |   540.7 |
1036	            | 300 Min. |    19 |   328 |      78.7 |   161.7 |
1037	            +----------+-------+-------+-----------+---------+

1039	                    Table 6: Lookups queries per Minute

1041	6.3.  Route Through

1043	   Routing through systems will increase the work expected from name
1044	   resolution servers.  It may lead to inefficient routing.  If this is
1045	   only used for the start of a data flow (and for all short sessions of
1046	   course), then TCP flow rates will frequently be incorrect (too fast
1047	   or slow for the path they have been changed to).  Applications such
1048	   as voice also seem to struggle to cope with large path changes
1049	   because of the delay variation seen.  This might also make fault
1050	   tracing much more complex.

1052	7.  conclusions

1054	   1.   There is no obvious correct solution.  The two classes of
1055	        solution both aim to increase the use of aggregatable addresses
1056	        and essentially differ in the driver they assume is the more
1057	        critical, ie provider lock-in or multi-homing support.  The
1058	        working assumption should be that both problems must be
1059	        adequately solved by any solution, unless one requirement can be
1060	        proven to be irrelevant

1062	   2.   We are not really sure if there is a problem, although it could
1063	        be major and if we leave it until we are certain it is likely to
1064	        be too late to solve it.  More importantly, the exact nature of
1065	        the problem (FIB size, RIB size, processing churn, writing FIB
1066	        updates etc) has escaped definition.  A simpler solution may be
1067	        possible.

1069	   3.   Each of the different approaches deserves further research.

1071	   4.   the area that has received least real attention is legacy inter-
1072	        working and partial deployment.

1074	   5.   the mapping system is a real crunch point and needs some serious
1075	        analysis

1077	   6.   We are focusing on the locater-ID split, but have in reality two
1078	        types of split, one which is recognizable as a locater-identity
1079	        split and the other which could be termed a locater-locater
1080	        split which involves splitting the addressing regions into core
1081	        and edge.  The addition of an identifier has been proposed in
1082	        other quarters for security and authentication reasons.  What
1083	        are the wider implications of a locater space split?

1085	   7.   Compact routing is a completely different routing algorithm that
1086	        essentially trades path stretch for router state.  At present
1087	        there is no way to implement a distributed dynamic version of
1088	        compact routing so this particular protocol may be very far out.
1089	        Nevertheless, there is no apparent study of the potential of
1090	        different routing algorithms

1092	   8.   Schemes such as HRA which simply look at how we organize the
1093	        routing system are not included.

1095	   9.   ROFL assumes that there is really no need for any locator at all
1096	        and it may be correct.  It assumes that using modern techniques
1097	        (based on DHTs) we could build an adequate system based on
1098	        semantic-free identifiers.  It may be that the problems we face
1099	        are caused by things other than scalability (eg lack of
1100	        accountability means that we get endless pointless update
1101	        messages, and means that there is no back-pressure on de-
1102	        aggregation).

1104	   10.  We are looking at the simple schemes; complex schemes such as
1105	        NODE-ID and HRA are not considered.  However, in considering
1106	        small scale changes, are we missing the point that we should
1107	        first have a long term target architecture that any point
1108	        solution should be compliant with?

1110	8.  Acknowledgements

1112	   An prelimary version of this document was prepared for Chinacom with
1113	   help from Sheng Jiang and Xiaohu Xu.

1115	   We are grateful to help from Olivier Bonaventure, and to the mailling
1116	   list discussions, especially Robin Whittle and Joel M. Halpern for
1117	   very useful comments

1119	9.  IANA Considerations

1121	   This memo includes no request to IANA.

1123	10.  Security Considerations

1125	11.  References

1127	11.1.  Normative References

1129	   [min_ref]  authSurName, authInitials., "Minimal Reference", 2006.

1131	11.2.  Informative References

1133	   [Bagnulo]  Bagnulo, M., "Preliminary LISP Threat Analysis", 2007, <ht
1134	              tps://datatracker.ietf.org/drafts/
1135	              draft-bagnulo-lisp-threat/>.

1137	   [Cisco]    Cisco, "NAT FAQ", 2008,
1138	              <http://www.cisco.com/warp/public/556/nat-faq>.

1140	   [DNS]      Jung, J., "DNS performance and the effectiveness of
1141	              caching", 2001, <SIGCOMM workshop on Internet
1142	              Measurement>.

1144	   [FLOWTOOLS]
1145	              Fullmer, M., "Flow-tools - tool set for working with
1146	              netflow data.", Available Online at: http://
1147	              www.splintered.net/sw/flow-tools/docs/flow-tools.html.

1149	   [Goals]    Li, T., "Design Goals for Scalable Internet Routing",
1150	              2007, <Internet draft draft-irtf-rrg-design-goals-01>.

1152	   [Handley]  Handley, M., "Why the Internet only just works", 200, <BT
1153	              Technology Journal>.

1155	   [Huston]   Huston, G., "IPv4 address report", 2007,
1156	              <http://www.potaroo.net/tools.ipv4/index.html>.

1158	   [I-D.farinacci-lisp]
1159	              Farinacci, D., Fuller, V., Oran, D., Meyer, D., and S.
1160	              Brim, "Locator/ID Separation Protocol (LISP)",
1161	              draft-farinacci-lisp-08 (work in progress), July 2008.

1163	   [I-D.vogt-rrg-six-one]
1164	              Vogt, C., "Six/One: A Solution for Routing and Addressing
1165	              in IPv6", draft-vogt-rrg-six-one-01 (work in progress),
1166	              November 2007.

1168	   [IAB]      Meyer, D., "Report from the IAB workshop on Routing and
1169	              Addressing", 2007, <Internet draft
1170	              draft-iab-raws-report-02.txt>.

1172	   [LISP]     Farinacci, D., "Locator/ID separation Protocol  (LISP)",
1173	              2007, <Internet draft draft-farinacci-lisp-05.txt>.

1175	   [MAPCOST]  Iannone, L. and O. Bonaventure, "On the Cost of Caching
1176	              Locator/ID Mappings.", 3rd Annual CoNEXT  Conference,
1177	              2007.

1179	   [MailList]
1180	              Farinacci, D., "e-mail thread", 2007,
1181	              <http://www.ops.ietf.org/lists/rrg/2008/msg00232.html>.

1183	   [NETFLOW]  Cisco Sytems, "Introduction to cisco ios netflow - a
1184	              technical overview.", Available Online at: http://
1185	              www.cisco.com/en/US/products/ps6601/
1186	              products_white_paper0900aecd80406232.shtml.

1188	   [OSPF-LITE]
1189	              Thomas, M., "OSPF-Lite", 2007, <https://
1190	              datatracker.ietf.org/drafts/
1191	              draft-thomas-hunter-reed-ospf-lite/>.

1193	   [RFC4192]  Baker, F., Lear, E., and R. Droms, "Procedures for
1194	              Renumbering an IPv6 Network without a Flag Day", RFC 4192,
1195	              September 2005.

1197	   [UCL]      "Universite Catholique de Louvain.",
1198	               http://www.uclouvain.be.

1200	   [six-one]  Vogt, C., "Six/one: A solution for routing and addressing
1201	              in IPv6", 2007, <Internet draft
1202	              draft-vogt-rrg-six-one-01>.

1204	   [whittle]  Whittle, R., "Comparing LISP-NERD/CONS, eFIT-APT and
1205	              Ivip", 2007, <http://www.firstpr.com.au/ip/ivip/comp/>.

1207	Appendix A.  Additional Stuff

1209	   This becomes an Appendix.

1211	Authors' Addresses

1213	   Louise Burness (editor)
1214	   BT
1215	   BT Adastral Park
1216	   Martlesham Heath, Suffolk
1217	   UK

1219	   Phone: +44 1473 646504
1220	   Email: louise.burness@bt.com

1222	   Philip Eardley (editor)
1223	   BT
1224	   BT Adastral Park
1225	   Martlesham Heath, Suffolk
1226	   UK

1228	   Phone:
1229	   Email: philip.eardley@bt.com
1230	   Luigi Iannone
1231	   UC Louvain
1232	   Place St. Barbe 2
1233	   Louvain la Neuve,   B-1348
1234	   Belgium

1236	   Phone: +32 10 47 87 18
1237	   Email: luigi.iannone@uclouvain.be
1238	   URI:   http://inl.info.ucl.ac.be

1240	Full Copyright Statement

1242	   Copyright (C) The IETF Trust (2008).

1244	   This document is subject to the rights, licenses and restrictions
1245	   contained in BCP 78, and except as set forth therein, the authors
1246	   retain all their rights.

1248	   This document and the information contained herein are provided on an
1249	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1250	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
1251	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
1252	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
1253	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1254	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1256	Intellectual Property

1258	   The IETF takes no position regarding the validity or scope of any
1259	   Intellectual Property Rights or other rights that might be claimed to
1260	   pertain to the implementation or use of the technology described in
1261	   this document or the extent to which any license under such rights
1262	   might or might not be available; nor does it represent that it has
1263	   made any independent effort to identify any such rights.  Information
1264	   on the procedures with respect to rights in RFC documents can be
1265	   found in BCP 78 and BCP 79.

1267	   Copies of IPR disclosures made to the IETF Secretariat and any
1268	   assurances of licenses to be made available, or the result of an
1269	   attempt made to obtain a general license or permission for the use of
1270	   such proprietary rights by implementers or users of this
1271	   specification can be obtained from the IETF on-line IPR repository at
1272	   http://www.ietf.org/ipr.

1274	   The IETF invites any interested party to bring to its attention any
1275	   copyrights, patents or patent applications, or other proprietary
1276	   rights that may cover technology that may be required to implement
1277	   this standard.  Please address the information to the IETF at
1278	   ietf-ipr@ietf.org.