idnits 2.17.1 

draft-lear-lisp-nerd-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 15.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 1289.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1300.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1307.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1313.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 4 instances of lines with private range IPv4 addresses in the
     document.  If these are generic example addresses, they should be changed
     to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x,
     198.51.100.x or 203.0.113.x.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (September 19, 2007) is 6054 days in the past.  Is
     this intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-12) exists of
     draft-farinacci-lisp-03

  ** Obsolete normative reference: RFC 2616 (ref. '2') (Obsoleted by RFC
     7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235)

  ** Obsolete normative reference: RFC 2141 (ref. '7') (Obsoleted by RFC 8141)

  -- Obsolete informational reference (is this intentional?): RFC 4346 (ref.
     '8') (Obsoleted by RFC 5246)

  -- Obsolete informational reference (is this intentional?): RFC  977 (ref.
     '13') (Obsoleted by RFC 3977)


     Summary: 3 errors (**), 0 flaws (~~), 3 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                            E. Lear
3	Internet-Draft                                        Cisco Systems GmbH
4	Intended status: Experimental                         September 19, 2007
5	Expires: March 22, 2008

7	               NERD: A Not-so-novel EID to RLOC Database
8	                      draft-lear-lisp-nerd-02.txt

10	Status of this Memo

12	   By submitting this Internet-Draft, each author represents that any
13	   applicable patent or other IPR claims of which he or she is aware
14	   have been or will be disclosed, and any of which he or she becomes
15	   aware will be disclosed, in accordance with Section 6 of BCP 79.

17	   Internet-Drafts are working documents of the Internet Engineering
18	   Task Force (IETF), its areas, and its working groups.  Note that
19	   other groups may also distribute working documents as Internet-
20	   Drafts.

22	   Internet-Drafts are draft documents valid for a maximum of six months
23	   and may be updated, replaced, or obsoleted by other documents at any
24	   time.  It is inappropriate to use Internet-Drafts as reference
25	   material or to cite them other than as "work in progress."

27	   The list of current Internet-Drafts can be accessed at
28	   http://www.ietf.org/ietf/1id-abstracts.txt.

30	   The list of Internet-Draft Shadow Directories can be accessed at
31	   http://www.ietf.org/shadow.html.

33	   This Internet-Draft will expire on March 22, 2008.

35	Copyright Notice

37	   Copyright (C) The IETF Trust (2007).

39	Abstract

41	   LISP is a protocol to encapsulate IP packets in order to allow end
42	   sites to multihome without injecting routes from one end of the
43	   Internet to another.  This memo specifies a database and a method to
44	   transport the mapping of EIDs to RLOCs to routers in a reliable,
45	   scalable, and secure manner.  Our analysis concludes that transport
46	   of of all EID/RLOC mappings scales well to at least 10^8 entries, and
47	   that use of DNS or any approach that queries for mappings has
48	   substantial operational concerns.

50	Table of Contents

52	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
53	     1.1.  Base Assumptions . . . . . . . . . . . . . . . . . . . . .  3
54	     1.2.  What is NERD?  . . . . . . . . . . . . . . . . . . . . . .  4
55	     1.3.  Glossary . . . . . . . . . . . . . . . . . . . . . . . . .  5
56	   2.  Theory of Operation  . . . . . . . . . . . . . . . . . . . . .  5
57	     2.1.  Who are database authorities?  . . . . . . . . . . . . . .  6
58	   3.  NERD Format  . . . . . . . . . . . . . . . . . . . . . . . . .  7
59	     3.1.  NERD Record Format . . . . . . . . . . . . . . . . . . . .  9
60	     3.2.  Database Update Format . . . . . . . . . . . . . . . . . . 10
61	   4.  NERD Distribution Mechanism  . . . . . . . . . . . . . . . . . 10
62	     4.1.  Initial Bootstrap  . . . . . . . . . . . . . . . . . . . . 10
63	     4.2.  Retrieving Changes . . . . . . . . . . . . . . . . . . . . 10
64	   5.  Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
65	     5.1.  Database Size  . . . . . . . . . . . . . . . . . . . . . . 12
66	     5.2.  Router Throughput Versus Time  . . . . . . . . . . . . . . 14
67	     5.3.  Number of Servers Required . . . . . . . . . . . . . . . . 14
68	     5.4.  Security Considerations  . . . . . . . . . . . . . . . . . 16
69	       5.4.1.  Use of Public Key Infrastructures (PKIs) . . . . . . . 17
70	       5.4.2.  Other Risks  . . . . . . . . . . . . . . . . . . . . . 19
71	   6.  Why not use XML? . . . . . . . . . . . . . . . . . . . . . . . 19
72	   7.  Other Distribution Mechanisms  . . . . . . . . . . . . . . . . 20
73	     7.1.  What About DNS as a retrieval model? . . . . . . . . . . . 21
74	       7.1.1.  Perhaps use a hybrid model?  . . . . . . . . . . . . . 22
75	     7.2.  Use of BGP . . . . . . . . . . . . . . . . . . . . . . . . 23
76	   8.  Deployment Issues  . . . . . . . . . . . . . . . . . . . . . . 23
77	     8.1.  HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
78	   9.  Conclusions  . . . . . . . . . . . . . . . . . . . . . . . . . 24
79	   10. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 25
80	   11. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 25
81	   12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25
82	     12.1. Normative References . . . . . . . . . . . . . . . . . . . 25
83	     12.2. Informational References . . . . . . . . . . . . . . . . . 26
84	   Appendix A.  Generating and verifying the database signature
85	                with OpenSSL  . . . . . . . . . . . . . . . . . . . . 27
86	   Appendix B.  Changes . . . . . . . . . . . . . . . . . . . . . . . 28
87	   Appendix C.  Open Questions  . . . . . . . . . . . . . . . . . . . 28
88	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 29
89	   Intellectual Property and Copyright Statements . . . . . . . . . . 30

91	1.  Introduction

93	   Locator/ID Separation Protocol (LISP) [1] is a protocol whose primary
94	   purpose is to separate an IP address used by a host and local routing
95	   system from the locators advertised by BGP participants on the
96	   Internet in general, and in the default free zone (DFZ) in
97	   particular.  It accomplishes this by establishing a mapping between
98	   globally unique endpoint identifiers (EIDs) and routing locators
99	   (RLOCs) within the global routing table.  This reduces the amount of
100	   state change that occurs on routers within the default-free zone on
101	   the Internet, while enabling end sites to be multihomed.

103	   In early stages of LISP (1 and 1.5) the mapping is either configured
104	   into a device or it is learned via data-triggered control messages
105	   between ingress tunnel routers (ITRs) and egress tunnel routers
106	   (ETRs) under the assumption that during transition, EIDs will be
107	   present within the global routing system, as they are today.

109	   In later stages of LISP, the assumption will be that EIDs are not
110	   contained within the global routing system, but that instead the
111	   mapping from EIDs to RLOCs will be learned through some other means.
112	   This memo addresses different approaches to the problem, and
113	   specifies a Not-so-novel EID RLOC Database (NERD) and methods to both
114	   receive the database and to receive updates.

116	   LISP and NERD are both currently experimental stages.  The NERD
117	   database is specified in such a way that the methods used to
118	   distribute or retrieve it may vary over time.  Multiple databases are
119	   supported in order to allow for multiple data sources.  An effort has
120	   been made to divorce the database from access methods so that both
121	   can evolve independently through experimentation and operational
122	   validation.

124	1.1.  Base Assumptions

126	   In order to specify a mapping it is important to understand how it
127	   will be used, and the nature of the data being mapped.  In the case
128	   of LISP, the following assumptions are pertinant:

130	   o  The data contained within the mapping changes only on provisioning
131	      or configuration operations, and is not intended to change when a
132	      link either fails or is restored.  Some other mechanism (via LISP
133	      or other) handles healing operations, particularly when a tail
134	      circuit within an service provider's aggregate goes down.
135	   o  While weight and priority are defined, these are not hop-by-hop
136	      metrics.  Hence the information contained within the mapping does
137	      not change based on where one sits within the topology.

139	   o  The purpose of LISP being to reduce control plane overhead by
140	      reducing "rate X state" complexity, updates to the mapping will be
141	      relatively rare.
142	   o  Because LISP and NERD are designed to ease interdomain routing,
143	      their use is intended within the inter-domain environment.  That
144	      is, LISP is best implemented at either the customer edge or
145	      provider edge, and there will be on the order of as many ITRs and
146	      LISP announcements as there are connections to Internet Service
147	      Providers by end customers.
148	   o  As such, LISP and NERD cannot be the sole means to implement host
149	      mobility, although they may be in used in conjunction with other
150	      mechanisms.  For instance, it would be possible for a mobile node
151	      to receive a local address that is an EID and pass that to the
152	      correspondant node, who could also make use of an EID.  As such
153	      use of LISP in this case would be transparent, and no mapping
154	      entries are changed for mobility.
155	   o  As such, there is no interaction with the interior gateway
156	      protocol (IGP).

158	1.2.  What is NERD?

160	   NERD is a Not-so-novel EID to RLOC Database.  It consists of the
161	   following components:

163	   1.  a network database format;
164	   2.  a change distribution format;
165	   3.  a database retrieval/bootstrapping method;
166	   4.  a change distribution method.

168	   The network database format is compressable.  However, at this time
169	   we specify no compression method.  NERD will make use of potentially
170	   several transport methods, but most notably HTTP [2].  HTTP has
171	   restart and compression capabilities.  It is also widely deployed.

173	   There exist many methods to show differences between two versions of
174	   a database or a file, UNIX's "diff" being the classic example.  In
175	   this case, because the data is well structured and easily keyed, we
176	   can make use of a very simple format for version differences that
177	   simply provides a list of EID/RLOC mappings that have changed using
178	   the same record format as the database, and a list of EIDs that are
179	   to be removed.

181	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
182	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
183	   document are to be interpreted as described in RFC 2119 [3].

185	1.3.  Glossary

187	   The reader is once again referred to [1] for a general glossary of
188	   terms related to LISP.  The following terms are specific to this
189	   memo.

191	   Base Distribution URI:  An Absolute-URI as defined in Section 4.3 of
192	      [6] from which other references are relative.  The base
193	      distribution URI is used to construct a URI to an EID/RLOC mapping
194	      database.  If more than one NERD is known then there will be one
195	      or more base distribution URIs associated with each (although each
196	      such base distribution URI may have the same value).

198	   EID Database Authority:  The authority that will sign database files
199	      and updates.  It is the source of both.

201	   The Authority:  Shorthand for the EID Database Authority.

203	   NERD:  (N)ot-so-novel (E)ID to (R)LOC (D)atabase.

205	   AFI  Address Family Identifier.

207	   Pull Model:  An architecture where clients pull only the information
208	      they need at any given time, such as when a packet arrives for
209	      forwarding.

211	   Push Model:  An architecture in which clients receive an entire
212	      dataset, containing data they may or may not require, such as
213	      mappings for EIDs that no host served is attempting to send to.

215	   Hybrid Model:  An architecture in which clients receive a subset of
216	      the entire dataset and query as needed for the rest.

218	2.  Theory of Operation

220	   What follows is a summary of how NERDs are generated and updated.
221	   Specifics can be found in Section 3.  The general way in which NERD
222	   works is as follows:

224	   1.  A NERD is generated by an authority that allocates provider
225	       independent (PI) addresses (e.g., IANA or an RIR) which are used
226	       by sites as EIDs.  As part of this process the authority
227	       generates a digest for the database and signs it with a private
228	       key whose public key is part of an X.509 certificate. [12] That
229	       signature along with a copy of the authority's public key is
230	       included in the NERD.
231	   2.  The NERD is distributed to a group of well known servers.
232	   3.  ITRs retrieve an initial copy of the NERD via HTTP when they come
233	       into service.
234	   4.  ITRs are preconfigured with a group of certificates whose private
235	       keys are used by database authorities to sign the NERD.  This
236	       list of certificates should be configurable by administrators.
237	   5.  ITRs next verify both the validity of the public key and the
238	       signed digest.  If either fail validation, the ITR attempts to
239	       retrieve the NERD from a different source.  The process iterates
240	       until either a valid database is found or the list of sources is
241	       exhausted.
242	   6.  Once a valid NERD is retrieved, the ITR installs it into both
243	       non-volatile and local memory.
244	   7.  At some point the authority updates the NERD and increments the
245	       database version counter.  At the same time it generates a list
246	       of changes, which it also signs, as it does with the original
247	       database.
248	   8.  Periodically ITRs will poll from their list of servers to
249	       determine if a new version of the database exists.  When a new
250	       version is found, an ITR will attempt to retrieve a change file,
251	       using its list of preconfigured servers.
252	   9.  The ITR validates a change file just as it does the original
253	       database.  Assuming the change file passes validation, the ITR
254	       installs new entries, overwrites existing ones, and removes empty
255	       entries, based on the content of the change file.

257	   As time goes on it is quite possible that an ITR may probe a list of
258	   configured neighbors for a database or change file copy.  It is
259	   equally possible that neighbors might advertise to each other the
260	   version number of their database.  Such methods are not explored in
261	   detph in this memo, but are mentioned for future consideration.

263	2.1.  Who are database authorities?

265	   This memo does not specify who the database authority is.  That is
266	   because there are several possible operational models.  In each case
267	   the number of database authorities is meant to be small so that ITRs
268	   need only keep a small list of authorities, similar to the way a name
269	   server might cache a list of root servers.

271	   o  A single database authority exists.  In this case all entries in
272	      the database are registered to a single entity, and that entity
273	      distributes the database.  Because the EID space is provider
274	      independent address space, there is no architectural requirement
275	      that address space be hierarchically distributed to anyone, as
276	      there is with provider-assigned address space.  Hence, there is a
277	      natural affinity between the IANA function and the database
278	      authority function.
279	   o  Each region runs a database authority.  In this case, provider
280	      independent address space is allocated to either regional internet
281	      registries or to affiliates of such organizations of network
282	      operations guilds (NOGs).  The benefit of this approach is that
283	      there is no single organization that controls the database.  It
284	      allows one database authority to backup another.  One could
285	      envision as many as ten database authorities in this scenario.
286	   o  Each country runs a database authority.  This could occur should
287	      countries decide to regulate this function.  While limiting the
288	      scope of any single database authority as the previous scenario
289	      describes, this approach would introduce some overhead as the list
290	      of database authorities would grow to as many as 200, and possibly
291	      more if jurisdictions within countries attempted to regulate the
292	      function.

294	   As the number of authorities increases the amount of change on that
295	   list will also increase, requiring both an update mechanism and the
296	   potential need for a discovery mechanism, both of which would be the
297	   subject of future work (i.e., not to be found in this memo).  For
298	   this reason alone, as a starting point two database authorities are
299	   recommended, but their selection is left for others.

301	3.  NERD Format

303	   The NERD consists of a header that contains a database version and a
304	   signature that is generated by ignoring the signature field and
305	   setting the authentication block length to 0 (NULL).  The
306	   authentication block itself consists of a signature and a certificate
307	   whose private key counterpart was used to generate the signature.
308	   The exact format of the authentication block is TBD.

310	   Records are kept sorted in numeric order with AFI plus EID as primary
311	   key and mask length as secondary.  This is so that after a database
312	   update it should be possible to reconstruct the database to verify
313	   the digest signature, which may be retrieved separately from the
314	   database for verification purposes.

316	        0                   1                   2                   3
317	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
318	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
319	       | Schema Vers=1 |  DB Code      |     Database Name Size        |
320	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
321	       |                      Database Version                         |
322	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
323	       |                   Old Database Version or 0                   |
324	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
325	       |                                                               |
326	       |                        Database Name                          |
327	       |                                                               |
328	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
329	       |       PKCS#7 Block Size       |          Reserved             |
330	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
331	       |                                                               |
332	       |      PKCS#7 Block containing Certificate and Signature        |
333	       |                                                               |
334	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

336	   Database Header

338	   The DB Code indicates 0 if what follows is an entire database or 1 if
339	   what follows is an update.  The database file version is incremented
340	   each time the complete database is generated by the authority.  In
341	   the case of an update, the database file version indicates the new
342	   database file version, and the old database file version is indicated
343	   in the "old DB version" field.  The database file version is used by
344	   routers to determine whether or not they have the most current
345	   database.

347	   The database name is a Universal Resource Name (URN) [7] of the
348	   following form:

350	       dburn  = "urn:lisp:3.0:" dbname
351	       dbname = 1*(URN Chars)  ;; URN Chars is defined in RFC 2141.

353	   The purpose of the database name is to allow for more than one
354	   database.  Such databases would be merged by the router.  It is
355	   important that an EID/RLOC mapping be listed in no more than one
356	   database, lest inconsistencies arise.  However, it may be possible to
357	   transition a mapping from one database to another.  During the
358	   transition period, the mappings MUST be identical.  When they are
359	   not, the resultant behavior will be undefined.

361	   The PKCS#7 [4] authentication block contains a DER encoded [5]
362	   signature and associated public key.

364	3.1.  NERD Record Format

366	   As distributed over the network, NERD records appear as follows:

368	        0                   1                   2                   3
369	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
370	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
371	       | Num. RLOCs    | EID Mask Len  |            EID AFI            |
372	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
373	       |                       End point identifier                    |
374	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
375	       | Priority 1    |    Weight 1   |             AFI 1             |
376	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
377	       |                       Routing Locator 1                       |
378	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
379	       | Priority 2    |    Weight 2   |             AFI 2             |
380	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
381	       |                       Routing Locator 2                       |
382	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
383	       | Priority 3    |    Weight 3   |             AFI 3             |
384	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
385	       |                       Routing Locator 3...                    |
386	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

388	   Priority N and Weight N, and AFI N are associated with Routing
389	   Locator N. There will always be at least one routing locator.  The
390	   minimum record size for IPv4 is 16 bytes.  Each additional IPv4 RLOC
391	   increases the record size by 8 bytes.  The purpose of this format is
392	   to keep the database compact, but somewhat easily read.  The meaning
393	   of weight and priority are described in [1].  The format of the AFI
394	   is specified by IANA as "Address Family Numbers", with the exception
395	   of how IPv6 addresses are stored.

397	   In order to reduce storage and transmission amounts for IPv6, only
398	   the necessary number of bytes as specified by the prefix length are
399	   kept in the record, rounded to the nearest four byte (word) boundary.
400	   This is true for both EIDs and RLOCs.  For instance, if the prefix
401	   length is /49, the nearest four-byte word boundary would require that
402	   eight bytes are stored.

404	3.2.  Database Update Format

406	   A database update contains a set of changes to an existing database.
407	   Each AFI/EID/mask-length tuple may have zero or more RLOCs associated
408	   with it.  In the case where there are no RLOCs, the EID entry is
409	   removed from the database.  Records that contain EIDs and mask
410	   lengths that were not previously listed are simply added.  Otherwise,
411	   the old record for the EID and mask length is replaced by the more
412	   current information.  The record format used by the a database update
413	   is the same as described in Section 3.1.

415	4.  NERD Distribution Mechanism

417	4.1.  Initial Bootstrap

419	   Bootstrap occurs when a router needs to retrieve the entire database.
420	   It knows it needs to retrieve the entire database because either it
421	   has none or an update too substantial to process, as might be the
422	   case if a router has been out of service for a substantially lengthy
423	   period of time.

425	   To bootstrap the router appends the database name plus "/current/
426	   entiredb" to a Base Distribution URI and retrieves the file via HTTP.
427	   For example, if the configured URI is
428	   "http://www.example.com/eiddb/", and assuming a database name of
429	   "arin", the router would request
430	   "http://www.example.com/eiddb/current/arin/entiredb".  Routers MUST
431	   check the signature on the database prior to installing it, and MUST
432	   check that the database schema matches a schema they understand.
433	   Once a router has a valid database it MUST store that database in
434	   some sort of non-volatile memory (e.g., disk, flash memory, etc).

436	   N.B., the host component for such URIs MUST NOT resolve to a LISP
437	   EID, lest a circular dependency be created.

439	4.2.  Retrieving Changes

441	   In order to retrieve a set of database changes a router will have
442	   previously retrieved the entire database.  Hence it knows the current
443	   version of the database it has.  Its first step for retrieving
444	   changes is to retrieve the current version of the database.  It does
445	   so by appending "current/version" to the base distribution URI and
446	   retrieving the file.  Its format is text and it contains the integer
447	   value of the current database version.

449	   Once a router has retrieved the current version it compares version
450	   of its local copy.  If there is no difference, then the router is up
451	   to date and need take no further actions until it next checks.

453	   If the versions differ, the router next sends a request for the
454	   appropriate change file by appending "current/changes/" and the
455	   textual representation of the version of its local copy of the
456	   database to the base distribution URI.  For example, if the current
457	   version of the database is 1105503 and router's version is 1105500,
458	   and the base URI and database name are the same as above, the router
459	   would request
460	   "http://www.example.com/eiddb/arin/current/changes/1105500".

462	   The server may not have that change file, either because there are
463	   too many versions between what the router has and what is current, or
464	   because no such change file was generated.  If the server has changes
465	   from the routers version to any later version, the server SHOULD
466	   issue an HTTP redirect to that change file, and the router SHOULD
467	   retrieve and process it.  Once it has done so, the router should then
468	   repeat the process until it has brought itself up to date.  It is
469	   thus important for servers to expire old change files in the order in
470	   which they were generated.

472	   By way of convention, it is suggested that the URIs issued in
473	   redirects be of the following form:

475	   {base dist.  URI}/{dbname}/{more-recent-version}/changes/
476	   {older-version}

478	   where "base dist.  URI" is the base distribution URI, "dbname" is the
479	   name of the database, and each version is the textual representation
480	   of the integer version value.

482	   For example, if the current database version was 1105503 and a router
483	   made a request for
484	   "http://www.example.com/eiddb/arin/current/changes/1105400" but there
485	   was no change file from 1105400 to 1105503, and the server had a
486	   group of change files to make the router current, it would issue a
487	   redirect to
488	   "http://www.example.com/eiddb/arin/110450/changes/1105400" that the
489	   router would then process.  The router would then make a request for
490	   "http://www.example.com/eiddb/arin/current/changes/110450" that the
491	   server would have.

493	   While it is unlikely that database versions would wrap, as they
494	   consists of 32 bit integers, should the event occur, ITRs MUST
495	   attempt first to retrieve a change file when their current version
496	   number is within 10,000 of 2^32 and they see a version available that
497	   is less than 10,000.  Barring the availability of a change file, the
498	   ITR MUST still assume that the database version has wrapped and
499	   retrieve a new copy.

501	5.  Analysis

503	   We will start our analysis by looking at how much data will be
504	   transferred to a router during bootstrap conditions.  We will then
505	   look at the bandwidth required.  Next we will turn our concerns to
506	   servers.  Finally we will ponder the effect of providing only
507	   changes.

509	   In the analysis below we treat the overhead of the database header as
510	   insignificant (because it is).  The analysis should be similar,
511	   whether a single database or multiple databases are employed, as we
512	   would assume that no entry would appear more than once.

514	5.1.  Database Size

516	   By its very nature the information to be transported is relatively
517	   static and is specifically designed to be topologically insensitive.
518	   That is, every ITR is intended to have the same set of RLOCs for a
519	   given EID.  While some processing power will be necessary to install
520	   a table, the amount required should be far less than that of a
521	   routing information database because the level of entropy is intended
522	   to be lower.

524	   For purposes of this analysis, we will assume that the world has
525	   migrated to IPv6, as this increases the size of the database, which
526	   would be our primary concern.  However, to mitigate the size
527	   increase, we have limited the size of the prefix transmitted.  For
528	   purposes of this analysis, we shall assume an average prefix length
529	   of 64 bits.

531	   Based on that assumption, Section 3.1 states that mapping information
532	   for each EID/Prefix includes a group of RLOCs, each with an
533	   associated priority and weight, and that a minimum record size with
534	   IPv6 EIDs with at least one RLOC is 24 bytes uncompressed.  Each
535	   additional IPv6 RLOC costs 12 bytes (again, assuming an average
536	   prefix length of 64 bits).

538	                 +-----------+--------+--------+---------+
539	                 | 10^n EIDs | 2 RLOC | 4 RLOC |  8 RLOC |
540	                 +-----------+--------+--------+---------+
541	                 |         4 | 360 KB | 600 KB | 1.08 MB |
542	                 |         5 | 3.6 MB | 6.0 MB | 10.8 MB |
543	                 |         6 |  36 MB |  60 MB |  108 MB |
544	                 |         7 | 360 MB | 600 MB | 1.08 GB |
545	                 |         8 | 3.6 GB | 6.0 GB | 10.8 GB |
546	                 +-----------+--------+--------+---------+

548	    Database size for IPv6 routes with average prefix length = 64 bits

550	                                  Table 1

552	   Entries in the above table are derived as follows:

554	        E * (24 + 12 * (R -1 ))

556	   where E = number of EIDs (10^n), R = number of RLOCs per EID.

558	   Our scaling target is to accommodate 10^8 multihomed systems, which
559	   is one order magnitude greater than what is discussed in [10].  At
560	   10^8 entries, a device could be expected to use between 3.6 and 10.8
561	   and gigabytes of RAM for the mapping.  No matter the method of
562	   distribution, any router that sits in the core of the Internet would
563	   require near this amount of memory in order to perform the ITR
564	   function.  Large enterprise ETRs would be similarly strained, simply
565	   due to the diversity of of sites that communicate with one another.
566	   The good news is that this is not our starting point, but rather our
567	   scaling target, a number that we intend to reach by the year 2050.
568	   Our starting point is more likely in the neighborhood of 10^4 or 10^5
569	   EIDs, thus requiring between 360KB and 10.8 MB.

571	5.2.  Router Throughput Versus Time

573	        +-------------------+---------+--------+---------+-------+
574	        | Table Size (10^N) |   1mb/s | 10mb/s | 100mb/s | 1gb/s |
575	        +-------------------+---------+--------+---------+-------+
576	        |                 6 |       8 |    0.8 |    0.08 | 0.008 |
577	        |                 7 |      80 |      8 |     0.8 |  0.08 |
578	        |                 8 |     800 |     80 |       8 |   0.8 |
579	        |                 9 |   8,000 |    800 |      80 |     8 |
580	        |                10 |  80,000 |  8,000 |     800 |    80 |
581	        |                11 | 800,000 | 80,000 |   8,000 |   800 |
582	        +-------------------+---------+--------+---------+-------+

584	                     Number of seconds to process NERD

586	                                  Table 2

588	   The length of time it takes to process the database is significant in
589	   models where the device acquires the entire table.  During this
590	   period of time, either the router will be unable to route packets
591	   using LISP or it must use some sort of query mechanism for specific
592	   EIDs as the rest it populates its table through the transfer.
593	   Table 2 shows us that at our scaling target, the length of time it
594	   would take for a router using 1 mb/s of bandwidth is about 80
595	   seconds.  We can measure the processing rate in small numbers of
596	   hours for any transfer speed greater than that.  The fastest
597	   processing time shows us as taking 8 seconds to process an entire
598	   table of 10^9 bytes and 80 for 10^10 bytes.

600	5.3.  Number of Servers Required

602	   As easy as it may be for a router to retrieve, the aggregate
603	   information may be difficult for servers to transmit, assuming the
604	   information is transmitted in aggregate (we'll revisit that
605	   assumption later).

607	   +----------------+------------+-----------+------------+------------+
608	   | # Simultaneous | 10 Servers |       100 |      1,000 |     10,000 |
609	   |       Requests |            |   Servers |    Servers |    Servers |
610	   +----------------+------------+-----------+------------+------------+
611	   |            100 |        480 |        48 |         48 |         48 |
612	   |          1,000 |      4,800 |       480 |         48 |         48 |
613	   |         10,000 |     48,000 |     4,800 |        480 |         48 |
614	   |        100,000 |    480,000 |    48,000 |      4,800 |        480 |
615	   |      1,000,000 |  4,800,000 |   480,000 |     48,000 |      4,800 |
616	   |     10,000,000 | 48,000,000 | 4,800,000 |    480,000 |     48,000 |
617	   +----------------+------------+-----------+------------+------------+

619	     Retrieval time per number of servers in seconds.  Assumes average
620	   10^8 entries with 4 RLOCs per EID and that each server has access to
621	    1gb/s and 100% efficient use of that bandwidth and no compression.

623	                                  Table 3

625	   Entries in the above table were generated using the following method:

627	   For 10^8 entries with four RLOCs per EID, the table size is 6.0GB,
628	   per our previous table.  Assume 1 Gb/s transfer rates and 100%
629	   utilization.  Protocol overhead is ignored for this exercise.  Hence
630	   a single transfer X takes 48 seconds and can get no faster.

632	   With this in mind, each entry is as follows:

634	            max(1X,N*X/S)

636	     where N=number of transfers, X = 48 seconds,
637	     and S = number of servers.

639	   If we have a distribution model which every device must retrieve the
640	   mapping information upon start, Table 3 shows the length of time in
641	   seconds it will take for a given number of servers to complete a
642	   transfer to a given number of devices.  This table says, as an
643	   example, that it would take 48,000 seconds (over 13 hours) for one
644	   million ITRs to simultaneously retrieve the database from one
645	   thousand servers.  Should a cold start scenario occur, this number
646	   should be of some concern.  Hence it is important to take some
647	   measures both to avoid such a scenario, and to ease the load should
648	   it occur.  The primary defense should be for ITRs to first attempt to
649	   retrieve their databases from their peers or upstream providers.
650	   Secondary defenses could include data sanity checks within ITRs, with
651	   agreed norms for how much the database should change in any given
652	   update or over any given period of time.  As we will see below,
653	   dissemination of changes is considerably less volume.

655	     +----------------+-------------+---------------+----------------+
656	     | % Daily Change | 100 Servers | 1,000 Servers | 10,000 Servers |
657	     +----------------+-------------+---------------+----------------+
658	     |           0.1% |         200 |            20 |              2 |
659	     |           0.5% |        1000 |           100 |             10 |
660	     |             1% |        2000 |           200 |             20 |
661	     |             5% |      10,000 |          1000 |            100 |
662	     |            10% |      20,000 |          2000 |            200 |
663	     +----------------+-------------+---------------+----------------+

665	     Assuming 10 million routers and a database size of 6GB, resulting
666	    hourly transfer times are shown in seconds, given number of servers
667	                         and daily rate of change.

669	                                  Table 4

671	   This table shows us that with 10,000 servers the average transfer
672	   time with 1Gb/s links for 10,000,000 routers will be 200 seconds with
673	   10% daily change spread over 24 hourly updates.  For a 0.1% daily
674	   change, that number is 2 seconds for a database of size 6.0GB.

676	   The amount of change goes to the purpose of LISP.  If its purpose is
677	   to provide effective multihoming support to end customers, then we
678	   might anticipate relatively random changes.  If, on the other,
679	   service providers attempt to make use of LISP to provide some form of
680	   traffic engineering, we can expect the same data to change more
681	   often.  We can probably not conclude much in this regard without
682	   additional operational experience.  The one thing we can say is that
683	   different applications of the LISP protocol may require new and
684	   different distribution mechanisms.  Such optimization is left for
685	   another day.

687	5.4.  Security Considerations

689	   Whichever the answer to our previous question, we must consider the
690	   security of the information being transported.  If an attacker can
691	   forge an update or tamper with the database, he can in effect
692	   redirect traffic to end sites.  Hence, integrity and authenticity of
693	   the NERD is critical.  In addition, a means is required to determine
694	   whether a source is authorized to modify a given database.  No data
695	   privacy is required.  Quite to the contrary, this information will be
696	   necessary for any ITR.

698	   The first question one must ask is who to trust to provide the ITR a
699	   mapping.  Ultimately the owner of the EID prefix is most
700	   authoritative for the mapping to RLOCs.  However, were all owners to
701	   sign all such mappings, ITRs would need to know which owner is
702	   authorized to modify which mapping, creating a problem of O(N^2)
703	   complexity.

705	   We can reduce this problem substantially by investing some trust in a
706	   small number of entities that are allowed to sign entries.  If
707	   authority manages EIDs much the same way a domain name registrar
708	   handles domains, then the owner of the EID would choose a database
709	   authority she or he trusts, and ITRs must trust each such authority
710	   in order to map the EIDs listed by that authority to RLOCs.  This
711	   reduces the amount of management complexity on the ETR to retaining
712	   knowledge of O(#authorities), but does require that each authority
713	   establish procedures for authenticating the owner of an EID.  Those
714	   procedures needn't be the same.

716	   There are two classic methods to ensure integrity of data:

718	   o  secure transport of the source of the data to the consumer, such
719	      as Transport Layer Security (TLS) [8]; and
720	   o  provide object level security.

722	   These methods are not mutually exclusive, although one can argue
723	   about the need for the former, given the latter.

725	   In the case of TLS, when it is properly implemented, the objects
726	   being transported cannot easily be modified by interlopers or so-
727	   called men in the middle.  When data objects are distributed to
728	   multiple servers, each of those servers must be trusted.  As we have
729	   seen above, we could have quite a large number of servers, thus
730	   providing an attacker a large number of targets.  We conclude that
731	   some form of object level security is required.

733	   Object level security involves an authority signing an object in a
734	   way that can easily be verified by a consumer, in this case a router.
735	   In this case, we would want the mapping table and any incremental
736	   update to be signed by the originator of the update.  This implies
737	   that we cannot simply make use of a tool like CVS [11].  Instead, the
738	   originator will want to generate diffs, sign them, and make them
739	   available either directly or through some sort of content
740	   distribution or peer to peer network.

742	5.4.1.  Use of Public Key Infrastructures (PKIs)

744	   X.509 provides a certificate hierarchy that has scaled to the size of
745	   the Internet.  The system is particularly manageable when there are
746	   fewer certificates to manage.  The model proposed in this memo makes
747	   use of one current certificate per database authority.  The three
748	   pieces of information necessary to verify a signature, therefore, are
749	   as follows:

751	   o  the certificate of the database authority, which can be provided
752	      along with the database;
753	   o  the certificate authority's certificate; and
754	   o  A table of database names and distinguished names (DNs) that are
755	      allowed to update them.

757	   The latter two pieces of information must be very well known and must
758	   be configured on each ITR.  It is expected that both would change
759	   very rarely, and it would not be unreasonable for such updates to
760	   occur as part of a normal OS release process.

762	   The tools for both signing and verifying are readily available.
763	   Openssl [20] provides tools and libraries for both signing and
764	   verifying.  Other tools commonly exist.

766	   Use of PKIs is not without implementation, operational complexity or
767	   risk.  The following risks and mitigations are identified with NERD's
768	   use of PKIs:

770	   If a NERD database authority private key is exposed:

772	      In this case an attacker could sign a false database update,
773	      either redirecting traffic, or otherwise causing havoc.  In this
774	      case, the NERD database administrator must revoke its existing key
775	      and issue a new one.  The certificate is added to a certificate
776	      revocation list (CRL), which may be distributed with both this and
777	      other databases, as well as through other channels.  Because this
778	      event is expected to be rare, and the number of database
779	      authorities is expected to be small, a CRL will be small.  When a
780	      router receives a revocation, it checks it against its existing
781	      databases, and attempts to update the one that is revoked.  This
782	      implies that prior to issuing the revocation, the database
783	      authority MUST sign an update with the new key.  Routers SHOULD
784	      discard updates they have already received that were signed after
785	      the revocation was generated.  If a router cannot confirm that
786	      whether the authority's certificate was revoked before or after a
787	      particular update, it MUST retrieve a fresh new copy of the
788	      database with a valid signature.

790	   The private key associated with the CA that signed the Authority's
791	   certificate is compromised:

793	      In this case, it becomes possible for an attacker to masquerade as
794	      the database authority.  To ameliorate damage, the database
795	      authority SHOULD revoke its certificate and get a new certificate
796	      issued from a CA that is not compromised.  Once it has done so,
797	      the previous procedure is followed.  The compromised certificate
798	      can be removed during the normal operating system upgrade cycle.

800	   An algorithm used if either the certificate or the signature is
801	   cracked:

803	      This is a catastrophic failure and the above forms of attack
804	      become possible.  The only mitigation is to make use of a new
805	      algorithm.  In theory this should be possible, but in practice has
806	      proven very difficult.  For this reason, additional work is
807	      recommended to make alternative algorithms available.

809	   The Database Authority loses its key or disappears:

811	      In this case nobody can update the existing database.  There are
812	      few programmatic mitigations.  If the database authority places
813	      its private keys and suitable amounts of information escrow, under
814	      agreed upon circumstances, such as no updates for three days, for
815	      example, the escrow agent would release the information to a party
816	      competent of generating a database update.

818	5.4.2.  Other Risks

820	   Because this specification does not require secure transport, if an
821	   attacker prevents updates to an ITR for the purposes of having that
822	   ITR continue to use a compromised ETR, the ITR could continue to use
823	   an old version of the database without realizing a new version has
824	   been made available.  If one is worried about such an attack, a
825	   secure channel such as SSL to a secure chain back to the database
826	   authority should be used.  It is possible that after some operational
827	   experience, later versions of this format will contain additional
828	   semantics to address this attack.

830	   As discussed above, substantial risk would be a cold start scenario.
831	   If an attacker found a bug in a common operating system that allowed
832	   it to erase an ITR's database, and was able to disseminate that bug,
833	   the collective ability of ITRs to retrieve new copies of the database
834	   could be taxed by collective demand.  The remedy to this is for
835	   devices to share copies of the database with their neighbors, thus
836	   making each potential requestor a potential service.

838	6.  Why not use XML?

840	   Many objects these days are distributed as either XML pages or
841	   something derived as XML [17], such as SOAP [18],[19].  Use of such
842	   well known standards allows for high level tools and library reuse.
843	   Why not, then, use these standards in this case?  There are two
844	   answers to this question.  First, the obvious concern is that XML is
845	   not known for efficiency of data transport.  Being based in text, an
846	   IPv4 address is expanded from one octet to three octets, plus either
847	   an attribute and quotes or element tags and end tags.  Let us presume
848	   for the moment a very simple schema that might cause a record to be
849	   represented as follows:

851	       <r e="10.1.1.0" m="24">
852	         <l w="10" p="15">
853	           <v4>
854	           192.168.1.1
855	           </v4>
856	        </l>
857	         <l w="5" p="15">
858	           <v4>
859	           192.168.1.2
860	           </v4>
861	        </l>
862	      </r>

864	   With white space removed the uncompressed XML represents 120 bytes
865	   versus 20 bytes for the record specified in Section 3.1, representing
866	   a five fold expansion.  That brings our 920MB database to 4.6GB.

868	   The other concern about XML is that version 1.0 of the specification
869	   is silent on the order of sibling elements.  Specifications other
870	   than the base specification state that order is significant.  Order
871	   is significant to LISP and NERD because once an update is applied to
872	   the database it should be possible to verify the signature of the
873	   entire database.  Prior to applying the signature the XML generator
874	   would need to ensure the order of information.  That same sort would
875	   be required of the router.  This seems to add unnecessary fragility
876	   to a critical system without much benefit.  While there may indeed be
877	   uses of an XML representation of the database, these uses are likely
878	   to be outside of a router.

880	7.  Other Distribution Mechanisms

882	   We now consider various different mechanisms.  The problem of
883	   distributing changes in various databases is as old as databases.
884	   The author is aware of two obvious approaches that have been well
885	   used in the past.  One approach would be the wide distribution of CVS
886	   repositories.  However, for reasons mentioned in the previous
887	   section, CVS is insufficient to the task.

889	   The other tried and true approach is the use of periodic updates in
890	   the form of messages.  Good old NNTP [13] itself provides two
891	   separate mechanisms (one push and another pull) to provide a coherent
892	   update process.  This was in fact used to update molecular biology
893	   databases [14] in the early 1990s.  Netnews offers a way to determine
894	   whether articles with specified Article-Ids have been received.  In
895	   the case where the mapping file source of authority wishes to
896	   transmit updates, it can sign a change file and then post it into the
897	   network.  Routers merely need to keep a record of article ids that it
898	   has received.  Initially this is probably overkill, but it may not be
899	   so later in this process.  Some consideration should be given to a
900	   mechanism known to widely distribute vast amounts of data, as
901	   instantaneously either the sender or the receiver wishes.

903	   To attain an additional level of hierarchy in the distribution
904	   network, service providers could retrieve information to their own
905	   local servers, and configure their routers with the host portion of
906	   the above URI.

908	   Another possibility would be for providers to establish an agreement
909	   on a small set of anycast addresses for use for this purpose.  There
910	   are limitations to the use of anycast, particularly with TCP.  In the
911	   midst of a routing flap anycast address can become all but unusable.
912	   Careful study of such a use as well as appropriate use of HTTP
913	   redirects is expected.

915	7.1.  What About DNS as a retrieval model?

917	   It has been proposed that a query/response mechanism be used for this
918	   information, and that specifically the domain name system (DNS) [16]
919	   be used.  The previous models do not preclude the DNS.  DNS has the
920	   advantage that the administrative lines are well drawn, and that the
921	   ID/RLOC mapping is likely to appear very close to these boundaries.
922	   DNS also has the added benefit that an entire distribution
923	   infrastructure already exists.  There are, however, some problems
924	   that could impact end hosts when intermediate routers make queries,
925	   some of which were first pointed out in [15]:

927	   o  Any query mechanism offers an opportunity for a resource attack if
928	      an attacker can force the ITR to query for information.  In this
929	      case, all that would be necessary would be for a "botnet" (a group
930	      of computers that have been compromised and used as vehicles to
931	      attack others) to ping or otherwise contact via some normal
932	      service hosts that sit behind the ETR.  If the botnet hosts
933	      themselves are behind ETRs, the victim's ITR will need to query
934	      for each and every one of them, thus becoming part of a classic
935	      reflector attack.
936	   o  Packets will be delayed at the very least, and probably dropped in
937	      the process of a mapping query.  This could be at the beginning of
938	      a communication, but it will be impossible for a router to
939	      conclude with certainty that this is the case.
940	   o  The DNS has a backoff algorithm that presumes that applications
941	      are making queries prior to the beginning of a communication.
942	      This is appropriate for end hosts who know in fact when a
943	      communication begins.  An end user may not enjoy a router waiting
944	      seconds for a retry.
945	   o  While the administrative lines may appear to be correct, the
946	      location of name servers may not be.  If name servers sit within
947	      PI address space, thus requiring LISP to reach, a circular
948	      dependency is created.  This is precisely where many enterprise
949	      name servers sit.  The LISP experiment should not predicate its
950	      success on relocation of such name servers.

952	   Never-the-less, DNS may be able to play a role in providing the
953	   enterprise control over the mapping of its EIDs to RLOCs.  Posit a
954	   new DNS record "EID2RLOC".  This record is used by the authority to
955	   collect and aggregate mapping information so that it may be
956	   distributed through one of the other mechanisms.  As an example:

958	      $ORIGIN 0.10.PI-SPACE.
959	       128   EID2RLOC   mask 23 priority 10 weight 5 172.16.5.60
960	             EID2RLOC   mask 23 priority 15 weight 5 192.168.1.5

962	   In the above figure network 10.0.128/23 would delegated to some end
963	   system, say EXAMPLE.COM.  They would manage the above zone
964	   information.  This would allow a DNS mechanism to work, but it would
965	   also allow someone to aggregate the information and distribution a
966	   table.

968	7.1.1.  Perhaps use a hybrid model?

970	   It would be possible to use both a prepopulated database such as NERD
971	   and query mechanism (perhaps DNS) to determine an EID/RLOC mapping.
972	   The general idea would be to receive a subset of the mappings, say,
973	   by taking only the NERD for certain regions.  This alleviates the
974	   need to drop packets for some subset of destinations under the
975	   assumption that one's business is localized to a particular region.
976	   If one did not have a local entry for a particular EID one would then
977	   make a query.

979	   One improvement on simply using DNS to query live would be to
980	   periodically walk the entire network, in search of EID2RLOC records,
981	   and caching them to non-volatile storage.  This has two benefits.

983	   First, it prevents resource attacks.  Care has to be given to how
984	   memory is cached it avoid an attacker causing a performance
985	   degradation by attempting to exceed memory limits through a random
986	   source attack.

988	   As important as resisting attacks, having a complete or near complete
989	   copy of the database provides for a faster recovery time when a
990	   router goes out of service, for whatever reason.  Absent such a
991	   mechanism, devices would need to repopulate their local caches
992	   through the help of another system, leading to additional system
993	   fragility.

995	7.2.  Use of BGP

997	   Border Gateway Protocol (BGP) [9] is currently used to distribute
998	   inter-domain routing throughout the Internet.  Why not, then, use BGP
999	   to distribute the mapping table?  A simple answer is that the objects
1000	   BGP best handles are routes.  While it may be possible to transmit
1001	   EID/RLOC mappings instead (because they look an awful lot like
1002	   routes) the rate of updates of EID/RLOC mappings is specifically
1003	   intended to be considerably less than routes, and would probably
1004	   require additional dampening mechanisms to ensure that this is so.

1006	   In addition, the ownership of the mapping does not flow from service
1007	   providers but rather from end users of the identifiers.  It should
1008	   not be possible for anyone to filter the mapping, other than perhaps
1009	   ITRs for local policy purposes.  The current limited security model
1010	   for BGP does not fit the general requirements of how the mapping is
1011	   to be processed.

1013	   Furthermore, as BGP is currently the lifeblood of the Internet its
1014	   use for any means other than routing should be strongly scrutinized.

1016	   This is not to say that BGP has no role to play whatsoever.  It may
1017	   well be possible for routers to exchange database version numbers and
1018	   perhaps base distribution URIs as extensions or capabilities.  This
1019	   would allow routers to serve their copy of the database to their
1020	   neighbors, easing the load off the rest of the server infrastructure.
1021	   How this would be done is future work.

1023	8.  Deployment Issues

1025	   While LISP and NERD are intended as experiments at this point, it is
1026	   already obvious one must give serious consideration to circular
1027	   dependencies with regard to the protocols used and the elements
1028	   within them.

1030	8.1.  HTTP

1032	   In Section 7.1 we have already seen how DNS can have circular
1033	   dependencies.  In as much as HTTP depends on DNS, either due to the
1034	   authority section of a URI, or due to the configured base
1035	   distribution URI, these same concerns apply.  In addition, any HTTP
1036	   server that itself makes use of provider independent addresses would
1037	   be a poor choice to distribute the database for these exact same
1038	   reasons.

1040	   One issue with using HTTP is that it is possible that a middlebox of
1041	   some form, such as a cache, may intercept and process requests.  In
1042	   some cases this might be a good thing.  For instance, if a cache
1043	   correctly returns a database, some amount of bandwidth is conserved.
1044	   On the other hand, if the cache itself fails to function properly for
1045	   whatever reason, end to end connectivity could be impaired.  For
1046	   example, if the cache itself depended on the mapping being in place
1047	   and functional, a cold start scenario might leave the cache
1048	   functioning improperly, in turn providing routers no means to update
1049	   their databases.  Some care must be given to avoid such
1050	   circumstances.

1052	9.  Conclusions

1054	   This memo has specified a database format, an update format, a URI
1055	   convention, an update method, and a validation method for EID/RLOC
1056	   mappings.  We have shown that beyond the predictions of 10^7
1057	   locators, the aggregate database size would be at most 10.8GB.  We
1058	   have considered the amount of servers to distribute that information
1059	   and we have demonstrated the limitations of a simple content
1060	   distribution network and other well known mechanisms.  The effort
1061	   required to retrieve a database change amounts to between 2 and 20
1062	   seconds of processing time per hour at at today's gigabit speeds.  We
1063	   conclude that there is no need for an off box query mechanism today,
1064	   and that there are distinct disadvantages for having such a mechanism
1065	   in the control plane.

1067	   Beyond this we have examined alternatives that allow for hybrid
1068	   models that do use query mechanisms, should our operating assumptions
1069	   prove overly optimistic.  Use of NERD today does not forclose use of
1070	   such models in the future, and in fact both models can happily co-
1071	   exist.

1073	   We leave to future work how the list of databases is distributed, how
1074	   BGP can play a role in distributing knowledge of the databases, and
1075	   how DNS can play a role in aggregating information into these
1076	   databases.

1078	   We also leave to future work whether HTTP is the best protocol for
1079	   the job, and whether the scheme described in this document is the
1080	   most efficient.  One could easily envision that when applied in high
1081	   delay or high loss environments, a broadcast or multicast method may
1082	   prove more effective.

1084	10.  IANA Considerations

1086	   This memo makes no requests of IANA.

1088	11.  Acknowledgments

1090	   Dino Farinacci, Patrik Faltstrom, Dave Meyer, Joel Halpern, Dave
1091	   Thaler, Mohamed Boucadair, and Max Pritikin were very helpful with
1092	   their reviews of this document.  The astute will notice a lengthy
1093	   References section.  This work stands on the shoulders of many
1094	   others' efforts.

1096	12.  References

1098	12.1.  Normative References

1100	   [1]   Farinacci, D., "Locator/ID Separation Protocol (LISP)",
1101	         draft-farinacci-lisp-03 (work in progress), August 2007.

1103	   [2]   Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L.,
1104	         Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol --
1105	         HTTP/1.1", RFC 2616, June 1999.

1107	   [3]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
1108	         Levels", BCP 14, RFC 2119, March 1997.

1110	   [4]   Kaliski, B., "PKCS #7: Cryptographic Message Syntax Version
1111	         1.5", RFC 2315, March 1998.

1113	   [5]   International Telecommunications Union, "Information technology
1114	         - Open Systems Interconnection - The Directory: Public-key and
1115	         attribute certificate frameworks", ITU-T Recommendation X.509,
1116	         ISO Standard 9594-8, March 2000.

1118	   [6]   Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
1119	         Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986,
1120	         January 2005.

1122	   [7]   Moats, R., "URN Syntax", RFC 2141, May 1997.

1124	12.2.  Informational References

1126	   [8]   Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS)
1127	         Protocol Version 1.1", RFC 4346, April 2006.

1129	   [9]   Rekhter, Y., Li, T., and S. Hares, "A Border Gateway Protocol 4
1130	         (BGP-4)", RFC 4271, January 2006.

1132	   [10]  Carpenter, B., "IETF Plenary Presentation: Routing and
1133	         Addressing: Where we are today", March 2007.

1135	   [11]  Grune, R., Baalbergen, E., Waage, M., Berliner, B., and J.
1136	         Polk, "CVS: Concurrent Versions System", November 1985.

1138	   [12]  International International Telephone and Telegraph
1139	         Consultative Committee, "Information Technology - Open Systems
1140	         Interconnection - The Directory: Authentication Framework",
1141	         CCITT Recommendation X.509, November 1988.

1143	   [13]  Kantor, B. and P. Lapsley, "Network News Transfer Protocol",
1144	         RFC 977, February 1986.

1146	   [14]  Smith, R., Gottesman, Y., Hobbs, B., Lear, E., Kristofferson,
1147	         D., Benton, D., and P. Smith, "A mechanism for maintaining an
1148	         up-to-date GenBank database via Usenet", CABIOS , April 1991.

1150	   [15]  Huitema, C., "An Experiment in DNS Based IP Routing", RFC 1383,
1151	         December 1992.

1153	   [16]  Mockapetris, P., "Domain names - concepts and facilities",
1154	         STD 13, RFC 1034, November 1987.

1156	   [17]  Bray, T., Paoli, J., Sperberg-McQueen, C., and E. Maler,
1157	         "Extensible Markup Language (XML) 1.0 (2nd ed)", W3C REC-xml,
1158	         October 2000, <http://www.w3.org/TR/REC-xml>.

1160	   [18]  Gudgin, M., Hadley, M., Mendelsohn, N., Moreau, J., and H.
1161	         Nielsen, "SOAP Version 1.2 Part 1: Messaging Framework", W3C
1162	         Working Draft soap12-part1, June 2002,
1163	         <http://www.w3.org/TR/soap12-part1>.

1165	   [19]  Gudgin, M., Hadley, M., Mendelsohn, N., Moreau, J., and H.
1166	         Nielsen, "SOAP Version 1.2 Part 2: Adjuncts", W3C Working
1167	         Draft soap12-part2, June 2002,
1168	         <http://www.w3.org/TR/soap12-part2>.

1170	URIs

1172	   [20]  <http://www.openssl.org>

1174	Appendix A.  Generating and verifying the database signature with
1175	             OpenSSL

1177	   As previously mentioned, one goal of NERD was to use off-the-shelf
1178	   tools to both generate and retrieve the database.  To many, PKI is
1179	   magic.  This section is meant to provide at least some clarification
1180	   as to both the generation and verification process, complete with
1181	   command line examples.  Not included is how you get the entries
1182	   themselves.  We'll assume they exist, and that you're just trying to
1183	   sign the database.

1185	   To sign the database, to start with, you need a database file that
1186	   has a database header described in Section 3.  Block size should be
1187	   zero, and there should be no PKCS#7 block at this point.  You also
1188	   need a certificate and its private key with which you will sign the
1189	   database.

1191	   The OpenSSL "smime" command contains all the functions we need from
1192	   this point forth.  To sign the database, issue the following command:

1194	         openssl smime -binary -sign -outform DER -signer yourcert.crt \
1195	                 -inkey yourcert.key -in database-file -out signature

1197	   -binary states that no MIME canonicalization should be performed.
1198	   -sign indicates that you are signing the file that was given as the
1199	   argument to -in.  The output format (-outform) is binary DER, and
1200	   your public certificate is provided with -signer along with your key
1201	   with -inkey.  The signature itself is specified with -out.

1203	   The resulting file "signature" is then copied into to PKCS#7 block in
1204	   the database header, its size in bytes is recorded in the PKCS#7
1205	   block size field, and the resulting file is ready for distribution to
1206	   ITRs.

1208	   To verify a database file, first retrieve the PKCS#7 block from the
1209	   file by copying the appropriate number of bytes into another file,
1210	   say "signature".  Then zero this field, and set the block size field
1211	   to 0.  Next use the "smime" command to verify the signature as
1212	   follows:

1214	       openssl smime -binary -verify -inform DER -content database-file
1215	               -out /dev/null -in signature

1217	   Openssl will return "Verification OK" if the signature is correct.

1219	   To improve verification performance it would make modifications to
1220	   the program so that it takes as input the database with a null
1221	   signature and as an argument the name of the file containing the
1222	   signature.  Better yet, use a call to the appropriate library with
1223	   each block.

1225	Appendix B.  Changes

1227	   This section to be removed prior to publication.

1229	   o  02: Incorporate some of Dave Thaler's comments.  Add
1230	      authentication block detail.  Modify analysis to take IPv6 into
1231	      account, along with a more realistic number of RLOCs per EID.  Add
1232	      some comments about potential risks of a cold start.  Add S/MIME
1233	      example as appendix A and take out old ToDo.  Provide some amount
1234	      of compression of IPv6 addresses by limiting their size to
1235	      significant bytes rounded to a four byte word boundary.
1236	   o  01: Massive spelling correction, URI example correction.
1237	   o  00: Initial Revision.

1239	Appendix C.  Open Questions

1241	   This section to be removed prior to publication.

1243	   o  Should the database contain its name?  It is probably sufficient
1244	      to merely reference the database by name.
1245	   o  Should the signature portion be separated from the actual
1246	      database?  By specifying the signature we hope to reduce
1247	      interoperability issues and encourage proper security from the get
1248	      go.  On the other hand, since the object is opaque it is not clear
1249	      how much interoperability we are actually encouraging.
1250	   o  Should we specify a (perhaps compressed) tarball that treads a
1251	      middle ground for the last question, where each update tarball
1252	      contains both a signature for the update and for the entire
1253	      database, once the update is applied.
1254	   o  Should we compress?  In some initial testing of databases with 1,
1255	      5, and 10 million IPv4 EIDs and a random distribution of IPv4
1256	      RLOCs, the current format in this document compresses down by a
1257	      factor of between 35% and 36%, using Burrows-Wheeler block sorting
1258	      text compression algorithm (bzip2).  The NERD used random EIDs
1259	      with mask lengths varying from 19-29, with probability weighted
1260	      toward the smaller masks.  This only very roughly reflects
1261	      reality.  A better test would be to start with the existing
1262	      prefixes found in the DFZ.

1264	Author's Address

1266	   Eliot Lear
1267	   Cisco Systems GmbH
1268	   Glatt-com
1269	   Glattzentrum, ZH  CH-8301
1270	   Switzerland

1272	   Phone: +41 1 878 7525
1273	   Email: lear@cisco.com

1275	Full Copyright Statement

1277	   Copyright (C) The IETF Trust (2007).

1279	   This document is subject to the rights, licenses and restrictions
1280	   contained in BCP 78, and except as set forth therein, the authors
1281	   retain all their rights.

1283	   This document and the information contained herein are provided on an
1284	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1285	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
1286	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
1287	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
1288	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1289	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1291	Intellectual Property

1293	   The IETF takes no position regarding the validity or scope of any
1294	   Intellectual Property Rights or other rights that might be claimed to
1295	   pertain to the implementation or use of the technology described in
1296	   this document or the extent to which any license under such rights
1297	   might or might not be available; nor does it represent that it has
1298	   made any independent effort to identify any such rights.  Information
1299	   on the procedures with respect to rights in RFC documents can be
1300	   found in BCP 78 and BCP 79.

1302	   Copies of IPR disclosures made to the IETF Secretariat and any
1303	   assurances of licenses to be made available, or the result of an
1304	   attempt made to obtain a general license or permission for the use of
1305	   such proprietary rights by implementers or users of this
1306	   specification can be obtained from the IETF on-line IPR repository at
1307	   http://www.ietf.org/ipr.

1309	   The IETF invites any interested party to bring to its attention any
1310	   copyrights, patents or patent applications, or other proprietary
1311	   rights that may cover technology that may be required to implement
1312	   this standard.  Please address the information to the IETF at
1313	   ietf-ipr@ietf.org.

1315	Acknowledgment

1317	   Funding for the RFC Editor function is provided by the IETF
1318	   Administrative Support Activity (IASA).