idnits 2.17.1 

draft-lear-lisp-nerd-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 15.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 1298.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1309.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1316.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1322.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 4 instances of lines with private range IPv4 addresses in the
     document.  If these are generic example addresses, they should be changed
     to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x,
     198.51.100.x or 203.0.113.x.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (January 23, 2008) is 5938 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-12) exists of
     draft-farinacci-lisp-03

  ** Obsolete normative reference: RFC 2616 (ref. '2') (Obsoleted by RFC
     7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235)

  -- Obsolete informational reference (is this intentional?): RFC 4346 (ref.
     '7') (Obsoleted by RFC 5246)

  -- Obsolete informational reference (is this intentional?): RFC  977 (ref.
     '12') (Obsoleted by RFC 3977)


     Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                            E. Lear
3	Internet-Draft                                        Cisco Systems GmbH
4	Intended status: Experimental                           January 23, 2008
5	Expires: July 26, 2008

7	               NERD: A Not-so-novel EID to RLOC Database
8	                      draft-lear-lisp-nerd-03.txt

10	Status of this Memo

12	   By submitting this Internet-Draft, each author represents that any
13	   applicable patent or other IPR claims of which he or she is aware
14	   have been or will be disclosed, and any of which he or she becomes
15	   aware will be disclosed, in accordance with Section 6 of BCP 79.

17	   Internet-Drafts are working documents of the Internet Engineering
18	   Task Force (IETF), its areas, and its working groups.  Note that
19	   other groups may also distribute working documents as Internet-
20	   Drafts.

22	   Internet-Drafts are draft documents valid for a maximum of six months
23	   and may be updated, replaced, or obsoleted by other documents at any
24	   time.  It is inappropriate to use Internet-Drafts as reference
25	   material or to cite them other than as "work in progress."

27	   The list of current Internet-Drafts can be accessed at
28	   http://www.ietf.org/ietf/1id-abstracts.txt.

30	   The list of Internet-Draft Shadow Directories can be accessed at
31	   http://www.ietf.org/shadow.html.

33	   This Internet-Draft will expire on July 26, 2008.

35	Copyright Notice

37	   Copyright (C) The IETF Trust (2008).

39	Abstract

41	   LISP is a protocol to encapsulate IP packets in order to allow end
42	   sites to multihome without injecting routes from one end of the
43	   Internet to another.  This memo specifies a database and a method to
44	   transport the mapping of EIDs to RLOCs to routers in a reliable,
45	   scalable, and secure manner.  Our analysis concludes that transport
46	   of of all EID/RLOC mappings scales well to at least 10^8 entries, and
47	   that use of DNS or any approach that queries for mappings has
48	   substantial operational concerns.

50	Table of Contents

52	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
53	     1.1.  Base Assumptions . . . . . . . . . . . . . . . . . . . . .  3
54	     1.2.  What is NERD?  . . . . . . . . . . . . . . . . . . . . . .  4
55	     1.3.  Glossary . . . . . . . . . . . . . . . . . . . . . . . . .  5
56	   2.  Theory of Operation  . . . . . . . . . . . . . . . . . . . . .  5
57	     2.1.  Who are database authorities?  . . . . . . . . . . . . . .  6
58	   3.  NERD Format  . . . . . . . . . . . . . . . . . . . . . . . . .  7
59	     3.1.  NERD Record Format . . . . . . . . . . . . . . . . . . . .  9
60	     3.2.  Database Update Format . . . . . . . . . . . . . . . . . .  9
61	   4.  NERD Distribution Mechanism  . . . . . . . . . . . . . . . . . 10
62	     4.1.  Initial Bootstrap  . . . . . . . . . . . . . . . . . . . . 10
63	     4.2.  Retrieving Changes . . . . . . . . . . . . . . . . . . . . 10
64	   5.  Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
65	     5.1.  Database Size  . . . . . . . . . . . . . . . . . . . . . . 12
66	     5.2.  Router Throughput Versus Time  . . . . . . . . . . . . . . 14
67	     5.3.  Number of Servers Required . . . . . . . . . . . . . . . . 14
68	     5.4.  Security Considerations  . . . . . . . . . . . . . . . . . 16
69	       5.4.1.  Use of Public Key Infrastructures (PKIs) . . . . . . . 17
70	       5.4.2.  Other Risks  . . . . . . . . . . . . . . . . . . . . . 19
71	   6.  Why not use XML? . . . . . . . . . . . . . . . . . . . . . . . 19
72	   7.  Other Distribution Mechanisms  . . . . . . . . . . . . . . . . 20
73	     7.1.  What About DNS as a retrieval model? . . . . . . . . . . . 21
74	       7.1.1.  Perhaps use a hybrid model?  . . . . . . . . . . . . . 22
75	     7.2.  Use of BGP . . . . . . . . . . . . . . . . . . . . . . . . 23
76	   8.  Deployment Issues  . . . . . . . . . . . . . . . . . . . . . . 23
77	     8.1.  HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
78	   9.  Open Questions . . . . . . . . . . . . . . . . . . . . . . . . 24
79	   10. Conclusions  . . . . . . . . . . . . . . . . . . . . . . . . . 25
80	   11. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 25
81	   12. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 25
82	   13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26
83	     13.1. Normative References . . . . . . . . . . . . . . . . . . . 26
84	     13.2. Informational References . . . . . . . . . . . . . . . . . 26
85	   Appendix A.  Generating and verifying the database signature
86	                with OpenSSL  . . . . . . . . . . . . . . . . . . . . 27
87	   Appendix B.  Changes . . . . . . . . . . . . . . . . . . . . . . . 29
88	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 29
89	   Intellectual Property and Copyright Statements . . . . . . . . . . 30

91	1.  Introduction

93	   Locator/ID Separation Protocol (LISP) [1] is a protocol whose primary
94	   purpose is to separate an IP address used by a host and local routing
95	   system from the locators advertised by BGP participants on the
96	   Internet in general, and in the default free zone (DFZ) in
97	   particular.  It accomplishes this by establishing a mapping between
98	   globally unique endpoint identifiers (EIDs) and routing locators
99	   (RLOCs) within the global routing table.  This reduces the amount of
100	   state change that occurs on routers within the default-free zone on
101	   the Internet, while enabling end sites to be multihomed.

103	   In early stages of LISP (1 and 1.5) the mapping is either configured
104	   into a device or it is learned via data-triggered control messages
105	   between ingress tunnel routers (ITRs) and egress tunnel routers
106	   (ETRs) under the assumption that during transition, EIDs will be
107	   present within the global routing system, as they are today.

109	   In later stages of LISP, the assumption will be that EIDs are not
110	   contained within the global routing system, but that instead the
111	   mapping from EIDs to RLOCs will be learned through some other means.
112	   This memo addresses different approaches to the problem, and
113	   specifies a Not-so-novel EID RLOC Database (NERD) and methods to both
114	   receive the database and to receive updates.

116	   LISP and NERD are both currently experimental protocols.  The NERD
117	   database is specified in such a way that the methods used to
118	   distribute or retrieve it may vary over time.  Multiple databases are
119	   supported in order to allow for multiple data sources.  An effort has
120	   been made to divorce the database from access methods so that both
121	   can evolve independently through experimentation and operational
122	   validation.

124	1.1.  Base Assumptions

126	   In order to specify a mapping it is important to understand how it
127	   will be used, and the nature of the data being mapped.  In the case
128	   of LISP, the following assumptions are pertinant:

130	   o  The data contained within the mapping changes only on provisioning
131	      or configuration operations, and is not intended to change when a
132	      link either fails or is restored.  Some other mechanism such as
133	      the use of LISP Rechability Bits with mapping replies handles
134	      healing operations, particularly when a tail circuit within an
135	      service provider's aggregate goes down.  NERD can be used as a
136	      verification method to ensure that whatever operational mapping
137	      changes an ITR receives are authorized.

139	   o  While weight and priority are defined, these are not hop-by-hop
140	      metrics.  Hence the information contained within the mapping does
141	      not change based on where one sits within the topology.
142	   o  The purpose of LISP being to reduce control plane overhead by
143	      reducing "rate X state" complexity, updates to the mapping will be
144	      relatively rare.
145	   o  Because LISP and NERD are designed to ease interdomain routing,
146	      their use is intended within the inter-domain environment.  That
147	      is, LISP is best implemented at either the customer edge or
148	      provider edge, and there will be on the order of as many ITRs and
149	      LISP announcements as there are connections to Internet Service
150	      Providers by end customers.
151	   o  As such, LISP and NERD cannot be the sole means to implement host
152	      mobility, although they may be in used in conjunction with other
153	      mechanisms.  For instance, it would be possible for a mobile node
154	      to receive a local address that is an EID and pass that to the
155	      correspondant node, who could also make use of an EID.  As such
156	      use of LISP in this case would be transparent, and no mapping
157	      entries are changed for mobility.
158	   o  There is no interaction with the interior gateway protocol (IGP).

160	1.2.  What is NERD?

162	   NERD is a Not-so-novel EID to RLOC Database.  It consists of the
163	   following components:

165	   1.  a network database format;
166	   2.  a change distribution format;
167	   3.  a database retrieval/bootstrapping method;
168	   4.  a change distribution method.

170	   The network database format is compressable.  However, at this time
171	   we specify no compression method.  NERD will make use of potentially
172	   several transport methods, but most notably HTTP [2].  HTTP has
173	   restart and compression capabilities.  It is also widely deployed.

175	   There exist many methods to show differences between two versions of
176	   a database or a file, UNIX's "diff" being the classic example.  In
177	   this case, because the data is well structured and easily keyed, we
178	   can make use of a very simple format for version differences that
179	   simply provides a list of EID/RLOC mappings that have changed using
180	   the same record format as the database, and a list of EIDs that are
181	   to be removed.

183	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
184	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
185	   document are to be interpreted as described in RFC 2119 [3].

187	1.3.  Glossary

189	   The reader is once again referred to [1] for a general glossary of
190	   terms related to LISP.  The following terms are specific to this
191	   memo.

193	   Base Distribution URI:  An Absolute-URI as defined in Section 4.3 of
194	      [6] from which other references are relative.  The base
195	      distribution URI is used to construct a URI to an EID/RLOC mapping
196	      database.  If more than one NERD is known then there will be one
197	      or more base distribution URIs associated with each (although each
198	      such base distribution URI may have the same value).

200	   EID Database Authority:  The authority that will sign database files
201	      and updates.  It is the source of both.

203	   The Authority:  Shorthand for the EID Database Authority.

205	   NERD:  (N)ot-so-novel (E)ID to (R)LOC (D)atabase.

207	   AFI  Address Family Identifier.

209	   Pull Model:  An architecture where clients pull only the information
210	      they need at any given time, such as when a packet arrives for
211	      forwarding.

213	   Push Model:  An architecture in which clients receive an entire
214	      dataset, containing data they may or may not require, such as
215	      mappings for EIDs that no host served is attempting to send to.

217	   Hybrid Model:  An architecture in which clients receive a subset of
218	      the entire dataset and query as needed for the rest.

220	2.  Theory of Operation

222	   What follows is a summary of how NERDs are generated and updated.
223	   Specifics can be found in Section 3.  The general way in which NERD
224	   works is as follows:

226	   1.  A NERD is generated by an authority that allocates provider
227	       independent (PI) addresses (e.g., IANA or an RIR) which are used
228	       by sites as EIDs.  As part of this process the authority
229	       generates a digest for the database and signs it with a private
230	       key whose public key is part of an X.509 certificate. [11] That
231	       signature along with a copy of the authority's public key is
232	       included in the NERD.
233	   2.  The NERD is distributed to a group of well known servers.
234	   3.  ITRs retrieve an initial copy of the NERD via HTTP when they come
235	       into service.
236	   4.  ITRs are preconfigured with a group of certificates whose private
237	       keys are used by database authorities to sign the NERD.  This
238	       list of certificates should be configurable by administrators.
239	   5.  ITRs next verify both the validity of the public key and the
240	       signed digest.  If either fail validation, the ITR attempts to
241	       retrieve the NERD from a different source.  The process iterates
242	       until either a valid database is found or the list of sources is
243	       exhausted.
244	   6.  Once a valid NERD is retrieved, the ITR installs it into both
245	       non-volatile and local memory.
246	   7.  At some point the authority updates the NERD and increments the
247	       database version counter.  At the same time it generates a list
248	       of changes, which it also signs, as it does with the original
249	       database.
250	   8.  Periodically ITRs will poll from their list of servers to
251	       determine if a new version of the database exists.  When a new
252	       version is found, an ITR will attempt to retrieve a change file,
253	       using its list of preconfigured servers.
254	   9.  The ITR validates a change file just as it does the original
255	       database.  Assuming the change file passes validation, the ITR
256	       installs new entries, overwrites existing ones, and removes empty
257	       entries, based on the content of the change file.

259	   As time goes on it is quite possible that an ITR may probe a list of
260	   configured neighbors for a database or change file copy.  It is
261	   equally possible that neighbors might advertise to each other the
262	   version number of their database.  Such methods are not explored in
263	   detph in this memo, but are mentioned for future consideration.

265	2.1.  Who are database authorities?

267	   This memo does not specify who the database authority is.  That is
268	   because there are several possible operational models.  In each case
269	   the number of database authorities is meant to be small so that ITRs
270	   need only keep a small list of authorities, similar to the way a name
271	   server might cache a list of root servers.

273	   o  A single database authority exists.  In this case all entries in
274	      the database are registered to a single entity, and that entity
275	      distributes the database.  Because the EID space is provider
276	      independent address space, there is no architectural requirement
277	      that address space be hierarchically distributed to anyone, as
278	      there is with provider-assigned address space.  Hence, there is a
279	      natural affinity between the IANA function and the database
280	      authority function.
281	   o  Each region runs a database authority.  In this case, provider
282	      independent address space is allocated to either regional internet
283	      registries or to affiliates of such organizations of network
284	      operations guilds (NOGs).  The benefit of this approach is that
285	      there is no single organization that controls the database.  It
286	      allows one database authority to backup another.  One could
287	      envision as many as ten database authorities in this scenario.
288	   o  Each country runs a database authority.  This could occur should
289	      countries decide to regulate this function.  While limiting the
290	      scope of any single database authority as the previous scenario
291	      describes, this approach would introduce some overhead as the list
292	      of database authorities would grow to as many as 200, and possibly
293	      more if jurisdictions within countries attempted to regulate the
294	      function.

296	   As the number of authorities increases the amount of change on that
297	   list will also increase, requiring both an update mechanism and the
298	   potential need for a discovery mechanism, both of which would be the
299	   subject of future work (i.e., not to be found in this memo).  For
300	   this reason alone, as a starting point two database authorities are
301	   recommended, but their selection is left for others.

303	3.  NERD Format

305	   The NERD consists of a header that contains a database version and a
306	   signature that is generated by ignoring the signature field and
307	   setting the authentication block length to 0 (NULL).  The
308	   authentication block itself consists of a signature and a certificate
309	   whose private key counterpart was used to generate the signature.
310	   The exact format of the authentication block is TBD.

312	   Records are kept sorted in numeric order with AFI plus EID as primary
313	   key and mask length as secondary.  This is so that after a database
314	   update it should be possible to reconstruct the database to verify
315	   the digest signature, which may be retrieved separately from the
316	   database for verification purposes.

318	        0                   1                   2                   3
319	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
320	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
321	       | Schema Vers=1 |  DB Code      |     Database Name Size        |
322	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
323	       |                      Database Version                         |
324	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
325	       |                   Old Database Version or 0                   |
326	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
327	       |                                                               |
328	       |                        Database Name                          |
329	       |                                                               |
330	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
331	       |       PKCS#7 Block Size       |          Reserved             |
332	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
333	       |                                                               |
334	       |      PKCS#7 Block containing Certificate and Signature        |
335	       |                                                               |
336	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

338	   Database Header

340	   The DB Code indicates 0 if what follows is an entire database or 1 if
341	   what follows is an update.  The database file version is incremented
342	   each time the complete database is generated by the authority.  In
343	   the case of an update, the database file version indicates the new
344	   database file version, and the old database file version is indicated
345	   in the "old DB version" field.  The database file version is used by
346	   routers to determine whether or not they have the most current
347	   database.

349	   The database name is a domain name.  This is the name that will
350	   appear in the Subject field of the certificate used to verify the
351	   database.  The purpose of the database name is to allow for more than
352	   one database.  Such databases would be merged by the router.  It is
353	   important that an EID/RLOC mapping be listed in no more than one
354	   database, lest inconsistencies arise.  However, it may be possible to
355	   transition a mapping from one database to another.  During the
356	   transition period, the mappings MUST be identical.  When they are
357	   not, the resultant behavior will be undefined.

359	   The PKCS#7 [4] authentication block contains a DER encoded [5]
360	   signature and associated public key.

362	3.1.  NERD Record Format

364	   As distributed over the network, NERD records appear as follows:

366	        0                   1                   2                   3
367	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
368	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
369	       | Num. RLOCs    | EID Mask Len  |            EID AFI            |
370	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
371	       |                       End point identifier                    |
372	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
373	       | Priority 1    |    Weight 1   |             AFI 1             |
374	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
375	       |                       Routing Locator 1                       |
376	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
377	       | Priority 2    |    Weight 2   |             AFI 2             |
378	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
379	       |                       Routing Locator 2                       |
380	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
381	       | Priority 3    |    Weight 3   |             AFI 3             |
382	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
383	       |                       Routing Locator 3...                    |
384	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

386	   Priority N and Weight N, and AFI N are associated with Routing
387	   Locator N. There will always be at least one routing locator.  The
388	   minimum record size for IPv4 is 16 bytes.  Each additional IPv4 RLOC
389	   increases the record size by 8 bytes.  The purpose of this format is
390	   to keep the database compact, but somewhat easily read.  The meaning
391	   of weight and priority are described in [1].  The format of the AFI
392	   is specified by IANA as "Address Family Numbers", with the exception
393	   of how IPv6 addresses are stored.

395	   In order to reduce storage and transmission amounts for IPv6, only
396	   the necessary number of bytes as specified by the prefix length are
397	   kept in the record, rounded to the nearest four byte (word) boundary.
398	   This is true for both EIDs and RLOCs.  For instance, if the prefix
399	   length is /49, the nearest four-byte word boundary would require that
400	   eight bytes are stored.

402	3.2.  Database Update Format

404	   A database update contains a set of changes to an existing database.
405	   Each AFI/EID/mask-length tuple may have zero or more RLOCs associated
406	   with it.  In the case where there are no RLOCs, the EID entry is
407	   removed from the database.  Records that contain EIDs and mask
408	   lengths that were not previously listed are simply added.  Otherwise,
409	   the old record for the EID and mask length is replaced by the more
410	   current information.  The record format used by the a database update
411	   is the same as described in Section 3.1.

413	4.  NERD Distribution Mechanism

415	4.1.  Initial Bootstrap

417	   Bootstrap occurs when a router needs to retrieve the entire database.
418	   It knows it needs to retrieve the entire database because either it
419	   has none or an update too substantial to process, as might be the
420	   case if a router has been out of service for a substantially lengthy
421	   period of time.

423	   To bootstrap the ITR appends the database name plus "/current/
424	   entiredb" to a Base Distribution URI and retrieves the file via HTTP.
425	   For example, if the configured URI is
426	   "http://www.example.com/eiddb/", and assuming a database name of
427	   "nerd.arin.net", the ITR would request
428	   "http://www.example.com/eiddb/current/nerd.arin.net/entiredb".
429	   Routers MUST check the signature on the database prior to installing
430	   it, and MUST check that the database schema matches a schema they
431	   understand.  Once a router has a valid database it MUST store that
432	   database in some sort of non-volatile memory (e.g., disk, flash
433	   memory, etc).

435	   N.B., the host component for such URIs MUST NOT resolve to a LISP
436	   EID, lest a circular dependency be created.

438	4.2.  Retrieving Changes

440	   In order to retrieve a set of database changes an ITR will have
441	   previously retrieved the entire database.  Hence it knows the current
442	   version of the database it has.  Its first step for retrieving
443	   changes is to retrieve the current version of the database.  It does
444	   so by appending "current/version" to the base distribution URI and
445	   retrieving the file.  Its format is text and it contains the integer
446	   value of the current database version.

448	   Once an ITR has retrieved the current version it compares version of
449	   its local copy.  If there is no difference, then the router is up to
450	   date and need take no further actions until it next checks.

452	   If the versions differ, the router next sends a request for the
453	   appropriate change file by appending "current/changes/" and the
454	   textual representation of the version of its local copy of the
455	   database to the base distribution URI.  For example, if the current
456	   version of the database is 1105503 and router's version is 1105500,
457	   and the base URI and database name are the same as above, the router
458	   would request
459	   "http://www.example.com/eiddb/nerd.arin.net/current/changes/1105500".

461	   The server may not have that change file, either because there are
462	   too many versions between what the router has and what is current, or
463	   because no such change file was generated.  If the server has changes
464	   from the routers version to any later version, the server SHOULD
465	   issue an HTTP redirect to that change file, and the router SHOULD
466	   retrieve and process it.  Once it has done so, the router should then
467	   repeat the process until it has brought itself up to date.  It is
468	   thus important for servers to expire old change files in the order in
469	   which they were generated.

471	   By way of convention, it is suggested that the URIs issued in
472	   redirects be of the following form:

474	   {base dist.  URI}/{dbname}/{more-recent-version}/changes/
475	   {older-version}

477	   where "base dist.  URI" is the base distribution URI, "dbname" is the
478	   name of the database, and each version is the textual representation
479	   of the integer version value.

481	   For example, if the current database version was 1105503 and a router
482	   made a request for
483	   "http://www.example.com/eiddb/nerd.arin.net/current/changes/1105400"
484	   but there was no change file from 1105400 to 1105503, and the server
485	   had a group of change files to make the router current, it would
486	   issue a redirect to
487	   "http://www.example.com/eiddb/nerd.arin.net/110450/changes/1105400"
488	   that the router would then process.  The router would then make a
489	   request for
490	   "http://www.example.com/eiddb/nerd.arin.net/current/changes/110450"
491	   that the server would have.

493	   While it is unlikely that database versions would wrap, as they
494	   consists of 32 bit integers, should the event occur, ITRs MUST
495	   attempt first to retrieve a change file when their current version
496	   number is within 10,000 of 2^32 and they see a version available that
497	   is less than 10,000.  Barring the availability of a change file, the
498	   ITR MUST still assume that the database version has wrapped and
499	   retrieve a new copy.

501	5.  Analysis

503	   We will start our analysis by looking at how much data will be
504	   transferred to a router during bootstrap conditions.  We will then
505	   look at the bandwidth required.  Next we will turn our concerns to
506	   servers.  Finally we will ponder the effect of providing only
507	   changes.

509	   In the analysis below we treat the overhead of the database header as
510	   insignificant (because it is).  The analysis should be similar,
511	   whether a single database or multiple databases are employed, as we
512	   would assume that no entry would appear more than once.

514	5.1.  Database Size

516	   By its very nature the information to be transported is relatively
517	   static and is specifically designed to be topologically insensitive.
518	   That is, every ITR is intended to have the same set of RLOCs for a
519	   given EID.  While some processing power will be necessary to install
520	   a table, the amount required should be far less than that of a
521	   routing information database because the level of entropy is intended
522	   to be lower.

524	   For purposes of this analysis, we will assume that the world has
525	   migrated to IPv6, as this increases the size of the database, which
526	   would be our primary concern.  However, to mitigate the size
527	   increase, we have limited the size of the prefix transmitted.  For
528	   purposes of this analysis, we shall assume an average prefix length
529	   of 64 bits.

531	   Based on that assumption, Section 3.1 states that mapping information
532	   for each EID/Prefix includes a group of RLOCs, each with an
533	   associated priority and weight, and that a minimum record size with
534	   IPv6 EIDs with at least one RLOC is 24 bytes uncompressed.  Each
535	   additional IPv6 RLOC costs 12 bytes (again, assuming an average
536	   prefix length of 64 bits).

538	                 +-----------+--------+--------+---------+
539	                 | 10^n EIDs | 2 RLOC | 4 RLOC |  8 RLOC |
540	                 +-----------+--------+--------+---------+
541	                 |         4 | 360 KB | 600 KB | 1.08 MB |
542	                 |         5 | 3.6 MB | 6.0 MB | 10.8 MB |
543	                 |         6 |  36 MB |  60 MB |  108 MB |
544	                 |         7 | 360 MB | 600 MB | 1.08 GB |
545	                 |         8 | 3.6 GB | 6.0 GB | 10.8 GB |
546	                 +-----------+--------+--------+---------+

548	    Database size for IPv6 routes with average prefix length = 64 bits

550	                                  Table 1

552	   Entries in the above table are derived as follows:

554	        E * (24 + 12 * (R -1 ))

556	   where E = number of EIDs (10^n), R = number of RLOCs per EID.

558	   Our scaling target is to accommodate 10^8 multihomed systems, which
559	   is one order magnitude greater than what is discussed in [9].  At
560	   10^8 entries, a device could be expected to use between 3.6 and 10.8
561	   and gigabytes of RAM for the mapping.  No matter the method of
562	   distribution, any router that sits in the core of the Internet would
563	   require near this amount of memory in order to perform the ITR
564	   function.  Large enterprise ETRs would be similarly strained, simply
565	   due to the diversity of of sites that communicate with one another.
566	   The good news is that this is not our starting point, but rather our
567	   scaling target, a number that we intend to reach by the year 2050.
568	   Our starting point is more likely in the neighborhood of 10^4 or 10^5
569	   EIDs, thus requiring between 360KB and 10.8 MB.

571	5.2.  Router Throughput Versus Time

573	        +-------------------+---------+--------+---------+-------+
574	        | Table Size (10^N) |   1mb/s | 10mb/s | 100mb/s | 1gb/s |
575	        +-------------------+---------+--------+---------+-------+
576	        |                 6 |       8 |    0.8 |    0.08 | 0.008 |
577	        |                 7 |      80 |      8 |     0.8 |  0.08 |
578	        |                 8 |     800 |     80 |       8 |   0.8 |
579	        |                 9 |   8,000 |    800 |      80 |     8 |
580	        |                10 |  80,000 |  8,000 |     800 |    80 |
581	        |                11 | 800,000 | 80,000 |   8,000 |   800 |
582	        +-------------------+---------+--------+---------+-------+

584	                     Number of seconds to process NERD

586	                                  Table 2

588	   The length of time it takes to process the database is significant in
589	   models where the device acquires the entire table.  During this
590	   period of time, either the router will be unable to route packets
591	   using LISP or it must use some sort of query mechanism for specific
592	   EIDs as the rest it populates its table through the transfer.
593	   Table 2 shows us that at our scaling target, the length of time it
594	   would take for a router using 1 mb/s of bandwidth is about 80
595	   seconds.  We can measure the processing rate in small numbers of
596	   hours for any transfer speed greater than that.  The fastest
597	   processing time shows us as taking 8 seconds to process an entire
598	   table of 10^9 bytes and 80 for 10^10 bytes.

600	5.3.  Number of Servers Required

602	   As easy as it may be for a router to retrieve, the aggregate
603	   information may be difficult for servers to transmit, assuming the
604	   information is transmitted in aggregate (we'll revisit that
605	   assumption later).

607	   +----------------+------------+-----------+------------+------------+
608	   | # Simultaneous | 10 Servers |       100 |      1,000 |     10,000 |
609	   |       Requests |            |   Servers |    Servers |    Servers |
610	   +----------------+------------+-----------+------------+------------+
611	   |            100 |        480 |        48 |         48 |         48 |
612	   |          1,000 |      4,800 |       480 |         48 |         48 |
613	   |         10,000 |     48,000 |     4,800 |        480 |         48 |
614	   |        100,000 |    480,000 |    48,000 |      4,800 |        480 |
615	   |      1,000,000 |  4,800,000 |   480,000 |     48,000 |      4,800 |
616	   |     10,000,000 | 48,000,000 | 4,800,000 |    480,000 |     48,000 |
617	   +----------------+------------+-----------+------------+------------+

619	     Retrieval time per number of servers in seconds.  Assumes average
620	   10^8 entries with 4 RLOCs per EID and that each server has access to
621	    1gb/s and 100% efficient use of that bandwidth and no compression.

623	                                  Table 3

625	   Entries in the above table were generated using the following method:

627	   For 10^8 entries with four RLOCs per EID, the table size is 6.0GB,
628	   per our previous table.  Assume 1 Gb/s transfer rates and 100%
629	   utilization.  Protocol overhead is ignored for this exercise.  Hence
630	   a single transfer X takes 48 seconds and can get no faster.

632	   With this in mind, each entry is as follows:

634	            max(1X,N*X/S)

636	     where N=number of transfers, X = 48 seconds,
637	     and S = number of servers.

639	   If we have a distribution model which every device must retrieve the
640	   mapping information upon start, Table 3 shows the length of time in
641	   seconds it will take for a given number of servers to complete a
642	   transfer to a given number of devices.  This table says, as an
643	   example, that it would take 48,000 seconds (over 13 hours) for one
644	   million ITRs to simultaneously retrieve the database from one
645	   thousand servers.  Should a cold start scenario occur, this number
646	   should be of some concern.  Hence it is important to take some
647	   measures both to avoid such a scenario, and to ease the load should
648	   it occur.  The primary defense should be for ITRs to first attempt to
649	   retrieve their databases from their peers or upstream providers.
650	   Secondary defenses could include data sanity checks within ITRs, with
651	   agreed norms for how much the database should change in any given
652	   update or over any given period of time.  As we will see below,
653	   dissemination of changes is considerably less volume.

655	     +----------------+-------------+---------------+----------------+
656	     | % Daily Change | 100 Servers | 1,000 Servers | 10,000 Servers |
657	     +----------------+-------------+---------------+----------------+
658	     |           0.1% |         200 |            20 |              2 |
659	     |           0.5% |        1000 |           100 |             10 |
660	     |             1% |        2000 |           200 |             20 |
661	     |             5% |      10,000 |          1000 |            100 |
662	     |            10% |      20,000 |          2000 |            200 |
663	     +----------------+-------------+---------------+----------------+

665	     Assuming 10 million routers and a database size of 6GB, resulting
666	    hourly transfer times are shown in seconds, given number of servers
667	                         and daily rate of change.

669	                                  Table 4

671	   This table shows us that with 10,000 servers the average transfer
672	   time with 1Gb/s links for 10,000,000 routers will be 200 seconds with
673	   10% daily change spread over 24 hourly updates.  For a 0.1% daily
674	   change, that number is 2 seconds for a database of size 6.0GB.

676	   The amount of change goes to the purpose of LISP.  If its purpose is
677	   to provide effective multihoming support to end customers, then we
678	   might anticipate relatively random changes.  If, on the other,
679	   service providers attempt to make use of LISP to provide some form of
680	   traffic engineering, we can expect the same data to change more
681	   often.  We can probably not conclude much in this regard without
682	   additional operational experience.  The one thing we can say is that
683	   different applications of the LISP protocol may require new and
684	   different distribution mechanisms.  Such optimization is left for
685	   another day.

687	5.4.  Security Considerations

689	   Whichever the answer to our previous question, we must consider the
690	   security of the information being transported.  If an attacker can
691	   forge an update or tamper with the database, he can in effect
692	   redirect traffic to end sites.  Hence, integrity and authenticity of
693	   the NERD is critical.  In addition, a means is required to determine
694	   whether a source is authorized to modify a given database.  No data
695	   privacy is required.  Quite to the contrary, this information will be
696	   necessary for any ITR.

698	   The first question one must ask is who to trust to provide the ITR a
699	   mapping.  Ultimately the owner of the EID prefix is most
700	   authoritative for the mapping to RLOCs.  However, were all owners to
701	   sign all such mappings, ITRs would need to know which owner is
702	   authorized to modify which mapping, creating a problem of O(N^2)
703	   complexity.

705	   We can reduce this problem substantially by investing some trust in a
706	   small number of entities that are allowed to sign entries.  If
707	   authority manages EIDs much the same way a domain name registrar
708	   handles domains, then the owner of the EID would choose a database
709	   authority she or he trusts, and ITRs must trust each such authority
710	   in order to map the EIDs listed by that authority to RLOCs.  This
711	   reduces the amount of management complexity on the ETR to retaining
712	   knowledge of O(#authorities), but does require that each authority
713	   establish procedures for authenticating the owner of an EID.  Those
714	   procedures needn't be the same.

716	   There are two classic methods to ensure integrity of data:

718	   o  secure transport of the source of the data to the consumer, such
719	      as Transport Layer Security (TLS) [7]; and
720	   o  provide object level security.

722	   These methods are not mutually exclusive, although one can argue
723	   about the need for the former, given the latter.

725	   In the case of TLS, when it is properly implemented, the objects
726	   being transported cannot easily be modified by interlopers or so-
727	   called men in the middle.  When data objects are distributed to
728	   multiple servers, each of those servers must be trusted.  As we have
729	   seen above, we could have quite a large number of servers, thus
730	   providing an attacker a large number of targets.  We conclude that
731	   some form of object level security is required.

733	   Object level security involves an authority signing an object in a
734	   way that can easily be verified by a consumer, in this case a router.
735	   In this case, we would want the mapping table and any incremental
736	   update to be signed by the originator of the update.  This implies
737	   that we cannot simply make use of a tool like CVS [10].  Instead, the
738	   originator will want to generate diffs, sign them, and make them
739	   available either directly or through some sort of content
740	   distribution or peer to peer network.

742	5.4.1.  Use of Public Key Infrastructures (PKIs)

744	   X.509 provides a certificate hierarchy that has scaled to the size of
745	   the Internet.  The system is most manageable when there are few
746	   certificates to manage.  The model proposed in this memo makes use of
747	   one current certificate per database authority.  The three pieces of
748	   information necessary to verify a signature, therefore, are as
749	   follows:

751	   o  the certificate of the database authority, which can be provided
752	      along with the database;
753	   o  the certificate authority's certificate; and
754	   o  A table of database names and distinguished names (DNs) that are
755	      allowed to update them.

757	   The latter two pieces of information must be very well known and must
758	   be configured on each ITR.  It is expected that both would change
759	   very rarely, and it would not be unreasonable for such updates to
760	   occur as part of a normal OS release process.

762	   The tools for both signing and verifying are readily available.
763	   Openssl [19] provides tools and libraries for both signing and
764	   verifying.  Other tools commonly exist.

766	   Use of PKIs is not without implementation, operational complexity or
767	   risk.  The following risks and mitigations are identified with NERD's
768	   use of PKIs:

770	   If a NERD database authority private key is exposed:

772	      In this case an attacker could sign a false database update,
773	      either redirecting traffic, or otherwise causing havoc.  In this
774	      case, the NERD database administrator must revoke its existing key
775	      and issue a new one.  The certificate is added to a certificate
776	      revocation list (CRL), which may be distributed with both this and
777	      other databases, as well as through other channels.  Because this
778	      event is expected to be rare, and the number of database
779	      authorities is expected to be small, a CRL will be small.  When a
780	      router receives a revocation, it checks it against its existing
781	      databases, and attempts to update the one that is revoked.  This
782	      implies that prior to issuing the revocation, the database
783	      authority MUST sign an update with the new key.  Routers SHOULD
784	      discard updates they have already received that were signed after
785	      the revocation was generated.  If a router cannot confirm that
786	      whether the authority's certificate was revoked before or after a
787	      particular update, it MUST retrieve a fresh new copy of the
788	      database with a valid signature.

790	   The private key associated with the CA that signed the Authority's
791	   certificate is compromised:

793	      In this case, it becomes possible for an attacker to masquerade as
794	      the database authority.  To ameliorate damage, the database
795	      authority SHOULD revoke its certificate and get a new certificate
796	      issued from a CA that is not compromised.  Once it has done so,
797	      the previous procedure is followed.  The compromised certificate
798	      can be removed during the normal operating system upgrade cycle.

800	   An algorithm used if either the certificate or the signature is
801	   cracked:

803	      This is a catastrophic failure and the above forms of attack
804	      become possible.  The only mitigation is to make use of a new
805	      algorithm.  In theory this should be possible, but in practice has
806	      proven very difficult.  For this reason, additional work is
807	      recommended to make alternative algorithms available.

809	   The Database Authority loses its key or disappears:

811	      In this case nobody can update the existing database.  There are
812	      few programmatic mitigations.  If the database authority places
813	      its private keys and suitable amounts of information escrow, under
814	      agreed upon circumstances, such as no updates for three days, for
815	      example, the escrow agent would release the information to a party
816	      competent of generating a database update.

818	5.4.2.  Other Risks

820	   Because this specification does not require secure transport, if an
821	   attacker prevents updates to an ITR for the purposes of having that
822	   ITR continue to use a compromised ETR, the ITR could continue to use
823	   an old version of the database without realizing a new version has
824	   been made available.  If one is worried about such an attack, a
825	   secure channel such as SSL to a secure chain back to the database
826	   authority should be used.  It is possible that after some operational
827	   experience, later versions of this format will contain additional
828	   semantics to address this attack.

830	   As discussed above, substantial risk would be a cold start scenario.
831	   If an attacker found a bug in a common operating system that allowed
832	   it to erase an ITR's database, and was able to disseminate that bug,
833	   the collective ability of ITRs to retrieve new copies of the database
834	   could be taxed by collective demand.  The remedy to this is for
835	   devices to share copies of the database with their neighbors, thus
836	   making each potential requestor a potential service.

838	6.  Why not use XML?

840	   Many objects these days are distributed as either XML pages or
841	   something derived as XML [16], such as SOAP [17],[18].  Use of such
842	   well known standards allows for high level tools and library reuse.
843	   Why not, then, use these standards in this case?  There are two
844	   answers to this question.  First, the obvious concern is that XML is
845	   not known for efficiency of data transport.  Being based in text, an
846	   IPv4 address is expanded from one octet to three octets, plus either
847	   an attribute and quotes or element tags and end tags.  Let us presume
848	   for the moment a very simple schema that might cause a record to be
849	   represented as follows:

851	       <r e="10.1.1.0" m="24">
852	         <l w="10" p="15">
853	           <v4>
854	           192.168.1.1
855	           </v4>
856	        </l>
857	         <l w="5" p="15">
858	           <v4>
859	           192.168.1.2
860	           </v4>
861	        </l>
862	      </r>

864	   With white space removed the uncompressed XML represents 120 bytes
865	   versus 20 bytes for the record specified in Section 3.1, representing
866	   a five fold expansion.  That brings our 920MB database to 4.6GB.

868	   The other concern about XML is that version 1.0 of the specification
869	   is silent on the order of sibling elements.  Specifications other
870	   than the base specification state that order is significant.  Order
871	   is significant to LISP and NERD because once an update is applied to
872	   the database it should be possible to verify the signature of the
873	   entire database.  Prior to applying the signature the XML generator
874	   would need to ensure the order of information.  That same sort would
875	   be required of the router.  This seems to add unnecessary fragility
876	   to a critical system without much benefit.  While there may indeed be
877	   uses of an XML representation of the database, these uses are likely
878	   to be outside of a router.

880	7.  Other Distribution Mechanisms

882	   We now consider various different mechanisms.  The problem of
883	   distributing changes in various databases is as old as databases.
884	   The author is aware of two obvious approaches that have been well
885	   used in the past.  One approach would be the wide distribution of CVS
886	   repositories.  However, for reasons mentioned in the previous
887	   section, CVS is insufficient to the task.

889	   The other tried and true approach is the use of periodic updates in
890	   the form of messages.  Good old NNTP [12] itself provides two
891	   separate mechanisms (one push and another pull) to provide a coherent
892	   update process.  This was in fact used to update molecular biology
893	   databases [13] in the early 1990s.  Netnews offers a way to determine
894	   whether articles with specified Article-Ids have been received.  In
895	   the case where the mapping file source of authority wishes to
896	   transmit updates, it can sign a change file and then post it into the
897	   network.  Routers merely need to keep a record of article ids that it
898	   has received.  Initially this is probably overkill, but it may not be
899	   so later in this process.  Some consideration should be given to a
900	   mechanism known to widely distribute vast amounts of data, as
901	   instantaneously either the sender or the receiver wishes.

903	   To attain an additional level of hierarchy in the distribution
904	   network, service providers could retrieve information to their own
905	   local servers, and configure their routers with the host portion of
906	   the above URI.

908	   Another possibility would be for providers to establish an agreement
909	   on a small set of anycast addresses for use for this purpose.  There
910	   are limitations to the use of anycast, particularly with TCP.  In the
911	   midst of a routing flap anycast address can become all but unusable.
912	   Careful study of such a use as well as appropriate use of HTTP
913	   redirects is expected.

915	7.1.  What About DNS as a retrieval model?

917	   It has been proposed that a query/response mechanism be used for this
918	   information, and that specifically the domain name system (DNS) [15]
919	   be used.  The previous models do not preclude the DNS.  DNS has the
920	   advantage that the administrative lines are well drawn, and that the
921	   ID/RLOC mapping is likely to appear very close to these boundaries.
922	   DNS also has the added benefit that an entire distribution
923	   infrastructure already exists.  There are, however, some problems
924	   that could impact end hosts when intermediate routers make queries,
925	   some of which were first pointed out in [14]:

927	   o  Any query mechanism offers an opportunity for a resource attack if
928	      an attacker can force the ITR to query for information.  In this
929	      case, all that would be necessary would be for a "botnet" (a group
930	      of computers that have been compromised and used as vehicles to
931	      attack others) to ping or otherwise contact via some normal
932	      service hosts that sit behind the ETR.  If the botnet hosts
933	      themselves are behind ETRs, the victim's ITR will need to query
934	      for each and every one of them, thus becoming part of a classic
935	      reflector attack.
936	   o  Packets will be delayed at the very least, and probably dropped in
937	      the process of a mapping query.  This could be at the beginning of
938	      a communication, but it will be impossible for a router to
939	      conclude with certainty that this is the case.
940	   o  The DNS has a backoff algorithm that presumes that applications
941	      are making queries prior to the beginning of a communication.
942	      This is appropriate for end hosts who know in fact when a
943	      communication begins.  An end user may not enjoy a router waiting
944	      seconds for a retry.
945	   o  While the administrative lines may appear to be correct, the
946	      location of name servers may not be.  If name servers sit within
947	      PI address space, thus requiring LISP to reach, a circular
948	      dependency is created.  This is precisely where many enterprise
949	      name servers sit.  The LISP experiment should not predicate its
950	      success on relocation of such name servers.

952	   Never-the-less, DNS may be able to play a role in providing the
953	   enterprise control over the mapping of its EIDs to RLOCs.  Posit a
954	   new DNS record "EID2RLOC".  This record is used by the authority to
955	   collect and aggregate mapping information so that it may be
956	   distributed through one of the other mechanisms.  As an example:

958	      $ORIGIN 0.10.PI-SPACE.
959	       128   EID2RLOC   mask 23 priority 10 weight 5 172.16.5.60
960	             EID2RLOC   mask 23 priority 15 weight 5 192.168.1.5

962	   In the above figure network 10.0.128/23 would delegated to some end
963	   system, say EXAMPLE.COM.  They would manage the above zone
964	   information.  This would allow a DNS mechanism to work, but it would
965	   also allow someone to aggregate the information and distribution a
966	   table.

968	7.1.1.  Perhaps use a hybrid model?

970	   It would be possible to use both a prepopulated database such as NERD
971	   and query mechanism (perhaps DNS) to determine an EID/RLOC mapping.
972	   The general idea would be to receive a subset of the mappings, say,
973	   by taking only the NERD for certain regions.  This alleviates the
974	   need to drop packets for some subset of destinations under the
975	   assumption that one's business is localized to a particular region.
976	   If one did not have a local entry for a particular EID one would then
977	   make a query.

979	   One improvement on simply using DNS to query live would be to
980	   periodically walk the entire network, in search of EID2RLOC records,
981	   and caching them to non-volatile storage.  This has two benefits.

983	   First, it prevents resource attacks.  Care has to be given to how
984	   memory is cached it avoid an attacker causing a performance
985	   degradation by attempting to exceed memory limits through a random
986	   source attack.

988	   As important as resisting attacks, having a complete or near complete
989	   copy of the database provides for a faster recovery time when a
990	   router goes out of service, for whatever reason.  Absent such a
991	   mechanism, devices would need to repopulate their local caches
992	   through the help of another system, leading to additional system
993	   fragility.

995	7.2.  Use of BGP

997	   Border Gateway Protocol (BGP) [8] is currently used to distribute
998	   inter-domain routing throughout the Internet.  Why not, then, use BGP
999	   to distribute the mapping table?  A simple answer is that the objects
1000	   BGP best handles are routes.  While it may be possible to transmit
1001	   EID/RLOC mappings instead (because they look an awful lot like
1002	   routes) the rate of updates of EID/RLOC mappings is specifically
1003	   intended to be considerably less than routes, and would probably
1004	   require additional dampening mechanisms to ensure that this is so.

1006	   In addition, the ownership of the mapping does not flow from service
1007	   providers but rather from end users of the identifiers.  It should
1008	   not be possible for anyone to filter the mapping, other than perhaps
1009	   ITRs for local policy purposes.  The current limited security model
1010	   for BGP does not fit the general requirements of how the mapping is
1011	   to be processed.

1013	   Furthermore, as BGP is currently the lifeblood of the Internet its
1014	   use for any means other than routing should be strongly scrutinized.

1016	   This is not to say that BGP has no role to play whatsoever.  It may
1017	   well be possible for routers to exchange database version numbers and
1018	   perhaps base distribution URIs as extensions or capabilities.  This
1019	   would allow routers to serve their copy of the database to their
1020	   neighbors, easing the load off the rest of the server infrastructure.
1021	   How this would be done is future work.

1023	8.  Deployment Issues

1025	   While LISP and NERD are intended as experiments at this point, it is
1026	   already obvious one must give serious consideration to circular
1027	   dependencies with regard to the protocols used and the elements
1028	   within them.

1030	8.1.  HTTP

1032	   In Section 7.1 we have already seen how DNS can have circular
1033	   dependencies.  In as much as HTTP depends on DNS, either due to the
1034	   authority section of a URI, or due to the configured base
1035	   distribution URI, these same concerns apply.  In addition, any HTTP
1036	   server that itself makes use of provider independent addresses would
1037	   be a poor choice to distribute the database for these exact same
1038	   reasons.

1040	   One issue with using HTTP is that it is possible that a middlebox of
1041	   some form, such as a cache, may intercept and process requests.  In
1042	   some cases this might be a good thing.  For instance, if a cache
1043	   correctly returns a database, some amount of bandwidth is conserved.
1044	   On the other hand, if the cache itself fails to function properly for
1045	   whatever reason, end to end connectivity could be impaired.  For
1046	   example, if the cache itself depended on the mapping being in place
1047	   and functional, a cold start scenario might leave the cache
1048	   functioning improperly, in turn providing routers no means to update
1049	   their databases.  Some care must be given to avoid such
1050	   circumstances.

1052	9.  Open Questions

1054	   Do we need to discuss reachability in more detail?  This was clearly
1055	   an issue at the IST-RING workshop.  There are two key issues.  First,
1056	   what is the appropriate architectural separation between the data
1057	   plane and the control plane?  Second, is there some specific way in
1058	   which NERD impacts the data plane?

1060	   Should the database contain its name?  It is probably sufficient to
1061	   merely reference the database by name.

1063	   Should the signature portion be separated from the actual database?
1064	   By specifying the signature we hope to reduce interoperability issues
1065	   and encourage proper security from the get go.  On the other hand,
1066	   since the object is opaque it is not clear how much interoperability
1067	   we are actually encouraging.

1069	   Should we specify a (perhaps compressed) tarball that treads a middle
1070	   ground for the last question, where each update tarball contains both
1071	   a signature for the update and for the entire database, once the
1072	   update is applied.

1074	   Should we compress?  In some initial testing of databases with 1, 5,
1075	   and 10 million IPv4 EIDs and a random distribution of IPv4 RLOCs, the
1076	   current format in this document compresses down by a factor of
1077	   between 35% and 36%, using Burrows-Wheeler block sorting text
1078	   compression algorithm (bzip2).  The NERD used random EIDs with mask
1079	   lengths varying from 19-29, with probability weighted toward the
1080	   smaller masks.  This only very roughly reflects reality.  A better
1081	   test would be to start with the existing prefixes found in the DFZ.

1083	10.  Conclusions

1085	   This memo has specified a database format, an update format, a URI
1086	   convention, an update method, and a validation method for EID/RLOC
1087	   mappings.  We have shown that beyond the predictions of 10^7
1088	   locators, the aggregate database size would be at most 10.8GB.  We
1089	   have considered the amount of servers to distribute that information
1090	   and we have demonstrated the limitations of a simple content
1091	   distribution network and other well known mechanisms.  The effort
1092	   required to retrieve a database change amounts to between 2 and 20
1093	   seconds of processing time per hour at at today's gigabit speeds.  We
1094	   conclude that there is no need for an off box query mechanism today,
1095	   and that there are distinct disadvantages for having such a mechanism
1096	   in the control plane.

1098	   Beyond this we have examined alternatives that allow for hybrid
1099	   models that do use query mechanisms, should our operating assumptions
1100	   prove overly optimistic.  Use of NERD today does not forclose use of
1101	   such models in the future, and in fact both models can happily co-
1102	   exist.

1104	   We leave to future work how the list of databases is distributed, how
1105	   BGP can play a role in distributing knowledge of the databases, and
1106	   how DNS can play a role in aggregating information into these
1107	   databases.

1109	   We also leave to future work whether HTTP is the best protocol for
1110	   the job, and whether the scheme described in this document is the
1111	   most efficient.  One could easily envision that when applied in high
1112	   delay or high loss environments, a broadcast or multicast method may
1113	   prove more effective.

1115	11.  IANA Considerations

1117	   This memo makes no requests of IANA.

1119	12.  Acknowledgments

1121	   Dino Farinacci, Patrik Faltstrom, Dave Meyer, Joel Halpern, Dave
1122	   Thaler, Mohamed Boucadair, Robin Whittle, and Max Pritikin were very
1123	   helpful with their reviews of this work.  Thanks also to the
1124	   participants of the Routing Research Group and the IST-RING workshop
1125	   held in Madrid in December of 2007 for their incisive comments.  The
1126	   astute will notice a lengthy References section.  This work stands on
1127	   the shoulders of many others' efforts.

1129	13.  References

1131	13.1.  Normative References

1133	   [1]   Farinacci, D., "Locator/ID Separation Protocol (LISP)",
1134	         draft-farinacci-lisp-03 (work in progress), August 2007.

1136	   [2]   Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L.,
1137	         Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol --
1138	         HTTP/1.1", RFC 2616, June 1999.

1140	   [3]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
1141	         Levels", BCP 14, RFC 2119, March 1997.

1143	   [4]   Kaliski, B., "PKCS #7: Cryptographic Message Syntax Version
1144	         1.5", RFC 2315, March 1998.

1146	   [5]   International Telecommunications Union, "Information technology
1147	         - Open Systems Interconnection - The Directory: Public-key and
1148	         attribute certificate frameworks", ITU-T Recommendation X.509,
1149	         ISO Standard 9594-8, March 2000.

1151	   [6]   Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
1152	         Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986,
1153	         January 2005.

1155	13.2.  Informational References

1157	   [7]   Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS)
1158	         Protocol Version 1.1", RFC 4346, April 2006.

1160	   [8]   Rekhter, Y., Li, T., and S. Hares, "A Border Gateway Protocol 4
1161	         (BGP-4)", RFC 4271, January 2006.

1163	   [9]   Carpenter, B., "IETF Plenary Presentation: Routing and
1164	         Addressing: Where we are today", March 2007.

1166	   [10]  Grune, R., Baalbergen, E., Waage, M., Berliner, B., and J.
1167	         Polk, "CVS: Concurrent Versions System", November 1985.

1169	   [11]  International International Telephone and Telegraph
1170	         Consultative Committee, "Information Technology - Open Systems
1171	         Interconnection - The Directory: Authentication Framework",
1172	         CCITT Recommendation X.509, November 1988.

1174	   [12]  Kantor, B. and P. Lapsley, "Network News Transfer Protocol",
1175	         RFC 977, February 1986.

1177	   [13]  Smith, R., Gottesman, Y., Hobbs, B., Lear, E., Kristofferson,
1178	         D., Benton, D., and P. Smith, "A mechanism for maintaining an
1179	         up-to-date GenBank database via Usenet", CABIOS , April 1991.

1181	   [14]  Huitema, C., "An Experiment in DNS Based IP Routing", RFC 1383,
1182	         December 1992.

1184	   [15]  Mockapetris, P., "Domain names - concepts and facilities",
1185	         STD 13, RFC 1034, November 1987.

1187	   [16]  Bray, T., Paoli, J., Sperberg-McQueen, C., and E. Maler,
1188	         "Extensible Markup Language (XML) 1.0 (2nd ed)", W3C REC-xml,
1189	         October 2000, <http://www.w3.org/TR/REC-xml>.

1191	   [17]  Gudgin, M., Hadley, M., Mendelsohn, N., Moreau, J., and H.
1192	         Nielsen, "SOAP Version 1.2 Part 1: Messaging Framework", W3C
1193	         Working Draft soap12-part1, June 2002,
1194	         <http://www.w3.org/TR/soap12-part1>.

1196	   [18]  Gudgin, M., Hadley, M., Mendelsohn, N., Moreau, J., and H.
1197	         Nielsen, "SOAP Version 1.2 Part 2: Adjuncts", W3C Working
1198	         Draft soap12-part2, June 2002,
1199	         <http://www.w3.org/TR/soap12-part2>.

1201	URIs

1203	   [19]  <http://www.openssl.org>

1205	Appendix A.  Generating and verifying the database signature with
1206	             OpenSSL

1208	   As previously mentioned, one goal of NERD was to use off-the-shelf
1209	   tools to both generate and retrieve the database.  To many, PKI is
1210	   magic.  This section is meant to provide at least some clarification
1211	   as to both the generation and verification process, complete with
1212	   command line examples.  Not included is how you get the entries
1213	   themselves.  We'll assume they exist, and that you're just trying to
1214	   sign the database.

1216	   To sign the database, to start with, you need a database file that
1217	   has a database header described in Section 3.  Block size should be
1218	   zero, and there should be no PKCS#7 block at this point.  You also
1219	   need a certificate and its private key with which you will sign the
1220	   database.

1222	   The OpenSSL "smime" command contains all the functions we need from
1223	   this point forth.  To sign the database, issue the following command:

1225	         openssl smime -binary -sign -outform DER -signer yourcert.crt \
1226	                 -inkey yourcert.key -in database-file -out signature

1228	   -binary states that no MIME canonicalization should be performed.
1229	   -sign indicates that you are signing the file that was given as the
1230	   argument to -in.  The output format (-outform) is binary DER, and
1231	   your public certificate is provided with -signer along with your key
1232	   with -inkey.  The signature itself is specified with -out.

1234	   The resulting file "signature" is then copied into to PKCS#7 block in
1235	   the database header, its size in bytes is recorded in the PKCS#7
1236	   block size field, and the resulting file is ready for distribution to
1237	   ITRs.

1239	   To verify a database file, first retrieve the PKCS#7 block from the
1240	   file by copying the appropriate number of bytes into another file,
1241	   say "signature".  Then zero this field, and set the block size field
1242	   to 0.  Next use the "smime" command to verify the signature as
1243	   follows:

1245	       openssl smime -binary -verify -inform DER -content database-file
1246	               -out /dev/null -in signature

1248	   Openssl will return "Verification OK" if the signature is correct.

1250	   To improve verification performance it would make modifications to
1251	   the program so that it takes as input the database with a null
1252	   signature and as an argument the name of the file containing the
1253	   signature.  Better yet, use a call to the appropriate library with
1254	   each block.

1256	Appendix B.  Changes

1258	   This section to be removed prior to publication.

1260	   o  03: Change dbname to a domain name, indicate that is what is in
1261	      the subject of the X.509 certificate, and list editorial changes,
1262	      update acknowledgments.
1263	   o  02: Incorporate some of Dave Thaler's comments.  Add
1264	      authentication block detail.  Modify analysis to take IPv6 into
1265	      account, along with a more realistic number of RLOCs per EID.  Add
1266	      some comments about potential risks of a cold start.  Add S/MIME
1267	      example as appendix A and take out old ToDo.  Provide some amount
1268	      of compression of IPv6 addresses by limiting their size to
1269	      significant bytes rounded to a four byte word boundary.
1270	   o  01: Massive spelling correction, URI example correction.
1271	   o  00: Initial Revision.

1273	Author's Address

1275	   Eliot Lear
1276	   Cisco Systems GmbH
1277	   Glatt-com
1278	   Glattzentrum, ZH  CH-8301
1279	   Switzerland

1281	   Phone: +41 1 878 7525
1282	   Email: lear@cisco.com

1284	Full Copyright Statement

1286	   Copyright (C) The IETF Trust (2008).

1288	   This document is subject to the rights, licenses and restrictions
1289	   contained in BCP 78, and except as set forth therein, the authors
1290	   retain all their rights.

1292	   This document and the information contained herein are provided on an
1293	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1294	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
1295	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
1296	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
1297	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1298	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1300	Intellectual Property

1302	   The IETF takes no position regarding the validity or scope of any
1303	   Intellectual Property Rights or other rights that might be claimed to
1304	   pertain to the implementation or use of the technology described in
1305	   this document or the extent to which any license under such rights
1306	   might or might not be available; nor does it represent that it has
1307	   made any independent effort to identify any such rights.  Information
1308	   on the procedures with respect to rights in RFC documents can be
1309	   found in BCP 78 and BCP 79.

1311	   Copies of IPR disclosures made to the IETF Secretariat and any
1312	   assurances of licenses to be made available, or the result of an
1313	   attempt made to obtain a general license or permission for the use of
1314	   such proprietary rights by implementers or users of this
1315	   specification can be obtained from the IETF on-line IPR repository at
1316	   http://www.ietf.org/ipr.

1318	   The IETF invites any interested party to bring to its attention any
1319	   copyrights, patents or patent applications, or other proprietary
1320	   rights that may cover technology that may be required to implement
1321	   this standard.  Please address the information to the IETF at
1322	   ietf-ipr@ietf.org.

1324	Acknowledgment

1326	   Funding for the RFC Editor function is provided by the IETF
1327	   Administrative Support Activity (IASA).