idnits 2.17.1
draft-lear-lisp-nerd-03.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
** It looks like you're using RFC 3978 boilerplate. You should update this
to the boilerplate described in the IETF Trust License Policy document
(see https://trustee.ietf.org/license-info), which is required now.
-- Found old boilerplate from RFC 3978, Section 5.1 on line 15.
-- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
line 1298.
-- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1309.
-- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1316.
-- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1322.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
== There are 4 instances of lines with private range IPv4 addresses in the
document. If these are generic example addresses, they should be changed
to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x,
198.51.100.x or 203.0.113.x.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust Copyright Line does not match the
current year
-- The document seems to lack a disclaimer for pre-RFC5378 work, but may
have content which was first submitted before 10 November 2008. If you
have contacted all the original authors and they are all willing to grant
the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
this comment. If not, you may need to add the pre-RFC5378 disclaimer.
(See the Legal Provisions document at
https://trustee.ietf.org/license-info for more information.)
-- The document date (January 23, 2008) is 5938 days in the past. Is this
intentional?
Checking references for intended status: Experimental
----------------------------------------------------------------------------
== Outdated reference: A later version (-12) exists of
draft-farinacci-lisp-03
** Obsolete normative reference: RFC 2616 (ref. '2') (Obsoleted by RFC
7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235)
-- Obsolete informational reference (is this intentional?): RFC 4346 (ref.
'7') (Obsoleted by RFC 5246)
-- Obsolete informational reference (is this intentional?): RFC 977 (ref.
'12') (Obsoleted by RFC 3977)
Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 9 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group E. Lear
3 Internet-Draft Cisco Systems GmbH
4 Intended status: Experimental January 23, 2008
5 Expires: July 26, 2008
7 NERD: A Not-so-novel EID to RLOC Database
8 draft-lear-lisp-nerd-03.txt
10 Status of this Memo
12 By submitting this Internet-Draft, each author represents that any
13 applicable patent or other IPR claims of which he or she is aware
14 have been or will be disclosed, and any of which he or she becomes
15 aware will be disclosed, in accordance with Section 6 of BCP 79.
17 Internet-Drafts are working documents of the Internet Engineering
18 Task Force (IETF), its areas, and its working groups. Note that
19 other groups may also distribute working documents as Internet-
20 Drafts.
22 Internet-Drafts are draft documents valid for a maximum of six months
23 and may be updated, replaced, or obsoleted by other documents at any
24 time. It is inappropriate to use Internet-Drafts as reference
25 material or to cite them other than as "work in progress."
27 The list of current Internet-Drafts can be accessed at
28 http://www.ietf.org/ietf/1id-abstracts.txt.
30 The list of Internet-Draft Shadow Directories can be accessed at
31 http://www.ietf.org/shadow.html.
33 This Internet-Draft will expire on July 26, 2008.
35 Copyright Notice
37 Copyright (C) The IETF Trust (2008).
39 Abstract
41 LISP is a protocol to encapsulate IP packets in order to allow end
42 sites to multihome without injecting routes from one end of the
43 Internet to another. This memo specifies a database and a method to
44 transport the mapping of EIDs to RLOCs to routers in a reliable,
45 scalable, and secure manner. Our analysis concludes that transport
46 of of all EID/RLOC mappings scales well to at least 10^8 entries, and
47 that use of DNS or any approach that queries for mappings has
48 substantial operational concerns.
50 Table of Contents
52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
53 1.1. Base Assumptions . . . . . . . . . . . . . . . . . . . . . 3
54 1.2. What is NERD? . . . . . . . . . . . . . . . . . . . . . . 4
55 1.3. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . 5
56 2. Theory of Operation . . . . . . . . . . . . . . . . . . . . . 5
57 2.1. Who are database authorities? . . . . . . . . . . . . . . 6
58 3. NERD Format . . . . . . . . . . . . . . . . . . . . . . . . . 7
59 3.1. NERD Record Format . . . . . . . . . . . . . . . . . . . . 9
60 3.2. Database Update Format . . . . . . . . . . . . . . . . . . 9
61 4. NERD Distribution Mechanism . . . . . . . . . . . . . . . . . 10
62 4.1. Initial Bootstrap . . . . . . . . . . . . . . . . . . . . 10
63 4.2. Retrieving Changes . . . . . . . . . . . . . . . . . . . . 10
64 5. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
65 5.1. Database Size . . . . . . . . . . . . . . . . . . . . . . 12
66 5.2. Router Throughput Versus Time . . . . . . . . . . . . . . 14
67 5.3. Number of Servers Required . . . . . . . . . . . . . . . . 14
68 5.4. Security Considerations . . . . . . . . . . . . . . . . . 16
69 5.4.1. Use of Public Key Infrastructures (PKIs) . . . . . . . 17
70 5.4.2. Other Risks . . . . . . . . . . . . . . . . . . . . . 19
71 6. Why not use XML? . . . . . . . . . . . . . . . . . . . . . . . 19
72 7. Other Distribution Mechanisms . . . . . . . . . . . . . . . . 20
73 7.1. What About DNS as a retrieval model? . . . . . . . . . . . 21
74 7.1.1. Perhaps use a hybrid model? . . . . . . . . . . . . . 22
75 7.2. Use of BGP . . . . . . . . . . . . . . . . . . . . . . . . 23
76 8. Deployment Issues . . . . . . . . . . . . . . . . . . . . . . 23
77 8.1. HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
78 9. Open Questions . . . . . . . . . . . . . . . . . . . . . . . . 24
79 10. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 25
80 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25
81 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 25
82 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26
83 13.1. Normative References . . . . . . . . . . . . . . . . . . . 26
84 13.2. Informational References . . . . . . . . . . . . . . . . . 26
85 Appendix A. Generating and verifying the database signature
86 with OpenSSL . . . . . . . . . . . . . . . . . . . . 27
87 Appendix B. Changes . . . . . . . . . . . . . . . . . . . . . . . 29
88 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 29
89 Intellectual Property and Copyright Statements . . . . . . . . . . 30
91 1. Introduction
93 Locator/ID Separation Protocol (LISP) [1] is a protocol whose primary
94 purpose is to separate an IP address used by a host and local routing
95 system from the locators advertised by BGP participants on the
96 Internet in general, and in the default free zone (DFZ) in
97 particular. It accomplishes this by establishing a mapping between
98 globally unique endpoint identifiers (EIDs) and routing locators
99 (RLOCs) within the global routing table. This reduces the amount of
100 state change that occurs on routers within the default-free zone on
101 the Internet, while enabling end sites to be multihomed.
103 In early stages of LISP (1 and 1.5) the mapping is either configured
104 into a device or it is learned via data-triggered control messages
105 between ingress tunnel routers (ITRs) and egress tunnel routers
106 (ETRs) under the assumption that during transition, EIDs will be
107 present within the global routing system, as they are today.
109 In later stages of LISP, the assumption will be that EIDs are not
110 contained within the global routing system, but that instead the
111 mapping from EIDs to RLOCs will be learned through some other means.
112 This memo addresses different approaches to the problem, and
113 specifies a Not-so-novel EID RLOC Database (NERD) and methods to both
114 receive the database and to receive updates.
116 LISP and NERD are both currently experimental protocols. The NERD
117 database is specified in such a way that the methods used to
118 distribute or retrieve it may vary over time. Multiple databases are
119 supported in order to allow for multiple data sources. An effort has
120 been made to divorce the database from access methods so that both
121 can evolve independently through experimentation and operational
122 validation.
124 1.1. Base Assumptions
126 In order to specify a mapping it is important to understand how it
127 will be used, and the nature of the data being mapped. In the case
128 of LISP, the following assumptions are pertinant:
130 o The data contained within the mapping changes only on provisioning
131 or configuration operations, and is not intended to change when a
132 link either fails or is restored. Some other mechanism such as
133 the use of LISP Rechability Bits with mapping replies handles
134 healing operations, particularly when a tail circuit within an
135 service provider's aggregate goes down. NERD can be used as a
136 verification method to ensure that whatever operational mapping
137 changes an ITR receives are authorized.
139 o While weight and priority are defined, these are not hop-by-hop
140 metrics. Hence the information contained within the mapping does
141 not change based on where one sits within the topology.
142 o The purpose of LISP being to reduce control plane overhead by
143 reducing "rate X state" complexity, updates to the mapping will be
144 relatively rare.
145 o Because LISP and NERD are designed to ease interdomain routing,
146 their use is intended within the inter-domain environment. That
147 is, LISP is best implemented at either the customer edge or
148 provider edge, and there will be on the order of as many ITRs and
149 LISP announcements as there are connections to Internet Service
150 Providers by end customers.
151 o As such, LISP and NERD cannot be the sole means to implement host
152 mobility, although they may be in used in conjunction with other
153 mechanisms. For instance, it would be possible for a mobile node
154 to receive a local address that is an EID and pass that to the
155 correspondant node, who could also make use of an EID. As such
156 use of LISP in this case would be transparent, and no mapping
157 entries are changed for mobility.
158 o There is no interaction with the interior gateway protocol (IGP).
160 1.2. What is NERD?
162 NERD is a Not-so-novel EID to RLOC Database. It consists of the
163 following components:
165 1. a network database format;
166 2. a change distribution format;
167 3. a database retrieval/bootstrapping method;
168 4. a change distribution method.
170 The network database format is compressable. However, at this time
171 we specify no compression method. NERD will make use of potentially
172 several transport methods, but most notably HTTP [2]. HTTP has
173 restart and compression capabilities. It is also widely deployed.
175 There exist many methods to show differences between two versions of
176 a database or a file, UNIX's "diff" being the classic example. In
177 this case, because the data is well structured and easily keyed, we
178 can make use of a very simple format for version differences that
179 simply provides a list of EID/RLOC mappings that have changed using
180 the same record format as the database, and a list of EIDs that are
181 to be removed.
183 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
184 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
185 document are to be interpreted as described in RFC 2119 [3].
187 1.3. Glossary
189 The reader is once again referred to [1] for a general glossary of
190 terms related to LISP. The following terms are specific to this
191 memo.
193 Base Distribution URI: An Absolute-URI as defined in Section 4.3 of
194 [6] from which other references are relative. The base
195 distribution URI is used to construct a URI to an EID/RLOC mapping
196 database. If more than one NERD is known then there will be one
197 or more base distribution URIs associated with each (although each
198 such base distribution URI may have the same value).
200 EID Database Authority: The authority that will sign database files
201 and updates. It is the source of both.
203 The Authority: Shorthand for the EID Database Authority.
205 NERD: (N)ot-so-novel (E)ID to (R)LOC (D)atabase.
207 AFI Address Family Identifier.
209 Pull Model: An architecture where clients pull only the information
210 they need at any given time, such as when a packet arrives for
211 forwarding.
213 Push Model: An architecture in which clients receive an entire
214 dataset, containing data they may or may not require, such as
215 mappings for EIDs that no host served is attempting to send to.
217 Hybrid Model: An architecture in which clients receive a subset of
218 the entire dataset and query as needed for the rest.
220 2. Theory of Operation
222 What follows is a summary of how NERDs are generated and updated.
223 Specifics can be found in Section 3. The general way in which NERD
224 works is as follows:
226 1. A NERD is generated by an authority that allocates provider
227 independent (PI) addresses (e.g., IANA or an RIR) which are used
228 by sites as EIDs. As part of this process the authority
229 generates a digest for the database and signs it with a private
230 key whose public key is part of an X.509 certificate. [11] That
231 signature along with a copy of the authority's public key is
232 included in the NERD.
233 2. The NERD is distributed to a group of well known servers.
234 3. ITRs retrieve an initial copy of the NERD via HTTP when they come
235 into service.
236 4. ITRs are preconfigured with a group of certificates whose private
237 keys are used by database authorities to sign the NERD. This
238 list of certificates should be configurable by administrators.
239 5. ITRs next verify both the validity of the public key and the
240 signed digest. If either fail validation, the ITR attempts to
241 retrieve the NERD from a different source. The process iterates
242 until either a valid database is found or the list of sources is
243 exhausted.
244 6. Once a valid NERD is retrieved, the ITR installs it into both
245 non-volatile and local memory.
246 7. At some point the authority updates the NERD and increments the
247 database version counter. At the same time it generates a list
248 of changes, which it also signs, as it does with the original
249 database.
250 8. Periodically ITRs will poll from their list of servers to
251 determine if a new version of the database exists. When a new
252 version is found, an ITR will attempt to retrieve a change file,
253 using its list of preconfigured servers.
254 9. The ITR validates a change file just as it does the original
255 database. Assuming the change file passes validation, the ITR
256 installs new entries, overwrites existing ones, and removes empty
257 entries, based on the content of the change file.
259 As time goes on it is quite possible that an ITR may probe a list of
260 configured neighbors for a database or change file copy. It is
261 equally possible that neighbors might advertise to each other the
262 version number of their database. Such methods are not explored in
263 detph in this memo, but are mentioned for future consideration.
265 2.1. Who are database authorities?
267 This memo does not specify who the database authority is. That is
268 because there are several possible operational models. In each case
269 the number of database authorities is meant to be small so that ITRs
270 need only keep a small list of authorities, similar to the way a name
271 server might cache a list of root servers.
273 o A single database authority exists. In this case all entries in
274 the database are registered to a single entity, and that entity
275 distributes the database. Because the EID space is provider
276 independent address space, there is no architectural requirement
277 that address space be hierarchically distributed to anyone, as
278 there is with provider-assigned address space. Hence, there is a
279 natural affinity between the IANA function and the database
280 authority function.
281 o Each region runs a database authority. In this case, provider
282 independent address space is allocated to either regional internet
283 registries or to affiliates of such organizations of network
284 operations guilds (NOGs). The benefit of this approach is that
285 there is no single organization that controls the database. It
286 allows one database authority to backup another. One could
287 envision as many as ten database authorities in this scenario.
288 o Each country runs a database authority. This could occur should
289 countries decide to regulate this function. While limiting the
290 scope of any single database authority as the previous scenario
291 describes, this approach would introduce some overhead as the list
292 of database authorities would grow to as many as 200, and possibly
293 more if jurisdictions within countries attempted to regulate the
294 function.
296 As the number of authorities increases the amount of change on that
297 list will also increase, requiring both an update mechanism and the
298 potential need for a discovery mechanism, both of which would be the
299 subject of future work (i.e., not to be found in this memo). For
300 this reason alone, as a starting point two database authorities are
301 recommended, but their selection is left for others.
303 3. NERD Format
305 The NERD consists of a header that contains a database version and a
306 signature that is generated by ignoring the signature field and
307 setting the authentication block length to 0 (NULL). The
308 authentication block itself consists of a signature and a certificate
309 whose private key counterpart was used to generate the signature.
310 The exact format of the authentication block is TBD.
312 Records are kept sorted in numeric order with AFI plus EID as primary
313 key and mask length as secondary. This is so that after a database
314 update it should be possible to reconstruct the database to verify
315 the digest signature, which may be retrieved separately from the
316 database for verification purposes.
318 0 1 2 3
319 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
320 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
321 | Schema Vers=1 | DB Code | Database Name Size |
322 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
323 | Database Version |
324 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
325 | Old Database Version or 0 |
326 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
327 | |
328 | Database Name |
329 | |
330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
331 | PKCS#7 Block Size | Reserved |
332 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
333 | |
334 | PKCS#7 Block containing Certificate and Signature |
335 | |
336 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
338 Database Header
340 The DB Code indicates 0 if what follows is an entire database or 1 if
341 what follows is an update. The database file version is incremented
342 each time the complete database is generated by the authority. In
343 the case of an update, the database file version indicates the new
344 database file version, and the old database file version is indicated
345 in the "old DB version" field. The database file version is used by
346 routers to determine whether or not they have the most current
347 database.
349 The database name is a domain name. This is the name that will
350 appear in the Subject field of the certificate used to verify the
351 database. The purpose of the database name is to allow for more than
352 one database. Such databases would be merged by the router. It is
353 important that an EID/RLOC mapping be listed in no more than one
354 database, lest inconsistencies arise. However, it may be possible to
355 transition a mapping from one database to another. During the
356 transition period, the mappings MUST be identical. When they are
357 not, the resultant behavior will be undefined.
359 The PKCS#7 [4] authentication block contains a DER encoded [5]
360 signature and associated public key.
362 3.1. NERD Record Format
364 As distributed over the network, NERD records appear as follows:
366 0 1 2 3
367 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
368 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
369 | Num. RLOCs | EID Mask Len | EID AFI |
370 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
371 | End point identifier |
372 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
373 | Priority 1 | Weight 1 | AFI 1 |
374 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
375 | Routing Locator 1 |
376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
377 | Priority 2 | Weight 2 | AFI 2 |
378 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
379 | Routing Locator 2 |
380 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
381 | Priority 3 | Weight 3 | AFI 3 |
382 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
383 | Routing Locator 3... |
384 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
386 Priority N and Weight N, and AFI N are associated with Routing
387 Locator N. There will always be at least one routing locator. The
388 minimum record size for IPv4 is 16 bytes. Each additional IPv4 RLOC
389 increases the record size by 8 bytes. The purpose of this format is
390 to keep the database compact, but somewhat easily read. The meaning
391 of weight and priority are described in [1]. The format of the AFI
392 is specified by IANA as "Address Family Numbers", with the exception
393 of how IPv6 addresses are stored.
395 In order to reduce storage and transmission amounts for IPv6, only
396 the necessary number of bytes as specified by the prefix length are
397 kept in the record, rounded to the nearest four byte (word) boundary.
398 This is true for both EIDs and RLOCs. For instance, if the prefix
399 length is /49, the nearest four-byte word boundary would require that
400 eight bytes are stored.
402 3.2. Database Update Format
404 A database update contains a set of changes to an existing database.
405 Each AFI/EID/mask-length tuple may have zero or more RLOCs associated
406 with it. In the case where there are no RLOCs, the EID entry is
407 removed from the database. Records that contain EIDs and mask
408 lengths that were not previously listed are simply added. Otherwise,
409 the old record for the EID and mask length is replaced by the more
410 current information. The record format used by the a database update
411 is the same as described in Section 3.1.
413 4. NERD Distribution Mechanism
415 4.1. Initial Bootstrap
417 Bootstrap occurs when a router needs to retrieve the entire database.
418 It knows it needs to retrieve the entire database because either it
419 has none or an update too substantial to process, as might be the
420 case if a router has been out of service for a substantially lengthy
421 period of time.
423 To bootstrap the ITR appends the database name plus "/current/
424 entiredb" to a Base Distribution URI and retrieves the file via HTTP.
425 For example, if the configured URI is
426 "http://www.example.com/eiddb/", and assuming a database name of
427 "nerd.arin.net", the ITR would request
428 "http://www.example.com/eiddb/current/nerd.arin.net/entiredb".
429 Routers MUST check the signature on the database prior to installing
430 it, and MUST check that the database schema matches a schema they
431 understand. Once a router has a valid database it MUST store that
432 database in some sort of non-volatile memory (e.g., disk, flash
433 memory, etc).
435 N.B., the host component for such URIs MUST NOT resolve to a LISP
436 EID, lest a circular dependency be created.
438 4.2. Retrieving Changes
440 In order to retrieve a set of database changes an ITR will have
441 previously retrieved the entire database. Hence it knows the current
442 version of the database it has. Its first step for retrieving
443 changes is to retrieve the current version of the database. It does
444 so by appending "current/version" to the base distribution URI and
445 retrieving the file. Its format is text and it contains the integer
446 value of the current database version.
448 Once an ITR has retrieved the current version it compares version of
449 its local copy. If there is no difference, then the router is up to
450 date and need take no further actions until it next checks.
452 If the versions differ, the router next sends a request for the
453 appropriate change file by appending "current/changes/" and the
454 textual representation of the version of its local copy of the
455 database to the base distribution URI. For example, if the current
456 version of the database is 1105503 and router's version is 1105500,
457 and the base URI and database name are the same as above, the router
458 would request
459 "http://www.example.com/eiddb/nerd.arin.net/current/changes/1105500".
461 The server may not have that change file, either because there are
462 too many versions between what the router has and what is current, or
463 because no such change file was generated. If the server has changes
464 from the routers version to any later version, the server SHOULD
465 issue an HTTP redirect to that change file, and the router SHOULD
466 retrieve and process it. Once it has done so, the router should then
467 repeat the process until it has brought itself up to date. It is
468 thus important for servers to expire old change files in the order in
469 which they were generated.
471 By way of convention, it is suggested that the URIs issued in
472 redirects be of the following form:
474 {base dist. URI}/{dbname}/{more-recent-version}/changes/
475 {older-version}
477 where "base dist. URI" is the base distribution URI, "dbname" is the
478 name of the database, and each version is the textual representation
479 of the integer version value.
481 For example, if the current database version was 1105503 and a router
482 made a request for
483 "http://www.example.com/eiddb/nerd.arin.net/current/changes/1105400"
484 but there was no change file from 1105400 to 1105503, and the server
485 had a group of change files to make the router current, it would
486 issue a redirect to
487 "http://www.example.com/eiddb/nerd.arin.net/110450/changes/1105400"
488 that the router would then process. The router would then make a
489 request for
490 "http://www.example.com/eiddb/nerd.arin.net/current/changes/110450"
491 that the server would have.
493 While it is unlikely that database versions would wrap, as they
494 consists of 32 bit integers, should the event occur, ITRs MUST
495 attempt first to retrieve a change file when their current version
496 number is within 10,000 of 2^32 and they see a version available that
497 is less than 10,000. Barring the availability of a change file, the
498 ITR MUST still assume that the database version has wrapped and
499 retrieve a new copy.
501 5. Analysis
503 We will start our analysis by looking at how much data will be
504 transferred to a router during bootstrap conditions. We will then
505 look at the bandwidth required. Next we will turn our concerns to
506 servers. Finally we will ponder the effect of providing only
507 changes.
509 In the analysis below we treat the overhead of the database header as
510 insignificant (because it is). The analysis should be similar,
511 whether a single database or multiple databases are employed, as we
512 would assume that no entry would appear more than once.
514 5.1. Database Size
516 By its very nature the information to be transported is relatively
517 static and is specifically designed to be topologically insensitive.
518 That is, every ITR is intended to have the same set of RLOCs for a
519 given EID. While some processing power will be necessary to install
520 a table, the amount required should be far less than that of a
521 routing information database because the level of entropy is intended
522 to be lower.
524 For purposes of this analysis, we will assume that the world has
525 migrated to IPv6, as this increases the size of the database, which
526 would be our primary concern. However, to mitigate the size
527 increase, we have limited the size of the prefix transmitted. For
528 purposes of this analysis, we shall assume an average prefix length
529 of 64 bits.
531 Based on that assumption, Section 3.1 states that mapping information
532 for each EID/Prefix includes a group of RLOCs, each with an
533 associated priority and weight, and that a minimum record size with
534 IPv6 EIDs with at least one RLOC is 24 bytes uncompressed. Each
535 additional IPv6 RLOC costs 12 bytes (again, assuming an average
536 prefix length of 64 bits).
538 +-----------+--------+--------+---------+
539 | 10^n EIDs | 2 RLOC | 4 RLOC | 8 RLOC |
540 +-----------+--------+--------+---------+
541 | 4 | 360 KB | 600 KB | 1.08 MB |
542 | 5 | 3.6 MB | 6.0 MB | 10.8 MB |
543 | 6 | 36 MB | 60 MB | 108 MB |
544 | 7 | 360 MB | 600 MB | 1.08 GB |
545 | 8 | 3.6 GB | 6.0 GB | 10.8 GB |
546 +-----------+--------+--------+---------+
548 Database size for IPv6 routes with average prefix length = 64 bits
550 Table 1
552 Entries in the above table are derived as follows:
554 E * (24 + 12 * (R -1 ))
556 where E = number of EIDs (10^n), R = number of RLOCs per EID.
558 Our scaling target is to accommodate 10^8 multihomed systems, which
559 is one order magnitude greater than what is discussed in [9]. At
560 10^8 entries, a device could be expected to use between 3.6 and 10.8
561 and gigabytes of RAM for the mapping. No matter the method of
562 distribution, any router that sits in the core of the Internet would
563 require near this amount of memory in order to perform the ITR
564 function. Large enterprise ETRs would be similarly strained, simply
565 due to the diversity of of sites that communicate with one another.
566 The good news is that this is not our starting point, but rather our
567 scaling target, a number that we intend to reach by the year 2050.
568 Our starting point is more likely in the neighborhood of 10^4 or 10^5
569 EIDs, thus requiring between 360KB and 10.8 MB.
571 5.2. Router Throughput Versus Time
573 +-------------------+---------+--------+---------+-------+
574 | Table Size (10^N) | 1mb/s | 10mb/s | 100mb/s | 1gb/s |
575 +-------------------+---------+--------+---------+-------+
576 | 6 | 8 | 0.8 | 0.08 | 0.008 |
577 | 7 | 80 | 8 | 0.8 | 0.08 |
578 | 8 | 800 | 80 | 8 | 0.8 |
579 | 9 | 8,000 | 800 | 80 | 8 |
580 | 10 | 80,000 | 8,000 | 800 | 80 |
581 | 11 | 800,000 | 80,000 | 8,000 | 800 |
582 +-------------------+---------+--------+---------+-------+
584 Number of seconds to process NERD
586 Table 2
588 The length of time it takes to process the database is significant in
589 models where the device acquires the entire table. During this
590 period of time, either the router will be unable to route packets
591 using LISP or it must use some sort of query mechanism for specific
592 EIDs as the rest it populates its table through the transfer.
593 Table 2 shows us that at our scaling target, the length of time it
594 would take for a router using 1 mb/s of bandwidth is about 80
595 seconds. We can measure the processing rate in small numbers of
596 hours for any transfer speed greater than that. The fastest
597 processing time shows us as taking 8 seconds to process an entire
598 table of 10^9 bytes and 80 for 10^10 bytes.
600 5.3. Number of Servers Required
602 As easy as it may be for a router to retrieve, the aggregate
603 information may be difficult for servers to transmit, assuming the
604 information is transmitted in aggregate (we'll revisit that
605 assumption later).
607 +----------------+------------+-----------+------------+------------+
608 | # Simultaneous | 10 Servers | 100 | 1,000 | 10,000 |
609 | Requests | | Servers | Servers | Servers |
610 +----------------+------------+-----------+------------+------------+
611 | 100 | 480 | 48 | 48 | 48 |
612 | 1,000 | 4,800 | 480 | 48 | 48 |
613 | 10,000 | 48,000 | 4,800 | 480 | 48 |
614 | 100,000 | 480,000 | 48,000 | 4,800 | 480 |
615 | 1,000,000 | 4,800,000 | 480,000 | 48,000 | 4,800 |
616 | 10,000,000 | 48,000,000 | 4,800,000 | 480,000 | 48,000 |
617 +----------------+------------+-----------+------------+------------+
619 Retrieval time per number of servers in seconds. Assumes average
620 10^8 entries with 4 RLOCs per EID and that each server has access to
621 1gb/s and 100% efficient use of that bandwidth and no compression.
623 Table 3
625 Entries in the above table were generated using the following method:
627 For 10^8 entries with four RLOCs per EID, the table size is 6.0GB,
628 per our previous table. Assume 1 Gb/s transfer rates and 100%
629 utilization. Protocol overhead is ignored for this exercise. Hence
630 a single transfer X takes 48 seconds and can get no faster.
632 With this in mind, each entry is as follows:
634 max(1X,N*X/S)
636 where N=number of transfers, X = 48 seconds,
637 and S = number of servers.
639 If we have a distribution model which every device must retrieve the
640 mapping information upon start, Table 3 shows the length of time in
641 seconds it will take for a given number of servers to complete a
642 transfer to a given number of devices. This table says, as an
643 example, that it would take 48,000 seconds (over 13 hours) for one
644 million ITRs to simultaneously retrieve the database from one
645 thousand servers. Should a cold start scenario occur, this number
646 should be of some concern. Hence it is important to take some
647 measures both to avoid such a scenario, and to ease the load should
648 it occur. The primary defense should be for ITRs to first attempt to
649 retrieve their databases from their peers or upstream providers.
650 Secondary defenses could include data sanity checks within ITRs, with
651 agreed norms for how much the database should change in any given
652 update or over any given period of time. As we will see below,
653 dissemination of changes is considerably less volume.
655 +----------------+-------------+---------------+----------------+
656 | % Daily Change | 100 Servers | 1,000 Servers | 10,000 Servers |
657 +----------------+-------------+---------------+----------------+
658 | 0.1% | 200 | 20 | 2 |
659 | 0.5% | 1000 | 100 | 10 |
660 | 1% | 2000 | 200 | 20 |
661 | 5% | 10,000 | 1000 | 100 |
662 | 10% | 20,000 | 2000 | 200 |
663 +----------------+-------------+---------------+----------------+
665 Assuming 10 million routers and a database size of 6GB, resulting
666 hourly transfer times are shown in seconds, given number of servers
667 and daily rate of change.
669 Table 4
671 This table shows us that with 10,000 servers the average transfer
672 time with 1Gb/s links for 10,000,000 routers will be 200 seconds with
673 10% daily change spread over 24 hourly updates. For a 0.1% daily
674 change, that number is 2 seconds for a database of size 6.0GB.
676 The amount of change goes to the purpose of LISP. If its purpose is
677 to provide effective multihoming support to end customers, then we
678 might anticipate relatively random changes. If, on the other,
679 service providers attempt to make use of LISP to provide some form of
680 traffic engineering, we can expect the same data to change more
681 often. We can probably not conclude much in this regard without
682 additional operational experience. The one thing we can say is that
683 different applications of the LISP protocol may require new and
684 different distribution mechanisms. Such optimization is left for
685 another day.
687 5.4. Security Considerations
689 Whichever the answer to our previous question, we must consider the
690 security of the information being transported. If an attacker can
691 forge an update or tamper with the database, he can in effect
692 redirect traffic to end sites. Hence, integrity and authenticity of
693 the NERD is critical. In addition, a means is required to determine
694 whether a source is authorized to modify a given database. No data
695 privacy is required. Quite to the contrary, this information will be
696 necessary for any ITR.
698 The first question one must ask is who to trust to provide the ITR a
699 mapping. Ultimately the owner of the EID prefix is most
700 authoritative for the mapping to RLOCs. However, were all owners to
701 sign all such mappings, ITRs would need to know which owner is
702 authorized to modify which mapping, creating a problem of O(N^2)
703 complexity.
705 We can reduce this problem substantially by investing some trust in a
706 small number of entities that are allowed to sign entries. If
707 authority manages EIDs much the same way a domain name registrar
708 handles domains, then the owner of the EID would choose a database
709 authority she or he trusts, and ITRs must trust each such authority
710 in order to map the EIDs listed by that authority to RLOCs. This
711 reduces the amount of management complexity on the ETR to retaining
712 knowledge of O(#authorities), but does require that each authority
713 establish procedures for authenticating the owner of an EID. Those
714 procedures needn't be the same.
716 There are two classic methods to ensure integrity of data:
718 o secure transport of the source of the data to the consumer, such
719 as Transport Layer Security (TLS) [7]; and
720 o provide object level security.
722 These methods are not mutually exclusive, although one can argue
723 about the need for the former, given the latter.
725 In the case of TLS, when it is properly implemented, the objects
726 being transported cannot easily be modified by interlopers or so-
727 called men in the middle. When data objects are distributed to
728 multiple servers, each of those servers must be trusted. As we have
729 seen above, we could have quite a large number of servers, thus
730 providing an attacker a large number of targets. We conclude that
731 some form of object level security is required.
733 Object level security involves an authority signing an object in a
734 way that can easily be verified by a consumer, in this case a router.
735 In this case, we would want the mapping table and any incremental
736 update to be signed by the originator of the update. This implies
737 that we cannot simply make use of a tool like CVS [10]. Instead, the
738 originator will want to generate diffs, sign them, and make them
739 available either directly or through some sort of content
740 distribution or peer to peer network.
742 5.4.1. Use of Public Key Infrastructures (PKIs)
744 X.509 provides a certificate hierarchy that has scaled to the size of
745 the Internet. The system is most manageable when there are few
746 certificates to manage. The model proposed in this memo makes use of
747 one current certificate per database authority. The three pieces of
748 information necessary to verify a signature, therefore, are as
749 follows:
751 o the certificate of the database authority, which can be provided
752 along with the database;
753 o the certificate authority's certificate; and
754 o A table of database names and distinguished names (DNs) that are
755 allowed to update them.
757 The latter two pieces of information must be very well known and must
758 be configured on each ITR. It is expected that both would change
759 very rarely, and it would not be unreasonable for such updates to
760 occur as part of a normal OS release process.
762 The tools for both signing and verifying are readily available.
763 Openssl [19] provides tools and libraries for both signing and
764 verifying. Other tools commonly exist.
766 Use of PKIs is not without implementation, operational complexity or
767 risk. The following risks and mitigations are identified with NERD's
768 use of PKIs:
770 If a NERD database authority private key is exposed:
772 In this case an attacker could sign a false database update,
773 either redirecting traffic, or otherwise causing havoc. In this
774 case, the NERD database administrator must revoke its existing key
775 and issue a new one. The certificate is added to a certificate
776 revocation list (CRL), which may be distributed with both this and
777 other databases, as well as through other channels. Because this
778 event is expected to be rare, and the number of database
779 authorities is expected to be small, a CRL will be small. When a
780 router receives a revocation, it checks it against its existing
781 databases, and attempts to update the one that is revoked. This
782 implies that prior to issuing the revocation, the database
783 authority MUST sign an update with the new key. Routers SHOULD
784 discard updates they have already received that were signed after
785 the revocation was generated. If a router cannot confirm that
786 whether the authority's certificate was revoked before or after a
787 particular update, it MUST retrieve a fresh new copy of the
788 database with a valid signature.
790 The private key associated with the CA that signed the Authority's
791 certificate is compromised:
793 In this case, it becomes possible for an attacker to masquerade as
794 the database authority. To ameliorate damage, the database
795 authority SHOULD revoke its certificate and get a new certificate
796 issued from a CA that is not compromised. Once it has done so,
797 the previous procedure is followed. The compromised certificate
798 can be removed during the normal operating system upgrade cycle.
800 An algorithm used if either the certificate or the signature is
801 cracked:
803 This is a catastrophic failure and the above forms of attack
804 become possible. The only mitigation is to make use of a new
805 algorithm. In theory this should be possible, but in practice has
806 proven very difficult. For this reason, additional work is
807 recommended to make alternative algorithms available.
809 The Database Authority loses its key or disappears:
811 In this case nobody can update the existing database. There are
812 few programmatic mitigations. If the database authority places
813 its private keys and suitable amounts of information escrow, under
814 agreed upon circumstances, such as no updates for three days, for
815 example, the escrow agent would release the information to a party
816 competent of generating a database update.
818 5.4.2. Other Risks
820 Because this specification does not require secure transport, if an
821 attacker prevents updates to an ITR for the purposes of having that
822 ITR continue to use a compromised ETR, the ITR could continue to use
823 an old version of the database without realizing a new version has
824 been made available. If one is worried about such an attack, a
825 secure channel such as SSL to a secure chain back to the database
826 authority should be used. It is possible that after some operational
827 experience, later versions of this format will contain additional
828 semantics to address this attack.
830 As discussed above, substantial risk would be a cold start scenario.
831 If an attacker found a bug in a common operating system that allowed
832 it to erase an ITR's database, and was able to disseminate that bug,
833 the collective ability of ITRs to retrieve new copies of the database
834 could be taxed by collective demand. The remedy to this is for
835 devices to share copies of the database with their neighbors, thus
836 making each potential requestor a potential service.
838 6. Why not use XML?
840 Many objects these days are distributed as either XML pages or
841 something derived as XML [16], such as SOAP [17],[18]. Use of such
842 well known standards allows for high level tools and library reuse.
843 Why not, then, use these standards in this case? There are two
844 answers to this question. First, the obvious concern is that XML is
845 not known for efficiency of data transport. Being based in text, an
846 IPv4 address is expanded from one octet to three octets, plus either
847 an attribute and quotes or element tags and end tags. Let us presume
848 for the moment a very simple schema that might cause a record to be
849 represented as follows:
851
852
853
854 192.168.1.1
855
856
857
858
859 192.168.1.2
860
861
862
864 With white space removed the uncompressed XML represents 120 bytes
865 versus 20 bytes for the record specified in Section 3.1, representing
866 a five fold expansion. That brings our 920MB database to 4.6GB.
868 The other concern about XML is that version 1.0 of the specification
869 is silent on the order of sibling elements. Specifications other
870 than the base specification state that order is significant. Order
871 is significant to LISP and NERD because once an update is applied to
872 the database it should be possible to verify the signature of the
873 entire database. Prior to applying the signature the XML generator
874 would need to ensure the order of information. That same sort would
875 be required of the router. This seems to add unnecessary fragility
876 to a critical system without much benefit. While there may indeed be
877 uses of an XML representation of the database, these uses are likely
878 to be outside of a router.
880 7. Other Distribution Mechanisms
882 We now consider various different mechanisms. The problem of
883 distributing changes in various databases is as old as databases.
884 The author is aware of two obvious approaches that have been well
885 used in the past. One approach would be the wide distribution of CVS
886 repositories. However, for reasons mentioned in the previous
887 section, CVS is insufficient to the task.
889 The other tried and true approach is the use of periodic updates in
890 the form of messages. Good old NNTP [12] itself provides two
891 separate mechanisms (one push and another pull) to provide a coherent
892 update process. This was in fact used to update molecular biology
893 databases [13] in the early 1990s. Netnews offers a way to determine
894 whether articles with specified Article-Ids have been received. In
895 the case where the mapping file source of authority wishes to
896 transmit updates, it can sign a change file and then post it into the
897 network. Routers merely need to keep a record of article ids that it
898 has received. Initially this is probably overkill, but it may not be
899 so later in this process. Some consideration should be given to a
900 mechanism known to widely distribute vast amounts of data, as
901 instantaneously either the sender or the receiver wishes.
903 To attain an additional level of hierarchy in the distribution
904 network, service providers could retrieve information to their own
905 local servers, and configure their routers with the host portion of
906 the above URI.
908 Another possibility would be for providers to establish an agreement
909 on a small set of anycast addresses for use for this purpose. There
910 are limitations to the use of anycast, particularly with TCP. In the
911 midst of a routing flap anycast address can become all but unusable.
912 Careful study of such a use as well as appropriate use of HTTP
913 redirects is expected.
915 7.1. What About DNS as a retrieval model?
917 It has been proposed that a query/response mechanism be used for this
918 information, and that specifically the domain name system (DNS) [15]
919 be used. The previous models do not preclude the DNS. DNS has the
920 advantage that the administrative lines are well drawn, and that the
921 ID/RLOC mapping is likely to appear very close to these boundaries.
922 DNS also has the added benefit that an entire distribution
923 infrastructure already exists. There are, however, some problems
924 that could impact end hosts when intermediate routers make queries,
925 some of which were first pointed out in [14]:
927 o Any query mechanism offers an opportunity for a resource attack if
928 an attacker can force the ITR to query for information. In this
929 case, all that would be necessary would be for a "botnet" (a group
930 of computers that have been compromised and used as vehicles to
931 attack others) to ping or otherwise contact via some normal
932 service hosts that sit behind the ETR. If the botnet hosts
933 themselves are behind ETRs, the victim's ITR will need to query
934 for each and every one of them, thus becoming part of a classic
935 reflector attack.
936 o Packets will be delayed at the very least, and probably dropped in
937 the process of a mapping query. This could be at the beginning of
938 a communication, but it will be impossible for a router to
939 conclude with certainty that this is the case.
940 o The DNS has a backoff algorithm that presumes that applications
941 are making queries prior to the beginning of a communication.
942 This is appropriate for end hosts who know in fact when a
943 communication begins. An end user may not enjoy a router waiting
944 seconds for a retry.
945 o While the administrative lines may appear to be correct, the
946 location of name servers may not be. If name servers sit within
947 PI address space, thus requiring LISP to reach, a circular
948 dependency is created. This is precisely where many enterprise
949 name servers sit. The LISP experiment should not predicate its
950 success on relocation of such name servers.
952 Never-the-less, DNS may be able to play a role in providing the
953 enterprise control over the mapping of its EIDs to RLOCs. Posit a
954 new DNS record "EID2RLOC". This record is used by the authority to
955 collect and aggregate mapping information so that it may be
956 distributed through one of the other mechanisms. As an example:
958 $ORIGIN 0.10.PI-SPACE.
959 128 EID2RLOC mask 23 priority 10 weight 5 172.16.5.60
960 EID2RLOC mask 23 priority 15 weight 5 192.168.1.5
962 In the above figure network 10.0.128/23 would delegated to some end
963 system, say EXAMPLE.COM. They would manage the above zone
964 information. This would allow a DNS mechanism to work, but it would
965 also allow someone to aggregate the information and distribution a
966 table.
968 7.1.1. Perhaps use a hybrid model?
970 It would be possible to use both a prepopulated database such as NERD
971 and query mechanism (perhaps DNS) to determine an EID/RLOC mapping.
972 The general idea would be to receive a subset of the mappings, say,
973 by taking only the NERD for certain regions. This alleviates the
974 need to drop packets for some subset of destinations under the
975 assumption that one's business is localized to a particular region.
976 If one did not have a local entry for a particular EID one would then
977 make a query.
979 One improvement on simply using DNS to query live would be to
980 periodically walk the entire network, in search of EID2RLOC records,
981 and caching them to non-volatile storage. This has two benefits.
983 First, it prevents resource attacks. Care has to be given to how
984 memory is cached it avoid an attacker causing a performance
985 degradation by attempting to exceed memory limits through a random
986 source attack.
988 As important as resisting attacks, having a complete or near complete
989 copy of the database provides for a faster recovery time when a
990 router goes out of service, for whatever reason. Absent such a
991 mechanism, devices would need to repopulate their local caches
992 through the help of another system, leading to additional system
993 fragility.
995 7.2. Use of BGP
997 Border Gateway Protocol (BGP) [8] is currently used to distribute
998 inter-domain routing throughout the Internet. Why not, then, use BGP
999 to distribute the mapping table? A simple answer is that the objects
1000 BGP best handles are routes. While it may be possible to transmit
1001 EID/RLOC mappings instead (because they look an awful lot like
1002 routes) the rate of updates of EID/RLOC mappings is specifically
1003 intended to be considerably less than routes, and would probably
1004 require additional dampening mechanisms to ensure that this is so.
1006 In addition, the ownership of the mapping does not flow from service
1007 providers but rather from end users of the identifiers. It should
1008 not be possible for anyone to filter the mapping, other than perhaps
1009 ITRs for local policy purposes. The current limited security model
1010 for BGP does not fit the general requirements of how the mapping is
1011 to be processed.
1013 Furthermore, as BGP is currently the lifeblood of the Internet its
1014 use for any means other than routing should be strongly scrutinized.
1016 This is not to say that BGP has no role to play whatsoever. It may
1017 well be possible for routers to exchange database version numbers and
1018 perhaps base distribution URIs as extensions or capabilities. This
1019 would allow routers to serve their copy of the database to their
1020 neighbors, easing the load off the rest of the server infrastructure.
1021 How this would be done is future work.
1023 8. Deployment Issues
1025 While LISP and NERD are intended as experiments at this point, it is
1026 already obvious one must give serious consideration to circular
1027 dependencies with regard to the protocols used and the elements
1028 within them.
1030 8.1. HTTP
1032 In Section 7.1 we have already seen how DNS can have circular
1033 dependencies. In as much as HTTP depends on DNS, either due to the
1034 authority section of a URI, or due to the configured base
1035 distribution URI, these same concerns apply. In addition, any HTTP
1036 server that itself makes use of provider independent addresses would
1037 be a poor choice to distribute the database for these exact same
1038 reasons.
1040 One issue with using HTTP is that it is possible that a middlebox of
1041 some form, such as a cache, may intercept and process requests. In
1042 some cases this might be a good thing. For instance, if a cache
1043 correctly returns a database, some amount of bandwidth is conserved.
1044 On the other hand, if the cache itself fails to function properly for
1045 whatever reason, end to end connectivity could be impaired. For
1046 example, if the cache itself depended on the mapping being in place
1047 and functional, a cold start scenario might leave the cache
1048 functioning improperly, in turn providing routers no means to update
1049 their databases. Some care must be given to avoid such
1050 circumstances.
1052 9. Open Questions
1054 Do we need to discuss reachability in more detail? This was clearly
1055 an issue at the IST-RING workshop. There are two key issues. First,
1056 what is the appropriate architectural separation between the data
1057 plane and the control plane? Second, is there some specific way in
1058 which NERD impacts the data plane?
1060 Should the database contain its name? It is probably sufficient to
1061 merely reference the database by name.
1063 Should the signature portion be separated from the actual database?
1064 By specifying the signature we hope to reduce interoperability issues
1065 and encourage proper security from the get go. On the other hand,
1066 since the object is opaque it is not clear how much interoperability
1067 we are actually encouraging.
1069 Should we specify a (perhaps compressed) tarball that treads a middle
1070 ground for the last question, where each update tarball contains both
1071 a signature for the update and for the entire database, once the
1072 update is applied.
1074 Should we compress? In some initial testing of databases with 1, 5,
1075 and 10 million IPv4 EIDs and a random distribution of IPv4 RLOCs, the
1076 current format in this document compresses down by a factor of
1077 between 35% and 36%, using Burrows-Wheeler block sorting text
1078 compression algorithm (bzip2). The NERD used random EIDs with mask
1079 lengths varying from 19-29, with probability weighted toward the
1080 smaller masks. This only very roughly reflects reality. A better
1081 test would be to start with the existing prefixes found in the DFZ.
1083 10. Conclusions
1085 This memo has specified a database format, an update format, a URI
1086 convention, an update method, and a validation method for EID/RLOC
1087 mappings. We have shown that beyond the predictions of 10^7
1088 locators, the aggregate database size would be at most 10.8GB. We
1089 have considered the amount of servers to distribute that information
1090 and we have demonstrated the limitations of a simple content
1091 distribution network and other well known mechanisms. The effort
1092 required to retrieve a database change amounts to between 2 and 20
1093 seconds of processing time per hour at at today's gigabit speeds. We
1094 conclude that there is no need for an off box query mechanism today,
1095 and that there are distinct disadvantages for having such a mechanism
1096 in the control plane.
1098 Beyond this we have examined alternatives that allow for hybrid
1099 models that do use query mechanisms, should our operating assumptions
1100 prove overly optimistic. Use of NERD today does not forclose use of
1101 such models in the future, and in fact both models can happily co-
1102 exist.
1104 We leave to future work how the list of databases is distributed, how
1105 BGP can play a role in distributing knowledge of the databases, and
1106 how DNS can play a role in aggregating information into these
1107 databases.
1109 We also leave to future work whether HTTP is the best protocol for
1110 the job, and whether the scheme described in this document is the
1111 most efficient. One could easily envision that when applied in high
1112 delay or high loss environments, a broadcast or multicast method may
1113 prove more effective.
1115 11. IANA Considerations
1117 This memo makes no requests of IANA.
1119 12. Acknowledgments
1121 Dino Farinacci, Patrik Faltstrom, Dave Meyer, Joel Halpern, Dave
1122 Thaler, Mohamed Boucadair, Robin Whittle, and Max Pritikin were very
1123 helpful with their reviews of this work. Thanks also to the
1124 participants of the Routing Research Group and the IST-RING workshop
1125 held in Madrid in December of 2007 for their incisive comments. The
1126 astute will notice a lengthy References section. This work stands on
1127 the shoulders of many others' efforts.
1129 13. References
1131 13.1. Normative References
1133 [1] Farinacci, D., "Locator/ID Separation Protocol (LISP)",
1134 draft-farinacci-lisp-03 (work in progress), August 2007.
1136 [2] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L.,
1137 Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol --
1138 HTTP/1.1", RFC 2616, June 1999.
1140 [3] Bradner, S., "Key words for use in RFCs to Indicate Requirement
1141 Levels", BCP 14, RFC 2119, March 1997.
1143 [4] Kaliski, B., "PKCS #7: Cryptographic Message Syntax Version
1144 1.5", RFC 2315, March 1998.
1146 [5] International Telecommunications Union, "Information technology
1147 - Open Systems Interconnection - The Directory: Public-key and
1148 attribute certificate frameworks", ITU-T Recommendation X.509,
1149 ISO Standard 9594-8, March 2000.
1151 [6] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
1152 Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986,
1153 January 2005.
1155 13.2. Informational References
1157 [7] Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS)
1158 Protocol Version 1.1", RFC 4346, April 2006.
1160 [8] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway Protocol 4
1161 (BGP-4)", RFC 4271, January 2006.
1163 [9] Carpenter, B., "IETF Plenary Presentation: Routing and
1164 Addressing: Where we are today", March 2007.
1166 [10] Grune, R., Baalbergen, E., Waage, M., Berliner, B., and J.
1167 Polk, "CVS: Concurrent Versions System", November 1985.
1169 [11] International International Telephone and Telegraph
1170 Consultative Committee, "Information Technology - Open Systems
1171 Interconnection - The Directory: Authentication Framework",
1172 CCITT Recommendation X.509, November 1988.
1174 [12] Kantor, B. and P. Lapsley, "Network News Transfer Protocol",
1175 RFC 977, February 1986.
1177 [13] Smith, R., Gottesman, Y., Hobbs, B., Lear, E., Kristofferson,
1178 D., Benton, D., and P. Smith, "A mechanism for maintaining an
1179 up-to-date GenBank database via Usenet", CABIOS , April 1991.
1181 [14] Huitema, C., "An Experiment in DNS Based IP Routing", RFC 1383,
1182 December 1992.
1184 [15] Mockapetris, P., "Domain names - concepts and facilities",
1185 STD 13, RFC 1034, November 1987.
1187 [16] Bray, T., Paoli, J., Sperberg-McQueen, C., and E. Maler,
1188 "Extensible Markup Language (XML) 1.0 (2nd ed)", W3C REC-xml,
1189 October 2000, .
1191 [17] Gudgin, M., Hadley, M., Mendelsohn, N., Moreau, J., and H.
1192 Nielsen, "SOAP Version 1.2 Part 1: Messaging Framework", W3C
1193 Working Draft soap12-part1, June 2002,
1194 .
1196 [18] Gudgin, M., Hadley, M., Mendelsohn, N., Moreau, J., and H.
1197 Nielsen, "SOAP Version 1.2 Part 2: Adjuncts", W3C Working
1198 Draft soap12-part2, June 2002,
1199 .
1201 URIs
1203 [19]
1205 Appendix A. Generating and verifying the database signature with
1206 OpenSSL
1208 As previously mentioned, one goal of NERD was to use off-the-shelf
1209 tools to both generate and retrieve the database. To many, PKI is
1210 magic. This section is meant to provide at least some clarification
1211 as to both the generation and verification process, complete with
1212 command line examples. Not included is how you get the entries
1213 themselves. We'll assume they exist, and that you're just trying to
1214 sign the database.
1216 To sign the database, to start with, you need a database file that
1217 has a database header described in Section 3. Block size should be
1218 zero, and there should be no PKCS#7 block at this point. You also
1219 need a certificate and its private key with which you will sign the
1220 database.
1222 The OpenSSL "smime" command contains all the functions we need from
1223 this point forth. To sign the database, issue the following command:
1225 openssl smime -binary -sign -outform DER -signer yourcert.crt \
1226 -inkey yourcert.key -in database-file -out signature
1228 -binary states that no MIME canonicalization should be performed.
1229 -sign indicates that you are signing the file that was given as the
1230 argument to -in. The output format (-outform) is binary DER, and
1231 your public certificate is provided with -signer along with your key
1232 with -inkey. The signature itself is specified with -out.
1234 The resulting file "signature" is then copied into to PKCS#7 block in
1235 the database header, its size in bytes is recorded in the PKCS#7
1236 block size field, and the resulting file is ready for distribution to
1237 ITRs.
1239 To verify a database file, first retrieve the PKCS#7 block from the
1240 file by copying the appropriate number of bytes into another file,
1241 say "signature". Then zero this field, and set the block size field
1242 to 0. Next use the "smime" command to verify the signature as
1243 follows:
1245 openssl smime -binary -verify -inform DER -content database-file
1246 -out /dev/null -in signature
1248 Openssl will return "Verification OK" if the signature is correct.
1250 To improve verification performance it would make modifications to
1251 the program so that it takes as input the database with a null
1252 signature and as an argument the name of the file containing the
1253 signature. Better yet, use a call to the appropriate library with
1254 each block.
1256 Appendix B. Changes
1258 This section to be removed prior to publication.
1260 o 03: Change dbname to a domain name, indicate that is what is in
1261 the subject of the X.509 certificate, and list editorial changes,
1262 update acknowledgments.
1263 o 02: Incorporate some of Dave Thaler's comments. Add
1264 authentication block detail. Modify analysis to take IPv6 into
1265 account, along with a more realistic number of RLOCs per EID. Add
1266 some comments about potential risks of a cold start. Add S/MIME
1267 example as appendix A and take out old ToDo. Provide some amount
1268 of compression of IPv6 addresses by limiting their size to
1269 significant bytes rounded to a four byte word boundary.
1270 o 01: Massive spelling correction, URI example correction.
1271 o 00: Initial Revision.
1273 Author's Address
1275 Eliot Lear
1276 Cisco Systems GmbH
1277 Glatt-com
1278 Glattzentrum, ZH CH-8301
1279 Switzerland
1281 Phone: +41 1 878 7525
1282 Email: lear@cisco.com
1284 Full Copyright Statement
1286 Copyright (C) The IETF Trust (2008).
1288 This document is subject to the rights, licenses and restrictions
1289 contained in BCP 78, and except as set forth therein, the authors
1290 retain all their rights.
1292 This document and the information contained herein are provided on an
1293 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1294 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
1295 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
1296 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
1297 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1298 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
1300 Intellectual Property
1302 The IETF takes no position regarding the validity or scope of any
1303 Intellectual Property Rights or other rights that might be claimed to
1304 pertain to the implementation or use of the technology described in
1305 this document or the extent to which any license under such rights
1306 might or might not be available; nor does it represent that it has
1307 made any independent effort to identify any such rights. Information
1308 on the procedures with respect to rights in RFC documents can be
1309 found in BCP 78 and BCP 79.
1311 Copies of IPR disclosures made to the IETF Secretariat and any
1312 assurances of licenses to be made available, or the result of an
1313 attempt made to obtain a general license or permission for the use of
1314 such proprietary rights by implementers or users of this
1315 specification can be obtained from the IETF on-line IPR repository at
1316 http://www.ietf.org/ipr.
1318 The IETF invites any interested party to bring to its attention any
1319 copyrights, patents or patent applications, or other proprietary
1320 rights that may cover technology that may be required to implement
1321 this standard. Please address the information to the IETF at
1322 ietf-ipr@ietf.org.
1324 Acknowledgment
1326 Funding for the RFC Editor function is provided by the IETF
1327 Administrative Support Activity (IASA).