idnits 2.17.1 draft-lear-lisp-nerd-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1289. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1300. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1307. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1313. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 4 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 19, 2007) is 6054 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-12) exists of draft-farinacci-lisp-03 ** Obsolete normative reference: RFC 2616 (ref. '2') (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) ** Obsolete normative reference: RFC 2141 (ref. '7') (Obsoleted by RFC 8141) -- Obsolete informational reference (is this intentional?): RFC 4346 (ref. '8') (Obsoleted by RFC 5246) -- Obsolete informational reference (is this intentional?): RFC 977 (ref. '13') (Obsoleted by RFC 3977) Summary: 3 errors (**), 0 flaws (~~), 3 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group E. Lear 3 Internet-Draft Cisco Systems GmbH 4 Intended status: Experimental September 19, 2007 5 Expires: March 22, 2008 7 NERD: A Not-so-novel EID to RLOC Database 8 draft-lear-lisp-nerd-02.txt 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on March 22, 2008. 35 Copyright Notice 37 Copyright (C) The IETF Trust (2007). 39 Abstract 41 LISP is a protocol to encapsulate IP packets in order to allow end 42 sites to multihome without injecting routes from one end of the 43 Internet to another. This memo specifies a database and a method to 44 transport the mapping of EIDs to RLOCs to routers in a reliable, 45 scalable, and secure manner. Our analysis concludes that transport 46 of of all EID/RLOC mappings scales well to at least 10^8 entries, and 47 that use of DNS or any approach that queries for mappings has 48 substantial operational concerns. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 53 1.1. Base Assumptions . . . . . . . . . . . . . . . . . . . . . 3 54 1.2. What is NERD? . . . . . . . . . . . . . . . . . . . . . . 4 55 1.3. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . 5 56 2. Theory of Operation . . . . . . . . . . . . . . . . . . . . . 5 57 2.1. Who are database authorities? . . . . . . . . . . . . . . 6 58 3. NERD Format . . . . . . . . . . . . . . . . . . . . . . . . . 7 59 3.1. NERD Record Format . . . . . . . . . . . . . . . . . . . . 9 60 3.2. Database Update Format . . . . . . . . . . . . . . . . . . 10 61 4. NERD Distribution Mechanism . . . . . . . . . . . . . . . . . 10 62 4.1. Initial Bootstrap . . . . . . . . . . . . . . . . . . . . 10 63 4.2. Retrieving Changes . . . . . . . . . . . . . . . . . . . . 10 64 5. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 65 5.1. Database Size . . . . . . . . . . . . . . . . . . . . . . 12 66 5.2. Router Throughput Versus Time . . . . . . . . . . . . . . 14 67 5.3. Number of Servers Required . . . . . . . . . . . . . . . . 14 68 5.4. Security Considerations . . . . . . . . . . . . . . . . . 16 69 5.4.1. Use of Public Key Infrastructures (PKIs) . . . . . . . 17 70 5.4.2. Other Risks . . . . . . . . . . . . . . . . . . . . . 19 71 6. Why not use XML? . . . . . . . . . . . . . . . . . . . . . . . 19 72 7. Other Distribution Mechanisms . . . . . . . . . . . . . . . . 20 73 7.1. What About DNS as a retrieval model? . . . . . . . . . . . 21 74 7.1.1. Perhaps use a hybrid model? . . . . . . . . . . . . . 22 75 7.2. Use of BGP . . . . . . . . . . . . . . . . . . . . . . . . 23 76 8. Deployment Issues . . . . . . . . . . . . . . . . . . . . . . 23 77 8.1. HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 78 9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 24 79 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 80 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 25 81 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 82 12.1. Normative References . . . . . . . . . . . . . . . . . . . 25 83 12.2. Informational References . . . . . . . . . . . . . . . . . 26 84 Appendix A. Generating and verifying the database signature 85 with OpenSSL . . . . . . . . . . . . . . . . . . . . 27 86 Appendix B. Changes . . . . . . . . . . . . . . . . . . . . . . . 28 87 Appendix C. Open Questions . . . . . . . . . . . . . . . . . . . 28 88 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 29 89 Intellectual Property and Copyright Statements . . . . . . . . . . 30 91 1. Introduction 93 Locator/ID Separation Protocol (LISP) [1] is a protocol whose primary 94 purpose is to separate an IP address used by a host and local routing 95 system from the locators advertised by BGP participants on the 96 Internet in general, and in the default free zone (DFZ) in 97 particular. It accomplishes this by establishing a mapping between 98 globally unique endpoint identifiers (EIDs) and routing locators 99 (RLOCs) within the global routing table. This reduces the amount of 100 state change that occurs on routers within the default-free zone on 101 the Internet, while enabling end sites to be multihomed. 103 In early stages of LISP (1 and 1.5) the mapping is either configured 104 into a device or it is learned via data-triggered control messages 105 between ingress tunnel routers (ITRs) and egress tunnel routers 106 (ETRs) under the assumption that during transition, EIDs will be 107 present within the global routing system, as they are today. 109 In later stages of LISP, the assumption will be that EIDs are not 110 contained within the global routing system, but that instead the 111 mapping from EIDs to RLOCs will be learned through some other means. 112 This memo addresses different approaches to the problem, and 113 specifies a Not-so-novel EID RLOC Database (NERD) and methods to both 114 receive the database and to receive updates. 116 LISP and NERD are both currently experimental stages. The NERD 117 database is specified in such a way that the methods used to 118 distribute or retrieve it may vary over time. Multiple databases are 119 supported in order to allow for multiple data sources. An effort has 120 been made to divorce the database from access methods so that both 121 can evolve independently through experimentation and operational 122 validation. 124 1.1. Base Assumptions 126 In order to specify a mapping it is important to understand how it 127 will be used, and the nature of the data being mapped. In the case 128 of LISP, the following assumptions are pertinant: 130 o The data contained within the mapping changes only on provisioning 131 or configuration operations, and is not intended to change when a 132 link either fails or is restored. Some other mechanism (via LISP 133 or other) handles healing operations, particularly when a tail 134 circuit within an service provider's aggregate goes down. 135 o While weight and priority are defined, these are not hop-by-hop 136 metrics. Hence the information contained within the mapping does 137 not change based on where one sits within the topology. 139 o The purpose of LISP being to reduce control plane overhead by 140 reducing "rate X state" complexity, updates to the mapping will be 141 relatively rare. 142 o Because LISP and NERD are designed to ease interdomain routing, 143 their use is intended within the inter-domain environment. That 144 is, LISP is best implemented at either the customer edge or 145 provider edge, and there will be on the order of as many ITRs and 146 LISP announcements as there are connections to Internet Service 147 Providers by end customers. 148 o As such, LISP and NERD cannot be the sole means to implement host 149 mobility, although they may be in used in conjunction with other 150 mechanisms. For instance, it would be possible for a mobile node 151 to receive a local address that is an EID and pass that to the 152 correspondant node, who could also make use of an EID. As such 153 use of LISP in this case would be transparent, and no mapping 154 entries are changed for mobility. 155 o As such, there is no interaction with the interior gateway 156 protocol (IGP). 158 1.2. What is NERD? 160 NERD is a Not-so-novel EID to RLOC Database. It consists of the 161 following components: 163 1. a network database format; 164 2. a change distribution format; 165 3. a database retrieval/bootstrapping method; 166 4. a change distribution method. 168 The network database format is compressable. However, at this time 169 we specify no compression method. NERD will make use of potentially 170 several transport methods, but most notably HTTP [2]. HTTP has 171 restart and compression capabilities. It is also widely deployed. 173 There exist many methods to show differences between two versions of 174 a database or a file, UNIX's "diff" being the classic example. In 175 this case, because the data is well structured and easily keyed, we 176 can make use of a very simple format for version differences that 177 simply provides a list of EID/RLOC mappings that have changed using 178 the same record format as the database, and a list of EIDs that are 179 to be removed. 181 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 182 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 183 document are to be interpreted as described in RFC 2119 [3]. 185 1.3. Glossary 187 The reader is once again referred to [1] for a general glossary of 188 terms related to LISP. The following terms are specific to this 189 memo. 191 Base Distribution URI: An Absolute-URI as defined in Section 4.3 of 192 [6] from which other references are relative. The base 193 distribution URI is used to construct a URI to an EID/RLOC mapping 194 database. If more than one NERD is known then there will be one 195 or more base distribution URIs associated with each (although each 196 such base distribution URI may have the same value). 198 EID Database Authority: The authority that will sign database files 199 and updates. It is the source of both. 201 The Authority: Shorthand for the EID Database Authority. 203 NERD: (N)ot-so-novel (E)ID to (R)LOC (D)atabase. 205 AFI Address Family Identifier. 207 Pull Model: An architecture where clients pull only the information 208 they need at any given time, such as when a packet arrives for 209 forwarding. 211 Push Model: An architecture in which clients receive an entire 212 dataset, containing data they may or may not require, such as 213 mappings for EIDs that no host served is attempting to send to. 215 Hybrid Model: An architecture in which clients receive a subset of 216 the entire dataset and query as needed for the rest. 218 2. Theory of Operation 220 What follows is a summary of how NERDs are generated and updated. 221 Specifics can be found in Section 3. The general way in which NERD 222 works is as follows: 224 1. A NERD is generated by an authority that allocates provider 225 independent (PI) addresses (e.g., IANA or an RIR) which are used 226 by sites as EIDs. As part of this process the authority 227 generates a digest for the database and signs it with a private 228 key whose public key is part of an X.509 certificate. [12] That 229 signature along with a copy of the authority's public key is 230 included in the NERD. 231 2. The NERD is distributed to a group of well known servers. 232 3. ITRs retrieve an initial copy of the NERD via HTTP when they come 233 into service. 234 4. ITRs are preconfigured with a group of certificates whose private 235 keys are used by database authorities to sign the NERD. This 236 list of certificates should be configurable by administrators. 237 5. ITRs next verify both the validity of the public key and the 238 signed digest. If either fail validation, the ITR attempts to 239 retrieve the NERD from a different source. The process iterates 240 until either a valid database is found or the list of sources is 241 exhausted. 242 6. Once a valid NERD is retrieved, the ITR installs it into both 243 non-volatile and local memory. 244 7. At some point the authority updates the NERD and increments the 245 database version counter. At the same time it generates a list 246 of changes, which it also signs, as it does with the original 247 database. 248 8. Periodically ITRs will poll from their list of servers to 249 determine if a new version of the database exists. When a new 250 version is found, an ITR will attempt to retrieve a change file, 251 using its list of preconfigured servers. 252 9. The ITR validates a change file just as it does the original 253 database. Assuming the change file passes validation, the ITR 254 installs new entries, overwrites existing ones, and removes empty 255 entries, based on the content of the change file. 257 As time goes on it is quite possible that an ITR may probe a list of 258 configured neighbors for a database or change file copy. It is 259 equally possible that neighbors might advertise to each other the 260 version number of their database. Such methods are not explored in 261 detph in this memo, but are mentioned for future consideration. 263 2.1. Who are database authorities? 265 This memo does not specify who the database authority is. That is 266 because there are several possible operational models. In each case 267 the number of database authorities is meant to be small so that ITRs 268 need only keep a small list of authorities, similar to the way a name 269 server might cache a list of root servers. 271 o A single database authority exists. In this case all entries in 272 the database are registered to a single entity, and that entity 273 distributes the database. Because the EID space is provider 274 independent address space, there is no architectural requirement 275 that address space be hierarchically distributed to anyone, as 276 there is with provider-assigned address space. Hence, there is a 277 natural affinity between the IANA function and the database 278 authority function. 279 o Each region runs a database authority. In this case, provider 280 independent address space is allocated to either regional internet 281 registries or to affiliates of such organizations of network 282 operations guilds (NOGs). The benefit of this approach is that 283 there is no single organization that controls the database. It 284 allows one database authority to backup another. One could 285 envision as many as ten database authorities in this scenario. 286 o Each country runs a database authority. This could occur should 287 countries decide to regulate this function. While limiting the 288 scope of any single database authority as the previous scenario 289 describes, this approach would introduce some overhead as the list 290 of database authorities would grow to as many as 200, and possibly 291 more if jurisdictions within countries attempted to regulate the 292 function. 294 As the number of authorities increases the amount of change on that 295 list will also increase, requiring both an update mechanism and the 296 potential need for a discovery mechanism, both of which would be the 297 subject of future work (i.e., not to be found in this memo). For 298 this reason alone, as a starting point two database authorities are 299 recommended, but their selection is left for others. 301 3. NERD Format 303 The NERD consists of a header that contains a database version and a 304 signature that is generated by ignoring the signature field and 305 setting the authentication block length to 0 (NULL). The 306 authentication block itself consists of a signature and a certificate 307 whose private key counterpart was used to generate the signature. 308 The exact format of the authentication block is TBD. 310 Records are kept sorted in numeric order with AFI plus EID as primary 311 key and mask length as secondary. This is so that after a database 312 update it should be possible to reconstruct the database to verify 313 the digest signature, which may be retrieved separately from the 314 database for verification purposes. 316 0 1 2 3 317 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 318 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 319 | Schema Vers=1 | DB Code | Database Name Size | 320 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 321 | Database Version | 322 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 323 | Old Database Version or 0 | 324 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 325 | | 326 | Database Name | 327 | | 328 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 329 | PKCS#7 Block Size | Reserved | 330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 331 | | 332 | PKCS#7 Block containing Certificate and Signature | 333 | | 334 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 336 Database Header 338 The DB Code indicates 0 if what follows is an entire database or 1 if 339 what follows is an update. The database file version is incremented 340 each time the complete database is generated by the authority. In 341 the case of an update, the database file version indicates the new 342 database file version, and the old database file version is indicated 343 in the "old DB version" field. The database file version is used by 344 routers to determine whether or not they have the most current 345 database. 347 The database name is a Universal Resource Name (URN) [7] of the 348 following form: 350 dburn = "urn:lisp:3.0:" dbname 351 dbname = 1*(URN Chars) ;; URN Chars is defined in RFC 2141. 353 The purpose of the database name is to allow for more than one 354 database. Such databases would be merged by the router. It is 355 important that an EID/RLOC mapping be listed in no more than one 356 database, lest inconsistencies arise. However, it may be possible to 357 transition a mapping from one database to another. During the 358 transition period, the mappings MUST be identical. When they are 359 not, the resultant behavior will be undefined. 361 The PKCS#7 [4] authentication block contains a DER encoded [5] 362 signature and associated public key. 364 3.1. NERD Record Format 366 As distributed over the network, NERD records appear as follows: 368 0 1 2 3 369 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 370 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 371 | Num. RLOCs | EID Mask Len | EID AFI | 372 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 373 | End point identifier | 374 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 375 | Priority 1 | Weight 1 | AFI 1 | 376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 377 | Routing Locator 1 | 378 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 379 | Priority 2 | Weight 2 | AFI 2 | 380 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 381 | Routing Locator 2 | 382 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 383 | Priority 3 | Weight 3 | AFI 3 | 384 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 385 | Routing Locator 3... | 386 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 388 Priority N and Weight N, and AFI N are associated with Routing 389 Locator N. There will always be at least one routing locator. The 390 minimum record size for IPv4 is 16 bytes. Each additional IPv4 RLOC 391 increases the record size by 8 bytes. The purpose of this format is 392 to keep the database compact, but somewhat easily read. The meaning 393 of weight and priority are described in [1]. The format of the AFI 394 is specified by IANA as "Address Family Numbers", with the exception 395 of how IPv6 addresses are stored. 397 In order to reduce storage and transmission amounts for IPv6, only 398 the necessary number of bytes as specified by the prefix length are 399 kept in the record, rounded to the nearest four byte (word) boundary. 400 This is true for both EIDs and RLOCs. For instance, if the prefix 401 length is /49, the nearest four-byte word boundary would require that 402 eight bytes are stored. 404 3.2. Database Update Format 406 A database update contains a set of changes to an existing database. 407 Each AFI/EID/mask-length tuple may have zero or more RLOCs associated 408 with it. In the case where there are no RLOCs, the EID entry is 409 removed from the database. Records that contain EIDs and mask 410 lengths that were not previously listed are simply added. Otherwise, 411 the old record for the EID and mask length is replaced by the more 412 current information. The record format used by the a database update 413 is the same as described in Section 3.1. 415 4. NERD Distribution Mechanism 417 4.1. Initial Bootstrap 419 Bootstrap occurs when a router needs to retrieve the entire database. 420 It knows it needs to retrieve the entire database because either it 421 has none or an update too substantial to process, as might be the 422 case if a router has been out of service for a substantially lengthy 423 period of time. 425 To bootstrap the router appends the database name plus "/current/ 426 entiredb" to a Base Distribution URI and retrieves the file via HTTP. 427 For example, if the configured URI is 428 "http://www.example.com/eiddb/", and assuming a database name of 429 "arin", the router would request 430 "http://www.example.com/eiddb/current/arin/entiredb". Routers MUST 431 check the signature on the database prior to installing it, and MUST 432 check that the database schema matches a schema they understand. 433 Once a router has a valid database it MUST store that database in 434 some sort of non-volatile memory (e.g., disk, flash memory, etc). 436 N.B., the host component for such URIs MUST NOT resolve to a LISP 437 EID, lest a circular dependency be created. 439 4.2. Retrieving Changes 441 In order to retrieve a set of database changes a router will have 442 previously retrieved the entire database. Hence it knows the current 443 version of the database it has. Its first step for retrieving 444 changes is to retrieve the current version of the database. It does 445 so by appending "current/version" to the base distribution URI and 446 retrieving the file. Its format is text and it contains the integer 447 value of the current database version. 449 Once a router has retrieved the current version it compares version 450 of its local copy. If there is no difference, then the router is up 451 to date and need take no further actions until it next checks. 453 If the versions differ, the router next sends a request for the 454 appropriate change file by appending "current/changes/" and the 455 textual representation of the version of its local copy of the 456 database to the base distribution URI. For example, if the current 457 version of the database is 1105503 and router's version is 1105500, 458 and the base URI and database name are the same as above, the router 459 would request 460 "http://www.example.com/eiddb/arin/current/changes/1105500". 462 The server may not have that change file, either because there are 463 too many versions between what the router has and what is current, or 464 because no such change file was generated. If the server has changes 465 from the routers version to any later version, the server SHOULD 466 issue an HTTP redirect to that change file, and the router SHOULD 467 retrieve and process it. Once it has done so, the router should then 468 repeat the process until it has brought itself up to date. It is 469 thus important for servers to expire old change files in the order in 470 which they were generated. 472 By way of convention, it is suggested that the URIs issued in 473 redirects be of the following form: 475 {base dist. URI}/{dbname}/{more-recent-version}/changes/ 476 {older-version} 478 where "base dist. URI" is the base distribution URI, "dbname" is the 479 name of the database, and each version is the textual representation 480 of the integer version value. 482 For example, if the current database version was 1105503 and a router 483 made a request for 484 "http://www.example.com/eiddb/arin/current/changes/1105400" but there 485 was no change file from 1105400 to 1105503, and the server had a 486 group of change files to make the router current, it would issue a 487 redirect to 488 "http://www.example.com/eiddb/arin/110450/changes/1105400" that the 489 router would then process. The router would then make a request for 490 "http://www.example.com/eiddb/arin/current/changes/110450" that the 491 server would have. 493 While it is unlikely that database versions would wrap, as they 494 consists of 32 bit integers, should the event occur, ITRs MUST 495 attempt first to retrieve a change file when their current version 496 number is within 10,000 of 2^32 and they see a version available that 497 is less than 10,000. Barring the availability of a change file, the 498 ITR MUST still assume that the database version has wrapped and 499 retrieve a new copy. 501 5. Analysis 503 We will start our analysis by looking at how much data will be 504 transferred to a router during bootstrap conditions. We will then 505 look at the bandwidth required. Next we will turn our concerns to 506 servers. Finally we will ponder the effect of providing only 507 changes. 509 In the analysis below we treat the overhead of the database header as 510 insignificant (because it is). The analysis should be similar, 511 whether a single database or multiple databases are employed, as we 512 would assume that no entry would appear more than once. 514 5.1. Database Size 516 By its very nature the information to be transported is relatively 517 static and is specifically designed to be topologically insensitive. 518 That is, every ITR is intended to have the same set of RLOCs for a 519 given EID. While some processing power will be necessary to install 520 a table, the amount required should be far less than that of a 521 routing information database because the level of entropy is intended 522 to be lower. 524 For purposes of this analysis, we will assume that the world has 525 migrated to IPv6, as this increases the size of the database, which 526 would be our primary concern. However, to mitigate the size 527 increase, we have limited the size of the prefix transmitted. For 528 purposes of this analysis, we shall assume an average prefix length 529 of 64 bits. 531 Based on that assumption, Section 3.1 states that mapping information 532 for each EID/Prefix includes a group of RLOCs, each with an 533 associated priority and weight, and that a minimum record size with 534 IPv6 EIDs with at least one RLOC is 24 bytes uncompressed. Each 535 additional IPv6 RLOC costs 12 bytes (again, assuming an average 536 prefix length of 64 bits). 538 +-----------+--------+--------+---------+ 539 | 10^n EIDs | 2 RLOC | 4 RLOC | 8 RLOC | 540 +-----------+--------+--------+---------+ 541 | 4 | 360 KB | 600 KB | 1.08 MB | 542 | 5 | 3.6 MB | 6.0 MB | 10.8 MB | 543 | 6 | 36 MB | 60 MB | 108 MB | 544 | 7 | 360 MB | 600 MB | 1.08 GB | 545 | 8 | 3.6 GB | 6.0 GB | 10.8 GB | 546 +-----------+--------+--------+---------+ 548 Database size for IPv6 routes with average prefix length = 64 bits 550 Table 1 552 Entries in the above table are derived as follows: 554 E * (24 + 12 * (R -1 )) 556 where E = number of EIDs (10^n), R = number of RLOCs per EID. 558 Our scaling target is to accommodate 10^8 multihomed systems, which 559 is one order magnitude greater than what is discussed in [10]. At 560 10^8 entries, a device could be expected to use between 3.6 and 10.8 561 and gigabytes of RAM for the mapping. No matter the method of 562 distribution, any router that sits in the core of the Internet would 563 require near this amount of memory in order to perform the ITR 564 function. Large enterprise ETRs would be similarly strained, simply 565 due to the diversity of of sites that communicate with one another. 566 The good news is that this is not our starting point, but rather our 567 scaling target, a number that we intend to reach by the year 2050. 568 Our starting point is more likely in the neighborhood of 10^4 or 10^5 569 EIDs, thus requiring between 360KB and 10.8 MB. 571 5.2. Router Throughput Versus Time 573 +-------------------+---------+--------+---------+-------+ 574 | Table Size (10^N) | 1mb/s | 10mb/s | 100mb/s | 1gb/s | 575 +-------------------+---------+--------+---------+-------+ 576 | 6 | 8 | 0.8 | 0.08 | 0.008 | 577 | 7 | 80 | 8 | 0.8 | 0.08 | 578 | 8 | 800 | 80 | 8 | 0.8 | 579 | 9 | 8,000 | 800 | 80 | 8 | 580 | 10 | 80,000 | 8,000 | 800 | 80 | 581 | 11 | 800,000 | 80,000 | 8,000 | 800 | 582 +-------------------+---------+--------+---------+-------+ 584 Number of seconds to process NERD 586 Table 2 588 The length of time it takes to process the database is significant in 589 models where the device acquires the entire table. During this 590 period of time, either the router will be unable to route packets 591 using LISP or it must use some sort of query mechanism for specific 592 EIDs as the rest it populates its table through the transfer. 593 Table 2 shows us that at our scaling target, the length of time it 594 would take for a router using 1 mb/s of bandwidth is about 80 595 seconds. We can measure the processing rate in small numbers of 596 hours for any transfer speed greater than that. The fastest 597 processing time shows us as taking 8 seconds to process an entire 598 table of 10^9 bytes and 80 for 10^10 bytes. 600 5.3. Number of Servers Required 602 As easy as it may be for a router to retrieve, the aggregate 603 information may be difficult for servers to transmit, assuming the 604 information is transmitted in aggregate (we'll revisit that 605 assumption later). 607 +----------------+------------+-----------+------------+------------+ 608 | # Simultaneous | 10 Servers | 100 | 1,000 | 10,000 | 609 | Requests | | Servers | Servers | Servers | 610 +----------------+------------+-----------+------------+------------+ 611 | 100 | 480 | 48 | 48 | 48 | 612 | 1,000 | 4,800 | 480 | 48 | 48 | 613 | 10,000 | 48,000 | 4,800 | 480 | 48 | 614 | 100,000 | 480,000 | 48,000 | 4,800 | 480 | 615 | 1,000,000 | 4,800,000 | 480,000 | 48,000 | 4,800 | 616 | 10,000,000 | 48,000,000 | 4,800,000 | 480,000 | 48,000 | 617 +----------------+------------+-----------+------------+------------+ 619 Retrieval time per number of servers in seconds. Assumes average 620 10^8 entries with 4 RLOCs per EID and that each server has access to 621 1gb/s and 100% efficient use of that bandwidth and no compression. 623 Table 3 625 Entries in the above table were generated using the following method: 627 For 10^8 entries with four RLOCs per EID, the table size is 6.0GB, 628 per our previous table. Assume 1 Gb/s transfer rates and 100% 629 utilization. Protocol overhead is ignored for this exercise. Hence 630 a single transfer X takes 48 seconds and can get no faster. 632 With this in mind, each entry is as follows: 634 max(1X,N*X/S) 636 where N=number of transfers, X = 48 seconds, 637 and S = number of servers. 639 If we have a distribution model which every device must retrieve the 640 mapping information upon start, Table 3 shows the length of time in 641 seconds it will take for a given number of servers to complete a 642 transfer to a given number of devices. This table says, as an 643 example, that it would take 48,000 seconds (over 13 hours) for one 644 million ITRs to simultaneously retrieve the database from one 645 thousand servers. Should a cold start scenario occur, this number 646 should be of some concern. Hence it is important to take some 647 measures both to avoid such a scenario, and to ease the load should 648 it occur. The primary defense should be for ITRs to first attempt to 649 retrieve their databases from their peers or upstream providers. 650 Secondary defenses could include data sanity checks within ITRs, with 651 agreed norms for how much the database should change in any given 652 update or over any given period of time. As we will see below, 653 dissemination of changes is considerably less volume. 655 +----------------+-------------+---------------+----------------+ 656 | % Daily Change | 100 Servers | 1,000 Servers | 10,000 Servers | 657 +----------------+-------------+---------------+----------------+ 658 | 0.1% | 200 | 20 | 2 | 659 | 0.5% | 1000 | 100 | 10 | 660 | 1% | 2000 | 200 | 20 | 661 | 5% | 10,000 | 1000 | 100 | 662 | 10% | 20,000 | 2000 | 200 | 663 +----------------+-------------+---------------+----------------+ 665 Assuming 10 million routers and a database size of 6GB, resulting 666 hourly transfer times are shown in seconds, given number of servers 667 and daily rate of change. 669 Table 4 671 This table shows us that with 10,000 servers the average transfer 672 time with 1Gb/s links for 10,000,000 routers will be 200 seconds with 673 10% daily change spread over 24 hourly updates. For a 0.1% daily 674 change, that number is 2 seconds for a database of size 6.0GB. 676 The amount of change goes to the purpose of LISP. If its purpose is 677 to provide effective multihoming support to end customers, then we 678 might anticipate relatively random changes. If, on the other, 679 service providers attempt to make use of LISP to provide some form of 680 traffic engineering, we can expect the same data to change more 681 often. We can probably not conclude much in this regard without 682 additional operational experience. The one thing we can say is that 683 different applications of the LISP protocol may require new and 684 different distribution mechanisms. Such optimization is left for 685 another day. 687 5.4. Security Considerations 689 Whichever the answer to our previous question, we must consider the 690 security of the information being transported. If an attacker can 691 forge an update or tamper with the database, he can in effect 692 redirect traffic to end sites. Hence, integrity and authenticity of 693 the NERD is critical. In addition, a means is required to determine 694 whether a source is authorized to modify a given database. No data 695 privacy is required. Quite to the contrary, this information will be 696 necessary for any ITR. 698 The first question one must ask is who to trust to provide the ITR a 699 mapping. Ultimately the owner of the EID prefix is most 700 authoritative for the mapping to RLOCs. However, were all owners to 701 sign all such mappings, ITRs would need to know which owner is 702 authorized to modify which mapping, creating a problem of O(N^2) 703 complexity. 705 We can reduce this problem substantially by investing some trust in a 706 small number of entities that are allowed to sign entries. If 707 authority manages EIDs much the same way a domain name registrar 708 handles domains, then the owner of the EID would choose a database 709 authority she or he trusts, and ITRs must trust each such authority 710 in order to map the EIDs listed by that authority to RLOCs. This 711 reduces the amount of management complexity on the ETR to retaining 712 knowledge of O(#authorities), but does require that each authority 713 establish procedures for authenticating the owner of an EID. Those 714 procedures needn't be the same. 716 There are two classic methods to ensure integrity of data: 718 o secure transport of the source of the data to the consumer, such 719 as Transport Layer Security (TLS) [8]; and 720 o provide object level security. 722 These methods are not mutually exclusive, although one can argue 723 about the need for the former, given the latter. 725 In the case of TLS, when it is properly implemented, the objects 726 being transported cannot easily be modified by interlopers or so- 727 called men in the middle. When data objects are distributed to 728 multiple servers, each of those servers must be trusted. As we have 729 seen above, we could have quite a large number of servers, thus 730 providing an attacker a large number of targets. We conclude that 731 some form of object level security is required. 733 Object level security involves an authority signing an object in a 734 way that can easily be verified by a consumer, in this case a router. 735 In this case, we would want the mapping table and any incremental 736 update to be signed by the originator of the update. This implies 737 that we cannot simply make use of a tool like CVS [11]. Instead, the 738 originator will want to generate diffs, sign them, and make them 739 available either directly or through some sort of content 740 distribution or peer to peer network. 742 5.4.1. Use of Public Key Infrastructures (PKIs) 744 X.509 provides a certificate hierarchy that has scaled to the size of 745 the Internet. The system is particularly manageable when there are 746 fewer certificates to manage. The model proposed in this memo makes 747 use of one current certificate per database authority. The three 748 pieces of information necessary to verify a signature, therefore, are 749 as follows: 751 o the certificate of the database authority, which can be provided 752 along with the database; 753 o the certificate authority's certificate; and 754 o A table of database names and distinguished names (DNs) that are 755 allowed to update them. 757 The latter two pieces of information must be very well known and must 758 be configured on each ITR. It is expected that both would change 759 very rarely, and it would not be unreasonable for such updates to 760 occur as part of a normal OS release process. 762 The tools for both signing and verifying are readily available. 763 Openssl [20] provides tools and libraries for both signing and 764 verifying. Other tools commonly exist. 766 Use of PKIs is not without implementation, operational complexity or 767 risk. The following risks and mitigations are identified with NERD's 768 use of PKIs: 770 If a NERD database authority private key is exposed: 772 In this case an attacker could sign a false database update, 773 either redirecting traffic, or otherwise causing havoc. In this 774 case, the NERD database administrator must revoke its existing key 775 and issue a new one. The certificate is added to a certificate 776 revocation list (CRL), which may be distributed with both this and 777 other databases, as well as through other channels. Because this 778 event is expected to be rare, and the number of database 779 authorities is expected to be small, a CRL will be small. When a 780 router receives a revocation, it checks it against its existing 781 databases, and attempts to update the one that is revoked. This 782 implies that prior to issuing the revocation, the database 783 authority MUST sign an update with the new key. Routers SHOULD 784 discard updates they have already received that were signed after 785 the revocation was generated. If a router cannot confirm that 786 whether the authority's certificate was revoked before or after a 787 particular update, it MUST retrieve a fresh new copy of the 788 database with a valid signature. 790 The private key associated with the CA that signed the Authority's 791 certificate is compromised: 793 In this case, it becomes possible for an attacker to masquerade as 794 the database authority. To ameliorate damage, the database 795 authority SHOULD revoke its certificate and get a new certificate 796 issued from a CA that is not compromised. Once it has done so, 797 the previous procedure is followed. The compromised certificate 798 can be removed during the normal operating system upgrade cycle. 800 An algorithm used if either the certificate or the signature is 801 cracked: 803 This is a catastrophic failure and the above forms of attack 804 become possible. The only mitigation is to make use of a new 805 algorithm. In theory this should be possible, but in practice has 806 proven very difficult. For this reason, additional work is 807 recommended to make alternative algorithms available. 809 The Database Authority loses its key or disappears: 811 In this case nobody can update the existing database. There are 812 few programmatic mitigations. If the database authority places 813 its private keys and suitable amounts of information escrow, under 814 agreed upon circumstances, such as no updates for three days, for 815 example, the escrow agent would release the information to a party 816 competent of generating a database update. 818 5.4.2. Other Risks 820 Because this specification does not require secure transport, if an 821 attacker prevents updates to an ITR for the purposes of having that 822 ITR continue to use a compromised ETR, the ITR could continue to use 823 an old version of the database without realizing a new version has 824 been made available. If one is worried about such an attack, a 825 secure channel such as SSL to a secure chain back to the database 826 authority should be used. It is possible that after some operational 827 experience, later versions of this format will contain additional 828 semantics to address this attack. 830 As discussed above, substantial risk would be a cold start scenario. 831 If an attacker found a bug in a common operating system that allowed 832 it to erase an ITR's database, and was able to disseminate that bug, 833 the collective ability of ITRs to retrieve new copies of the database 834 could be taxed by collective demand. The remedy to this is for 835 devices to share copies of the database with their neighbors, thus 836 making each potential requestor a potential service. 838 6. Why not use XML? 840 Many objects these days are distributed as either XML pages or 841 something derived as XML [17], such as SOAP [18],[19]. Use of such 842 well known standards allows for high level tools and library reuse. 843 Why not, then, use these standards in this case? There are two 844 answers to this question. First, the obvious concern is that XML is 845 not known for efficiency of data transport. Being based in text, an 846 IPv4 address is expanded from one octet to three octets, plus either 847 an attribute and quotes or element tags and end tags. Let us presume 848 for the moment a very simple schema that might cause a record to be 849 represented as follows: 851 852 853 854 192.168.1.1 855 856 857 858 859 192.168.1.2 860 861 862 864 With white space removed the uncompressed XML represents 120 bytes 865 versus 20 bytes for the record specified in Section 3.1, representing 866 a five fold expansion. That brings our 920MB database to 4.6GB. 868 The other concern about XML is that version 1.0 of the specification 869 is silent on the order of sibling elements. Specifications other 870 than the base specification state that order is significant. Order 871 is significant to LISP and NERD because once an update is applied to 872 the database it should be possible to verify the signature of the 873 entire database. Prior to applying the signature the XML generator 874 would need to ensure the order of information. That same sort would 875 be required of the router. This seems to add unnecessary fragility 876 to a critical system without much benefit. While there may indeed be 877 uses of an XML representation of the database, these uses are likely 878 to be outside of a router. 880 7. Other Distribution Mechanisms 882 We now consider various different mechanisms. The problem of 883 distributing changes in various databases is as old as databases. 884 The author is aware of two obvious approaches that have been well 885 used in the past. One approach would be the wide distribution of CVS 886 repositories. However, for reasons mentioned in the previous 887 section, CVS is insufficient to the task. 889 The other tried and true approach is the use of periodic updates in 890 the form of messages. Good old NNTP [13] itself provides two 891 separate mechanisms (one push and another pull) to provide a coherent 892 update process. This was in fact used to update molecular biology 893 databases [14] in the early 1990s. Netnews offers a way to determine 894 whether articles with specified Article-Ids have been received. In 895 the case where the mapping file source of authority wishes to 896 transmit updates, it can sign a change file and then post it into the 897 network. Routers merely need to keep a record of article ids that it 898 has received. Initially this is probably overkill, but it may not be 899 so later in this process. Some consideration should be given to a 900 mechanism known to widely distribute vast amounts of data, as 901 instantaneously either the sender or the receiver wishes. 903 To attain an additional level of hierarchy in the distribution 904 network, service providers could retrieve information to their own 905 local servers, and configure their routers with the host portion of 906 the above URI. 908 Another possibility would be for providers to establish an agreement 909 on a small set of anycast addresses for use for this purpose. There 910 are limitations to the use of anycast, particularly with TCP. In the 911 midst of a routing flap anycast address can become all but unusable. 912 Careful study of such a use as well as appropriate use of HTTP 913 redirects is expected. 915 7.1. What About DNS as a retrieval model? 917 It has been proposed that a query/response mechanism be used for this 918 information, and that specifically the domain name system (DNS) [16] 919 be used. The previous models do not preclude the DNS. DNS has the 920 advantage that the administrative lines are well drawn, and that the 921 ID/RLOC mapping is likely to appear very close to these boundaries. 922 DNS also has the added benefit that an entire distribution 923 infrastructure already exists. There are, however, some problems 924 that could impact end hosts when intermediate routers make queries, 925 some of which were first pointed out in [15]: 927 o Any query mechanism offers an opportunity for a resource attack if 928 an attacker can force the ITR to query for information. In this 929 case, all that would be necessary would be for a "botnet" (a group 930 of computers that have been compromised and used as vehicles to 931 attack others) to ping or otherwise contact via some normal 932 service hosts that sit behind the ETR. If the botnet hosts 933 themselves are behind ETRs, the victim's ITR will need to query 934 for each and every one of them, thus becoming part of a classic 935 reflector attack. 936 o Packets will be delayed at the very least, and probably dropped in 937 the process of a mapping query. This could be at the beginning of 938 a communication, but it will be impossible for a router to 939 conclude with certainty that this is the case. 940 o The DNS has a backoff algorithm that presumes that applications 941 are making queries prior to the beginning of a communication. 942 This is appropriate for end hosts who know in fact when a 943 communication begins. An end user may not enjoy a router waiting 944 seconds for a retry. 945 o While the administrative lines may appear to be correct, the 946 location of name servers may not be. If name servers sit within 947 PI address space, thus requiring LISP to reach, a circular 948 dependency is created. This is precisely where many enterprise 949 name servers sit. The LISP experiment should not predicate its 950 success on relocation of such name servers. 952 Never-the-less, DNS may be able to play a role in providing the 953 enterprise control over the mapping of its EIDs to RLOCs. Posit a 954 new DNS record "EID2RLOC". This record is used by the authority to 955 collect and aggregate mapping information so that it may be 956 distributed through one of the other mechanisms. As an example: 958 $ORIGIN 0.10.PI-SPACE. 959 128 EID2RLOC mask 23 priority 10 weight 5 172.16.5.60 960 EID2RLOC mask 23 priority 15 weight 5 192.168.1.5 962 In the above figure network 10.0.128/23 would delegated to some end 963 system, say EXAMPLE.COM. They would manage the above zone 964 information. This would allow a DNS mechanism to work, but it would 965 also allow someone to aggregate the information and distribution a 966 table. 968 7.1.1. Perhaps use a hybrid model? 970 It would be possible to use both a prepopulated database such as NERD 971 and query mechanism (perhaps DNS) to determine an EID/RLOC mapping. 972 The general idea would be to receive a subset of the mappings, say, 973 by taking only the NERD for certain regions. This alleviates the 974 need to drop packets for some subset of destinations under the 975 assumption that one's business is localized to a particular region. 976 If one did not have a local entry for a particular EID one would then 977 make a query. 979 One improvement on simply using DNS to query live would be to 980 periodically walk the entire network, in search of EID2RLOC records, 981 and caching them to non-volatile storage. This has two benefits. 983 First, it prevents resource attacks. Care has to be given to how 984 memory is cached it avoid an attacker causing a performance 985 degradation by attempting to exceed memory limits through a random 986 source attack. 988 As important as resisting attacks, having a complete or near complete 989 copy of the database provides for a faster recovery time when a 990 router goes out of service, for whatever reason. Absent such a 991 mechanism, devices would need to repopulate their local caches 992 through the help of another system, leading to additional system 993 fragility. 995 7.2. Use of BGP 997 Border Gateway Protocol (BGP) [9] is currently used to distribute 998 inter-domain routing throughout the Internet. Why not, then, use BGP 999 to distribute the mapping table? A simple answer is that the objects 1000 BGP best handles are routes. While it may be possible to transmit 1001 EID/RLOC mappings instead (because they look an awful lot like 1002 routes) the rate of updates of EID/RLOC mappings is specifically 1003 intended to be considerably less than routes, and would probably 1004 require additional dampening mechanisms to ensure that this is so. 1006 In addition, the ownership of the mapping does not flow from service 1007 providers but rather from end users of the identifiers. It should 1008 not be possible for anyone to filter the mapping, other than perhaps 1009 ITRs for local policy purposes. The current limited security model 1010 for BGP does not fit the general requirements of how the mapping is 1011 to be processed. 1013 Furthermore, as BGP is currently the lifeblood of the Internet its 1014 use for any means other than routing should be strongly scrutinized. 1016 This is not to say that BGP has no role to play whatsoever. It may 1017 well be possible for routers to exchange database version numbers and 1018 perhaps base distribution URIs as extensions or capabilities. This 1019 would allow routers to serve their copy of the database to their 1020 neighbors, easing the load off the rest of the server infrastructure. 1021 How this would be done is future work. 1023 8. Deployment Issues 1025 While LISP and NERD are intended as experiments at this point, it is 1026 already obvious one must give serious consideration to circular 1027 dependencies with regard to the protocols used and the elements 1028 within them. 1030 8.1. HTTP 1032 In Section 7.1 we have already seen how DNS can have circular 1033 dependencies. In as much as HTTP depends on DNS, either due to the 1034 authority section of a URI, or due to the configured base 1035 distribution URI, these same concerns apply. In addition, any HTTP 1036 server that itself makes use of provider independent addresses would 1037 be a poor choice to distribute the database for these exact same 1038 reasons. 1040 One issue with using HTTP is that it is possible that a middlebox of 1041 some form, such as a cache, may intercept and process requests. In 1042 some cases this might be a good thing. For instance, if a cache 1043 correctly returns a database, some amount of bandwidth is conserved. 1044 On the other hand, if the cache itself fails to function properly for 1045 whatever reason, end to end connectivity could be impaired. For 1046 example, if the cache itself depended on the mapping being in place 1047 and functional, a cold start scenario might leave the cache 1048 functioning improperly, in turn providing routers no means to update 1049 their databases. Some care must be given to avoid such 1050 circumstances. 1052 9. Conclusions 1054 This memo has specified a database format, an update format, a URI 1055 convention, an update method, and a validation method for EID/RLOC 1056 mappings. We have shown that beyond the predictions of 10^7 1057 locators, the aggregate database size would be at most 10.8GB. We 1058 have considered the amount of servers to distribute that information 1059 and we have demonstrated the limitations of a simple content 1060 distribution network and other well known mechanisms. The effort 1061 required to retrieve a database change amounts to between 2 and 20 1062 seconds of processing time per hour at at today's gigabit speeds. We 1063 conclude that there is no need for an off box query mechanism today, 1064 and that there are distinct disadvantages for having such a mechanism 1065 in the control plane. 1067 Beyond this we have examined alternatives that allow for hybrid 1068 models that do use query mechanisms, should our operating assumptions 1069 prove overly optimistic. Use of NERD today does not forclose use of 1070 such models in the future, and in fact both models can happily co- 1071 exist. 1073 We leave to future work how the list of databases is distributed, how 1074 BGP can play a role in distributing knowledge of the databases, and 1075 how DNS can play a role in aggregating information into these 1076 databases. 1078 We also leave to future work whether HTTP is the best protocol for 1079 the job, and whether the scheme described in this document is the 1080 most efficient. One could easily envision that when applied in high 1081 delay or high loss environments, a broadcast or multicast method may 1082 prove more effective. 1084 10. IANA Considerations 1086 This memo makes no requests of IANA. 1088 11. Acknowledgments 1090 Dino Farinacci, Patrik Faltstrom, Dave Meyer, Joel Halpern, Dave 1091 Thaler, Mohamed Boucadair, and Max Pritikin were very helpful with 1092 their reviews of this document. The astute will notice a lengthy 1093 References section. This work stands on the shoulders of many 1094 others' efforts. 1096 12. References 1098 12.1. Normative References 1100 [1] Farinacci, D., "Locator/ID Separation Protocol (LISP)", 1101 draft-farinacci-lisp-03 (work in progress), August 2007. 1103 [2] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., 1104 Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol -- 1105 HTTP/1.1", RFC 2616, June 1999. 1107 [3] Bradner, S., "Key words for use in RFCs to Indicate Requirement 1108 Levels", BCP 14, RFC 2119, March 1997. 1110 [4] Kaliski, B., "PKCS #7: Cryptographic Message Syntax Version 1111 1.5", RFC 2315, March 1998. 1113 [5] International Telecommunications Union, "Information technology 1114 - Open Systems Interconnection - The Directory: Public-key and 1115 attribute certificate frameworks", ITU-T Recommendation X.509, 1116 ISO Standard 9594-8, March 2000. 1118 [6] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 1119 Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, 1120 January 2005. 1122 [7] Moats, R., "URN Syntax", RFC 2141, May 1997. 1124 12.2. Informational References 1126 [8] Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS) 1127 Protocol Version 1.1", RFC 4346, April 2006. 1129 [9] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway Protocol 4 1130 (BGP-4)", RFC 4271, January 2006. 1132 [10] Carpenter, B., "IETF Plenary Presentation: Routing and 1133 Addressing: Where we are today", March 2007. 1135 [11] Grune, R., Baalbergen, E., Waage, M., Berliner, B., and J. 1136 Polk, "CVS: Concurrent Versions System", November 1985. 1138 [12] International International Telephone and Telegraph 1139 Consultative Committee, "Information Technology - Open Systems 1140 Interconnection - The Directory: Authentication Framework", 1141 CCITT Recommendation X.509, November 1988. 1143 [13] Kantor, B. and P. Lapsley, "Network News Transfer Protocol", 1144 RFC 977, February 1986. 1146 [14] Smith, R., Gottesman, Y., Hobbs, B., Lear, E., Kristofferson, 1147 D., Benton, D., and P. Smith, "A mechanism for maintaining an 1148 up-to-date GenBank database via Usenet", CABIOS , April 1991. 1150 [15] Huitema, C., "An Experiment in DNS Based IP Routing", RFC 1383, 1151 December 1992. 1153 [16] Mockapetris, P., "Domain names - concepts and facilities", 1154 STD 13, RFC 1034, November 1987. 1156 [17] Bray, T., Paoli, J., Sperberg-McQueen, C., and E. Maler, 1157 "Extensible Markup Language (XML) 1.0 (2nd ed)", W3C REC-xml, 1158 October 2000, . 1160 [18] Gudgin, M., Hadley, M., Mendelsohn, N., Moreau, J., and H. 1161 Nielsen, "SOAP Version 1.2 Part 1: Messaging Framework", W3C 1162 Working Draft soap12-part1, June 2002, 1163 . 1165 [19] Gudgin, M., Hadley, M., Mendelsohn, N., Moreau, J., and H. 1166 Nielsen, "SOAP Version 1.2 Part 2: Adjuncts", W3C Working 1167 Draft soap12-part2, June 2002, 1168 . 1170 URIs 1172 [20] 1174 Appendix A. Generating and verifying the database signature with 1175 OpenSSL 1177 As previously mentioned, one goal of NERD was to use off-the-shelf 1178 tools to both generate and retrieve the database. To many, PKI is 1179 magic. This section is meant to provide at least some clarification 1180 as to both the generation and verification process, complete with 1181 command line examples. Not included is how you get the entries 1182 themselves. We'll assume they exist, and that you're just trying to 1183 sign the database. 1185 To sign the database, to start with, you need a database file that 1186 has a database header described in Section 3. Block size should be 1187 zero, and there should be no PKCS#7 block at this point. You also 1188 need a certificate and its private key with which you will sign the 1189 database. 1191 The OpenSSL "smime" command contains all the functions we need from 1192 this point forth. To sign the database, issue the following command: 1194 openssl smime -binary -sign -outform DER -signer yourcert.crt \ 1195 -inkey yourcert.key -in database-file -out signature 1197 -binary states that no MIME canonicalization should be performed. 1198 -sign indicates that you are signing the file that was given as the 1199 argument to -in. The output format (-outform) is binary DER, and 1200 your public certificate is provided with -signer along with your key 1201 with -inkey. The signature itself is specified with -out. 1203 The resulting file "signature" is then copied into to PKCS#7 block in 1204 the database header, its size in bytes is recorded in the PKCS#7 1205 block size field, and the resulting file is ready for distribution to 1206 ITRs. 1208 To verify a database file, first retrieve the PKCS#7 block from the 1209 file by copying the appropriate number of bytes into another file, 1210 say "signature". Then zero this field, and set the block size field 1211 to 0. Next use the "smime" command to verify the signature as 1212 follows: 1214 openssl smime -binary -verify -inform DER -content database-file 1215 -out /dev/null -in signature 1217 Openssl will return "Verification OK" if the signature is correct. 1219 To improve verification performance it would make modifications to 1220 the program so that it takes as input the database with a null 1221 signature and as an argument the name of the file containing the 1222 signature. Better yet, use a call to the appropriate library with 1223 each block. 1225 Appendix B. Changes 1227 This section to be removed prior to publication. 1229 o 02: Incorporate some of Dave Thaler's comments. Add 1230 authentication block detail. Modify analysis to take IPv6 into 1231 account, along with a more realistic number of RLOCs per EID. Add 1232 some comments about potential risks of a cold start. Add S/MIME 1233 example as appendix A and take out old ToDo. Provide some amount 1234 of compression of IPv6 addresses by limiting their size to 1235 significant bytes rounded to a four byte word boundary. 1236 o 01: Massive spelling correction, URI example correction. 1237 o 00: Initial Revision. 1239 Appendix C. Open Questions 1241 This section to be removed prior to publication. 1243 o Should the database contain its name? It is probably sufficient 1244 to merely reference the database by name. 1245 o Should the signature portion be separated from the actual 1246 database? By specifying the signature we hope to reduce 1247 interoperability issues and encourage proper security from the get 1248 go. On the other hand, since the object is opaque it is not clear 1249 how much interoperability we are actually encouraging. 1250 o Should we specify a (perhaps compressed) tarball that treads a 1251 middle ground for the last question, where each update tarball 1252 contains both a signature for the update and for the entire 1253 database, once the update is applied. 1254 o Should we compress? In some initial testing of databases with 1, 1255 5, and 10 million IPv4 EIDs and a random distribution of IPv4 1256 RLOCs, the current format in this document compresses down by a 1257 factor of between 35% and 36%, using Burrows-Wheeler block sorting 1258 text compression algorithm (bzip2). The NERD used random EIDs 1259 with mask lengths varying from 19-29, with probability weighted 1260 toward the smaller masks. This only very roughly reflects 1261 reality. A better test would be to start with the existing 1262 prefixes found in the DFZ. 1264 Author's Address 1266 Eliot Lear 1267 Cisco Systems GmbH 1268 Glatt-com 1269 Glattzentrum, ZH CH-8301 1270 Switzerland 1272 Phone: +41 1 878 7525 1273 Email: lear@cisco.com 1275 Full Copyright Statement 1277 Copyright (C) The IETF Trust (2007). 1279 This document is subject to the rights, licenses and restrictions 1280 contained in BCP 78, and except as set forth therein, the authors 1281 retain all their rights. 1283 This document and the information contained herein are provided on an 1284 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1285 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1286 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1287 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1288 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1289 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1291 Intellectual Property 1293 The IETF takes no position regarding the validity or scope of any 1294 Intellectual Property Rights or other rights that might be claimed to 1295 pertain to the implementation or use of the technology described in 1296 this document or the extent to which any license under such rights 1297 might or might not be available; nor does it represent that it has 1298 made any independent effort to identify any such rights. Information 1299 on the procedures with respect to rights in RFC documents can be 1300 found in BCP 78 and BCP 79. 1302 Copies of IPR disclosures made to the IETF Secretariat and any 1303 assurances of licenses to be made available, or the result of an 1304 attempt made to obtain a general license or permission for the use of 1305 such proprietary rights by implementers or users of this 1306 specification can be obtained from the IETF on-line IPR repository at 1307 http://www.ietf.org/ipr. 1309 The IETF invites any interested party to bring to its attention any 1310 copyrights, patents or patent applications, or other proprietary 1311 rights that may cover technology that may be required to implement 1312 this standard. Please address the information to the IETF at 1313 ietf-ipr@ietf.org. 1315 Acknowledgment 1317 Funding for the RFC Editor function is provided by the IETF 1318 Administrative Support Activity (IASA).