idnits 2.17.1 draft-meyer-lisp-cons-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 18. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1372. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1383. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1390. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1396. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 54 instances of too long lines in the document, the longest one being 3 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 7, 2007) is 6075 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'D' is mentioned on line 627, but not defined == Missing Reference: 'C' is mentioned on line 627, but not defined == Missing Reference: 'B' is mentioned on line 627, but not defined == Outdated reference: A later version (-12) exists of draft-farinacci-lisp-03 -- Obsolete informational reference (is this intentional?): RFC 2434 (Obsoleted by RFC 5226) Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Brim 3 Internet-Draft N. Chiappa 4 Intended status: Experimental D. Farinacci 5 Expires: March 10, 2008 V. Fuller 6 D. Lewis 7 D. Meyer 8 September 7, 2007 10 LISP-CONS: A Content distribution Overlay Network Service for LISP 11 draft-meyer-lisp-cons-02.txt 13 Status of this Memo 15 By submitting this Internet-Draft, each author represents that any 16 applicable patent or other IPR claims of which he or she is aware 17 have been or will be disclosed, and any of which he or she becomes 18 aware will be disclosed, in accordance with Section 6 of BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on March 10, 2008. 38 Copyright Notice 40 Copyright (C) The IETF Trust (2007). 42 Abstract 44 The Content distribution Overlay Network Service for LISP (LISP-CONS) 45 is a protocol for distributing identifier-to-locator mappings for the 46 Locator/ID Separation Protocol (LISP). LISP-CONS is not a routing 47 protocol. LISP-CONS is designed to scale by using a hierarchical 48 content distribution system comprised of Tunnel Routers, Content 49 Access Resources, and Content Distribution Resources. 51 Table of Contents 53 1. Requirements Notation . . . . . . . . . . . . . . . . . . . . 3 54 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 3. Definition of Terms . . . . . . . . . . . . . . . . . . . . . 4 56 3.1. LISP-CONS Name Spaces . . . . . . . . . . . . . . . . . . 5 57 3.2. LISP-CONS Network Elements . . . . . . . . . . . . . . . . 5 58 3.3. Relationship Between LISP-CONS Network Elements . . . . . 7 59 4. Overview of Operation . . . . . . . . . . . . . . . . . . . . 7 60 5. The LISP-CONS Protocol . . . . . . . . . . . . . . . . . . . . 10 61 5.1. Building the LISP-CONS Database . . . . . . . . . . . . . 10 62 5.2. Querying the LISP-CONS Database . . . . . . . . . . . . . 11 63 5.3. Maintaining the LISP-CONS Database . . . . . . . . . . . . 13 64 5.3.1. An EID-Prefix Is Administratively Removed From The 65 Infrastructure . . . . . . . . . . . . . . . . . . . . 13 66 5.3.2. A CAR's Connectivity Changes . . . . . . . . . . . . . 14 67 5.3.3. A CAR Becomes Unreachable . . . . . . . . . . . . . . 15 68 5.3.4. A CDR Becomes Unreachable . . . . . . . . . . . . . . 15 69 6. LISP-CONS Message Types . . . . . . . . . . . . . . . . . . . 16 70 6.1. Open Message . . . . . . . . . . . . . . . . . . . . . . . 16 71 6.2. Push-Add and Push-Delete . . . . . . . . . . . . . . . . . 19 72 6.3. Map-Request Message . . . . . . . . . . . . . . . . . . . 20 73 6.4. Map-Reply Message . . . . . . . . . . . . . . . . . . . . 22 74 6.5. No-Map Message . . . . . . . . . . . . . . . . . . . . . . 25 75 7. Operational Considerations . . . . . . . . . . . . . . . . . . 26 76 8. LISP-CONS and Locator Reachability . . . . . . . . . . . . . . 26 77 9. LISP-CONS and Mobility . . . . . . . . . . . . . . . . . . . . 26 78 10. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 26 79 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 27 80 12. Security Considerations . . . . . . . . . . . . . . . . . . . 27 81 12.1. Apparent LISP-CONS Vunerabilities . . . . . . . . . . . . 27 82 12.2. Survey of LISP-CONS Security Mechanisms . . . . . . . . . 28 83 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 84 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29 85 14.1. Normative References . . . . . . . . . . . . . . . . . . . 29 86 14.2. Informative References . . . . . . . . . . . . . . . . . . 29 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 30 88 Intellectual Property and Copyright Statements . . . . . . . . . . 31 90 1. Requirements Notation 92 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 93 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 94 document are to be interpreted as described in [RFC2119]. 96 2. Introduction 98 The Content distribution Overlay Network Service for LISP, or LISP- 99 CONS, is a control-plane protocol for distributing identifier-to- 100 locator mappings for the Locator/ID Separation Protocol (LISP) 101 [LISP-03]. The properties of such a "locator/id split" have been 102 discussed in depth in various venues dating back to [CHIAPPA] and 103 [RFC1498], and as such will not be reviewed here. Rather, the reader 104 is referred to the above references for an outline of the various 105 benefits that may be realized by separating the functionality of IP 106 addresses into separate Endpoint Identifier and Routing Locator name 107 spaces. 109 LISP-CONS operates on a distributed Endpoint Identifier-to-Routing 110 Locator (EID-to-RLOC) database. This database is distributed among 111 the authoritative Replying Content Access Resources (Answering-CAR). 112 An Answering-CAR (aCAR) advertises "reachability" for its EID-to-RLOC 113 mappings through a hierarchical network of Content Distribution 114 Resources (CDRs) (but importantly, not the mapping itself), and 115 responds to mapping requests from the system. A CAR may also request 116 mappings from the system (this a Querying-CAR, or qCAR). Ingress 117 Tunnel Routers (ITRs) connect to one or more qCARs to query the 118 system for EID-to-RLOC bindings; the qCAR then queries the system on 119 behalf of the ITR. These queries follow the overlay network to the 120 authoritative aCAR, which responds with the mapping. This response 121 may then be cached by the 'local' CAR. Finally, note that neither a 122 qCAR or aCAR need to hold the entire EID-to-RLOC database. Rather, 123 the EID-to-RLOC translations are explicitly pulled by the ITRs by 124 querying one or more of its connected qCARs. 126 Note that LISP-CONS is not designed for the "fast-mobility" case. 127 That is, it is envisioned that the mappings distributed by LISP-CONS 128 are reasonably static. LISP-CONS is also not designed to carry 129 Locator Reachability status information; see [LISP-03] for details on 130 how LISP determines locator reachability. 132 LISP-CONS seeks to control the "state * rate" scaling properties of 133 the mapping service by first observing that the host mapping state is 134 likely to be quite large (some estimates put the size of this 135 database to be on the order of 10^10 hosts). As a result, even with 136 aggressive aggregation, the "rate" of change of the mapping database 137 must be kept small. LISP-CONS manages the rate problem by 138 distributing highly aggregated information about the location of the 139 EID-to-RLOC mappings (which are assumed to change at low frequency) 140 over a peering network. The peering network is comprised of ITRs, 141 CARs and CDRs. 143 In summary, LISP-CONS is a hybrid "push/pull" protocol in which 144 information about the existence of a particular mapping is "pushed" 145 at the higher levels of the aggregation hierarchy, while the actual 146 EID-to-RLOC mappings are "pulled" from the network elements at the 147 lowest level of the hierarchy. In particular, LISP-CONS carries 148 mapping requests and replies to and from the lowest level of the 149 hierarchy where the EID-to-RLOC mappings reside. 151 While this draft focuses on a router-based solution, there is no 152 architectural reason that LISP-CONS functionality could not be 153 implemented in other devices (i.e., hosts). However, in keeping with 154 the architectural direction taken by the LISP data-plane proposal 155 [LISP-03], LISP-CONS is based on the the theory that building the 156 solution into the network should facilitate incremental deployment of 157 the technology on the Internet. In order to minimize the required 158 investment in deployment of new hardware, it is assumed that much, if 159 not all, the initial implementation will be in routers. Finally, 160 while the detailed protocol specification and examples in this 161 document assume IP version 4 (IPv4), there is nothing in the design 162 that precludes the use of the same techniques and mechanisms for 163 IPv6. 165 The remainder of this document is organized as follows: Section 3 166 provides the set of definitions that are used in this document, and 167 Section 4 provides an overview of LISP-CONS operation. Section 5 168 describes the LISP-CONS protocol, and Section 6 provides details of 169 the LISP-CONS message types. Section 7 outlines operational 170 considerations, Section 8 discusses locator reachability, and 171 Section 9 considers the interaction of LISP-CONS with mobile nodes. 172 Section 12 outlines security considerations for LISP-CONS. 174 Finally, this proposal (as well as the LISP data-plane proposal) was 175 stimulated by the problem statement effort at the IAB Routing and 176 Addressing Workshop (RAWS) [I-D.iab-raws-report], which took place in 177 Amsterdam in October 2006. 179 3. Definition of Terms 181 The LISP-CONS protocol operates on two name spaces and is comprised 182 of four network elements. This section provides high-level 183 definitions of the LISP-CONS name spaces, network elements, and 184 message types. 186 3.1. LISP-CONS Name Spaces 188 Endpoint ID (EID): A 32- or 128-bit value used in the source and 189 destination fields of the first (most inner) LISP header of a 190 packet. A packet that is emitted by a system contains EIDs in its 191 headers and and LISP headers are prepended only when the packet 192 hits an Ingress Tunnel Router (ITR) on the data path to the 193 destination EID. 195 In LISP-CONS, EID-prefixes MUST BE assigned in a hierarchical 196 manner (in power-of-two or larger chunks) such that they can be 197 aggregated either by Content Access Resources or Content 198 Distribution Resources (see below). In addition, a site may have 199 site-local structure in how EIDs are topologically organized 200 (subnetting) for routing within the site; this structure is not 201 visible to the global routing system. 203 EID-Prefix Aggregate: A set of EID-prefixes said to be aggregatable 204 in the [RFC4632] sense. That is, an EID-Prefix aggregate is 205 defined to be a single contiguous power-of-two EID-prefix block. 206 Such a block is characterized by a prefix and a mask. 208 Routing Locator (RLOC): The IP address of an egress tunnel router 209 (ETR). It is the output of a EID-to-RLOC mapping lookup. An EID 210 maps to one or more RLOCs. Typically, RLOCs are numbered from 211 topologically-aggregatable blocks that are assigned to a site at 212 each point to which it attaches to the global Internet; where the 213 topology is defined by the connectivity of provider networks, 214 RLOCs can be thought of as Provider Aggregatable (PA) addresses. 216 EID-to-RLOC Mapping: A binding between and EID and the RLOC-set 217 that can be used to reach the EID. We use the term "mapping" in 218 this document to refer to a EID-to-RLOC mapping. 220 3.2. LISP-CONS Network Elements 222 LISP-CONS consists of the four network element types described below. 223 Peering connections between these element types use RLOCs so that the 224 underlying routing system can keep the LISP-CONS peering connections 225 up (i.e., to avoid circular dependencies on the mapping system). 226 Each peering connection is required to be configured with a keyed- 227 hash message authentication code (HMAC) key. A connection MUST NOT 228 be established without the TCP HMAC option included. 230 Content Distribution Resource (CDR): A CDR provides aggregation of 231 EID prefix lists, propagation of EID-prefix lists to parent CDRs, 232 and routing of mapping requests to and from CARs. 234 There may be several levels of aggregation of CDRs. CDRs do not 235 themselves carry EID prefix to RLOC mappings. CDRs are arranged 236 in a hierarchical manner in order to enable aggressive aggregation 237 of EID-prefixes. 239 Content Access Resource (CAR): A CAR fills one or both of the 240 following roles: 242 Answering-CAR (aCAR): A CAR is the source of authority for one or 243 more EID prefix to RLOC mappings which which it has been 244 administratively configured, and responds to Map Requests for 245 these EID-to-RLOC mappings. Each aCAR provides to parent CDRs 246 a list of prefixes that it is responsible for, but not the 247 mappings themselves. 249 In particular, aCARs peer with CDRs to propagate aggregated 250 information about how to find a particular EID-to-RLOC mapping 251 upward (but importantly, not the mapping itself). However, 252 aCARs do not peer with other CARs. The primary difference 253 between the aCAR and CDR is that a CAR maintains two databases: 254 A EID-to-RLOC mapping database, and a EID-prefix database. A 255 CDR maintains only an EID-prefix database. 257 Querying-CAR (qCAR): A CAR that generates Map-Request messages on 258 behalf of one or more of its ITR peers (see below). Note that 259 qCAR has peering connections with ITRs whereas an aCAR does not 260 have to. Finally, both functionalities (qCAR and aCAR) MAY be 261 co-located in the same device. In particular, qCAR MUST also 262 be an aCAR, while an aCAR need not be a qCAR. 264 Egress Tunnel Router (ETR): A router that accepts an IP packet where 265 destination address in the "outer" IP header is one of its own 266 RLOCs. The router strips the "outer" header and forwards the 267 packet based on the next IP header found. In general, an ETR 268 receives LISP-encapsulated IP packets from the Internet on one 269 side and sends decapsulated IP packets to site end-systems on the 270 other side. 272 Ingress Tunnel Router (ITR): A router which accepts an IP packet 273 with a single IP header (more precisely, an IP packet that does 274 not contain a LISP header). The router treats this "inner" IP 275 destination address as an EID and performs an EID-to-RLOC mapping 276 lookup. The router then prepends an "outer" IP header with one of 277 its globally-routable RLOCs in the source address field and the 278 result of the mapping lookup in the destination address field. 279 Note that this destination RLOC may be an intermediate, proxy 280 device that has better knowledge of the EID-to-RLOC mapping 281 closest to the destination EID. In general, an ITR receives IP 282 packets from site end-systems on one side and sends LISP- 283 encapsulated IP packets toward the Internet on the other side. 284 ITRs may also have TCP connections to qCARs in order to send 285 mapping requests and receive replies (noting that a qCAR, an aCAR, 286 and an ITR may be co-located). 288 3.3. Relationship Between LISP-CONS Network Elements 290 Each LISP-CONS device is known by a single identifier, which is used 291 for peering from all peers, and in path-vector (PV) lists. This 292 identifier MAY be an IP address. An implementation SHOULD use a 293 loopback address for this purpose. Note that this address MUST be 294 routable by the core routing system. 296 LISP-CONS network elements peer with each other in one of three 297 peering relationships: parent, child, or sibling. The relationship 298 is carried in the LISP-CONS OPEN message (Section 6.1). The 299 permitted peering relationships are as follows: 301 o ITRs exist at lowest (unnumbered) level in the peering hierarchy, 302 and peer only with one or more CARs. An ITR MUST NOT peer with 303 another ITR or with a CDR. 305 o CARs exist at level 0 in the peering hierarchy, and peer only with 306 parent CDRs or with a child ITR. A CAR MUST NOT peer with another 307 CAR; this rule allows the aCARs to aggregate EID prefixes as low 308 in the hierarchy as possible. Note that this rule also means that 309 mapping requests and replies are routed over the peering topology, 310 not directly between the CARs. 312 o CDRs exist at level 1 (and above) and aggregate EID-prefixes learn 313 from its aCAR peerings. When a two CDRs start their peering 314 connection, if one is a parent, the other MUST BE a child. 315 Otherwise, they both MUST BE siblings. 317 o If any of these checks fail, the peering connection MUST NOT be 318 established. 320 4. Overview of Operation 322 LISP-CONS constructs a multi-level content distribution overlay which 323 achieves scalability by imposing a strict aggregation hierarchy on 324 the participating elements. The LISP-CONS hierarchy consists of ITRs 325 the bottom of the hierarchy, CARs at level 0, and CDRs at levels 1 326 and above; this is depicted in Figure 1. Each level of the hierarchy 327 is a strict tree. That is, there are no transit loops in the 328 hierarchy; redundancy is achieved by meshing CDR connectivity within 329 in a single level of the hierarchy, and the LISP-CONS protocol 330 assures that message flow is loop-free. 332 In LISP-CONS, the EID-to-RLOC mappings are held in the aCARs, while 333 the CDRs maintain information about how to find the aCAR holding a 334 particular EID-to-RLOC mapping. That is, the Push-Add and Push- 335 Delete messages (Section 6.2) only contain EID-prefixes (i.e., 336 Locator-sets are not included in these messages and are not stored in 337 the CDRs). 339 In general, LISP-CONS uses network element redundancy to avoid 340 mapping database inconsistencies that may arise in those cases in 341 which a CAR or CDR crashes. Similarly, connectivity outages are 342 avoided by configuring a redundant underlying topology. 344 +----------------+ 345 | CDR ------ CDR | 346 +--|----------|--+ 347 / \ 348 / \ 349 +----------------+ +----------------+ 350 | CDR ------ CDR | | CDR ------ CDR | (CDR-mesh at level 2) 351 +--|----------|--+ +--|----------|--+ 352 | | | | 353 | | | | 354 +---|----------|----+ +---|----------|---+ 355 | CDR ------ CDR | | CDR ------ CDR | 356 | | | | | | | | (CDR-Mesh at level 1) 357 | | | | | | | | 358 | CDR ------ CDR | | CDR ------ CDR | 359 +---|----------|----+ +---|----------|---+ 360 | | | | 361 | | | | 362 | | | | 363 CAR CAR CAR CAR 364 / \ / \ / \ / \ 365 / \ / \ / \ / \ 366 ITR ITR ITR ITR ITR ITR ITR ITR 368 Figure 1: LISP-CONS Hierarchy 370 Figure 2 depicts the details of the first three levels of hierarchy. 372 Note that there are no horizontal TCP connections between the ITRs or 373 between the CARs. Note that qCARs (abbreviated "Req-CAR") peer with 374 the ITRs, while the aCARs may not. The CDRs at level 1 are meshed so 375 that the two aCARs can aggregate to the same mesh level. 377 Note that to avoid request and reply black-holes, all CDRs that are 378 responsible for a segment of the address space must be siblings 379 (i.e., at the same level). 381 CDR --- CDR (level-1) 382 |\ /| 383 | \ / | 384 | \ / | 385 | X | 386 | / \ | 387 | / \ | 388 |/ \| 389 CAR CAR (level-0) 390 |\ /| 391 | \ / | 392 | \ / | 393 | X | 394 | / \ | 395 | / \ | 396 |/ \| 397 ITR ITR 399 Figure 2: LISP-CONS Hierarchy Detail 401 LISP-CONS operates as follows: An aCAR receives EID-to-RLOC mappings 402 by administrative configuration. The aCARs aggregate these EID- 403 prefixes into power-of-two less specific EID-prefixes, and "push" the 404 aggregated EID-prefixes to their (parent) CDRs in Push-Add messages 405 (see Section 6.2). CDRs then flood the Push-Add messages to their 406 sibling CDRs. Note that the Push messages contain EID-prefix 407 reachability information, not locator sets. 409 If a CDR is a child, it then pushes the aggregate for the EID-prefix 410 (i.e., the aggregate that "covers" the EID-prefix) to its parent 411 CDRs. This CDR MUST also originate the default EID-prefix 0.0.0.0/0 412 or 0::0/0 (this allows Requests and Replies to flow up and down the 413 aggregation hierarchy). This default is contained within the level 414 of the sibling mesh. Note that aggregates MUST only be generated 415 when the components of the aggregate are all longer prefixes than the 416 aggregate (and importantly, NOT equal in length). For example, a CDR 417 MUST NOT generate an aggregate such as A.B.0.0/16 if it has not heard 418 a A.B.*.0/24 from either a child or sibling peer. 420 When an ITR needs a mapping, it sends a Map-Request message to its 421 directly connected qCARs. If any of those CARs have cached the 422 requested mapping, the result is immediately returned to the ITR. 423 Otherwise, the Map-Request message is routed through the CDR 424 hierarchy to the aCAR which holds the mapping. That CAR then returns 425 the mapping in a Map-Reply message (which is routed over the peering 426 topology) to the qCAR, which then forwards it on to the requesting 427 ITR. 429 Finally, note that this type of advertisement hierarchy allows EID 430 lookups to have lower Round Trip Times (RTTs) when the EID-prefix is 431 "close" (in the EID allocation hierarchy) to the site's attached CAR. 432 However, for scalability reasons, a request may have to travel extra 433 hops to get an EID-prefix that can only be obtained by going up the 434 tree (and in the worse case, by going to the top of the hierarchy and 435 down to the aCAR that hold the mapping). 437 5. The LISP-CONS Protocol 439 This section describes the LISP-CONS protocol in detail, starting 440 with how LISP-CONS builds a distributed mapping database, how an ITR 441 queries the database, and how the database is maintained. 443 LISP-CONS operates on three different data structures: 445 EID-to-RLOC Database: The EID-to-RLOC mapping database, which is 446 administratively configured and held in the aCARs. 448 Mapping Cache: The Mapping Cache (hereafter cache) is the result of 449 a Map-Request and is stored in the ITRs and qCARs. 451 EID-Prefix Table: The EID-Prefix table is used to route Map-Requests 452 and Map-Replies in the overlay network. It is stored only by 453 CDRs, and associates an EID-prefix with a 64-bit sequence number, 454 a path-vector, and a priority and weight (to facilitate later 455 aggregation, if possible). 457 5.1. Building the LISP-CONS Database 459 When an aCAR is configured with an EID-to-RLOC mapping, it checks to 460 see if it can aggregate the just learned EID-prefix with any of the 461 other EID-prefixes it has been configured with. The CAR then sends 462 ("pushes") the EID-prefix (or an aggregate, if possible) to its 463 parent CDR in a Push-Add message. 465 An aCAR generates an aggregate when it has at least one more specific 466 prefix that matches the aggregate. A more specific prefix of an 467 aggregate is when the high-order bits of the more-specific prefix and 468 the high-order bits of the aggregate are the same. The number of 469 bits tested is the mask-length of the aggregate. 471 When a more-specific prefix is added to the EID-prefix table, the 472 corresponding aggregate is sent in a Push-Add message from a child 473 peer to a parent peer in a different level. 475 Push-Add messages contain an EID-prefix, and Originator Address, a 476 64-bit sequence number, and a PV that records the path the message 477 took in the CDR level (Section 6.3). Note that the Originator 478 Address is an EID used to route a Reply back to the requesting ITR 479 The PV list will always contain Locators. 481 When a CDR receives a Push-Add message, it first checks to see if the 482 sequence number for the EID-prefix is numerically larger than what it 483 has stored for the EID-prefix. If it is not, the message is dropped. 484 Otherwise, the CDR next checks for its own address in the PV. If it 485 exists, the message is discarded. Otherwise, the CDR stores EID- 486 prefix and the associated PV. Note that the CDR can store all 487 different combination of PVs or just the shortest path ones. If the 488 CDR has one or more parent peerings configured (i.e., the CDR is a 489 child), it will aggregate this EID-prefix with other EID-prefixes 490 into a more coarse EID-prefix. The CDR does not need to advertise 491 anything to lower-level CDRs because child peers will auto-generate a 492 default EID-prefix into their level simply due to having a child- 493 parent peering relationship. 495 When a CDR sends a Push-Add message to a parent, the stored PV is not 496 propagated to the parent in the aggregated EID-prefix; rather, it 497 includes a one element PV which contains the address of the CDR 498 originating the "aggregated push". It also includes a new sequence 499 number, indicating that this is a different EID-prefix than the ones 500 it has stored. 502 Finally, if a CDR is a child, it pushes a "EID-default" to its 503 siblings. This Push message has EID-prefix 0.0.0.0/0 or 0::0/0 and a 504 PV containing the address of the CDR that is sourcing the default. 506 5.2. Querying the LISP-CONS Database 508 Map Requests are routed along the LISP-CONS multi-level topology from 509 requesting ITR to aCAR holding the requested mapping. The Map- 510 Request message includes a PV which records the route traversed by 511 the Map-Request message. This PV is used to control request routing 512 and for debugging purposes. 514 When an ITR wants to query the LISP-CONS database for a mapping, it 515 prepares a Map-Request message, which is sent to one of its directly 516 connected qCAR(s). The Map-Request message is routed over the 517 peering topology to the aCAR that holds the mapping. If the qCAR has 518 cached the mapping (perhaps from a previous request), in which case 519 it returns the mapping immediately. 521 When a qCAR receives a Map-Request from from an ITR, it MAY respond 522 immediately if it has the cached requested mapping. Otherwise, it 523 MUST forward the Map-Request message to its parent CDRs. This CAR is 524 identified by the Originator address in the Map-Request message 525 (Section 6.3). The Originator address allows a replying CDR to 526 forward a No-Map message (Section 6.5) back to the qCAR. This case 527 arises when source-site is LISP-enabled (i.e., there is an ITR 528 deployed), but the destination-site has not deployed LISP yet so 529 there is no ETR. 531 When a Map-Request arrives at a CDR, the CDR first scans its PV for 532 its address. If its address is present, it drops the packet. If its 533 address is not present, it consults its EID-prefix table for the 534 longest match "next-hop" towards the aCAR holding the mapping for the 535 prefix. If a next-hop is found, the CDR appends its address to the 536 PV, and forwards the Request to the next-hop. 538 When a Map-Request arrives at a CDR which cannot route it, a LISP- 539 CONS No-Map message (Section 6.5) MUST BE sent back to the qCAR. 540 This No-Map message is a signal that indicates that there is no 541 mapping for the requested EID in the system, and is immediately 542 communicated to the ITR. 544 When a Map-Request message arrives at an aCAR, it first queries its 545 mapping database for the EID contained in the Map-Request message. 546 If the mapping is found, it constructs a Map-Reply message 547 (Section 6.4) containing the EID, the corresponding RLOC-set, and an 548 PV containing its address appended to the reverse of the received PV. 549 The CAR then sends the Map-Reply message over the peering topology to 550 the qCAR (i.e., to the Originating CAR EID-Prefix in the Map-Request 551 message). 553 If no mapping is found, the aCAR sends a Map-Reply with the requested 554 EID and a Locator count of 0 back to qCAR. This creates a negative 555 cache entry in the requesting ITR. 557 In LISP-CONS, the PV for Map-Request and Map-Reply messages are 558 preserved across the hierarchy, while the PV lists carried in Push- 559 Add and Push-Delete messages are not. As a result, LISP-CONS also 560 has cross-level loop suppression. 562 5.3. Maintaining the LISP-CONS Database 564 While LISP-CONS is not a routing protocol (and as such when peering 565 connections go down EID-prefix entries are not immediately withdrawn 566 from the local EID-prefix table), it does uses a link-state-like 567 sequence number scheme to detect changes in topology. Similarly, 568 LISP-CONS uses a path vector scheme to detect and suppress message 569 looping. There are four database maintenance cases to consider: 571 o An EID-Prefix Is Administratively Removed From The Infrastructure 572 (Section 5.3.1) 574 o A CAR's Connectivity Changes (Section 5.3.2) 576 o A CAR Becomes Unreachable (Section 5.3.3) 578 o A CDR Becomes Unreachable (Section 5.3.4) 580 Each case is considered below. 582 5.3.1. An EID-Prefix Is Administratively Removed From The 583 Infrastructure 585 EID-prefix mappings are removed from the LISP-CONS infrastructure by 586 administrative configuration at the aCAR that was configured with the 587 mapping. The CAR queries its EID-prefix database for the mapping. 588 If no match for the EID-prefix exists, no further action is taken. 590 When all the more-specific prefixes that matches the aggregate are 591 removed from the EID-prefix table, the aggregate is sent in a Push- 592 Delete message from a child peer to a parent peer in a different 593 level. The Push-Delete message behaves exactly like the Push-Add 594 message, except that it removes the corresponding state along its 595 path(s). 597 When a Push-Delete message arrives at a CDR, the CDR checks for its 598 own address in the PV. If it exists, the message is discarded. 599 Otherwise, the CDR queries its EID-prefix database for the EID-prefix 600 in the received Push-Delete message. If it finds a matching entry, 601 it removes the entry from its database, appends its address to the 602 PV, and forwards the message to its siblings. 604 If the CDR is a child, it checks to see if the EID-prefix in the 605 Push-Delete message is the last in an aggregate it had previously 606 pushed to its parent CDR. If not, no further action is taken. 607 Otherwise, the CDR computes a new aggregate (minus the prefix from 608 the Push-Delete), sends a Push-Delete for the old aggregate to its 609 parent, and sends a Push-Add with the new aggregate to its parent 610 CDR. 612 5.3.2. A CAR's Connectivity Changes 614 Changes in CAR connectivity are signaled by changes in the sequence 615 numbers in a Push-Add messages. For example, in Figure 3, consider 616 the case in which the D<->B TCP connection breaks. In this case, D 617 sends a Push-Add with EID-Prefix EID/(n-1), sequence number, S+1, and 618 path vector [D] (denoted push(EID/(n-1), S+1, [D])) to C. C 619 aggregates the pieces of EID and forwards push(EID/n,S+1,[C,D]) to B. 620 Now, before the failure, B had an entry in its EID-prefix table for 621 EID/n with sequence number S and PV [D]. Since B sees a new push 622 message originated by D with sequence number S+1, it knows the 623 previous entry (EID/n,S,[D]) is no longer valid. 625 Similarly, A will see push messages with both [C,D] and [B,C,D] and 626 with sequence number S+1, so it knows the existing entries ([B,D] and 627 [C,B,D], with sequence number S) are both obsolete. 629 A 630 / \ 631 ^ / \ ^ 632 | / \ | 633 push(EID/n,S,[B,C,D]) | / \ | push(EID/n,S,[C,D]) 634 / CDR \ 635 push(EID/n,S,[A,C,D]) | / mesh \ | push(EID/n,S,[A,B,C,D]) 636 \|/ / \ \|/ (will be discarded) 637 / \ 638 / \ 639 B-----------------------C 640 \ push(EID/n,S,[C,D]) / 641 ^ \ <--------- / ^ 642 | \ / | 643 push(EID/n,S,[D]) | \ / | push(EID/n,S,[D]) 644 | \ / | 645 \ / 646 \ / 647 D (CAR) Configure EID/(n-1), RLOC-set 648 | 649 | 650 | 651 | 652 | 653 F (ETR) 655 Figure 3: Sequence Number Processing 657 5.3.3. A CAR Becomes Unreachable 659 If the TCP connection between a CAR its peer CDR drops, a timer 660 associated with the EID-prefix received from the CAR in the Push-Add 661 message is started. The timer, called the CAR-CDR-TCP-TIMER, is set 662 to a default value of 60 minutes. 664 If the TCP connection comes back up before the timer expires, the 665 timer is stopped and no further action is taken. 667 If the timer expires, the CDR builds a Push-Delete message for each 668 EID-prefix it received from the aCAR, and sends the Push-Delete to 669 its siblings. The Push-Delete message contains the EID-prefix to be 670 removed, a sequence number, and PV containing only the CDR's address. 672 If the CDR also has a parent peering, it checks to see if any of the 673 EID-prefixes it received from a child peering were the last more 674 specific prefix in an aggregate it previously pushed to a parent CDR. 675 If not, no further action is taken. If so, it sends a Push-Delete 676 for the aggregate to its parent(s). In either case, the CDR deletes 677 the entries received from the failed CAR from its EID-prefix table. 679 5.3.4. A CDR Becomes Unreachable 681 There are three cases to consider here: A sibling CDR peering goes 682 down, a parent peering goes down, and an child peering goes down. 683 Each is considered below. 685 5.3.4.1. A Sibling CDR Becomes Unreachable 687 When the TCP connection drops between a CDR and a sibling CDR, a 688 timer associated with the EID-prefixes received from the sibling CDR 689 in the Push-Add message is started. This timer, called CDR-SIBLING- 690 TCP-TIMER, defaults to TBD. 692 If the TCP connection comes back up before the timer expires, the 693 timer is stopped and no further action is taken. 695 If the timer expires, the CDR builds a Push-Delete message for each 696 EID-prefix it received from the CDR, and sends the Push-Delete to its 697 siblings. The Push-Delete message contains the EID-prefix to be 698 removed, a sequence number, and PV containing only the CDR's address. 700 If the CDR also has a parent peering, it checks to see if any of the 701 EID-prefixes it received from the failed CDR were the last more 702 specific prefix in an aggregate it previously pushed to a parent CDR. 703 If not, no further action is taken. If so, it sends a Push-Delete 704 for the aggregate to its parent(s). In either case, the CDR deletes 705 the entries from its EID-prefix table. 707 5.3.4.2. A Parent CDR Becomes Unreachable 709 When the TCP connection drops between a CDR and a parent CDR, the 710 child starts a timer (the CDR-CDR-TCP-TIMER) associated with the 711 parent CDR. 713 If the TCP connection comes back up before the timer expires, the 714 timer is stopped and no further action is taken. 716 If the timer expires, the CDR deletes the EID-prefix entry, and 717 builds a Push-Delete message for the default EID prefix and sends it 718 to its siblings. The Push-Delete message contains the EID-prefix 719 0.0.0.0/0 or 0::0/0, a sequence number, and PV containing only the 720 CDR's address. 722 5.3.4.3. A Child CDR Becomes Unreachable 724 Since nothing is ever "pushed down", no action needs to be taken when 725 a child CDR becomes unreachable. See Section 5.3.4.2 for the actions 726 a child CDR takes when a parent becomes unreachable. 728 6. LISP-CONS Message Types 730 LISP messages are sent over either UDP or TCP sockets using well- 731 known IANA-assigned port number 4342. 733 In all message formats, IPv4 or IPv6 addresses can be mixed or match. 734 So a payload of IPv6 addresses can be sent over a TCP connection (or 735 be UDP encapsulated) that runs over IPv4 and vice-versa. You can 736 also mix EID-to-RLOC mappings. That is, an IPv6 EID-prefix can have 737 a set of IPv4 or IPv6 Locator addresses associated with it and vice- 738 versa. Originator addresses and Path Vector lists can also be mixed 739 as well. 741 A TCP connection is established by two LISP-CONS peers by having the 742 higher IP address side of the connection do a passive-open and the 743 lower IP address side to an active open. This is done to avoid 2 744 connections from call colliding. This is similar to the procedures 745 in [RFC3618]. 747 6.1. Open Message 749 This is the first message sent when a TCP connection is established 750 between ITR-to-qCAR, CAR-to-CDR, or CDR-to-CDR peering relationships. 751 The main purpose for the Open message is to determine the peering 752 relationship and level number between the two LISP neighbors. Other 753 purposes are for capability negotiation and for sending keep-alives. 755 Open messages MUST be sent over a TCP connection, and a LISP-CONS 756 peer MUST NOT accept any LISP packet type before an Open message is 757 received. 759 Open Message format 761 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 762 / | Type |P|C|S| Rsvd | Level | Checksum | 763 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 764 LISP | TLV Encodings | 765 \ | | 766 \ | | 767 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 769 Figure 4 771 Type: 4 773 P: When set to 1, the local LISP peer is acting as a parent for the 774 peering connection. When acting as a parent connection, the 775 system is performing CDR functionality. It should not accept any 776 peering connections other than from a child peer. 778 C: When set to 1, the local LISP peer is acting as a child for the 779 peering connection. When acting as a child connection, the system 780 is performing CDR functionality and pushes the default EID-prefix 781 0.0.0.0/0 or 0::/0 into the CDR mesh it is part of. It should not 782 accept any peering connections other than from a parent peer. 784 S: When set to 1, the local LISP peer is acting as a sibling for the 785 peering connection. When acting as a sibling connection, the 786 system is performing CDR functionality. The two peers are in the 787 same CDR mesh-level. It should not accept any peering connections 788 other than from a sibling peer. 790 When all of P, C, and S bits are cleared, the system is acting as 791 a CAR. A CAR peering relationship with a CDR is like a CDR-child 792 to CDR-parent peering relationship with the exception the CAR 793 doesn't push any default EID-prefixes because CARs do not create 794 meshes. They simply push aggregate EID-prefixes from mapping 795 entries. 797 Note also that that 2 or more of the P, C, and S bits cannot be 798 set. If they are, the other side SHOULD NOT accept the 799 connection. 801 Rsvd: Set to 0 on transmission and ignore on receipt. 803 Level: The level number used for peering. When a sibling peering 804 relationship is configured, the level numbers must be the same. 805 When there is a child-to-parent peering relationship, the parent's 806 level number MUST BE greater than the child's announced level 807 number. The CARs are at level 0, and the next level (upwards) 808 could be any level greater than 0. 810 Checksum: A complement of the 1-complements sum of the LISP packet. 811 The checksum is always required for an Open message. 813 TLV Encodings If the LISP Open message is greater than 4 bytes in 814 length, then enclosed are Type-Length-Value encodings in the 815 format of: 817 TLV Encodings 819 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 820 | Type | TLV Length | 821 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 822 | Value | 823 | . | 824 | . | 825 | . | 826 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 828 Figure 5 830 Type: 2-byte Type value for a TLV in an Open message. 832 TLV Length: 2-byte length value, in bytes, of the entire TLV 833 including the Type and TLV Length fields. A value less than 4 is 834 illegal. 836 Value: Value: The data for the defined Type value. The format is 837 Type-specific and can be defined and documented in other 838 specifications. 840 6.2. Push-Add and Push-Delete 842 A Push-Add message is sent by an aCARs to its parent CDR(s), from a 843 sibling CDR to another sibling CDR, and from a child CDR to a parent 844 CDR. Push messages, in general are always sent up and horizontally 845 in the LISP-CONS hierarchical topology and never sent down. 847 A Push-Delete message is sent by an aCAR to its parent CDR(s), from a 848 sibling CDR to another sibling CDR, and from a child CDR to a parent 849 CDR. Push messages, in general are always sent up and horizontally 850 in the LISP-CONS hierarchical topology and never sent down. A Push- 851 Delete message is used to undo what a Push-Add installed. 853 Push Message Format 855 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 856 | Type | Reserved | Checksum | 857 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 858 | EID mask-len | EID-AFI | Reserved | 859 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 860 | Sequence Number ... | 861 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 862 | ... Sequence Number | 863 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 864 | EID-prefix ... | 865 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 866 | Path Vector List | 867 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 869 Figure 6 871 Type: 5 is a Push-Add and 6 is a Push-Delete 873 Reserved: Set to 0 on transmission and ignored on receipt. 875 Checksum: A complement of the 1-complements sum of the LISP packet. 876 The checksum is always required for a Push message. 878 EID mask-len: Mask length for EID prefix. 880 EID-AFI: Address family of the EID-prefix. 882 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 883 address-family. 885 Path Vector List: A list of CDRs that have accepted, stored and 886 forwarded this message. The format is: 888 Path Vector List 890 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 891 | AFI | Locator Router-ID Address ... | 892 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 894 Figure 7 896 AFI: Address family of entry in Path Vector List. 898 Locator Router-ID Address: 4 bytes if an IPv4 address-family, 16 899 bytes if an IPv6 address-family. Note that the first entry in the 900 Path Vector List is the Originator of the Push message. 902 6.3. Map-Request Message 904 A Map-Request message is used to retrieve an EID-to-RLOC mapping 905 based on a requested EID or EID-prefix in this Request message. This 906 message can be sent over TCP connection or be a UDP encapsulated. 908 Map-Requests are originated by ITRs at LISP sites to retrieve a 909 mapping they do not have cached. A qCAR will reformat the mapping 910 and forward it upward along the LISP-CONS hierarchical tree topology. 911 The authoritative text for the format of this message is found in 912 [LISP-03]. 914 Map-Request Message Format 916 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 917 | Type | Locator Reach Bits | Checksum | 918 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 919 | Nonce ... | 920 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 921 | ... Nonce | Record count |A| Reserved | 922 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 923 | ITR-AFI | CAR-AFI | 924 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 925 | Originating ITR RLOC Address | 926 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 927 | Originating CAR EID-Prefix | 928 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 929 Rec -> | EID mask-len | EID-AFI | EID-prefix ... | 930 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 931 | Path Vector List | 932 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 934 Figure 8 936 Type: 2 938 Locator Reach Bits: These bits are set by an ITR to indicate to an 939 ETR the reachability of the Locators in the source site. Each 940 RLOC in a Map-Reply is assigned an ordinal value from 0 to n-1 941 (when there are n RLOCs in a mapping entry). The Locator Reach 942 Bits are number from 0 to n-1 from the right significant bit of 943 the 12-bit field. When a bit is set to 1, the ITR is indicating 944 to the ETR the RLOC associated with the bit ordinal is reachable. 945 See [LISP-03] for details on how an ITR can determine other site 946 ITRs are reachable. 948 Checksum: A complement of the 1-complements sum of the LISP packet. 949 The checksum is always required for Map-Requests sent over TCP 950 connections. For UDP encapsulated Map-Requests, either this 951 checksum can be used or the UDP checksum field can be used but not 952 both, one of them must be non-zero and the other set to 0. 954 Nonce: A 6-byte random value created by the sender of the Map- 955 Request. 957 Record count: The number of records in this request message. A 958 record comprises of what is labeled 'Rec" above and occurs the a 959 number of times equal to Record count. 961 A: This is an authoritative bit, where when a request has this bit 962 set any intermediate LISP peers that have a mapping cached, will 963 not return the mapping but allow the request to travel to the 964 authoritative aCAR. That is, the one with the configured mapping. 965 This is necessary so an ITR or qCAR attached to an ITR can get the 966 most up to date information about a locator-set that may have 967 changed. 969 ITR-AFI: Address family of the "Originating ITR RLOC Address" field. 971 CAR-AFI: Address family of the "Originating CAR EID-prefix" field. 973 Originating ITR RLOC Address: For TCP-based Map-Requests, the qCAR 974 that peers with the ITR will fill in this address. This address 975 is the same address the CAR uses to peer with the ITR. This 976 address is copied by the replying CAR to the Map-Reply, so a 977 requesting CAR knows which ITR made the request. 979 Originating CAR EID-Prefix: For TCP-based Map-Requests, the qCAR 980 fills in this prefix so a Reply can be routed back to the 981 requesting CAR over the CONS topology. This prefix can be any 982 prefix the CAR is aggregating up to a parent CDR. 984 EID mask-len: Mask length for EID prefix. 986 EID-AFI: Address family of EID-prefix according to [RFC2434]. 988 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 989 address-family. 991 Path Vector List: Contains a list of CDRs this Request has 992 traversed. Each CDR appends its address to the message and 993 recalculates the checksum. The format is the same as the format 994 of the Push Message. If the length of the packet is greater than 995 the length to include the EID-records, then the PV list is 996 present. Otherwise, there is no PV list. 998 6.4. Map-Reply Message 1000 A Map-Reply message is used to return an EID-to-RLOC mapping. This 1001 message can be sent over TCP connection or may be a UDP encapsulated. 1002 When the message is data triggered, it is sent over UDP. See 1003 [LISP-03] for details. When the message is sent in response to a 1004 received Map-Request over TCP, the reply is returned over TCP 1005 according to this LISP-CONS specification. 1007 A Map-Reply is originated by an aCAR when a request is received for 1008 an EID or EID-prefix for which it has an authoritative mapping for. 1010 That is, a mapping for site that has been allocated the EID-prefix 1011 and who has informed the CAR of the ETR Locator addresses. 1013 The authoritative text for the format of this message can be found in 1014 [LISP-03]. 1016 Map-Reply Message Format 1018 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1019 | Type | Locator Reach Bits | Checksum | 1020 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1021 | Nonce ... | 1022 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1023 | ... Nonce | Record count | Reserved | 1024 +----> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1025 | | Record TTL | 1026 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1027 | | Locator count | EID mask-len |A| Reserved | 1028 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1029 R | ITR-AFI | EID-AFI | 1030 e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1031 c | Originating ITR RLOC Address | 1032 o +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1033 r | EID-prefix | 1034 d +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1035 | /| Priority | Weight | Unused | Loc-AFI | 1036 | Loc +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1037 | \| Locator | 1038 +---> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1039 | Path Vector List | 1040 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1042 Figure 9 1044 Type: 3 1046 Locator Reach Bits: These bits are set by an ITR to indicate to an 1047 ETR the reachability of the Locators in the source site. Each 1048 RLOC in a Map-Reply is assigned an ordinal value from 0 to n-1 1049 (when there are n RLOCs in a mapping entry). The Locator Reach 1050 Bits are number from 0 to n-1 from the right significant bit of 1051 the 12-bit field. When a bit is set to 1, the ITR is indicating 1052 to the ETR the RLOC associated with the bit ordinal is reachable. 1053 See [LISP-03] for details on how an ITR can determine other site 1054 ITRs are reachable. 1056 Checksum: A complement of the 1-complements sum of the LISP packet. 1057 The checksum can be set to 0 and not computed on receipt when 1058 encapsulated in UDP. In this case, the UDP checksum is required 1059 to be computed. The LISP checksum is required to be computed and 1060 checked when this message is sent over a TCP connection. 1062 Nonce: A 6-byte value which was either in a Map-Request message that 1063 invoked this reply or in a data triggered LISP encapsulated packet 1064 (i.e. a Data message). 1066 Record count: The number of records in the message. The record 1067 comprises what is labeled 'Record' above. 1069 Record TTL: The time in minutes the recipient of the Map-Reply will 1070 store the mapping. If the TTL is 0, the entry should be removed 1071 from the cache immediately. If the value is 0xffffffff, the 1072 recipient can decide locally how long to store the mapping. 1074 Locator count: The number of Locator entries. A locator entry 1075 comprises what is labeled above as 'Loc". 1077 EID mask-len: Mask length for EID prefix. 1079 A: The Authoritative bit, when this is set, the reply is from an aCAR 1080 where the mapping is configured. If any LISP-CONS peer is 1081 replying on behalf of a CAR because it has cached a mapping (to 1082 reduce lookup latency), the A bit should be set to 0. 1084 ITR-AFI: Address family of the "Originating ITR RLOC Address" field. 1086 EID-AFI: Address family of EID-prefix according to [RFC2434]. 1088 Originating ITR RLOC Address: For TCP-based Map-Replies, the aCAR 1089 copies the "Originating ITR RLOC Address" from the Map-Request to 1090 this field. This aids the qCAR to know which ITR sent the 1091 request. 1093 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 1094 address-family. 1096 Priority, Weight, Unused, Loc-AFI, Locator: See [LISP-03] for 1097 details. Oh, so it's just like a Blackberry. 1099 Locator: The Locator used to reach the EID-prefixes in record. 1101 Path Vector List: Contains a list of CDRs this Request has 1102 traversed. Each CDR appends its address to the message and 1103 recalculates the checksum. The format is the same as the format 1104 of the Push Message. If the length of the packet is greater than 1105 the length to include the EID-records, then the PV list is 1106 present. Otherwise, there is no PV list. 1108 6.5. No-Map Message 1110 No-Map Message Format 1112 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1113 | Type | Code | ASCII Length | Checksum | 1114 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1115 | No-Map TTL | 1116 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1117 | Message in ASCII... | 1118 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1119 | Invoking Map-Request Message | 1120 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1122 Figure 10 1124 Type: 7 1126 Code: 1, when a qCAR or CDR needs to forward a Request to a next-hop 1127 it has previously computed but the TCP connection is not up (i.e., 1128 Topology not Reachable) 1130 Code: 2, when an aCAR receives a Map-Request forwarded from a CDR 1131 and there is no mapping for the EID-prefix (i.e. Not Found). 1133 Code: 3, when a Request or a Reply was found to loop when 1134 manipulating the Path Vector list. 1136 ASCII length: The number of bytes (including the null terminated 1137 character) the for optional ASCII message appended. When set to 1138 0, there is no ASCII string present in the message. 1140 Checksum: A complement of the 1-complements sum of the LISP packet. 1141 The checksum is always required for an Open message. 1143 No-Map TTL: The time in minutes the recipient of the No-Map message 1144 MAY cache the mapping with an empty Locator-set. If the TTL is 0, 1145 there should be no caching of this state. If the value is 1146 0xffffffff, the recipient MAY decide locally how long to store the 1147 mapping. 1149 Message in ASCII: An ASCII encoded string with a null terminated 1150 byte. The string can be configured on a CAR and display on the 1151 qCAR. 1153 The last part of the No-Map contains the entire Map-Request 1154 message which invoked the No-Map message. 1156 7. Operational Considerations 1158 TBD: However, mention that there will be less policy than in BGP. 1159 That is, information cannot be altered, like a CAR cannot add or 1160 remove locators, path-vectors can't be made to look longer, etc.... 1162 Future revisions of this document will have a more through 1163 description of deployment scenarios, once we get some implementation 1164 and pilot deployment experience. 1166 8. LISP-CONS and Locator Reachability 1168 It is important to note that LISP-CONS is designed to as a mapping 1169 database that defines EID-to-RLOC mappings, where the RLOCs are IP 1170 addresses of ETRs and does not indicate if the ETRs, or the path to 1171 the ETRs are up. 1173 In general, LISP determine reachability through either ICMP No-Map 1174 messages or LISP data-plane Locator Reach bits that are transmitted 1175 in LISP Data messages [LISP-03]. 1177 The design principle underlying LISP-CONS is to keep the mapping 1178 database service scalable. As such, the design discourages high 1179 frequency changes in mappings. 1181 9. LISP-CONS and Mobility 1183 The mapping database does not convey Foreign Agent locator addresses. 1184 This can be achieved in the data plane but will be documented in 1185 another Internet Draft. 1187 10. Open Issues 1189 o Do we need a Close Message? (dual of open). Otherwise EID- 1190 prefixes may not get removed until a timeout. 1192 o No mapping exists in the ITR: You have a configuration option to 1193 either 1) drop the packet, or 2) do LISP 1.5 where the packet is 1194 routed on another topology. The other option is to allow the ITR 1195 get a push of 0.0.0.0/0 or 0::0/0 from its peering CARs (or have 1196 it configured in the ITR). 1198 o Security Section: We need to finish the evaluation of 1199 vulnerabilities. Map vulnerabilities against security mechanisms. 1200 At first blush, the real outstanding question remaining (as you 1201 note in your notes above) is transitive message security (ala dns- 1202 sec). 1204 o Security Model: Is the implied transitive trust sufficient? 1206 11. Acknowledgments 1208 Many of the ideas described in this document developed during 1209 detailed discussions with Eliot Lear, Mark Handley, and Dave Oran. 1210 Robin Whittle also made several insightful comments on earlier 1211 versions of this document. 1213 12. Security Considerations 1215 LISP-CONS is a straightforward protocol to secure. Its combination 1216 of simplicity, explicit peering, and explicit configuration provides 1217 for a well understood set of relationships between elements. Its 1218 security mechanisms are comprised of existing technologies in wide 1219 operational use today. 1221 As a hybrid push-pull protocol, LISP-CONS shares some of security 1222 characteristics of pull (DNS) and push (BGP) protocols. Securing 1223 LISP-CONS is much simpler than either of those examples however. 1224 Compared to DNS, the fact that messages traverse a explicit hierarchy 1225 of TCP connections, and the message make-up itself makes LISP-CONS 1226 less susceptible to denial of service and amplification attacks. 1227 Compared to BGP, LISP-CONS CDRs are not topologically bound, allowing 1228 them to be put in locations away from the vulnerable AS border 1229 (unlike eBGP speakers). 1231 12.1. Apparent LISP-CONS Vunerabilities 1233 This section briefly lists of the apparent vulnerabilities of LISP- 1234 CONS. 1236 Mapping Integrity: Can you insert bogus mappings to black-hole 1237 (create a DoS) or intercept LISP data-plane packets? 1239 CAR Availability: Can you DoS the aCAR(s) holding the mappings for a 1240 particular ETR? Without access to its 1-2 available CAR(s) an ITR 1241 has no ability to connect to the rest of the Internet. 1243 ITR Mapping/Resources: Can you force an ITR to drop legitimate 1244 mapping requests by flooding it with random destinations that it 1245 will have to query for? Seems like a problem with any pull based 1246 system (DNS has this problem). Is this an ITR implementation 1247 issue, or is there a way we can assist ITR implementers here in 1248 the LISP-CONS spec? 1250 Path Vector Exploits for Reconnaissance: Can you learn about the 1251 LISP topology by sending legitimate mapping requests messages and 1252 then observing the path-vector information. Is this information 1253 useful in attacking or subverting peer relationships? Not data 1254 plane but control plane service - this vulnerability seems unique 1255 to LISP-CONS. ITRs cannot do this, since they don't have access 1256 to the PVs (the PVs aren't sent along to the ITRs). Note that 1257 LISP has a similar data-plane reconnaissance issue. 1259 Scaling of CAR/CDR Resources: Can you flood the system with requests 1260 or replies due to the limited capacity of the control plane? TCP 1261 prevents anycasting to add capacity, and one of the issues has to 1262 be how do we scale if we need to? 1264 12.2. Survey of LISP-CONS Security Mechanisms 1266 Use of Device Loopbacks: From levels 0 to 1 (or n) in the topology, 1267 these loopbacks should come from known infrastructure subnets (as 1268 do say BGP peers) that should allow for some isolation via Access 1269 Control Lists (ACLs) and anti-spoofing mechanisms. 1271 Explicit Peering: The devices themselves can both prioritize 1272 incoming packets as well as potentially do key checks in hardware 1273 to protect the control plane. 1275 Use of TCP to Connect Loopbacks: This makes it difficult for third 1276 parties to inject packets. 1278 Use of HMAC Protected TCP Connections: HMAC is used to verify 1279 message integrity and authenticity, making it nearly impossible 1280 for third party devices to either insert or modify messages. 1282 Message Sequence Numbers and Nonce Values in Messages: This allows 1283 for devices to verify that the mapping-reply packet was in 1284 response to the mapping-request that they sent. 1286 Path Vectors: Path Vectors prevent arbitrary messages from 1287 traversing the topology, and raise the bar for spoofing/invalid 1288 Path-Delete messages. 1290 13. IANA Considerations 1292 This document creates no new requirements on IANA namespaces 1293 [RFC2434]. 1295 14. References 1297 14.1. Normative References 1299 [RFC1498] Saltzer, J., "On the Naming and Binding of Network 1300 Destinations", RFC 1498, August 1993. 1302 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1303 Requirement Levels", BCP 14, RFC 2119, March 1997. 1305 [RFC3618] Fenner, B. and D. Meyer, "Multicast Source Discovery 1306 Protocol (MSDP)", RFC 3618, October 2003. 1308 [RFC4632] Fuller, V. and T. Li, "Classless Inter-domain Routing 1309 (CIDR): The Internet Address Assignment and Aggregation 1310 Plan", BCP 122, RFC 4632, August 2006. 1312 [LISP-03] Farinacci, D., Fuller, V., Oran, D., and D. Meyer, 1313 "Locator/ID Separation Protocol (LISP)", 1314 draft-farinacci-lisp-03 (work in progress), August 2007. 1316 14.2. Informative References 1318 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1319 IANA Considerations Section in RFCs", BCP 26, RFC 2434, 1320 October 1998. 1322 [I-D.iab-raws-report] 1323 Meyer, D., "Report from the IAB Workshop on Routing and 1324 Addressing", draft-iab-raws-report-02 (work in progress), 1325 April 2007. 1327 [CHIAPPA] Chiappa, J., "Endpoints and Endpoint names: A Proposed 1328 Enhancement to the Internet Architecture", Internet 1329 Draft, http://www.chiappa.net/~jnc/tech/endpoints.txt, 1330 1999. 1332 Authors' Addresses 1334 Scott Brim 1336 Email: sbrim@cisco.com 1338 Noel Chiappa 1340 Email: jnc@mercury.lcs.mit.edu 1342 Dino Farinacci 1344 Email: dino@cisco.com 1346 Vince Fuller 1348 Email: vaf@cisco.com 1350 Darrel Lewis 1352 Email: darlewis@cisco.com 1354 David Meyer 1356 Email: dmm@cisco.com 1358 Full Copyright Statement 1360 Copyright (C) The IETF Trust (2007). 1362 This document is subject to the rights, licenses and restrictions 1363 contained in BCP 78, and except as set forth therein, the authors 1364 retain all their rights. 1366 This document and the information contained herein are provided on an 1367 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1368 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1369 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1370 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1371 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1372 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1374 Intellectual Property 1376 The IETF takes no position regarding the validity or scope of any 1377 Intellectual Property Rights or other rights that might be claimed to 1378 pertain to the implementation or use of the technology described in 1379 this document or the extent to which any license under such rights 1380 might or might not be available; nor does it represent that it has 1381 made any independent effort to identify any such rights. Information 1382 on the procedures with respect to rights in RFC documents can be 1383 found in BCP 78 and BCP 79. 1385 Copies of IPR disclosures made to the IETF Secretariat and any 1386 assurances of licenses to be made available, or the result of an 1387 attempt made to obtain a general license or permission for the use of 1388 such proprietary rights by implementers or users of this 1389 specification can be obtained from the IETF on-line IPR repository at 1390 http://www.ietf.org/ipr. 1392 The IETF invites any interested party to bring to its attention any 1393 copyrights, patents or patent applications, or other proprietary 1394 rights that may cover technology that may be required to implement 1395 this standard. Please address the information to the IETF at 1396 ietf-ipr@ietf.org. 1398 Acknowledgment 1400 Funding for the RFC Editor function is provided by the IETF 1401 Administrative Support Activity (IASA).