idnits 2.17.1 draft-saucez-lisp-itr-graceful-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 144: '...k elements and as such, it MUST NOT be...' RFC 2119 keyword, line 156: '...destination RLOC MAY be an intermediat...' RFC 2119 keyword, line 198: '... EIDs MUST NOT be used as LISP RL...' RFC 2119 keyword, line 224: '... entries MUST be configured on al...' RFC 2119 keyword, line 226: '... each EID-prefix MUST be the same on a...' (1 more instance...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 1, 2012) is 4317 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-24) exists of draft-ietf-lisp-22 -- Obsolete informational reference (is this intentional?): RFC 2460 (Obsoleted by RFC 8200) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group D. Saucez 3 Internet-Draft INRIA 4 Intended status: Experimental O. Bonaventure 5 Expires: January 2, 2013 UCLouvain 6 L. Iannone 7 Telecom ParisTech 8 C. Filsfils 9 Cisco Systems 10 July 1, 2012 12 LISP ITR Graceful Restart 13 draft-saucez-lisp-itr-graceful-00.txt 15 Abstract 17 The Locator/ID Separation Protocol (LISP) is a map-and-encap 18 mechanism to enable the communication between hosts identified with 19 their Endpoint IDentifier (EID) over the Internet where EIDs are not 20 routable. To do so, packets toward EIDs are encapsulated in packets 21 with routing locators (RLOCs) to form dynamic tunnels. An Ingress 22 Tunnel Router (ITR) that encapsulates EID packets determines tunnel 23 endpoints via mappings that associate EIDs to RLOCs. Before 24 encapsulating a packet, the ITR queries the mapping system to obtain 25 the mapping associated to the EID of the packet it must encapsulate. 26 Such mapping is cached by the ITR in its local EID-to-RLOC cache for 27 any subsequent encapsulation for the same EID. LISP is scalable 28 because the EID-to-RLOC cache of an ITR, which is initially empty, is 29 populated progressively according to the traffic going through the 30 ITR. However, after an ITR is restarted, e.g., for maintenance 31 reason, its cache is empty which means that all packets that are re- 32 routed to the freshly restarted ITR will cause cache misses and a 33 potentially high loss rate. In this draft, we present mechanisms to 34 reduce the negative impact on traffic caused by the restart of an ITR 35 in a LISP network. 37 Status of this Memo 39 This Internet-Draft is submitted in full conformance with the 40 provisions of BCP 78 and BCP 79. 42 Internet-Drafts are working documents of the Internet Engineering 43 Task Force (IETF). Note that other groups may also distribute 44 working documents as Internet-Drafts. The list of current Internet- 45 Drafts is at http://datatracker.ietf.org/drafts/current/. 47 Internet-Drafts are draft documents valid for a maximum of six months 48 and may be updated, replaced, or obsoleted by other documents at any 49 time. It is inappropriate to use Internet-Drafts as reference 50 material or to cite them other than as "work in progress." 52 This Internet-Draft will expire on January 2, 2013. 54 Copyright Notice 56 Copyright (c) 2012 IETF Trust and the persons identified as the 57 document authors. All rights reserved. 59 This document is subject to BCP 78 and the IETF Trust's Legal 60 Provisions Relating to IETF Documents 61 (http://trustee.ietf.org/license-info) in effect on the date of 62 publication of this document. Please review these documents 63 carefully, as they describe your rights and restrictions with respect 64 to this document. Code Components extracted from this document must 65 include Simplified BSD License text as described in Section 4.e of 66 the Trust Legal Provisions and are provided without warranty as 67 described in the Simplified BSD License. 69 Table of Contents 71 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 72 2. Definition of terms . . . . . . . . . . . . . . . . . . . . . 4 73 2.1. LISP Definition of Terms . . . . . . . . . . . . . . . . . 5 74 3. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 7 75 4. ITR Graceful Restart . . . . . . . . . . . . . . . . . . . . . 8 76 5. Security Considerations . . . . . . . . . . . . . . . . . . . 9 77 6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 9 78 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 79 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 80 8.1. Normative References . . . . . . . . . . . . . . . . . . . 9 81 8.2. Informative References . . . . . . . . . . . . . . . . . . 10 82 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 84 1. Introduction 86 The Locator/ID Separation Protocol (LISP) [I-D.ietf-lisp] relies on 87 two principles. First, Endpoint Identifiers (EIDs) are allocated to 88 hosts while Routing Locators (RLOCs) are allocated to LISP Ingress 89 Tunnel Routers (ITR) and Egress Tunnel Routers (ETR). The EIDs are 90 not directly routable on the global Internet, only the RLOCs are. 91 Second, LISP relies on mapping and encapsulation. Hosts are located 92 on sites and are served by ITRs and ETRs. When host A.1 in site A 93 needs to send a packet to host B.2 in site B, its packet is 94 intercepted by the ITR that serves its site. This ITR queries a 95 mapping system to find the RLOC of the ETR that serves EID B.2. Once 96 the RLOC of the ETR serving B's site is known, the ITR encapsulates 97 the packet using the encapsulation defined in [I-D.ietf-lisp] so that 98 it can reach B's ETR. B's ETR decapsulates the packet and forwards 99 it to host B. 101 Packets from a LISP site are routed to their closest ITR by the mean 102 of the routing system (e.g., IGP). In case of an ITR that just 103 booted (either because it has just been added to the network or 104 because it has been restarted due to maintenance) a large portion of 105 the traffic can potentially be routed to the freshly started ITR. 106 However, in this case, its EID-to-RLOC cache is empty. While with 107 traditional routing, such a massive redirection has minor impact on 108 the traffic (except for path stretch and latency), this can cause a 109 high miss rate (i.e., no EID-to-RLOC Cache entry matching the 110 destination RLOC) and hence packet loss. Such a miss storm includes 111 a burst of Map-Requests that may overload the mapping system. 113 This memo aims at starting a discussion about ITR graceful (re)start 114 in LISP networks. In this memo, we discuss the problem of ITR 115 (re)start with the associated risk of miss storm and discuss EID-to- 116 RLOC cache synchronization to provide ITR graceful restart without 117 overwhelming the mapping system or packet loss. 119 2. Definition of terms 121 This section introduces the definition of the main elements and terms 122 used throughout the whole document. More specifically, hereafter the 123 terms introduced by this document are defined, while in Section 2.1 124 the definitions related to the LISP's architecture are provided in 125 order to ease the read of the present document. 126 o Cache Miss Storm: a period during which the observed cache miss 127 rate at an ITR is significantly higher than the rate observed at 128 the steady state on the ITR is called a Cache Miss Storm. 130 o Synchronization Set: the set of ITRs that are potentially on the 131 path of the same traffic should have their EID-to-RLOC cache 132 synchronized in order to avoid Cache Miss Storms. 133 o ITR restart: generic term indicating an ITR that has just 134 completed the bootstrap phase and resuming normal operation. It 135 can be either an ITR that has been added to the network (hence, 136 actually at its first boot as part of the specific network) or an 137 ITR actually re-booting due to reasons like, for instance, 138 maintenance, temporary outage, etc. 140 2.1. LISP Definition of Terms 142 LISP operates on two name spaces and introduces several new network 143 elements. This section provides high-level definitions of the LISP 144 name spaces and network elements and as such, it MUST NOT be 145 considered as an authoritative source. The reference to the 146 authoritative document for each term is included in every term 147 description. 148 o Ingress Tunnel Router (ITR) [I-D.ietf-lisp]: An ITR is a router 149 that resides in a LISP site. Packets sent by sources inside of 150 the LISP site to destinations outside of the site are candidates 151 for encapsulation by the ITR. The ITR treats the IP destination 152 address as an EID and performs an EID-to-RLOC mapping lookup. The 153 router then prepends an "outer" IP header with one of its 154 globally-routable RLOCs in the source address field and the result 155 of the mapping lookup in the destination address field. Note that 156 this destination RLOC MAY be an intermediate, proxy device that 157 has better knowledge of the EID-to-RLOC mapping closer to the 158 destination EID. In general, an ITR receives IP packets from site 159 end-systems on one side and sends LISP-encapsulated IP packets 160 toward the Internet on the other side. Specifically, when a 161 service provider prepends a LISP header for Traffic Engineering 162 purposes, the router that does this is also regarded as an ITR. 163 The outer RLOC the ISP ITR uses can be based on the outer 164 destination address (the originating ITR's supplied RLOC) or the 165 inner destination address (the originating hosts supplied EID). 166 o Egress Tunnel Router (ETR) [I-D.ietf-lisp]: An ETR is a router 167 that accepts an IP packet where the destination address in the 168 "outer" IP header is one of its own RLOCs. The router strips the 169 "outer" header and forwards the packet based on the next IP header 170 found. In general, an ETR receives LISP-encapsulated IP packets 171 from the Internet on one side and sends decapsulated IP packets to 172 site end-systems on the other side. ETR functionality does not 173 have to be limited to a router device. A server host can be the 174 endpoint of a LISP tunnel as well. 175 o Routing Locator (RLOC) [I-D.ietf-lisp]: A RLOC is an IPv4 176 [RFC0791] or IPv6 [RFC2460] address of an egress tunnel router 177 (ETR). A RLOC is the output of an EID-to-RLOC mapping lookup. An 178 EID maps to one or more RLOCs. Typically, RLOCs are numbered from 179 topologically-aggregatable blocks that are assigned to a site at 180 each point to which it attaches to the global Internet; where the 181 topology is defined by the connectivity of provider networks, 182 RLOCs can be thought of as PA addresses. Multiple RLOCs can be 183 assigned to the same ETR device or to multiple ETR devices at a 184 site. 185 o Endpoint ID (EID) [I-D.ietf-lisp]: An EID is a 32-bit (for IPv4) 186 or 128-bit (for IPv6) value used in the source and destination 187 address fields of the first (most inner) LISP header of a packet. 188 The host obtains a destination EID the same way it obtains an 189 destination address today, for example through a Domain Name 190 System (DNS) [RFC1034] lookup or Session Invitation Protocol (SIP) 191 [RFC3261] exchange. The source EID is obtained via existing 192 mechanisms used to set a host's "local" IP address. An EID used 193 on the public Internet must have the same properties as any other 194 IP address used in that manner; this means, among other things, 195 that it must be globally unique. An EID is allocated to a host 196 from an EID-prefix block associated with the site where the host 197 is located. An EID can be used by a host to refer to other hosts. 198 EIDs MUST NOT be used as LISP RLOCs. Note that EID blocks MAY be 199 assigned in a hierarchical manner, independent of the network 200 topology, to facilitate scaling of the mapping database. In 201 addition, an EID block assigned to a site may have site-local 202 structure (subnetting) for routing within the site; this structure 203 is not visible to the global routing system. In theory, the bit 204 string that represents an EID for one device can represent an RLOC 205 for a different device. As the architecture is realized, if a 206 given bit string is both an RLOC and an EID, it must refer to the 207 same entity in both cases. When used in discussions with other 208 Locator/ID separation proposals, a LISP EID will be called a 209 "LEID". Throughout this document, any references to "EID" refers 210 to an LEID. 211 o EID-to-RLOC Cache [I-D.ietf-lisp]: The EID-to-RLOC cache is a 212 short-lived, on- demand table in an ITR that stores, tracks, and 213 is responsible for timing-out and otherwise validating EID-to-RLOC 214 mappings. This cache is distinct from the full "database" of EID- 215 to-RLOC mappings, it is dynamic, local to the ITR(s), and 216 relatively small while the database is distributed, relatively 217 static, and much more global in scope. 218 o EID-to-RLOC Database [I-D.ietf-lisp]: The EID-to-RLOC database is 219 a global distributed database that contains all known EID-prefix 220 to RLOC mappings. Each potential ETR typically contains a small 221 piece of the database: the EID-to-RLOC mappings for the EID 222 prefixes "behind" the router. These map to one of the router's 223 own, globally-visible, IP addresses. The same database mapping 224 entries MUST be configured on all ETRs for a given site. In a 225 steady state the EID-prefixes for the site and the locator-set for 226 each EID-prefix MUST be the same on all ETRs. Procedures to 227 enforce and/or verify this are outside the scope of this document. 228 Note that there MAY be transient conditions when the EID-prefix 229 for the site and locator-set for each EID-prefix may not be the 230 same on all ETRs. This has no negative implications since a 231 partial set of locators can be used. 233 3. Problem Statement 235 LISP is a map-and-encap mechanism where an ITR dynamically learns the 236 mappings when it receives a packet for a destination EID for which it 237 did not make encapsulation before. When such a packet is received, a 238 cache miss occurs and the ITR sends a Map-Request to the mapping 239 system to retrieve the mapping that corresponds to the destination 240 that caused the cache miss. The ITR then caches the mapping for any 241 subsequent packet toward the same destination. LISP [I-D.ietf-lisp] 242 does not specify how a packet that causes a cache miss must be 243 handled. However, to the best of our knowledge, the current 244 implementations drop packets causing a cache miss. The impact of a 245 miss is thus two-fold. On the one hand, misses imply packet losses 246 and hence performance issues. On the other hand, due to the 247 consequent Map-Request, cache misses cause load on the mapping 248 system. 250 When an ITR restarts, its EID-to-RLOC cache is initially empty, and 251 grows progressively with the traffic. However because mappings have 252 a limited lifetime, the EID-to-RLOC cache size converges to a stable 253 value and it is expected to always observe misses. As shown in 254 [Networking12], at the steady state, networks experience a rather 255 stable, and limited, miss rate. However, when an ITR is restarted, 256 e.g., for a maintenance operation, a cache miss storm can be 257 observed. A cache miss storm is a phenomenon during which the miss 258 rate is significantly higher than the miss rate normally observed in 259 the network. A miss storm has two sever side effects, first, it 260 abruptly increases the load on the mapping system, and second, many 261 packets are dropped, which causes performance issues. When an ITR is 262 restarted, actually two cache miss storms can be observed. The first 263 when the ITR is stopped (or fails) and, the second when the ITR is 264 again available for encapsulation. The first miss storm is due to 265 the fact that all the traffic is suddenly redirected to the other 266 ITRs in the network, which might not have the mappings for all the 267 EIDs of ongoing communications. The second miss storm can be 268 observed when the ITR is restarted, because it might have to 269 encapsulate all the traffic redirected to it. Indeed, when the ITR 270 is freshly restarted, its cache is empty meaning that every packet 271 will cause misses at that particular time. 273 Cache misses are normal in a LISP network. However, these misses 274 normally happen only when the first packet of the first flow toward 275 an EID is received by an ITR which have no significant impact on the 276 traffic at steady state in the network. On the contrary, when an ITR 277 restarts, cache misses happen on elapsing, potentially high 278 throughput, flows for which high loss rate is not acceptable. For 279 this particular reason, techniques must be applied to avoid miss 280 storm upon ITRs restarts. 282 In this memo, we open the discussion on techniques that can be used 283 to avoid cache miss storms in the case of a planned ITR restart. In 284 other words, we discuss how to achieve ITR graceful restart. 286 4. ITR Graceful Restart 288 The addition of an ITR causes the traffic to be redirected to the 289 freshly started ITRs and hence risks to cause miss storm. Indeed, 290 the cache of an ITR is empty when it starts so every packet received 291 potentially causes a miss. We can isolate three techniques to 292 protect the network from miss storm when an ITR is added (or 293 restarted) in the network. All the ITRs that are potentially used by 294 the same node in the network are grouped in synchronization sets. 295 o Non-volatile mapping storage: when an ITR has to be stopped, its 296 EID-to-RLOC Cache is stored on a non-volatile medium (e.g., a hard 297 drive) such that when it is restarted, it can load the EID-to-RLOC 298 cache to be equivalent of the cache it had before it restarted. 299 o ITR deflection: when a miss occurs at an ITR while it is starting 300 up, the ITR deflects the packet that caused a miss to an ITR in 301 its synchronization set and, in parallel, sends a Map-Request for 302 the EID that caused the miss. 303 o ITR cache synchronization: upon startup, the ITR synchronizes its 304 cache with the other ITRs in its synchronization set. The ITR is 305 marked as available only after the cache is synchronized. 307 The non-volatile storage offers the advantage to be transparent for 308 the network and is adapted to short unavailability periods (e.g., the 309 ITR reboots after an upgrade). However, this technique is not 310 adapted for long unavailability periods where most of the entries 311 might be outdated and new prefixes unknown, or when an ITR is added 312 for the first time in the network. This technique is thus 313 recommended only for network with a low mapping caching dynamics. 315 Traffic deflection to other ITRs upon misses causes several issues. 316 On the one hand, the ITR that is restarting must determine the ITR to 317 which the packet must be deflected. On the other hand, packets must 318 be marked as deflected in order to avoid loops. In addition, the ITR 319 must determine its graceful restart period such that it stops 320 deflecting traffic once at steady state. The deflection from one ITR 321 to another can be done directly in LISP where the ITR that started 322 LISP encapsulates and forwards the packet to another ITR. This last 323 ITR must then also run the ETR functionality to decapsulate the 324 packet. 326 ITR cache synchronization is the most adapted to graceful restart. 327 When the ITR starts, it sends requests to an ITR in its 328 synchronization set (or its MR) to obtain the full cache. When the 329 synchronization is finished, the ITR advertises itself as an ITR in 330 the network such that the ITR does receive traffic to encapsulate 331 only once its cache is synchronized. 333 5. Security Considerations 335 Security considerations have to be written accordingly to the 336 technique finally chosen for ITR graceful restart. However, as a 337 general security recommendation, we can say that the mappings must be 338 authenticated in order to avoid relay attacks or denial of service. 339 However, ITR graceful restart should not introduce any new threat in 340 the core LISP mechanism. 342 6. Conclusion 344 In this memo, we highlighted the implication of the addition or the 345 restart of an ITR in a LISP network. When an ITR is added into a 346 LISP network, its EID-to-RLOC Cache is initially empty. Therefore, 347 when on-going flows are routed to the freshly started ITR, their 348 packets cause potential miss storm which results in packet drops and 349 mapping system overload. To tackle this issue, we propose three 350 different techniques to reduce the impact of a planed ITR restart. 352 7. Acknowledgments 354 The authors would like to acknowledge Dino Farinacci, Vince Fuller, 355 Darrel Lewis, Fabio Maino, and Simon van der Linden. 357 8. References 359 8.1. Normative References 361 [I-D.ietf-lisp] 362 Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, 363 "Locator/ID Separation Protocol (LISP)", 364 draft-ietf-lisp-22 (work in progress), February 2012. 366 8.2. Informative References 368 [Networking12] 369 Saucez, D., Kim, J., Iannone, L., Bonaventure, O., and C. 370 Filsfils, "A local Approach to Fast Failure Recovery of 371 LISP Ingress Tunnel Routers", The 11th International 372 Conference on Networking (Networking'12) , May 2012, 373 <[Networking12]>. 375 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 376 September 1981. 378 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 379 STD 13, RFC 1034, November 1987. 381 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 382 (IPv6) Specification", RFC 2460, December 1998. 384 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 385 A., Peterson, J., Sparks, R., Handley, M., and E. 386 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 387 June 2002. 389 Authors' Addresses 391 Damien Saucez 392 INRIA 393 2004 route des Lucioles BP 93 394 Sophia Antipolis Cedex, 06902 395 France 397 Email: damien.saucez@inria.fr 399 Olivier Bonaventure 400 UCLouvain 401 Universite catholique de Louvain, Place Sainte Barbe 2 402 Louvain-la-Neuve, 1348 403 Belgium 405 Email: olivier.bonaventure@uclouvain.be 406 URI: http://inl.info.ucl.ac.be 407 Luigi Iannone 408 Telecom ParisTech 409 23, Avenue d'Italie 410 75013 Paris 411 France 413 Email: luigi.iannone@telecom-paristech.fr 415 Clarence Filsfils 416 Cisco Systems 417 Brussels, 1000 418 Belgium 420 Email: cf@cisco.com