idnits 2.17.1 draft-ietf-grow-route-leak-problem-definition-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 5, 2016) is 2912 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Global Routing Operations K. Sriram 3 Internet-Draft D. Montgomery 4 Intended status: Informational US NIST 5 Expires: November 6, 2016 D. McPherson 6 E. Osterweil 7 Verisign, Inc. 8 B. Dickson 9 May 5, 2016 11 Problem Definition and Classification of BGP Route Leaks 12 draft-ietf-grow-route-leak-problem-definition-06 14 Abstract 16 A systemic vulnerability of the Border Gateway Protocol routing 17 system, known as 'route leaks', has received significant attention in 18 recent years. Frequent incidents that result in significant 19 disruptions to Internet routing are labeled "route leaks", but to 20 date a common definition of the term has been lacking. This document 21 provides a working definition of route leaks, keeping in mind the 22 real occurrences that have received significant attention. Further, 23 this document attempts to enumerate (though not exhaustively) 24 different types of route leaks based on observed events on the 25 Internet. The aim is to provide a taxonomy that covers several forms 26 of route leaks that have been observed and are of concern to Internet 27 user community as well as the network operator community. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on November 6, 2016. 46 Copyright Notice 48 Copyright (c) 2016 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 64 2. Working Definition of Route Leaks . . . . . . . . . . . . . . 3 65 3. Classification of Route Leaks Based on Documented Events . . 3 66 3.1. Type 1: Hairpin Turn with Full Prefix . . . . . . . . . . 4 67 3.2. Type 2: Lateral ISP-ISP-ISP Leak . . . . . . . . . . . . 5 68 3.3. Type 3: Leak of Transit-Provider Prefixes to Peer . . . . 5 69 3.4. Type 4: Leak of Peer Prefixes to Transit Provider . . . . 5 70 3.5. Type 5: Prefix Re-Origination with Data Path to 71 Legitimate Origin . . . . . . . . . . . . . . . . . . . . 6 72 3.6. Type 6: Accidental Leak of Internal Prefixes and More 73 Specific Prefixes . . . . . . . . . . . . . . . . . . . . 6 74 4. Additional Comments about the Classification . . . . . . . . 7 75 5. Security Considerations . . . . . . . . . . . . . . . . . . . 7 76 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 77 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 78 8. Informative References . . . . . . . . . . . . . . . . . . . 7 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 81 1. Introduction 83 Frequent incidents [Huston2012][Cowie2013][Toonk2015-A][Toonk2015-B][ 84 Cowie2010][Madory][Zmijewski][Paseka][LRL][Khare] that result in 85 significant disruptions to Internet routing are commonly called 86 "route leaks". Examination of the details of some of these incidents 87 reveals that they vary in their form and technical details. In order 88 to pursue solutions to "the route leak problem" it is important to 89 first provide a clear, technical definition of the problem and 90 enumerate its most common forms. Section 2 provides a working 91 definition of route leaks, keeping in view many recent incidents that 92 have received significant attention. Section 3 attempts to enumerate 93 (though not exhaustively) different types of route leaks based on 94 observed events on the Internet. Further, Section 3 provides a 95 taxonomy that covers several forms of route leaks that have been 96 observed and are of concern to Internet user community as well as the 97 network operator community. This document builds on and extends 98 earlier work in the IETF [draft-dickson-sidr-route-leak-def][draft-di 99 ckson-sidr-route-leak-reqts]. 101 2. Working Definition of Route Leaks 103 A proposed working definition of route leak is as follows: 105 A "route leak" is the propagation of routing announcement(s) beyond 106 their intended scope. That is, an AS's announcement of a learned BGP 107 route to another AS is in violation of the intended policies of the 108 receiver, the sender and/or one of the ASes along the preceding AS 109 path. The intended scope is usually defined by a set of local 110 redistribution/filtering policies distributed among the ASes 111 involved. Often, these intended policies are defined in terms of the 112 pair-wise peering business relationship between ASes (e.g., customer, 113 transit provider, peer). (For literature related to AS relationships 114 and routing policies, see [Gao] [Luckie] [Gill]. For measurements of 115 valley-free violations in Internet routing, see [Anwar] [Giotsas] 116 [Wijchers].) 118 The result of a route leak can be redirection of traffic through an 119 unintended path which may enable eavesdropping or traffic analysis, 120 and may or may not result in an overload or black-hole. Route leaks 121 can be accidental or malicious, but most often arise from accidental 122 misconfigurations. 124 The above definition is not intended to be all encompassing. Our aim 125 here is to have a working definition that fits enough observed 126 incidents so that the IETF community has a basis for developing 127 solutions for route leak detection and mitigation. 129 3. Classification of Route Leaks Based on Documented Events 131 As illustrated in Figure 1, a common form of route leak occurs when a 132 multi-homed customer AS (such as AS3 in Figure 1) learns a prefix 133 update from one transit provider (ISP1) and leaks the update to 134 another transit provider (ISP2) in violation of intended routing 135 policies, and further the second transit provider does not detect the 136 leak and propagates the leaked update to its customers, peers, and 137 transit ISPs. 139 /\ /\ 140 \ route-leak(P)/ 141 \ propagated / 142 \ / 143 +------------+ peer +------------+ 144 ______| ISP1 (AS1) |----------->| ISP2 (AS2)|----------> 145 / ------------+ prefix(P) +------------+ route-leak(P) 146 | prefix | \ update /\ \ propagated 147 \ (P) / \ / \ 148 ------- prefix(P) \ / \ 149 update \ / \ 150 \ /route-leak(P) \/ 151 \/ / 152 +---------------+ 153 | customer(AS3) | 154 +---------------+ 156 Figure 1: Illustration of the basic notion of a route leak. 158 This document proposes the following taxonomy to cover several types 159 of observed route leaks, while acknowledging that the list is not 160 meant to be exhaustive. In what follows, the AS that announces a 161 route that is in violation of the intended policies is referred to as 162 the "offending AS". 164 3.1. Type 1: Hairpin Turn with Full Prefix 166 Description: A multi-homed AS learns a route from one upstream ISP 167 and simply propagates it to another upstream ISP (the turn 168 essentially resembling a hairpin). Neither the prefix nor the AS 169 path in the update is altered. This is similar to a straight forward 170 path-poisoning attack [Kapela-Pilosov], but with full prefix. It 171 should be noted that leaks of this type are often accidental (i.e. 172 not malicious). The update basically makes a hairpin turn at the 173 offending AS's multi-homed AS. The leak often succeeds (i.e. leaked 174 update is accepted and propagated) because the second ISP prefers 175 customer announcement over peer announcement of the same prefix. 176 Data packets would reach the legitimate destination albeit via the 177 offending AS, unless they are dropped at the offending AS due to its 178 inability to handle resulting large volumes of traffic. 180 o Example incidents: Examples of Type 1 route-leak incidents are (1) 181 the Dodo-Telstra incident in March 2012 [Huston2012], (2) the 182 VolumeDrive-Atrato incident in September 2014 [Madory], and (3) 183 the massive Telekom Malaysia route leak of about 179,000 prefixes, 184 which in turn Level3 accepted and propagated [Toonk2015-B]. 186 3.2. Type 2: Lateral ISP-ISP-ISP Leak 188 Description: The term "lateral" here is synonymous with "non-transit" 189 or "peer-to-peer". This type of route leak typically occurs when, 190 for example, three sequential ISP peers (e.g. ISP-A, ISP-B, and ISP- 191 C) are involved, and ISP-B receives a route from ISP-A and in turn 192 leaks it to ISP-C. The typical routing policy between laterally 193 (i.e. non-transit) peering ISPs is that they should only propagate to 194 each other their respective customer prefixes. 196 o Example incidents: In [Mauch-nanog][Mauch], route leaks of this 197 type are reported by monitoring updates in the global BGP system 198 and finding three or more very large ISP ASNs in a sequence in a 199 BGP update's AS path. [Mauch] observes that its detection 200 algorithm detects for these anomalies and potentially route leaks 201 because very large ISPs do not in general buy transit services 202 from each other. However, it also notes that there are exceptions 203 when one very large ISP does indeed buy transit from another very 204 large ISP, and accordingly exceptions are made in its detection 205 algorithm for known cases. 207 3.3. Type 3: Leak of Transit-Provider Prefixes to Peer 209 Description: This type of route leak occurs when an offending AS 210 leaks routes learned from its transit provider to a lateral (i.e. 211 non-transit) peer. 213 o Example incidents: The incidents reported in [Mauch] include the 214 Type 3 leaks. 216 3.4. Type 4: Leak of Peer Prefixes to Transit Provider 218 Description: This type of route leak occurs when an offending AS 219 leaks routes learned from a lateral (i.e. non-transit) peer to its 220 (the AS's) own transit provider. These leaked routes typically 221 originate from the customer cone of the lateral peer. 223 o Example incidents: Examples of Type 4 route-leak incidents are (1) 224 the Axcelx-Hibernia route leak of Amazon Web Services (AWS) 225 prefixes causing disruption of AWS and a variety of services that 226 run on AWS [Kephart],(2) the Hathway-Airtel route leak of 336 227 Google prefixes causing widespread interruption of Google services 228 in Europe and Asia [Toonk2015-A], (3) the Moratel-PCCW route leak 229 of Google prefixes causing Google's services to go offline 230 [Paseka], and (4) Some of the example incidents cited for Type 1 231 route leaks above are also inclusive of Type 4 route leaks. For 232 instance, in the Dodo-Telstra incident [Huston2012], the leaked 233 routes from Dodo to Telstra included routes that Dodo learned from 234 its transit providers as well as lateral peers. 236 3.5. Type 5: Prefix Re-Origination with Data Path to Legitimate Origin 238 Description: A multi-homed AS learns a route from one upstream ISP 239 and announces the prefix to another upstream ISP as if it is being 240 originated by it (i.e. strips the received AS path, and re-originates 241 the prefix). This can be called re-origination or mis-origination. 242 However, somehow a reverse path to the legitimate origination AS may 243 be present and data packets reach the legitimate destination albeit 244 via the offending AS. (Note: The presence of a reverse path here is 245 not attributable to the use of path poisoning trick by the offending 246 AS.) But sometimes the reverse path may not be present, and data 247 packets destined for the leaked prefix may be simply discarded at the 248 offending AS. 250 o Example incidents: Examples of Type 5 route leak include (1) the 251 China Telecom incident in April 2010 [Hiran][Cowie2010][Labovitz], 252 (2) the Belarusian GlobalOneBel route leak incidents in February- 253 March 2013 and May 2013 [Cowie2013], (3) the Icelandic Opin Kerfi- 254 Simmin route leak incidents in July-August 2013 [Cowie2013], and 255 (4) the Indosat route leak incident in April 2014 [Zmijewski]. 256 The reverse paths (i.e. data paths from the offending AS to the 257 legitimate destinations) were present in incidents #1, #2 and #3 258 cited above, but not in incident #4. In incident #4, the 259 misrouted data packets were dropped at Indosat's AS. 261 3.6. Type 6: Accidental Leak of Internal Prefixes and More Specific 262 Prefixes 264 Description: An offending AS simply leaks its internal prefixes to 265 one or more of its transit-provider ASes and/or ISP peers. The 266 leaked internal prefixes are often more specific prefixes subsumed by 267 an already announced less specific prefix. The more specific 268 prefixes were not intended to be routed in eBGP. Further, the AS 269 receiving those leaks fails to filter them. Typically, these leaked 270 announcements are due to some transient failures within the AS; they 271 are short-lived and typically withdrawn quickly following the 272 announcements. However, these more specific prefixes may momentarily 273 cause the routes to be preferred over other aggregate (i.e. less 274 specific) route announcements, thus redirecting traffic from its 275 normal best path. 277 o Example incidents: Leaks of internal routes occur frequently (e.g. 278 multiple times in a week), and the number of prefixes leaked range 279 from hundreds to thousands per incident. One highly conspicuous 280 and widely disruptive leak of internal routes happened in August 281 2014 when AS701 and AS705 leaked about 22,000 more specifics of 282 already announced aggregates [Huston2014][Toonk2014]. 284 4. Additional Comments about the Classification 286 It is worth noting that Types 1 through 4 are similar in that a route 287 is leaked in violation of policy in each case, but what varies is the 288 context of the leaked-route source AS and destination AS roles. 290 Type 5 route leak (i.e. prefix mis-origination with data path to 291 legitimate origin) can also happen in conjunction with the AS 292 relationship contexts in Types 2, 3, and 4. While these 293 possibilities are acknowledged, simply enumerating more types to 294 consider all such special cases does not add value as far as solution 295 development for route leaks is concerned. Hence, the special cases 296 mentioned here are not included in enumerating route leak types. 298 5. Security Considerations 300 No security considerations apply since this is a problem definition 301 document. 303 6. IANA Considerations 305 This document does not require an action from IANA. 307 7. Acknowledgements 309 The authors wish to thank Jared Mauch, Jeff Haas, Warren Kumari, 310 Amogh Dhamdhere, Jakob Heitz, Geoff Huston, Randy Bush, Job Snijders, 311 Ruediger Volk, Andrei Robachevsky, Charles van Niman, Chris Morrow, 312 and Sandy Murphy for comments, suggestions, and critique. The 313 authors are also thankful to Padma Krishnaswamy, Oliver Borchert, and 314 Okhee Kim for their comments and review. 316 8. Informative References 318 [Anwar] Anwar, R., Niaz, H., Choffnes, D., Cunha, I., Gill, P., 319 and N. Katz-Bassett, "Investigating Interdomain Routing 320 Policies in the Wild", ACM Internet Measurement 321 Conference (IMC), October 2015, 322 . 324 [Cowie2010] 325 Cowie, J., "China's 18 Minute Mystery", Dyn 326 Research/Renesys Blog, November 2010, 327 . 330 [Cowie2013] 331 Cowie, J., "The New Threat: Targeted Internet Traffic 332 Misdirection", Dyn Research/Renesys Blog, November 2013, 333 . 336 [draft-dickson-sidr-route-leak-def] 337 Dickson, B., "Route Leaks -- Definitions", IETF Internet 338 Draft (expired), October 2012, 339 . 342 [draft-dickson-sidr-route-leak-reqts] 343 Dickson, B., "Route Leaks -- Requirements for Detection 344 and Prevention thereof", IETF Internet Draft (expired), 345 March 2012, . 348 [Gao] Gao, L. and J. Rexford, "Stable Internet routing without 349 global coordination", IEEE/ACM Transactions on 350 Networking, December 2001, 351 . 354 [Gill] Gill, P., Schapira, M., and S. Goldberg, "A Survey of 355 Interdomain Routing Policies", ACM SIGCOMM Computer 356 Communication Review, January 2014, 357 . 359 [Giotsas] Giotsas, V. and S. Zhou, "Valley-free violation in 360 Internet routing - Analysis based on BGP Community data", 361 IEEE ICC 2012, June 2012. 363 [Hiran] Hiran, R., Carlsson, N., and P. Gill, "Characterizing 364 Large-scale Routing Anomalies: A Case Study of the China 365 Telecom Incident", PAM 2013, March 2013, 366 . 369 [Huston2012] 370 Huston, G., "Leaking Routes", March 2012, 371 . 373 [Huston2014] 374 Huston, G., "What's so special about 512?", September 375 2014, . 377 [Kapela-Pilosov] 378 Pilosov, A. and T. Kapela, "Stealing the Internet: An 379 Internet-Scale Man in the Middle Attack", DEFCON-16 Las 380 Vegas, NV, USA, August 2008, 381 . 384 [Kephart] Kephart, N., "Route Leak Causes Amazon and AWS Outage", 385 ThousandEyes Blog, June 2015, 386 . 389 [Khare] Khare, V., Ju, Q., and B. Zhang, "Concurrent Prefix 390 Hijacks: Occurrence and Impacts", IMC 2012, Boston, MA, 391 November 2012, . 394 [Labovitz] 395 Labovitz, C., "Additional Discussion of the April China 396 BGP Hijack Incident", Arbor Networks IT Security Blog, 397 November 2010, 398 . 401 [LRL] Khare, V., Ju, Q., and B. Zhang, "Large Route Leaks", 402 Project web page, 2012, 403 . 406 [Luckie] Luckie, M., Huffaker, B., Dhamdhere, A., Giotsas, V., and 407 kc. claffy, "AS Relationships, Customer Cones, and 408 Validation", IMC 2013, October 2013, 409 . 411 [Madory] Madory, D., "Why Far-Flung Parts of the Internet Broke 412 Today", Dyn Research/Renesys Blog, September 2014, 413 . 416 [Mauch] Mauch, J., "BGP Routing Leak Detection System", Project 417 web page, 2014, 418 . 420 [Mauch-nanog] 421 Mauch, J., "Detecting Routing Leaks by Counting", 422 NANOG-41 Albuquerque, NM, USA, October 2007, 423 . 426 [Paseka] Paseka, T., "Why Google Went Offline Today and a Bit about 427 How the Internet Works", CloudFare Blog, November 2012, 428 . 431 [Toonk2014] 432 Toonk, A., "What caused today's Internet hiccup", August 433 2014, . 436 [Toonk2015-A] 437 Toonk, A., "What caused the Google service interruption", 438 March 2015, . 441 [Toonk2015-B] 442 Toonk, A., "Massive route leak causes Internet slowdown", 443 June 2015, . 446 [Wijchers] 447 Wijchers, B. and B. Overeinder, "Quantitative Analysis of 448 BGP Route Leaks", RIPE-69, November 2014, 449 . 452 [Zmijewski] 453 Zmijewski, E., "Indonesia Hijacks the World", Dyn 454 Research/Renesys Blog, April 2014, 455 . 458 Authors' Addresses 460 Kotikalapudi Sriram 461 US NIST 463 Email: ksriram@nist.gov 465 Doug Montgomery 466 US NIST 468 Email: dougm@nist.gov 469 Danny McPherson 470 Verisign, Inc. 472 Email: dmcpherson@verisign.com 474 Eric Osterweil 475 Verisign, Inc. 477 Email: eosterweil@verisign.com 479 Brian Dickson 481 Email: brian.peter.dickson@gmail.com