idnits 2.17.1 draft-ietf-grow-ix-bgp-route-server-operations-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (June 8, 2015) is 3246 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-12) exists of draft-ietf-idr-ix-bgp-route-server-06 -- Obsolete informational reference (is this intentional?): RFC 4893 (Obsoleted by RFC 6793) Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 GROW Working Group N. Hilliard 3 Internet-Draft INEX 4 Intended status: Informational E. Jasinska 5 Expires: December 10, 2015 BigWave IT 6 R. Raszuk 7 Mirantis Inc. 8 N. Bakker 9 Akamai Technologies B.V. 10 June 8, 2015 12 Internet Exchange BGP Route Server Operations 13 draft-ietf-grow-ix-bgp-route-server-operations-05 15 Abstract 17 The popularity of Internet exchange points (IXPs) brings new 18 challenges to interconnecting networks. While bilateral eBGP 19 sessions between exchange participants were historically the most 20 common means of exchanging reachability information over an IXP, the 21 overhead associated with this interconnection method causes serious 22 operational and administrative scaling problems for IXP participants. 24 Multilateral interconnection using Internet route servers can 25 dramatically reduce the administrative and operational overhead 26 associated with connecting to IXPs; in some cases, route servers are 27 used by IXP participants as their preferred means of exchanging 28 routing information. 30 This document describes operational considerations for multilateral 31 interconnections at IXPs. 33 Status of This Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at http://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on December 10, 2015. 50 Copyright Notice 52 Copyright (c) 2015 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 3 69 2. Bilateral BGP Sessions . . . . . . . . . . . . . . . . . . . 3 70 3. Multilateral Interconnection . . . . . . . . . . . . . . . . 4 71 4. Operational Considerations for Route Server Installations . . 5 72 4.1. Path Hiding . . . . . . . . . . . . . . . . . . . . . . . 5 73 4.2. Route Server Scaling . . . . . . . . . . . . . . . . . . 6 74 4.2.1. Tackling Scaling Issues . . . . . . . . . . . . . . . 7 75 4.2.1.1. View Merging and Decomposition . . . . . . . . . 7 76 4.2.1.2. Destination Splitting . . . . . . . . . . . . . . 8 77 4.2.1.3. NEXT_HOP Resolution . . . . . . . . . . . . . . . 8 78 4.3. Prefix Leakage Mitigation . . . . . . . . . . . . . . . . 8 79 4.4. Route Server Redundancy . . . . . . . . . . . . . . . . . 9 80 4.5. AS_PATH Consistency Check . . . . . . . . . . . . . . . . 9 81 4.6. Export Routing Policies . . . . . . . . . . . . . . . . . 9 82 4.6.1. BGP Communities . . . . . . . . . . . . . . . . . . . 10 83 4.6.2. Internet Routing Registries . . . . . . . . . . . . . 10 84 4.6.3. Client-accessible Databases . . . . . . . . . . . . . 10 85 4.7. Layer 2 Reachability Problems . . . . . . . . . . . . . . 10 86 4.8. BGP NEXT_HOP Hijacking . . . . . . . . . . . . . . . . . 11 87 5. Security Considerations . . . . . . . . . . . . . . . . . . . 13 88 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 89 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13 90 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 91 8.1. Normative References . . . . . . . . . . . . . . . . . . 13 92 8.2. Informative References . . . . . . . . . . . . . . . . . 14 93 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14 95 1. Introduction 97 Internet exchange points (IXPs) provide IP data interconnection 98 facilities for their participants, using data link layer protocols 99 such as Ethernet. The Border Gateway Protocol (BGP) [RFC4271] is 100 normally used to facilitate exchange of network reachability 101 information over these media. 103 As bilateral interconnection between IXP participants requires 104 operational and administrative overhead, BGP route servers 105 [I-D.ietf-idr-ix-bgp-route-server] are often deployed by IXP 106 operators to provide a simple and convenient means of interconnecting 107 IXP participants with each other. A route server redistributes BGP 108 routes received from its BGP clients to other clients according to a 109 pre-specified policy, and it can be viewed as similar to an eBGP 110 equivalent of an iBGP [RFC4456] route reflector. 112 Route servers at IXPs require careful management and it is important 113 for route server operators to thoroughly understand both how they 114 work and what their limitations are. In this document, we discuss 115 several issues of operational relevance to route server operators and 116 provide recommendations to help route server operators provision a 117 reliable interconnection service. 119 1.1. Notational Conventions 121 The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 122 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 123 "OPTIONAL" in this document are to be interpreted as described in 124 [RFC2119]. 126 The phrase "BGP route" in this document should be interpreted as the 127 term "Route" described in [RFC4271]. 129 2. Bilateral BGP Sessions 131 Bilateral interconnection is a method of interconnecting routers 132 using individual BGP sessions between each pair of participant 133 routers on an IXP, in order to exchange reachability information. If 134 an IXP participant wishes to implement an open interconnection policy 135 - i.e. a policy of interconnecting with as many other IXP 136 participants as possible - it is necessary for the participant to 137 liaise with each of their intended interconnection partners. 138 Interconnection can then be implemented bilaterally by configuring a 139 BGP session on both participants' routers to exchange network 140 reachability information. If each exchange participant interconnects 141 with each other participant, a full mesh of BGP sessions is needed, 142 as shown in Figure 1. 144 ___ ___ 145 / \ / \ 146 ..| AS1 |..| AS2 |.. 147 : \___/____\___/ : 148 : | \ / | : 149 : | \ / | : 150 : IXP | \/ | : 151 : | /\ | : 152 : | / \ | : 153 : _|_/____\_|_ : 154 : / \ / \ : 155 ..| AS3 |..| AS4 |.. 156 \___/ \___/ 158 Figure 1: Full-Mesh Interconnection at an IXP 160 Figure 1 depicts an IXP platform with four connected routers, 161 administered by four separate exchange participants, each of them 162 with a locally unique autonomous system number: AS1, AS2, AS3 and 163 AS4. The lines between the routers depict BGP sessions; the dotted 164 edge represents the IXP border. Each of these four participants 165 wishes to exchange traffic with all other participants; this is 166 accomplished by configuring a full mesh of BGP sessions on each 167 router connected to the exchange, resulting in 6 BGP sessions across 168 the IXP fabric. 170 The number of BGP sessions at an exchange has an upper bound of 171 n*(n-1)/2, where n is the number of routers at the exchange. As many 172 exchanges have large numbers of participating networks, the amount of 173 administrative and operation overhead required to implement an open 174 interconnection scales quadratically. New participants to an IXP 175 require significant initial resourcing in order to gain value from 176 their IXP connection, while existing exchange participants need to 177 commit ongoing resources in order to benefit from interconnecting 178 with these new participants. 180 3. Multilateral Interconnection 182 Multilateral interconnection is implemented using a route server 183 configured to distribute BGP routes among client routers. The route 184 server preserves the BGP NEXT_HOP attribute from all received BGP 185 routes and passes them with unchanged NEXT_HOP to its route server 186 clients according to its configured routing policy, as described in 187 [I-D.ietf-idr-ix-bgp-route-server]. Using this method of exchanging 188 BGP routes, an IXP participant router can receive an aggregated list 189 of BGP routes from all other route server clients using a single BGP 190 session to the route server instead of depending on BGP sessions with 191 each other router at the exchange. This reduces the overall number 192 of BGP sessions at an Internet exchange from n*(n-1)/2 to n, where n 193 is the number of routers at the exchange. 195 Although a route server uses BGP to exchange reachability information 196 with each of its clients, it does not forward traffic itself and is 197 therefore not a router. 199 In practical terms, this allows dense interconnection between IXP 200 participants with low administrative overhead and significantly 201 simpler and smaller router configurations. In particular, new IXP 202 participants benefit from immediate and extensive interconnection, 203 while existing route server participants receive reachability 204 information from these new participants without necessarily having to 205 modify their configurations. 207 ___ ___ 208 / \ / \ 209 ..| AS1 |..| AS2 |.. 210 : \___/ \___/ : 211 : \ / : 212 : \ / : 213 : \__/ : 214 : IXP / \ : 215 : | RS | : 216 : \____/ : 217 : / \ : 218 : / \ : 219 : __/ \__ : 220 : / \ / \ : 221 ..| AS3 |..| AS4 |.. 222 \___/ \___/ 224 Figure 2: IXP-based Interconnection with Route Server 226 As illustrated in Figure 2, each router on the IXP fabric requires 227 only a single BGP session to the route server, from which it can 228 receive reachability information for all other routers on the IXP 229 which also connect to the route server. 231 4. Operational Considerations for Route Server Installations 233 4.1. Path Hiding 235 "Path hiding" is a term used in [I-D.ietf-idr-ix-bgp-route-server] to 236 describe the process whereby a route server may mask individual paths 237 by applying conflicting routing policies to its Loc-RIB. When this 238 happens, route server clients receive incomplete information from the 239 route server about network reachability. 241 There are several approaches which may be used to mitigate against 242 the effect of path hiding; these are described in 243 [I-D.ietf-idr-ix-bgp-route-server]. However, the only method which 244 does not require explicit support from the route server client is for 245 the route server itself to maintain a individual Loc-RIB for each 246 client which is the subject of conflicting routing policies. 248 4.2. Route Server Scaling 250 While deployment of multiple Loc-RIBs on the route server presents a 251 simple way to avoid the path hiding problem noted in Section 4.1, 252 this approach requires significantly more computing resources on the 253 route server than where a single Loc-RIB is deployed for all clients. 254 As the [RFC4271] BGP decision process must be applied to all Loc-RIBs 255 deployed on the route server, both CPU and memory requirements on the 256 host computer scale approximately according to O(P * N), where P is 257 the total number of unique paths received by the route server and N 258 is the number of route server clients which require a unique Loc-RIB. 259 As this is a super-linear scaling relationship, large route servers 260 may derive benefit from deploying per-client Loc-RIBs only where they 261 are required. 263 Regardless of whether any Loc-RIB optimization technique is 264 implemented, the route server's theoretical upper-bound network 265 bandwidth requirements will scale according to O(P_tot * N), where 266 P_tot is the total number of unique paths received by the route 267 server and N is the total number of route server clients. In the 268 case where P_avg (the arithmetic mean number of unique paths received 269 per route server client) remains roughly constant even as the number 270 of connected clients increases, the total number of prefixes will 271 equal the average number of prefixes multiplied by the number of 272 clients. Symbolically, this can be written as P_tot = P_avg * N. If 273 we assume that in the worst case, each prefix is associated with a 274 different set of BGP path attributes, so must be transmitted 275 individually, the network bandwidth scaling function can be rewritten 276 as O((P_avg * N) * N) or O(N^2). This quadratic upper bound on the 277 network traffic requirements indicates that the route server model 278 may not scale well for larger numbers of clients. 280 In practice, most prefixes will be associated with a limited number 281 of BGP path attribute sets, allowing more efficient transmission of 282 BGP routes from the route server than the theoretical analysis 283 suggests. In the analysis above, P_tot will increase monotonically 284 according to the number of clients, but will have an upper limit of 285 the size of the full default-free routing table of the network in 286 which the IXP is located. Observations from production route servers 287 have shown that most route server clients generally avoid using 288 custom routing policies and consequently the route server may not 289 need to deploy per-client Loc-RIBs. These practical bounds reduce 290 the theoretical worst-case scaling scenario to the point where route- 291 server deployments are manageable even on larger IXPs. 293 4.2.1. Tackling Scaling Issues 295 The problem of scaling route servers still presents serious practical 296 challenges and requires careful attention. Scaling analysis 297 indicates problems in three key areas: route processor CPU overhead 298 associated with BGP decision process calculations, the memory 299 requirements for handling many different BGP path entries, and the 300 network traffic bandwidth required to distribute these BGP routes 301 from the route server to each route server client. 303 4.2.1.1. View Merging and Decomposition 305 View merging and decomposition, outlined in [RS-ARCH], describes a 306 method of optimising memory and CPU requirements where multiple route 307 server clients are subject to exactly the same routing policies. In 308 this situation, multiple Loc-RIB views can be merged into a single 309 view. 311 There are several variations of this approach. If the route server 312 operator has prior knowledge of interconnection relationships between 313 route server clients, then the operator may configure separate Loc- 314 RIBs only for route server clients with unique routing policies. As 315 this approach requires prior knowledge of interconnection 316 relationships, the route server operator must depend on each client 317 sharing their interconnection policies, either in a internal 318 provisioning database controlled by the operator, or else in an 319 external data store such as an Internet Routing Registry Database. 321 Conversely, the route server implementation itself may implement 322 internal view decomposition by creating virtual Loc-RIBs based on a 323 single in-memory master Loc-RIB, with delta differences for each 324 prefix subject to different routing policies. This allows a more 325 fine-grained and flexible approach to the problem of Loc-RIB scaling, 326 at the expense of requiring a more complex in-memory Loc-RIB 327 structure. 329 Whatever method of view merging and decomposition is chosen on a 330 route server, pathological edge cases can be created whereby they 331 will scale no better than fully non-optimised per-client Loc-RIBs. 332 However, as most route server clients connect to a route server for 333 the purposes of reducing overhead, rather than implementing complex 334 per-client routing policies, edge cases tend not to arise in 335 practice. 337 4.2.1.2. Destination Splitting 339 Destination splitting, also described in [RS-ARCH], describes a 340 method for route server clients to connect to multiple route servers 341 and to send non-overlapping sets of prefixes to each route server. 342 As each route server computes the best path for its own set of 343 prefixes, the quadratic scaling requirement operates on multiple 344 smaller sets of prefixes. This reduces the overall computational and 345 memory requirements for managing multiple Loc-RIBs and performing the 346 best-path calculation on each. 348 In practice, the route server operator would need all route server 349 clients to send a full set of BGP routes to each route server. The 350 route server operator could then selectively filter these prefixes 351 for each route server by using either BGP Outbound Route Filtering 352 [RFC5291] or else inbound prefix filters configured on client BGP 353 sessions. 355 4.2.1.3. NEXT_HOP Resolution 357 As route servers are usually deployed at IXPs where all connected 358 routers are on the same layer 2 broadcast domain, recursive 359 resolution of the NEXT_HOP attribute is generally not required, and 360 can be replaced by a simple check to ensure that the NEXT_HOP value 361 for each received BGP route is a network address on the IXP LAN's IP 362 address range. 364 4.3. Prefix Leakage Mitigation 366 Prefix leakage occurs when a BGP client unintentionally distributes 367 BGP routes to one or more neighboring BGP routers. Prefix leakage of 368 this form to a route server can cause serious connectivity problems 369 at an IXP if each route server client is configured to accept all BGP 370 routes from the route server. It is therefore RECOMMENDED when 371 deploying route servers that, due to the potential for collateral 372 damage caused by BGP route leakage, route server operators deploy 373 prefix leakage mitigation measures in order to prevent unintentional 374 prefix announcements or else limit the scale of any such leak. 375 Although not foolproof, per-client inbound prefix limits can restrict 376 the damage caused by prefix leakage in many cases. Per-client 377 inbound prefix filtering on the route server is a more deterministic 378 and usually more reliable means of preventing prefix leakage, but 379 requires more administrative resources to maintain properly. 381 If a route server operator implements per-client inbound prefix 382 filtering, then it is RECOMMENDED that the operator also builds in 383 mechanisms to automatically compare the Adj-RIB-In received from each 384 client with the inbound prefix lists configured for those clients. 386 Naturally, it is the responsibility of the route server client to 387 ensure that their stated prefix list is compatible with what they 388 announce to an IXP route server. However, many network operators do 389 not carefully manage their published routing policies and it is not 390 uncommon to see significant variation between the two sets of 391 prefixes. Route server operator visibility into this discrepancy can 392 provide significant advantages to both operator and client. 394 4.4. Route Server Redundancy 396 As the purpose of an IXP route server implementation is to provide a 397 reliable reachability brokerage service, it is RECOMMENDED that 398 exchange operators who implement route server systems provision 399 multiple route servers on each shared Layer-2 domain. There is no 400 requirement to use the same BGP implementation or operating system 401 for each route server on the IXP fabric; however, it is RECOMMENDED 402 that where an operator provisions more than a single server on the 403 same shared Layer-2 domain, each route server implementation be 404 configured equivalently and in such a manner that the path 405 reachability information from each system is identical. 407 4.5. AS_PATH Consistency Check 409 [RFC4271] requires that every BGP speaker which advertises a BGP 410 route to another external BGP speaker prepends its own AS number as 411 the last element of the AS_PATH sequence. Therefore the leftmost AS 412 in an AS_PATH attribute should be equal to the autonomous system 413 number of the BGP speaker which sent the BGP route. 415 As [I-D.ietf-idr-ix-bgp-route-server] suggests that route servers 416 should not modify the AS_PATH attribute, a consistency check on the 417 AS_PATH of an BGP route received by a route server client would 418 normally fail. It is therefore RECOMMENDED that route server clients 419 disable the AS_PATH consistency check towards the route server. 421 4.6. Export Routing Policies 423 Policy filtering is commonly implemented on route servers to provide 424 prefix distribution control mechanisms for route server clients. A 425 route server "export" policy is a policy which affects prefixes sent 426 from the route server to a route server client. Several different 427 strategies are commonly used for implementing route server export 428 policies. 430 4.6.1. BGP Communities 432 Prefixes sent to the route server are tagged with specific standard 433 [RFC1997] or extended [RFC4360] BGP community attributes, based on 434 pre-defined values agreed between the operator and all clients. 435 Based on these community tags, BGP routes may be propagated to all 436 other clients, a subset of clients, or none. This mechanism allows 437 route server clients to instruct the route server to implement per- 438 client export routing policies. 440 As both standard and extended BGP community values are currently 441 restricted to 6 octets or fewer, it is not possible for both the 442 global and local administrator fields in the BGP community to fit a 443 4-octet autonomous system number. Bearing this in mind, the route 444 server operator SHOULD take care to ensure that the predefined BGP 445 community values mechanism used on their route server is compatible 446 with [RFC4893] 4-octet ASNs. 448 4.6.2. Internet Routing Registries 450 Internet Routing Registry databases (IRRDBs) may be used by route 451 server operators to construct per-client routing policies. [RFC2622] 452 Routing Policy Specification Language (RPSL) provides an 453 comprehensive grammar for describing interconnection relationships, 454 and several toolsets exist which can be used to translate RPSL policy 455 description into route server configurations. 457 4.6.3. Client-accessible Databases 459 Should the route server operator not wish to use either BGP community 460 tags or the public IRRDBs for implementing client export policies, 461 they may implement their own routing policy database system for 462 managing their clients' requirements. A database of this form SHOULD 463 allow a route server client operator to update their routing policy 464 and provide a mechanism for allowing the client to specify whether 465 they wish to exchange all their prefixes with any other route server 466 client. Optionally, the implementation may allow a client to specify 467 unique routing policies for individual prefixes over which they have 468 routing policy control. 470 4.7. Layer 2 Reachability Problems 472 Layer 2 reachability problems on an IXP can cause serious operational 473 problems for IXP participants which depend on route servers for 474 interconnection. Ethernet switch forwarding bugs have occasionally 475 been observed to cause non-transitive reachability. For example, 476 given a route server and two IXP participants, A and B, if the two 477 participants can reach the route server but cannot reach each other, 478 then traffic between the participants may be dropped until such time 479 as the layer 2 forwarding problem is resolved. This situation does 480 not tend to occur in bilateral interconnection arrangements, as the 481 routing control path between the two hosts is usually (but not 482 always, due to IXP inter-switch connectivity load balancing 483 algorithms) the same as the data path between them. 485 Problems of this form can be partially mitigated by using [RFC5881] 486 bidirectional forwarding detection. However, as this is a bilateral 487 protocol configured between routers, and as there is currently no 488 protocol to automatically configure BFD sessions between route server 489 clients, BFD does not currently provide an optimal means of handling 490 the problem. Even if automatic BFD session configuration were 491 possible, practical problems would remain. If two IXP route server 492 clients were configured to run BFD between each other and the 493 protocol detected a non-transitive loss of reachability between them, 494 each of those routers would internally mark the other's prefixes as 495 unreachable via the BGP path announced by the route server. As the 496 route server only propagates a single best path to each client, this 497 could cause either sub-optimal routing or complete connectivity loss 498 if there were no alternative paths learned from other BGP sessions. 500 4.8. BGP NEXT_HOP Hijacking 502 Section 5.1.3(2) of [RFC4271] allows eBGP speakers to change the 503 NEXT_HOP address of a received BGP route to be a different internet 504 address on the same subnet. This is the mechanism which allows route 505 servers to operate on a shared layer 2 IXP network. However, the 506 mechanism can be abused by route server clients to redirect traffic 507 for their prefixes to other IXP participant routers. 509 ____ 510 / \ 511 | AS99 | 512 \____/ 513 / \ 514 / \ 515 __/ \__ 516 / \ / \ 517 ..| AS1 |..| AS2 |.. 518 : \___/ \___/ : 519 : \ / : 520 : \ / : 521 : \__/ : 522 : IXP / \ : 523 : | RS | : 524 : \____/ : 525 : : 526 .................... 528 Figure 3: BGP NEXT_HOP Hijacking using a Route Server 530 For example in Figure 3, if AS1 and AS2 both announce BGP routes for 531 AS99 to the route server, AS1 could set the NEXT_HOP address for 532 AS99's routes to be the address of AS2's router, thereby diverting 533 traffic for AS99 via AS2. This may override the routing policies of 534 AS99 and AS2. 536 Worse still, if the route server operator does not use inbound prefix 537 filtering, AS1 could announce any arbitrary prefix to the route 538 server with a NEXT_HOP address of any other IXP participant. This 539 could be used as a denial of service mechanism against either the 540 users of the address space being announced by illicitly diverting 541 their traffic, or the other IXP participant by overloading their 542 network with traffic which would not normally be sent there. 544 This problem is not specific to route servers and it can also be 545 implemented using bilateral BGP sessions. However, the potential 546 damage is amplified by route servers because a single BGP session can 547 be used to affect many networks simultaneously. 549 Because route server clients cannot easily implement next-hop policy 550 checks against route server BGP sessions, route server operators 551 SHOULD check that the BGP NEXT_HOP attribute for BGP routes received 552 from a route server client matches the interface address of the 553 client. If the route server receives an BGP route where these 554 addresses are different and where the announcing route server client 555 is in a different autonomous system to the route server client which 556 uses the next hop address, the BGP route SHOULD be dropped. 558 Permitting next-hop rewriting for the same autonomous system allows 559 an organisation with multiple connections into an IXP configured with 560 different IP addresses to direct traffic off the IXP infrastructure 561 through any of their connections for traffic engineering or other 562 purposes. 564 5. Security Considerations 566 On route server installations which do not employ path hiding 567 mitigation techniques, the path hiding problem outlined in 568 Section 4.1 could be used by an IXP participant to prevent the route 569 server from sending any BGP routes for a particular prefix to other 570 route server clients, even if there were a valid path to that 571 destination via another route server client. 573 If the route server operator does not implement prefix leakage 574 mitigation as described in Section 4.3, it is trivial for route 575 server clients to implement denial of service attacks against 576 arbitrary Internet networks by leaking BGP routes to a route server. 578 Route server installations SHOULD be secured against BGP NEXT_HOP 579 hijacking, as described in Section 4.8. 581 6. IANA Considerations 583 There are no IANA considerations. 585 7. Acknowledgments 587 The authors would like to thank Chris Hall, Ryan Bickhart, Steven 588 Bakker and Eduardo Ascenco Reis for their valuable input. 590 8. References 592 8.1. Normative References 594 [I-D.ietf-idr-ix-bgp-route-server] 595 Jasinska, E., Hilliard, N., Raszuk, R., and N. Bakker, 596 "Internet Exchange Route Server", draft-ietf-idr-ix-bgp- 597 route-server-06 (work in progress), December 2014. 599 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 600 Requirement Levels", BCP 14, RFC 2119, March 1997. 602 8.2. Informative References 604 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 605 Communities Attribute", RFC 1997, August 1996. 607 [RFC2622] Alaettinoglu, C., Villamizar, C., Gerich, E., Kessens, D., 608 Meyer, D., Bates, T., Karrenberg, D., and M. Terpstra, 609 "Routing Policy Specification Language (RPSL)", RFC 2622, 610 June 1999. 612 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 613 Protocol 4 (BGP-4)", RFC 4271, January 2006. 615 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 616 Communities Attribute", RFC 4360, February 2006. 618 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 619 Reflection: An Alternative to Full Mesh Internal BGP 620 (IBGP)", RFC 4456, April 2006. 622 [RFC4893] Vohra, Q. and E. Chen, "BGP Support for Four-octet AS 623 Number Space", RFC 4893, May 2007. 625 [RFC5291] Chen, E. and Y. Rekhter, "Outbound Route Filtering 626 Capability for BGP-4", RFC 5291, August 2008. 628 [RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 629 (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, June 630 2010. 632 [RS-ARCH] Govindan, R., Alaettinoglu, C., Varadhan, K., and D. 633 Estrin, "A Route Server Architecture for Inter-Domain 634 Routing", 1995, 635 . 637 Authors' Addresses 639 Nick Hilliard 640 INEX 641 4027 Kingswood Road 642 Dublin 24 643 IE 645 Email: nick@inex.ie 646 Elisa Jasinska 647 BigWave IT 648 ul. Skawinska 27/7 649 Krakow, MP 31-066 650 Poland 652 Email: elisa@bigwaveit.org 654 Robert Raszuk 655 Mirantis Inc. 656 615 National Ave. #100 657 Mt View, CA 94043 658 USA 660 Email: robert@raszuk.net 662 Niels Bakker 663 Akamai Technologies B.V. 664 Kingsfordweg 151 665 Amsterdam 1043 GR 666 NL 668 Email: nbakker@akamai.com