idnits 2.17.1 draft-ietf-bess-datacenter-gateway-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 2, 2018) is 2179 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-19) exists of draft-ietf-idr-bgpls-segment-routing-epe-15 == Outdated reference: A later version (-22) exists of draft-ietf-idr-tunnel-encaps-09 ** Obsolete normative reference: RFC 7752 (Obsoleted by RFC 9552) == Outdated reference: A later version (-06) exists of draft-farrel-spring-sr-domain-interconnect-03 == Outdated reference: A later version (-18) exists of draft-ietf-idr-bgp-ls-segment-routing-ext-06 Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Working Group J. Drake 3 Internet-Draft A. Farrel 4 Intended status: Standards Track E. Rosen 5 Expires: November 3, 2018 Juniper Networks 6 K. Patel 7 Arrcus, Inc. 8 L. Jalil 9 Verizon 10 May 2, 2018 12 Gateway Auto-Discovery and Route Advertisement for Segment Routing 13 Enabled Domain Interconnection 14 draft-ietf-bess-datacenter-gateway-01 16 Abstract 18 Data centers have become critical components of the infrastructure 19 used by network operators to provide services to their customers. 20 Data centers are attached to the Internet or a backbone network by 21 gateway routers. One data center typically has more than one gateway 22 for commercial, load balancing, and resiliency reasons. 24 Segment routing is a popular protocol mechanism for operating within 25 a data center, but also for steering traffic that flows between two 26 data center sites. In order that one data center site may load 27 balance the traffic it sends to another data center site it needs to 28 know the complete set of gateway routers at the remote data center, 29 the points of connection from those gateways to the backbone network, 30 and the connectivity across the backbone network. 32 Segment routing may also be operated in other domains, such as access 33 networks. Those domains also need to be connected across backbone 34 networks through gateways. 36 This document defines a mechanism using the BGP Tunnel Encapsulation 37 attribute to allow each gateway router to advertise the routes to the 38 prefixes in the segment routing domains to which it provides access, 39 and also to advertise on behalf of each other gateway to the same 40 segment routing domain. 42 Requirements Language 44 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 45 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 46 document are to be interpreted as described in [RFC2119]. 48 Status of This Memo 50 This Internet-Draft is submitted in full conformance with the 51 provisions of BCP 78 and BCP 79. 53 Internet-Drafts are working documents of the Internet Engineering 54 Task Force (IETF). Note that other groups may also distribute 55 working documents as Internet-Drafts. The list of current Internet- 56 Drafts is at https://datatracker.ietf.org/drafts/current/. 58 Internet-Drafts are draft documents valid for a maximum of six months 59 and may be updated, replaced, or obsoleted by other documents at any 60 time. It is inappropriate to use Internet-Drafts as reference 61 material or to cite them other than as "work in progress." 63 This Internet-Draft will expire on November 3, 2018. 65 Copyright Notice 67 Copyright (c) 2018 IETF Trust and the persons identified as the 68 document authors. All rights reserved. 70 This document is subject to BCP 78 and the IETF Trust's Legal 71 Provisions Relating to IETF Documents 72 (https://trustee.ietf.org/license-info) in effect on the date of 73 publication of this document. Please review these documents 74 carefully, as they describe your rights and restrictions with respect 75 to this document. Code Components extracted from this document must 76 include Simplified BSD License text as described in Section 4.e of 77 the Trust Legal Provisions and are provided without warranty as 78 described in the Simplified BSD License. 80 Table of Contents 82 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 83 2. SR Domain Gateway Auto-Discovery . . . . . . . . . . . . . . 5 84 3. Relationship to BGP Link State and Egress Peer Engineering . 6 85 4. Advertising an SR Domain Route Externally . . . . . . . . . . 7 86 5. Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . 7 87 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 88 7. Security Considerations . . . . . . . . . . . . . . . . . . . 7 89 8. Manageability Considerations . . . . . . . . . . . . . . . . 9 90 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 91 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 92 10.1. Normative References . . . . . . . . . . . . . . . . . . 9 93 10.2. Informative References . . . . . . . . . . . . . . . . . 10 94 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 96 1. Introduction 98 Data centers (DCs) have become critical components of the 99 infrastructure used by network operators to provide services to their 100 customers. DCs are attached to the Internet or a backbone network by 101 gateway routers (GWs). One DC typically has more than one GW for 102 various reasons including commercial preferences, load balancing, and 103 resiliency against connection of device failure. 105 Segment routing (SR) [I-D.ietf-spring-segment-routing] is a popular 106 protocol mechanism for operating within a DC, but also for steering 107 traffic that flows between two DC sites. In order for an ingress DC 108 that uses SR to load balance the flows it sends to an egress DC, it 109 needs to know the complete set of entry nodes (i.e., GWs) for that 110 egress DC from the backbone network connecting the two DCs. Note 111 that it is assumed that the connected set of DCs and the backbone 112 network connecting them are part of the same SR BGP Link State (LS) 113 instance ([RFC7752] and [I-D.ietf-idr-bgpls-segment-routing-epe]) so 114 that traffic engineering using SR may be used for these flows. 116 Segment routing may also be operated in other domains, such as access 117 networks. Those domains also need to be connected across backbone 118 networks through gateways. 120 Suppose that there are two gateways, GW1 and GW2 as shown in 121 Figure 1, for a given egress segment routing domain and that they 122 each advertise a route to prefix X which is located within the egress 123 segment routing domain with each setting itself as next hop. One 124 might think that the GWs for X could be inferred from the routes' 125 next hop fields, but typically it is not the case that both routes 126 get distributed across the backbone: rather only the best route, as 127 selected by BGP, is distributed. This precludes load balancing flows 128 across both GWs. 130 ----------------- --------------------- 131 | Ingress | | Egress ------ | 132 | SR Domain | | SR Domain |Prefix| | 133 | | | | X | | 134 | | | ------ | 135 | -- | | --- --- | 136 | |GW| | | |GW1| |GW2| | 137 -------++-------- ----+-----------+-+-- 138 | \ | / | 139 | \ | / | 140 | -+------------- --------+--------+-- | 141 | ||PE| ----| |---- |PE| |PE| | | 142 | | -- |ASBR+------+ASBR| -- -- | | 143 | | ----| |---- | | 144 | | | | | | 145 | | ----| |---- | | 146 | | AS1 |ASBR+------+ASBR| AS2 | | 147 | | ----| |---- | | 148 | --------------- -------------------- | 149 --+-----------------------------------------------+-- 150 | |PE| |PE| | 151 | -- AS3 -- | 152 | | 153 ----------------------------------------------------- 155 Figure 1: Example Segment Routing Domain Interconnection 157 The obvious solution to this problem is to use the BGP feature that 158 allows the advertisement of multiple paths in BGP (known as Add- 159 Paths) [RFC7911] to ensure that all routes to X get advertised by 160 BGP. However, even if this is done, the identity of the GWs will be 161 lost as soon as the routes get distributed through an Autonomous 162 System Border Router (ASBR) that will set itself to be the next hop. 163 And if there are multiple Autonomous Systems (ASes) in the backbone, 164 not only will the next hop change several times, but the Add-Paths 165 technique will experience scaling issues. This all means that the 166 Add-Paths approach is limited to SR domains connected over a single 167 AS. 169 This document defines a solution that overcomes this limitation and 170 works equally well with a backbone constructed from one or more ASes. 171 The solution uses the Tunnel Encapsulation attribute 172 [I-D.ietf-idr-tunnel-encaps] as follows: 174 We define a new tunnel type, "SR tunnel". When the GWs to a given 175 SR domain advertise a route to a prefix X within the SR domain, 176 they will each include a Tunnel Encapsulation attribute with 177 multiple tunnel instances each of type "SR tunnel", one for each 178 GW, and each containing a Remote Endpoint sub-TLV with that GW's 179 address. 181 In other words, each route advertised by any GW identifies all of the 182 GWs to the same SR domain (see Section 2 for a discussion of how GWs 183 discover each other). Therefore, even if only one of the routes is 184 distributed to other ASes, it will not matter how many times the next 185 hop changes, as the Tunnel Encapsulation attribute (and its remote 186 endpoint sub-TLVs) will remain unchanged. 188 To put this in the context of Figure 1, GW1 and GW2 discover each 189 other as gateways for the egress SR domain. Both GW1 and GW2 190 advertise themselves as having routes to prefix X. Furthermore, GW1 191 includes a Tunnel Encapsulation attribute with a tunnel instance of 192 type "SR tunnel" for itself and another for GW2. Similarly, GW2 193 includes a Tunnel Encapsulation for itself and another for GW1. The 194 gateway in the ingress SR domain can now see all possible paths to 195 the egress SR domain regardless of which route advertisement is 196 propagated to it, and it can choose one or balance traffic flows as 197 it sees fit. 199 The protocol extensions defined in this document are put into the 200 broader context of SR domain interconnection by 201 [I-D.farrel-spring-sr-domain-interconnect]. That document shows how 202 other existing protocol elements may be combined with the extensions 203 defined in this document to provide a full system. 205 2. SR Domain Gateway Auto-Discovery 207 To allow a given SR domain's GWs to auto-discover each other and to 208 coordinate their operations, the following procedures are 209 implemented: 211 o Each GW is configured with an identifier for the SR domain that is 212 common across all GWs to the domain (i.e., the same identifier is 213 used by all GWs to the same SR domain) and unique across all SR 214 domains that are connected (i.e., across all GWs to all SR domains 215 that are interconnected). 217 o A route target ([RFC4360]) is attached to each GW's auto-discovery 218 route and has its value set to the SR domain identifier. 220 o Each GW constructs an import filtering rule to import any route 221 that carries a route target with the same SR domain identifier 222 that the GW itself uses. This means that only these GWs will 223 import those routes and that all GWs to the same SR domain will 224 import each other's routes and will learn (auto-discover) the 225 current set of active GWs for the SR domain. 227 The auto-discovery route that each GW advertises consists of the 228 following: 230 o An IPv4 or IPv6 NLRI containing one of the GW's loopback addresses 231 (that is, with AFI/SAFI that is one of 1/1, 2/1, 1/4, or 2/4). 233 o A Tunnel Encapsulation attribute containing the GW's encapsulation 234 information, which at a minimum consists of an SR tunnel TLV (type 235 to be allocated by IANA) with a Remote Endpoint sub-TLV as 236 specified in [I-D.ietf-idr-tunnel-encaps]. 238 To avoid the side effect of applying the Tunnel Encapsulation 239 attribute to any packet that is addressed to the GW itself, the GW 240 SHOULD use a different loopback address for the two cases. 242 As described in Section 1, each GW will include a Tunnel 243 Encapsulation attribute for each GW that is active for the SR domain 244 (including itself), and will include these in every route advertised 245 externally to the SR domain by each GW. As the current set of active 246 GWs changes (due to the addition of a new GW or the failure/removal 247 of an existing GW) each externally advertised route will be re- 248 advertised with the set of SR tunnel instances reflecting the current 249 set of active GWs. 251 If a gateway becomes disconnected from the backbone network, or if 252 the SR domain operator decides to terminate the gateway's activity, 253 it withdraws the advertisements described above. This means that 254 remote gateways at other sites will stop seeing advertisements from 255 this gateway. It also means that other local gateways at this site 256 will "unlearn" the removed gateway and stop including a Tunnel 257 Encapsulation attribute for the removed gateway in their 258 advertisements. 260 3. Relationship to BGP Link State and Egress Peer Engineering 262 When a remote GW receives a route to a prefix X it can use the SR 263 tunnel instances within the contained Tunnel Encapsulation attribute 264 to identify the GWs through which X can be reached. It uses this 265 information to compute SR TE paths across the backbone network 266 looking at the information advertised to it in SR BGP Link State 267 (BGP-LS) [I-D.ietf-idr-bgp-ls-segment-routing-ext] and correlated 268 using the SR domain identity. SR Egress Peer Engineering (EPE) 269 [I-D.ietf-idr-bgpls-segment-routing-epe] can be used to supplement 270 the information advertised in the BGP-LS. 272 4. Advertising an SR Domain Route Externally 274 When a packet destined for prefix X is sent on an SR TE path to a GW 275 for the SR domain containing X, it needs to carry the receiving GW's 276 label for X such that this label rises to the top of the stack before 277 the GW completes its processing of the packet. To achieve this we 278 place a prefix-SID sub-TLV for X in each SR tunnel instance in the 279 Tunnel Encapsulation attribute in the externally advertised route for 280 X. 282 Alternatively, if the GWs for a given SR domain are configured to 283 allow remote GWs to perform SR TE through that SR domain for a prefix 284 X, then each GW computes an SR TE path through that SR domain to X 285 from each of the currently active GWs, and places each in an MPLS 286 label stack sub-TLV [I-D.ietf-idr-tunnel-encaps] in the SR tunnel 287 instance for that GW. 289 5. Encapsulation 291 If the GWs for a given SR domain are configured to allow remote GWs 292 to send them a packet in that SR domain's native encapsulation, then 293 each GW will also include multiple instances of a tunnel TLV for that 294 native encapsulation in externally advertised routes: one for each GW 295 and each containing a remote endpoint sub-TLV with that GW's address. 296 A remote GW may then encapsulate a packet according to the rules 297 defined via the sub-TLVs included in each of the tunnel TLV 298 instances. 300 6. IANA Considerations 302 IANA maintains a registry called "BGP parameters" with a sub-registry 303 called "BGP Tunnel Encapsulation Tunnel Types." The registration 304 policy for this registry is First-Come First-Served. 306 IANA is requested to assign a codepoint from this sub-registry for 307 "SR Tunnel". The next available value may be used and reference 308 should be made to this document. 310 [[Note: This text is likely to be replaced with a specific code point 311 value once FCFS allocation has been made.]] 313 7. Security Considerations 315 From a protocol point of view, the mechanisms described in this 316 document can leverage the security mechanisms already defined for 317 BGP. Further discussion of security considerations for BGP may be 318 found in the BGP specification itself [RFC4271] and in the security 319 analysis for BGP [RFC4272]. The original discussion of the use of 320 the TCP MD5 signature option to protect BGP sessions is found in 321 [RFC5925], while [RFC6952] includes an analysis of BGP keying and 322 authentication issues. 324 The mechanisms described in this document involve sharing routing or 325 reachability information between domains: that may mean disclosing 326 information that is normally contained within a domain. So it needs 327 to be understood that normal security paradigms based on the 328 boundaries of domains are weakened. Discussion of these issues with 329 respect to VPNs can be found in [RFC4364] while [RFC7926] describes 330 many of the issues associated with the exchange of topology or TE 331 information between domains. 333 Particular exposures resulting from this work include: 335 o Gateways to a domain will know about all other gateways to the 336 same domain. This feature applies within a domain and so is not a 337 substantial exposure, but it does mean that if the protocol BGP 338 exchanges within a domain can be snooped or if a gateway can be 339 subverted then an attacker may learn the full set of gateways to a 340 domain. This facilitates more effective attacks on that domain. 342 o The existence of multiple gateways to a domain becomes more 343 visible across the backbone and even into remote domains. This 344 means that an attacker is able to prepare a more comprehensive 345 attack than exists when only the locally attached backbone network 346 (e.g., the AS that hosts the domain) can see all of the gateways 347 to a site. 349 o A node in a domain that does not have external BGP peering (i.e., 350 is not really a domain gateway and cannot speak BGP into the 351 backbone network) may be able to get itself advertised as a 352 gateway by letting other genuine gateways discover it (by speaking 353 BGP to them within the domain) and so may get those genuine 354 gateways to advertise it as a gateway into the backbone network. 356 o If it is possible to modify a BGP message within the backbone, it 357 may be possible to spoof the existence of a gateway. This could 358 cause traffic to be attracted to a specific node and might result 359 in black-holing of traffic. 361 All of the issues in the list above could cause disruption to domain 362 interconnection, but are not new protocol vulnerabilities so much as 363 new exposures of information that could be protected against using 364 existing protocol mechanisms. Furthermore, it is a general 365 observation that if these attacks are possible then it is highly 366 likely that far more significant attacks can be made on the routing 367 system. It should be noted that BGP peerings are not discovered, but 368 always arise from explicit configuration. 370 8. Manageability Considerations 372 The principal configuration item added by this solution is the 373 allocation of an SR domain identifier. The same identifier must be 374 assigned to every GW to the same domain, and each domain must have a 375 different identifier. This requires coordination probably through a 376 central management agent. 378 TBD 380 9. Acknowledgements 382 Thanks to Bruno Rijsman for review comments, and to Robert Raszuk for 383 useful discussions. 385 10. References 387 10.1. Normative References 389 [I-D.ietf-idr-bgpls-segment-routing-epe] 390 Previdi, S., Filsfils, C., Patel, K., Ray, S., and J. 391 Dong, "BGP-LS extensions for Segment Routing BGP Egress 392 Peer Engineering", draft-ietf-idr-bgpls-segment-routing- 393 epe-15 (work in progress), March 2018. 395 [I-D.ietf-idr-tunnel-encaps] 396 Rosen, E., Patel, K., and G. Velde, "The BGP Tunnel 397 Encapsulation Attribute", draft-ietf-idr-tunnel-encaps-09 398 (work in progress), February 2018. 400 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 401 Requirement Levels", BCP 14, RFC 2119, 402 DOI 10.17487/RFC2119, March 1997, 403 . 405 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 406 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 407 DOI 10.17487/RFC4271, January 2006, 408 . 410 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 411 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 412 February 2006, . 414 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 415 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 416 June 2010, . 418 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 419 S. Ray, "North-Bound Distribution of Link-State and 420 Traffic Engineering (TE) Information Using BGP", RFC 7752, 421 DOI 10.17487/RFC7752, March 2016, 422 . 424 10.2. Informative References 426 [I-D.farrel-spring-sr-domain-interconnect] 427 Farrel, A. and J. Drake, "Interconnection of Segment 428 Routing Domains - Problem Statement and Solution 429 Landscape", draft-farrel-spring-sr-domain-interconnect-03 430 (work in progress), January 2018. 432 [I-D.ietf-idr-bgp-ls-segment-routing-ext] 433 Previdi, S., Talaulikar, K., Filsfils, C., Gredler, H., 434 and M. Chen, "BGP Link-State extensions for Segment 435 Routing", draft-ietf-idr-bgp-ls-segment-routing-ext-06 436 (work in progress), April 2018. 438 [I-D.ietf-spring-segment-routing] 439 Filsfils, C., Previdi, S., Ginsberg, L., Decraene, B., 440 Litkowski, S., and R. Shakir, "Segment Routing 441 Architecture", draft-ietf-spring-segment-routing-15 (work 442 in progress), January 2018. 444 [RFC4272] Murphy, S., "BGP Security Vulnerabilities Analysis", 445 RFC 4272, DOI 10.17487/RFC4272, January 2006, 446 . 448 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 449 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 450 2006, . 452 [RFC6952] Jethanandani, M., Patel, K., and L. Zheng, "Analysis of 453 BGP, LDP, PCEP, and MSDP Issues According to the Keying 454 and Authentication for Routing Protocols (KARP) Design 455 Guide", RFC 6952, DOI 10.17487/RFC6952, May 2013, 456 . 458 [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, 459 "Advertisement of Multiple Paths in BGP", RFC 7911, 460 DOI 10.17487/RFC7911, July 2016, 461 . 463 [RFC7926] Farrel, A., Ed., Drake, J., Bitar, N., Swallow, G., 464 Ceccarelli, D., and X. Zhang, "Problem Statement and 465 Architecture for Information Exchange between 466 Interconnected Traffic-Engineered Networks", BCP 206, 467 RFC 7926, DOI 10.17487/RFC7926, July 2016, 468 . 470 Authors' Addresses 472 John Drake 473 Juniper Networks 475 Email: jdrake@juniper.net 477 Adrian Farrel 478 Juniper Networks 480 Email: afarrel@juniper.net 482 Eric Rosen 483 Juniper Networks 485 Email: erosen@juniper.net 487 Keyur Patel 488 Arrcus, Inc. 490 Email: keyur@arrcus.com 492 Luay Jalil 493 Verizon 495 Email: luay.jalil@verizon.com