idnits 2.17.1 draft-mohanty-bess-mutipath-interas-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (September 10, 2017) is 2420 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'I-D.ietf-idr-extcomm-iana' is defined on line 257, but no explicit reference was found in the text == Unused Reference: 'RFC4360' is defined on line 277, but no explicit reference was found in the text == Unused Reference: 'RFC6624' is defined on line 296, but no explicit reference was found in the text == Outdated reference: A later version (-07) exists of draft-ietf-idr-link-bandwidth-06 -- Obsolete informational reference (is this intentional?): RFC 3107 (Obsoleted by RFC 8277) Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS WorkGroup S. Mohanty 3 Internet-Draft A. Sreekantiah 4 Intended status: Informational D. Rao 5 Expires: March 14, 2018 Cisco Systems 6 K. Patel 7 Arrcus, Inc 8 September 10, 2017 10 BGP Multipath in Inter-AS Option-B 11 draft-mohanty-bess-mutipath-interas-01 13 Abstract 15 By default, The Border Gateway Protocol, BGP only installs the best- 16 path to the IP Routing Table. BGP multi-path is a well known feature 17 that enables installation of multiple paths to the IP Routing Table. 18 This is done to achieve load balancing while forwarding traffic. For 19 a path to be eligible as a multi-path, certain criteria need to be 20 fulfilled. Inter-AS VPNs are commonly deployed to span organizations 21 across Service Provider boundaries. In this draft, we describe an 22 issue relating to multi-path load balancing that can arise in an 23 Option B Inter-AS Deployment. With the help of a representative 24 topology, we illustrate the problem and then present two simple 25 schemes as the solution to the problem. We also note as a matter of 26 independent interest that the same underlying issue is applicable to 27 deployments that employ next-hop-self behavior (implicit or explicit) 28 downstream and the multi-path feature upstream. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at https://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on March 14, 2018. 47 Copyright Notice 49 Copyright (c) 2017 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (https://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 65 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3 66 3. Topology notation . . . . . . . . . . . . . . . . . . . . . . 3 67 4. Problem Description . . . . . . . . . . . . . . . . . . . . . 4 68 5. BGP ADDpath with the non-unique RD case . . . . . . . . . . . 4 69 6. BGP Labeled unicast with Add-Path . . . . . . . . . . . . . . 5 70 7. BGP Multi-path Inter-As Solution 1 . . . . . . . . . . . . . 5 71 8. BGP Multi-path Inter-As Solution 2 . . . . . . . . . . . . . 5 72 9. Protocol Considerations . . . . . . . . . . . . . . . . . . . 6 73 10. Operational Considerations . . . . . . . . . . . . . . . . . 6 74 11. Security Considerations . . . . . . . . . . . . . . . . . . . 6 75 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 6 76 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 6 77 13.1. Normative References . . . . . . . . . . . . . . . . . . 6 78 13.2. Informative References . . . . . . . . . . . . . . . . . 7 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 7 81 1. Introduction 83 By Default BGP [RFC4271] only advertises the best-path to a peer and 84 also installs the best-path to the IP Routing Table (RIB) and thereby 85 to the Forwarding Information Base (FIB). BGP multi-path is a 86 feature where more than one received BGP route, rather than only the 87 one corresponding to the BGP best-path, are installed in the IP 88 Routing Table and the Forwarding Information Base. This offers 89 benefits of load balancing, efficient utilization of system resources 90 network-wide, and enabling high throughput for traffic flows which 91 would be lacking otherwise. It also has the added benefit of 92 providing redundancy in case one of the BGP paths are withdrawn due 93 to a link going down or some other event. Often vendors have a 94 configurable knob which dictates how many paths to a given 95 destination can be installed in the forwarding. 97 BGP Multi-path is widely deployed in practice and when augmented with 98 the Demilitarized Link Bandwidth (DMZ LB) 99 [I-D.ietf-idr-link-bandwidth] can be used to provide unequal cost 100 load balancing as per user control. 102 The BGP best-path algorithm proceeds through a well-known and 103 deterministic selection mechanism in determining the best-path. 104 Typically, a path is deemed eligible as a multi-path, if it 105 encounters a tie with the best-path, when it is determined that the 106 IGP cost (metric) to the BGP next-hop is the same, as per the BGP 107 best-path algorithm [RFC4271]. In addition, two paths, which match 108 all criteria until the IGP metric but have the same next-hop IP 109 address cannot both be considered as multi-paths. This is regardless 110 of EBGP or IBGP rules. In this draft we point out an issue that 111 limits the benefits of multi-path deployments arising out of above 112 restrictions when the BGP path is propagated across Inter-AS Option B 113 [RFC4364] Autonomous System Boundary Routers (ASBRs). 115 \ / 116 \ / 117 |----PE1----| | | 118 | | | | 119 CE1---| RR-------ASBR1------ASBR2------PE3 120 |----PE2----| | | 121 | | 122 AS 100 AS 200 124 Inter-AS Option B. 126 Figure 1 128 2. Requirements Language 130 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 131 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 132 document are to be interpreted as described in [RFC2119]. 134 3. Topology notation 136 In the Figure 1. above, we consider a typical Inter-AS Option B 137 topology, ASBR1 peering with ASBR2 over the inter-AS eBGPlink. A 138 VPN, vpn has a presence in both the Autonomous Systems, on all the PE 139 routers shown; i.e. a Virtual routing Forwarding (VRF) tables 140 associated with the VPN vpn exists at each of the Provider Routers 141 shown. A dual-homed CE, CE1 is peering with PE1 and PE2 respectively 142 in the context of vrf VRF1. 144 Denote the Route-Distinguisher (RD) of the vrf VRF1 configured in PE1 145 by RD1. Denote the Route-Distinguisher of the vrf VRF2 configured in 146 PE2 by RD2. Assume that CE1 advertises an ipv4 prefix p, at ASBR1, 147 the received VPN route prefix will be RD1:p and RD2:p, with next-hops 148 PE1 and PE2 respectively, with the vpn (service) label as L1 and L2 149 respectively. 151 4. Problem Description 153 As per EBGP rules at the advertising ASBR, ASBR1,the next-hop will be 154 reset to the ASBR1 itself. This causes the two routes RD1:p and 155 RD2:p to be advertised to the receiving AS, AS2, with the mandatory 156 attribute, the next-hop which points to ASBR1. 158 Let's say the swapped label for RD1:p and RD2:p at ASBR1 is L1 and L2 159 respectively. If ASBR2 does not reset the next-hop (usual behavior), 160 then the two paths will be received at PE3 with the same next-hop, 161 i.e. ASBR1. If ASBR2 does reset the next-hop, then the two paths 162 will be received at PE3 with the next-hop set to ASBR2. 164 In either case above, the two paths received at PE3 have the same 165 next-hop, even though the labels are different. As explained 166 earlier, if two received BGP paths have the same next-hop, then both 167 of them cannot be eligible for multi-paths at the same time. This 168 means that at the PE3, only one of the routes will be installed in 169 the forwarding. 171 In the Figure 1 above, even though the advertising AS (AS 100) has 172 path redundancy, this is not visible to AS 200, and therefore load 173 balancing cannot be done at ASBR1. Note that this is different from 174 the classic same RD problem which one often encounters in the Route- 175 Reflector context. 177 5. BGP ADDpath with the non-unique RD case 179 The above scenario is described in the context of the unique-RD case. 180 Now consider the case when one has non-unique RDs configured for the 181 vpn VRF at PE1 and PE2, and BGP Add-Path [RFC7911] is used to 182 propagate the paths to AS200 via RR, ASBR1 and ASBR2 respectively. 183 In this case, the ASBR1 resets the next-hop to itself in both of the 184 add-paths thus ensuring that the two add-paths cannot be installed as 185 primary and backup in the FIB at PE3 in AS200. 187 6. BGP Labeled unicast with Add-Path 189 A similar situation exists for non-VPN labeled traffic. Figure 2 190 shows a simple ebgp topology, in which R1 is in AS 1, R2 and R3 are 191 in AS 2, R4 is in AS 3, and R5 is in AS 4. A labeled unicast 192 [RFC3107] prefix, p, is being advertised from R1 to R5. Add-Path is 193 configured at R4 and R5 and the capability is negotiated. Both R2 194 and R3, will set the next-hop to themselves. When R4 receives the 195 prefix p from R2 and R3, the situation is similar to the add-path 196 scenario for the VPN case as described in the earlier section. As a 197 result only one of the paths will be advertised to R5. 199 |===== R2 =====| 200 | | 201 R1----| R4---------- R5 202 | | 203 |===== R3 =====| 204 -AS1 -| - - AS2 - -|-AS3-|----AS4 206 Inter-AS Option B. 208 Figure 2 210 7. BGP Multi-path Inter-As Solution 1 212 The first solution is to consider the uniqueness of the label and the 213 next-hop by considering the tuple (next-hop, label). This translates 214 to (ASBR1, L1) and (ASBR2, L2) and therefore they can be 215 distinguished. However many existing deployments today consider only 216 the next-hop as the key. Therefore this solution requires upgrade to 217 existing deployment software. An independent issue is that there 218 should be no implications on hashing the weights assigned to the 219 paths in the FIB due to the dependency on the label. 221 8. BGP Multi-path Inter-As Solution 2 223 The second solution is to inject two loopback ip addresses at ASBR1 224 into the IBGP of the receiving AS corresponding to the PE1 and PE2's 225 configured ip address or loopbacks that are in the next-hop attribute 226 of the vpn routes RD1:p and RD2:p. These loopback addresses need to 227 be injected into the IGP of the receiving AS. Also ASBR2 needs to be 228 configured with a static route pointing to ASBR1 for this purpose. 229 Alternatively, ASBR1 can redistribute these loopbacks into EBGP. 230 This is also equivalent to doing next-hop-self. The above solution 231 won't require any software upgrade. However it will require the 232 implementation to support policy and may have security implications 233 since routes need to be leaked from one AS to the other. 235 9. Protocol Considerations 237 No Protocol Changes are necessary 239 10. Operational Considerations 241 Any of the two methods above can be adopted. A note may be made that 242 these solutions also are applicable to EVPN [RFC7432] 244 11. Security Considerations 246 This document raises no new security issues for L3VPN. 248 12. Acknowledgements 250 The authors would like to thank Yuri Tsier for his feedback and 251 useful discussions 253 13. References 255 13.1. Normative References 257 [I-D.ietf-idr-extcomm-iana] 258 Rosen, E. and Y. Rekhter, "IANA Registries for BGP 259 Extended Communities", draft-ietf-idr-extcomm-iana-02 260 (work in progress), December 2013. 262 [I-D.ietf-idr-link-bandwidth] 263 Mohapatra, P. and R. Fernando, "BGP Link Bandwidth 264 Extended Community", draft-ietf-idr-link-bandwidth-06 265 (work in progress), January 2013. 267 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 268 Requirement Levels", BCP 14, RFC 2119, 269 DOI 10.17487/RFC2119, March 1997, 270 . 272 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 273 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 274 DOI 10.17487/RFC4271, January 2006, 275 . 277 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 278 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 279 February 2006, . 281 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 282 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 283 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 284 2015, . 286 13.2. Informative References 288 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 289 BGP-4", RFC 3107, DOI 10.17487/RFC3107, May 2001, 290 . 292 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 293 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 294 2006, . 296 [RFC6624] Kompella, K., Kothari, B., and R. Cherukuri, "Layer 2 297 Virtual Private Networks Using BGP for Auto-Discovery and 298 Signaling", RFC 6624, DOI 10.17487/RFC6624, May 2012, 299 . 301 [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, 302 "Advertisement of Multiple Paths in BGP", RFC 7911, 303 DOI 10.17487/RFC7911, July 2016, 304 . 306 Authors' Addresses 308 Satya Ranjan Mohanty 309 Cisco Systems 310 170 W. Tasman Drive 311 San Jose, CA 95134 312 USA 314 Email: satyamoh@cisco.com 316 Arjun Sreekantiah 317 Cisco Systems 318 170 W. Tasman Drive 319 San Jose, CA 95134 320 USA 322 Email: asreekan@cisco.com 323 Dhananjaya Rao 324 Cisco Systems 325 170 W. Tasman Drive 326 San Jose, CA 95134 327 USA 329 Email: dhrao@cisco.com 331 Keyur Patel 332 Arrcus, Inc 334 Email: keyur@arrcus.com