idnits 2.17.1 draft-marques-l3vpn-ibgp-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC4364]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 148: '...E-received route MUST be advertised to...' RFC 2119 keyword, line 240: '...XT_HOP attribute SHOULD NOT be include...' RFC 2119 keyword, line 274: '...customer network SHOULD use internal o...' RFC 2119 keyword, line 279: '... routes MAY be advertised to both in...' RFC 2119 keyword, line 305: '... MUST check that the autonomous-s...' (1 more instance...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 3, 2010) is 5161 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- No issues found here. Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Marques 3 Internet-Draft Raszuk 4 Intended status: Experimental Patel 5 Expires: September 4, 2010 Cisco Systems 6 Kumaki 7 Yamagata 8 KDDI Corporation 9 March 3, 2010 11 Internal BGP as PE-CE protocol 12 draft-marques-l3vpn-ibgp-02 14 Abstract 16 This document defines protocol extensions and procedures for BGP 17 PE-CE router iteration in BGP/MPLS IP VPN [RFC4364] networks. These 18 have the objective of making the usage of the BGP/MPLS IP VPN 19 transparent to the customer network, as far as routing information is 20 concerned. 22 Status of this Memo 24 This Internet-Draft is submitted to IETF in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF), its areas, and its working groups. Note that 29 other groups may also distribute working documents as Internet- 30 Drafts. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 The list of current Internet-Drafts can be accessed at 38 http://www.ietf.org/ietf/1id-abstracts.txt. 40 The list of Internet-Draft Shadow Directories can be accessed at 41 http://www.ietf.org/shadow.html. 43 This Internet-Draft will expire on September 4, 2010. 45 Copyright Notice 47 Copyright (c) 2010 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. IP VPN network as a Route Server . . . . . . . . . . . . . . . 4 64 3. Path attributes . . . . . . . . . . . . . . . . . . . . . . . 6 65 4. Carrying internal BGP routes . . . . . . . . . . . . . . . . . 7 66 5. Next-hop handling . . . . . . . . . . . . . . . . . . . . . . 8 67 6. Exchanging routes between different VPN customer networks . . 9 68 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 11 69 8. Security considerations . . . . . . . . . . . . . . . . . . . 12 70 9. IANA considerations . . . . . . . . . . . . . . . . . . . . . 13 71 10. Normative References . . . . . . . . . . . . . . . . . . . . . 14 72 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 74 1. Introduction 76 In current deployments, when BGP is used as the PE-CE routing 77 protocol, these peering sessions are typically configured as an 78 external peering between the VPN provider AS and the customer network 79 AS. At each External BGP boundary, Path Attributes [RFC4271] are 80 modified as per standard BGP rules. This includes prepending the 81 AS_PATH attribute with the autonomous system of the originating 82 customer CE and the automomous system(s) of the provider edge 83 router(s). 85 In order for such routes not to be rejected by AS_PATH loop 86 detection, a PE router advertising a route received from a remote PE, 87 often remaps the customer network autonomous-system number to its 88 own. Otherwise the customer network can use different autonomous- 89 system numbers at different sites or configure their CE routers to 90 accept routes containing their own AS number. 92 While this technique works well in situations where there are no BGP 93 routing exchanges between the client network and other networks, it 94 does have drawbacks for customer networks that use BGP internally for 95 purposes other than interaction between CE and PE routers. 97 In order to make the usage of BGP/MPLS VPN services as transparent as 98 possible to any external interaction, it is desirable to define a 99 mechanism by which PE-CE routers can exchange BGP routes by means 100 other than external BGP. 102 One can consider a BGP/MPLS VPN as a provider-managed backbone 103 service interconnecting several customer-managed sites. While this 104 model is not universal it does constitute a good starting point. 106 Independently of the presence of VPN service, networks which use an 107 hierarchical design are typically modeled such that the top-level 108 core or backbone participates in a full iBGP mesh which distributes 109 routing information between sites via BGP route reflection [RFC4456] 110 or confederations [RFC5065]. This will be our service model 111 definition. 113 2. IP VPN network as a Route Server 115 In a typical backbone/area hierarchical design, routers that attach 116 an area (or site) to the core, use BGP route reflection (or 117 confederations) to distribute routes between the top-level core iBGP 118 mesh and the local area iBGP cluster. 120 To provide equivalent functionality in a network using a provider 121 provisioned backbone, one can consider the VPN network as the 122 equivalent of an Internal BGP Route Server which multiplexes 123 information from _N_ VPN attachment points. 125 A route learned by any of the PEs in the IP VPN network, is available 126 to all other PEs that import the Route Target used to identify the 127 customer network. This is conceptually equivalent to a centralized 128 route server. 130 In a PE router, PE received routes are not advertised back to other 131 PEs. It is this split horizon technique that prevents routing loops 132 in an IP VPN environment. This is also consistent with the behavior 133 of a top level mesh of RRs. 135 In order to complete the Route Server model, is necessary to be able 136 to transparently carry the Internal BGP PATH attributes of customer 137 network routes through the BGP/MPLS VPN core. This is achieved by 138 using a new BGP path attribute described bellow that allows the 139 customer network attributes to be saved and restored at the BGP/MPLS 140 VPN boundaries. 142 When a route is advertised from PE to CE, if it is advertised as an 143 iBGP route, the CE will not advertise it further unless it is itself 144 configured as a Route Reflector (or has an external BGP session). 145 This is a consequence of the default BGP behavior of not advertising 146 iBGP routes back to iBGP peers. This behavior is not modified. 148 On a BGP/MPLS VPN PE, a CE-received route MUST be advertised to other 149 VPN PEs that import the Route Targets which are associated with the 150 route. This is independent of whether the CE route has been received 151 as an external or internal route. However, a CE received route is 152 not readvertised back to other CEs unless Route Reflection is 153 explicitly configured. This is the equivalent of disabling client to 154 client reflection in BGP RR implementations. 156 When reflection is configured on the PE router, with local CE routers 157 as clients, there is no need to internally mesh multiple CEs that may 158 exist in the site. 160 This Route Server model can also be used to support a confederation 161 style abstraction to CE devices. We choose not to describe in detail 162 the procedures for that mode of operation, at this point. 163 Confederations are considered to be less common than route reflection 164 in enterprise environments. 166 3. Path attributes 168 --> push path attributes --> vrf-export --> 2547 169 VRF route PE-PE route 170 advertisement 171 <-- pop path attributes <-- vrf-import <-- 173 The diagram above shows the BGP path attribute stack processing in 174 relation to existing 2547 route processing procedures. BGP path 175 attributes received from a customer network are pushed into the 176 stack, before adding the Export Route Targets to the BGP path 177 attributes. Conversely, the stack is poped after the Import Target 178 processing step that identifies the VRF table in which a PE received 179 route is accepted. 181 When a PE received route is imported into a VRF, its IGP metric, as 182 far as BGP path selection is concerned, should be the metric to the 183 remote PE address, expressed in terms of the service provider metric 184 domain. 186 For the purposes of VRF route selection performed at the PE, between 187 routes received from local CEs and remote PEs, VPN network IGP 188 metrics should always be considered higher (thus least preferred) 189 than local site metrics. 191 When backdoor links are present, this would tend to direct the 192 traffic between two sites through the backdoor link for BGP routes 193 originated by a remote site. However BGP already has policy 194 mechanisms to address this type of situations such as the LOCAL_PREF 195 attribute. 197 When a given CE is connected to more than one PE, it will not 198 advertise the route that it receives from a PE to another PE unless 199 configured as a route reflector, due to the standard BGP route 200 advertisement rules. 202 When a CE reflects a PE received route to another PE, the fact that 203 the original attributes of a route are preserved across the VPN 204 network prevents the formation of routing loops due to mutual 205 redistribution between the two networks. 207 4. Carrying internal BGP routes 209 In order to carry the original BGP attributes of a route received 210 from a CE, this document defines a new BGP path attribute: 212 ATTR_SET (type code 128) 214 ATTR_SET is an optional transitive attribute that carries a set 215 of BGP path attributes. An attribute set (ATTR_SET) can 216 include any BGP attribute that can occur in a BGP UPDATE 217 message, except the MP_REACH and MP_UNREACH attributes. 219 This attribute is used by a PE router to store the original set of 220 BGP attributes it receives from a CE. When a PE router advertises a 221 PE-received route to a CE, it will use the path attributes carried in 222 the ATTR_SET attribute. 224 In other words, the BGP Path Attributes are "pushed" into this stack 225 like attribute when the route is received by the VPN network and 226 "popped" when the route is advertised in the PE to CE direction. 228 Using this mechanism isolates the customer network from the 229 attributes used in the VPN network and vice versa. Attributes as the 230 route reflection cluster list attribute are segregated such that 231 customer network cluster identifiers won't be considered by the VPN 232 network route reflectors and vice-versa. 234 The autonomous system number present in the ATTR_SET attribute is 235 designed to prevent a route originating in a given autonomous-system 236 iBGP to be leaked into a different autonomous-system, without proper 237 AS_PATH manipulation. It should contain the autonomous system of the 238 customer network that originates the given set of attributes. 240 The NEXT_HOP attribute SHOULD NOT be included in an ATTR_SET. 242 5. Next-hop handling 244 When BGP/MPLS VPNs are not in use, the NEXT_HOP attribute in iBGP 245 routes carries the address of the border router advertising the route 246 into the domain. 248 An important component of BGP route selection is the IGP distance to 249 the NEXT_HOP of the route. 251 When a BGP/MPLS VPN service is used to provide interconnection 252 between different sites, since the VPN network runs a different IGP 253 domain, metrics between the VPN and customer networks are not 254 comparable. 256 However, the most important component of a metric is the inter-area 257 metric, which is known to the VPN network. The intra-area metric is 258 typically negligible. 260 The use of route reflection, for instance, requires metrics to be 261 configured so that inter-cluster/area metrics are always greater than 262 intra-cluster metrics. 264 The approach taken by this document is to rewrite the NEXT_HOP 265 attribute at the PE-CE boundary. PE routers take into account the 266 PE-PE IGP distance calculated by the VPN network IGP, when selecting 267 between routes advertised from different PEs. 269 An advantage of the proposed method is that the customer network can 270 run independent IGPs at each site. 272 6. Exchanging routes between different VPN customer networks 274 A given VPN customer network SHOULD use internal or external BGP 275 sessions consistently for peering sessions where the same autonomous 276 system is used. 278 In scenarios such as what is commonly referred to an "extranet" VPN, 279 routes MAY be advertised to both internal and external VPN 280 attachments, belonging to different autonomous systems. 282 +-----+ +-----+ 283 | PE1 |-----------------| PE2 | 284 +-----+ +-----+ 285 / \ | 286 +-----+ +-----+ +-----+ 287 | CE1 | | CE2 | | CE3 | 288 +-----+ +-----+ +-----+ 289 AS 1 AS 2 AS 1 291 Consider the example given above where (PE1, CE1) and (PE2, CE3) 292 sessions are iBGP. In RFC2547 VPNs, a route received from CE1 above 293 may be distributed to the VRFs corresponding to the attachment points 294 for CEs 2 and 3. 296 The desired result, in such a scenario is to present the internal 297 peer (CE3) with a BGP advertisement that contains the same BGP Path 298 Attributes received from CE1 and to the external peer (CE 2) a BGP 299 advertisement that would correspond to a situation where AS 1 and 2 300 have a external BGP session between them. 302 It order to achieve this goal the following set of rules apply: 304 When advertising an iBGP originated route to iBGP, a PE router 305 MUST check that the autonomous-system contained in the ATTR_SET 306 attribute matches the autonomous system of the CE to which the 307 route is being advertised. 309 In case the autonomous-systems do match, the route is advertised 310 with the attributes contained in the ATTR_SET attribute. 311 Otherwise, in the case of an autonomous-system mismatch, the set 312 of attributes to be advertised to the CE in question shall be 313 constructed as follows: 315 1. The path attributes are set to the attributes contained in the 316 ATTR_SET attribute. 318 2. Internal BGP specific attributes are discarded (LOCAL_PREF, 319 ORIGINATOR, CLUSTER_LIST, etc). 321 3. The autonomous-system contained in the ATTR_SET attribute is 322 prepended to the as-path following the rules that would apply 323 to an external BGP peering between the source and destination 324 ASes. 326 4. Internal BGP specific attributes corresponding to the 327 configuration of destination AS (LOCAL_PREF) are added. 329 When advertising an iBGP originated route to eBGP, a PE router 330 shall apply steps 1 to 3 defined above and subsequently prepend 331 its own autonomous-system number to the AS_PATH attribute (i.e. 332 both the originator and VPN network as numbers are prepended). 334 When advertising an eBGP originated route to iBGP, a PE router 335 MUST prepend its own as number before adding iBGP only as-path 336 attributes (LOCAL_PREF). 338 In all cases where an iBGP originating route is processed, attributes 339 present on the VPN route other than the NEXT_HOP attribute are 340 ignored, both from the point of view of route selection in the VRF 341 Adj-RIB-in and route advertisement to a CE router. 343 7. Contributors 344 8. Security considerations 346 It is worthwhile to consider the security implications of this 347 proposal from two independent perspectives: the IP VPN provider and 348 the IP VPN customer. 350 From a IP VPN provider perspective, this mechanism will assure 351 separation between the BGP path attributes advertised by the customer 352 CE router and the BGP attributes used within the provider network, 353 thus potentially improving security. 355 Although this behavior is largely implementation dependent, currently 356 it is possible for a CE device to inject BGP attributes (extended 357 communities, for example) that have semantics on the IP VPN provider 358 network, unless explicitly disabled by configuration in the PE. 360 With the rules specified for the ATTR_SET path attribute, any 361 attribute that has been received from a CE is pushed into the stack 362 before the route is advertised out to other PEs. 364 From the perspective of the VPN customer network, it is our opinion 365 that there is no change to the security profile of PE-CE interaction. 366 While having an iBGP session allows the PE to specify additional 367 attributes not allowed on an eBGP session (e.g. local-pref), this 368 does not significantly change the fact that the VPN customer must 369 trust its service provider to provide it correct routing information. 371 9. IANA considerations 373 This document defines a new BGP path attribute which is part of a 374 registry space managed by IANA. We request that IANA update its 375 registry with the value specified above (128) for the ATTR_SET path 376 attribute. 378 10. Normative References 380 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 381 Protocol 4 (BGP-4)", RFC 4271, January 2006. 383 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 384 Networks (VPNs)", RFC 4364, February 2006. 386 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 387 Reflection: An Alternative to Full Mesh Internal BGP 388 (IBGP)", RFC 4456, April 2006. 390 [RFC5065] Traina, P., McPherson, D., and J. Scudder, "Autonomous 391 System Confederations for BGP", RFC 5065, August 2007. 393 Authors' Addresses 395 Pedro Marques 396 Cisco Systems 397 170 W. Tasman Dr 398 San Jose, CA 95134 399 US 401 Email: roque@cisco.com 403 Robert Raszuk 404 Cisco Systems 405 170 W. Tasman Dr 406 San Jose, CA 95134 407 US 409 Email: raszuk@cisco.com 411 Keyur Patel 412 Cisco Systems 413 170 W. Tasman Dr 414 San Jose, CA 95134 415 US 417 Email: keyupate@cisco.com 419 Kenji Kumaki 420 KDDI Corporation 421 Garden Air Tower 422 Iidabashi 423 Chiyoda-ku, Tokyo 102-8460 424 JAPAN 426 Email: ke-kumaki@kddi.com 428 Tomohiro Yamagata 429 KDDI Corporation 430 Garden Air Tower 431 Iidabashi 432 Chiyoda-ku, Tokyo 102-8460 433 JAPAN 435 Email: to-yamagata@kddi.com