idnits 2.17.1 draft-ietf-lsvr-bgp-spf-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 30, 2018) is 2155 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2328' is mentioned on line 616, but not defined == Missing Reference: 'RFC5286' is mentioned on line 650, but not defined == Missing Reference: 'RFC4456' is mentioned on line 620, but not defined == Missing Reference: 'RFC4915' is mentioned on line 645, but not defined == Missing Reference: 'RFC5549' is mentioned on line 655, but not defined ** Obsolete undefined reference: RFC 5549 (Obsoleted by RFC 8950) == Missing Reference: 'RFC4790' is mentioned on line 640, but not defined == Missing Reference: 'RFC5880' is mentioned on line 660, but not defined == Missing Reference: 'RFC4760' is mentioned on line 635, but not defined == Missing Reference: 'RFC4750' is mentioned on line 630, but not defined == Missing Reference: 'RFC4724' is mentioned on line 625, but not defined == Outdated reference: A later version (-19) exists of draft-ietf-idr-bgpls-segment-routing-epe-14 ** Obsolete normative reference: RFC 7752 (Obsoleted by RFC 9552) ** Downref: Normative reference to an Informational RFC: RFC 7938 Summary: 3 errors (**), 0 flaws (~~), 13 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group K. Patel 3 Internet-Draft Arrcus, Inc. 4 Intended status: Standards Track A. Lindem 5 Expires: December 1, 2018 Cisco Systems 6 S. Zandi 7 Linkedin 8 W. Henderickx 9 Nokia 10 May 30, 2018 12 Shortest Path Routing Extensions for BGP Protocol 13 draft-ietf-lsvr-bgp-spf-00.txt 15 Abstract 17 Many Massively Scaled Data Centers (MSDCs) have converged on 18 simplified layer 3 routing. Furthermore, requirements for 19 operational simplicity have lead many of these MSDCs to converge on 20 BGP as their single routing protocol for both their fabric routing 21 and their Data Center Interconnect (DCI) routing. This document 22 describes a solution which leverages BGP Link-State distribution and 23 the Shortest Path First algorithm similar to Internal Gateway 24 Protocols (IGPs) such as OSPF. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on December 1, 2018. 43 Copyright Notice 45 Copyright (c) 2018 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 This document may contain material from IETF Documents or IETF 59 Contributions published or made publicly available before November 60 10, 2008. The person(s) controlling the copyright in some of this 61 material may not have granted the IETF Trust the right to allow 62 modifications of such material outside the IETF Standards Process. 63 Without obtaining an adequate license from the person(s) controlling 64 the copyright in such materials, this document may not be modified 65 outside the IETF Standards Process, and derivative works of it may 66 not be created outside the IETF Standards Process, except to format 67 it for publication as an RFC or to translate it into languages other 68 than English. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 73 1.1. BGP Shortest Path First (SPF) Motivation . . . . . . . . 4 74 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 5 75 2. BGP Peering Models . . . . . . . . . . . . . . . . . . . . . 5 76 2.1. BGP Single-Hop Peering on Network Node Connections . . . 5 77 2.2. BGP Peering Between Directly Connected Network Nodes . . 5 78 2.3. BGP Peering in Route-Reflector or Controller Topology . . 6 79 3. BGP-LS Shortest Path Routing (SPF) SAFI . . . . . . . . . . . 6 80 4. Extensions to BGP-LS . . . . . . . . . . . . . . . . . . . . 6 81 4.1. Node NLRI Usage and Modifications . . . . . . . . . . . . 7 82 4.2. Link NLRI Usage . . . . . . . . . . . . . . . . . . . . . 7 83 4.3. Prefix NLRI Usage . . . . . . . . . . . . . . . . . . . . 8 84 4.4. BGP-LS Attribute Sequence-Number TLV . . . . . . . . . . 8 85 5. Decision Process with SPF Algorithm . . . . . . . . . . . . . 9 86 5.1. Phase-1 BGP NLRI Selection . . . . . . . . . . . . . . . 9 87 5.2. Dual Stack Support . . . . . . . . . . . . . . . . . . . 10 88 5.3. NEXT_HOP Manipulation . . . . . . . . . . . . . . . . . . 10 89 5.4. IPv4/IPv6 Unicast Address Family Interaction . . . . . . 11 90 5.5. NLRI Advertisement and Convergence . . . . . . . . . . . 11 91 5.6. Error Handling . . . . . . . . . . . . . . . . . . . . . 11 92 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 93 7. Security Considerations . . . . . . . . . . . . . . . . . . . 12 94 7.1. Acknowledgements . . . . . . . . . . . . . . . . . . . . 12 95 7.2. Contributors . . . . . . . . . . . . . . . . . . . . . . 12 97 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 98 8.1. Normative References . . . . . . . . . . . . . . . . . . 13 99 8.2. Information References . . . . . . . . . . . . . . . . . 14 100 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 15 102 1. Introduction 104 Many Massively Scaled Data Centers (MSDCs) have converged on 105 simplified layer 3 routing. Furthermore, requirements for 106 operational simplicity have lead many of these MSDCs to converge on 107 BGP [RFC4271] as their single routing protocol for both their fabric 108 routing and their Data Center Interconnect (DCI) routing. 109 Requirements and procedures for using BGP are described in [RFC7938]. 110 This document describes an alternative solution which leverages BGP- 111 LS [RFC7752] and the Shortest Path First algorithm similar to 112 Internal Gateway Protocols (IGPs) such as OSPF [RFC2328]. 114 [RFC4271] defines the Decision Process that is used to select routes 115 for subsequent advertisement by applying the policies in the local 116 Policy Information Base (PIB) to the routes stored in its Adj-RIBs- 117 In. The output of the Decision Process is the set of routes that are 118 announced by a BGP speaker to its peers. These selected routes are 119 stored by a BGP speaker in the speaker's Adj-RIBs-Out according to 120 policy. 122 [RFC7752] describes a mechanism by which link-state and TE 123 information can be collected from networks and shared with external 124 components using BGP. This is achieved by defining NLRI carried 125 within BGP-LS AFI and BGP-LS SAFIs. The BGP-LS extensions defined in 126 [RFC7752] makes use of the Decision Process defined in [RFC4271]. 128 This document augments [RFC7752] by replacing its use of the existing 129 Decision Process. The BGP-LS-SPF and BGP-LS-SPF-VPN AFI/SAFI are 130 introduced to insure backward compatibility. The Phase 1 and 2 131 decision functions of the Decision Process are replaced with the 132 Shortest Path Algorithm (SPF) also known as the Dijkstra Algorithm. 133 The Phase 3 decision function is also simplified since it is no 134 longer dependent on the previous phases. This solution avails the 135 benefits of both BGP and SPF-based IGPs. These include TCP based 136 flow-control, no periodic link-state refresh, and completely 137 incremental NLRI advertisement. These advantages can reduce the 138 overhead in MSDCs where there is a high degree of Equal Cost Multi- 139 Path (ECMPs) and the topology is very stable. Additionally, using a 140 SPF-based computation can support fast convergence and the 141 computation of Loop-Free Alternatives (LFAs) [RFC5286] in the event 142 of link failures. Furthermore, a BGP based solution lends itself to 143 multiple peering models including those incorporating route- 144 reflectors [RFC4456] or controllers. 146 Support for Multiple Topology Routing (MTR) as described in [RFC4915] 147 is an area for further study dependent on deployment requirements. 149 1.1. BGP Shortest Path First (SPF) Motivation 151 Given that [RFC7938] already describes how BGP could be used as the 152 sole routing protocol in an MSDC, one might question the motivation 153 for defining an alternate BGP deployment model when a mature solution 154 exists. For both alternatives, BGP offers the operational benefits 155 of a single routing protocol. However, BGP SPF offers some unique 156 advantages above and beyond standard BGP distance-vector routing. 158 A primary advantage is that all BGP speakers in the BGP SPF routing 159 domain will have a complete view of the topology. This will allow 160 support of ECMP, IP fast-reroute (e.g., Loop-Free Alternatives), 161 Shared Risk Link Groups (SRLGs), and other routing enhancements 162 without advertisement of addition BGP paths or other extensions. In 163 short, the advantages of an IGP such as OSPF [RFC2328] are availed in 164 BGP. 166 With the simplified BGP decision process as defined in Section 5.1, 167 NLRI changes can be disseminated throughout the BGP routing domain 168 much more rapidly (equivalent to IGPs with the proper 169 implementation). 171 Another primary advantage is a potential reduction in NLRI 172 advertisement. With standard BGP distance-vector routing, a single 173 link failure may impact 100s or 1000s prefixes and result in the 174 withdrawal or re-advertisement of the attendant NLRI. With BGP SPF, 175 only the BGP speakers corresponding to the link NLRI need withdraw 176 the corresponding BGP-LS Link NLRI. This advantage will contribute 177 to both faster convergence and better scaling. 179 With controller and route-reflector peering models, BGP SPF 180 advertisement and distributed computation require a minimal number of 181 sessions and copies of the NLRI since only the latest verion of the 182 NLRI from the originator is required. Given that verification of the 183 adjacencies is done outside of BGP (see Section 2), each BGP speaker 184 will only need as many sessions and copies of the NLRI as required 185 for redundancy (e.g., one for SPF computation and another for 186 backup). Functions such as Optimized Route Reflection (ORR) are 187 supported without extension by virture of the primary advantages. 188 Additionally, a controller could inject topology that is learned 189 outside the BGP routing domain. 191 Given that controllers are already consuming BGP-LS NLRI [RFC7752], 192 reusing for the BGP-LS SPF leverages the existing controller 193 implementations. 195 Another potential advantage of BGP SPF is that both IPv6 and IPv4 can 196 be supported in the same address family using the same topology. 197 Although not described in this version of the document, multi- 198 topology extensions can be used to support separate IPv4, IPv6, 199 unicast, and multicast topologies while sharing the same NLRI. 201 Finally, the BGP SPF topology can be used as an underlay for other 202 BGP address families (using the existing model) and realize all the 203 above advantages. A simplified peering model using IPv6 link-local 204 addresses as next-hops can be deployed similar to [RFC5549]. 206 1.2. Requirements Language 208 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 209 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 210 document are to be interpreted as described in RFC 2119 [RFC2119]. 212 2. BGP Peering Models 214 Depending on the requirements, scaling, and capabilities of the BGP 215 speakers, various peering models are supported. The only requirement 216 is that all BGP speakers in the BGP SPF routing domain receive link- 217 state NLRI on a timely basis, run an SPF calculation, and update 218 their data plane appropriately. The content of the Link NLRI is 219 described in Section 4.2. 221 2.1. BGP Single-Hop Peering on Network Node Connections 223 The simplest peering model is the one described in section 5.2.1 of 224 [RFC7938]. In this model, EBGP single-hop sessions are established 225 over direct point-to-point links interconnecting the network nodes. 226 For the purposes of BGP SPF, Link NLRI is only advertised if a 227 single-hop BGP session has been established and the Link-State/SPF 228 adddress family capability has been exchanged [RFC4790] on the 229 corresponding session. If the session goes down, the NLRI will be 230 withdrawn. 232 2.2. BGP Peering Between Directly Connected Network Nodes 234 In this model, BGP speakers peer with all directly connected network 235 nodes but the sessions may be multi-hop and the direct connection 236 discovery and liveliness detection for those connections are 237 independent of the BGP protocol. How this is accomplished is outside 238 the scope of this document. Consequently, there will be a single 239 session even if there are multiple direct connections between BGP 240 speakers. For the purposes of BGP SPF, Link NLRI is advertised as 241 long as a BGP session has been established, the Link-State/SPF 242 address family capability has been exchanged [RFC4790] and the 243 corresponding link is up and considered operational. 245 2.3. BGP Peering in Route-Reflector or Controller Topology 247 In this model, BGP speakers peer solely with one or more Route 248 Reflectors [RFC4456] or controllers. As in the previous model, 249 direct connection discovery and liveliness detection for those 250 connections are done outside the BGP protocol. More specifically, 251 the Liveliness detection is done using BFD protocol described in 252 [RFC5880]. For the purposes of BGP SPF, Link NLRI is advertised as 253 long as the corresponding link is up and considered operational. 255 3. BGP-LS Shortest Path Routing (SPF) SAFI 257 In order to replace the Phase 1 and 2 decision functions of the 258 existing Decision Process with an SPF-based Decision Process and 259 streamline the Phase 3 decision functions in a backward compatible 260 manner, this draft introduces a couple AFI/SAFIs for BGP LS SPF 261 operation. The BGP-LS-SPF (AF 16388 / SAFI TBD1) and BGP-LS-SPF-VPN 262 (AFI 16388 / SAFI TBD2) [RFC4790] are allocated by IANA as specified 263 in the Section 6. A BGP speaker wanting to run BGP LS SPF operation 264 must exchange the AFI/SAFI using Multiprotocol Extensions Capabilty 265 Code, as defined in [RFC4760]. 267 4. Extensions to BGP-LS 269 [RFC7752] describes a mechanism by which link-state and TE 270 information can be collected from networks and shared with external 271 components using BGP protocol. It contains two parts: definition of 272 a new BGP NLRI that describes links, nodes, and prefixes comprising 273 IGP link-state information and definition of a new BGP path attribute 274 (BGP-LS attribute) that carries link, node, and prefix properties and 275 attributes, such as the link and prefix metric or auxiliary Router- 276 IDs of nodes, etc. 278 The BGP protocol will be used in the Protocol-ID field specified in 279 table 1 of [I-D.ietf-idr-bgpls-segment-routing-epe]. The local and 280 remote node descriptors for all NLRI will be the BGP Router-ID (TLV 281 516) and either the AS Number (TLV 512) [RFC7752] or the BGP 282 Confederation Member (TLV 517) 283 [I-D.ietf-idr-bgpls-segment-routing-epe]. However, if the BGP 284 Router-ID is known to be unique within the BGP Routing domain, it can 285 be used as the sole descriptor. 287 4.1. Node NLRI Usage and Modifications 289 The SPF capability is a new Node Attribute TLV that will be added to 290 those defined in table 7 of [RFC7752]. The new attribute TLV will 291 only be applicable when BGP is specified in the Node NLRI Protocol ID 292 field. The TBD TLV type will be defined by IANA. The new Node 293 Attribute TLV will contain a single octet SPF algorithm field: 295 0 1 2 3 296 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 297 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 298 | Type | Length | 299 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 300 | SPF Algorithm | 301 +-+-+-+-+-+-+-+-+ 303 The SPF Algorithm may take the following values: 305 1 - Normal SPF 306 2 - Strict SPF 308 When computing the SPF for a given BGP routing domain, only BGP nodes 309 advertising the SPF capability attribute will be included the 310 Shortest Path Tree (SPT). 312 4.2. Link NLRI Usage 314 The criteria for advertisement of Link NLRI are discussed in 315 Section 2. 317 Link NLRI is advertised with local and remote node descriptors as 318 described above and unique link identifiers dependent on the 319 addressing. For IPv4 links, the links local IPv4 (TLV 259) and 320 remote IPv4 (TLV 260) addresses will be used. For IPv6 links, the 321 local IPv6 (TLV 261) and remote IPv6 (TLV 262) addresses will be 322 used. For unnumbered links, the link local/remote identifiers (TLV 323 258) will be used. For links supporting having both IPv4 and IPv6 324 addresses, both sets of descriptors may be included in the same Link 325 NLRI. The link identifiers are described in table 5 of [RFC7752]. 327 The link IGP metric attribute TLV (TLV 1095) as well as any others 328 required for non-SPF purposes SHOULD be advertised. Algorithms such 329 as setting the metric inversely to the link speed as done in the OSPF 330 MIB [RFC4750] may be supported. However, this is beyond the scope of 331 this document. 333 4.3. Prefix NLRI Usage 335 Prefix NLRI is advertised with a local descriptor as described above 336 and the prefix and length used as the descriptors (TLV 265) as 337 described in [RFC7752]. The prefix metric attribute TLV (TLV 1155) 338 as well as any others required for non-SPF purposes SHOULD be 339 advertised. For loopback prefixes, the metric should be 0. For non- 340 loopback, the setting of the metric is beyond the scope of this 341 document. 343 4.4. BGP-LS Attribute Sequence-Number TLV 345 A new BGP-LS Attribute TLV to BGP-LS NLRI types is defined to assure 346 the most recent version of a given NLRI is used in the SPF 347 computation. The TBD TLV type will be defined by IANA. The new BGP- 348 LS Attribute TLV will contain an 8 octet sequence number. The usage 349 of the Sequence Number TLV is described in Section 5.1. 351 0 1 2 3 352 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 353 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 354 | Type | Length | 355 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 356 | Sequence Number (High-Order 32 Bits) | 357 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 358 | Sequence Number (Low-Order 32 Bits) | 359 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 361 Sequence Number 363 The 64-bit strictly increasing sequence number is incremented for 364 every version of BGP-LS NLRI originated. BGP speakers implementing 365 this specification MUST use available mechanisms to preserve the 366 sequence number's strictly increasing property for the deployed life 367 of the BGP speaker (including cold restarts). One mechanism for 368 accomplishing this would be to use the high-order 32 bits of the 369 sequence number as a wrap/boot count that is incremented anytime the 370 BGP Router router loses its sequence number state or the low-order 32 371 bits wrap. 373 When incrementing the sequence number for each self-originated NLRI, 374 the sequence number should be treated as an unsigned 64-bit value. 375 If the lower-order 32-bit value wraps, the higher-order 32-bit value 376 should be incremented and saved in non-volatile storage. If by some 377 chance the BGP Speaker is deployed long enough that there is a 378 possibility that the 64-bit sequence number may wrap or a BGP Speaker 379 completely loses its sequence number state (e.g, the BGP speaker 380 hardware is replaced), the phase 1 decision function (see 381 Section 5.1) rules should insure convergance, albeit, not 382 immediately. 384 5. Decision Process with SPF Algorithm 386 The Decision Process described in [RFC4271] takes place in three 387 distinct phases. The Phase 1 decision function of the Decision 388 Process is responsible for calculating the degree of preference for 389 each route received from a Speaker's peer. The Phase 2 decision 390 function is invoked on completion of the Phase 1 decision function 391 and is responsible for choosing the best route out of all those 392 available for each distinct destination, and for installing each 393 chosen route into the Loc-RIB. The combination of the Phase 1 and 2 394 decision functions is also known as a Path vector algorithm. 396 The SPF based Decision process replaces the BGP Bestpath Decision 397 process described in [RFC4271]. This process starts with selecting 398 only those Node NLRI whose SPF capability TLV matches with the local 399 BGP speaker's SPF capability TLV value. Since Link-State NLRI always 400 contains the local descriptor [RFC7752], it will only be originated 401 by a single BGP speaker in the BGP routing domain. These selected 402 Node NLRI and their Link/Prefix NLRI are used to build a directed 403 graph during the SPF computation. The best paths for BGP prefixes 404 are installed as a result of the SPF process. 406 When BGP-LS-SPF NLRI is received, all that is required is to 407 determine whether it is the best-path by examining the Node-ID and 408 sequence number as described in Section 5.1. If the received best- 409 path NLRI had changed, it will be advertised to other BGP-LS-SPF 410 peers. If the attributes have changed (other than the sequence 411 number), a BGP SPF calculation will be scheduled. However, a changed 412 best-path can be advertised to other peer immediately and propagation 413 of changes can approach IGP convergence times with appropriately 414 tuned MinRouteAdvertisementIntervalTimer. 416 The Phase 3 decision function of the Decision Process [RFC4271] is 417 also simplified since under normal SPF operation, a BGP speaker would 418 advertise the NLRI selected for the SPF to all BGP peers with the 419 BGP-LS/BGP-SPF AFI/SAFI. Application of policy would not be 420 prevented however its usage to bestpath process would be limited as 421 the SPF relies solely on link metrics. 423 5.1. Phase-1 BGP NLRI Selection 425 The rules for NLRI selection are greatly simplified from [RFC4271]. 427 1. If the NLRI is received from the BGP speaker originating the NLRI 428 (as determined by the comparing BGP Router ID in the NLRI Node 429 identifiers with the BGP speaker Router ID), then it is preferred 430 over the same NLRI from non-originators. 432 2. If the Sequence-Number TLV is present in the BGP-LS Attribute, 433 then the NLRI with the most recent, i.e., highest sequence number 434 is selected. BGP-LS NLRI with a Sequence-Number TLV will be 435 considered more recent than NLRI without a BGP-LS or a BGP-LS 436 Attribute that doesn't include the Sequence-Number TLV. 438 3. The final tie-breaker is the NLRI from the BGP Speaker with the 439 numerically largest BGP Router ID. 441 The modified Decision Process with SPF algorithm uses the metric from 442 Link and Prefix NLRI Attribute TLVs [RFC7752]. As a result, any 443 attributes that would influence the Decision process defined in 444 [RFC4271] like ORIGIN, MULTI_EXIT_DISC, and LOCAL_PREF attributes are 445 ignored by the SPF algorithm. Furthermore, the NEXT_HOP attribute 446 value is preserved but otherwise ignored during the SPF or best-path. 448 5.2. Dual Stack Support 450 The SPF based decision process operates on Node, Link, and Prefix 451 NLRIs that support both IPv4 and IPv6 addresses. Whether to run a 452 single SPF instance or multiple SPF instances for separate AFs is a 453 matter of a local implementation. Normally, IPv4 next-hops are 454 calculated for IPv4 prefixes and IPv6 next-hops are calculated for 455 IPv6 prefixes. However, an interesting use-case is deployment of 456 [RFC5549] where IPv6 link-local next-hops are calculated for both 457 IPv4 and IPv6 prefixes. As stated in Section 1, support for Multiple 458 Topology Routing (MTR) is an area for future study. 460 5.3. NEXT_HOP Manipulation 462 A BGP speaker that supports SPF extensions MAY interact with peers 463 that don't support SPF extensions. If the BGP Link-State address 464 family is advertised to a peer not supporting the SPF extensions 465 described herein, then the BGP speaker MUST conform to the NEXT_HOP 466 rules mentioned in [RFC4271] when announcing the Link-State address 467 family routes to those peers. 469 All BGP peers that support SPF extensions would locally compute the 470 NEXT_HOP values as result of the SPF process. As a result, the 471 NEXT_HOP attribute is always ignored on receipt. However BGP 472 speakers should set the NEXT_HOP address according to the NEXT_HOP 473 attribute rules mentioned in [RFC4271]. 475 5.4. IPv4/IPv6 Unicast Address Family Interaction 477 While the BGP-LS SPF address family and the IPv4/IPv6 unicast address 478 families install routes into the same device routing tables, they 479 will operate independently much the same as OSPF and IS-IS would 480 operate today (i.e., "Ships-in-the-Night" mode). There will be no 481 implicit route redistribution between the BGP address families. 482 However, implementation specific redistribution mechanisms SHOULD be 483 made available with the restriction that redistribution of BGP-LS SPF 484 routes into the IPv4 address family applies only to IPv4 routes and 485 redistribution of BGP-LS SPF route into the IPv6 address family 486 applies only to IPv6 routes. 488 Given the fact that SPF algorithms are based on the assumption that 489 all routers in the routing domain calculate the precisely the same 490 SPF tree and install the same set of routers, it is RECOMMENDED that 491 BGP-LS SPF IPv4/IPv6 routes be given priority by default when 492 installed into their respective RIBs. In common implementations the 493 prioritization is governed by route preference or administrative 494 distance with lower being more preferred. 496 5.5. NLRI Advertisement and Convergence 498 A local failure will prevent a link from being used in the SPF 499 calculation due to the IGP bi-directional connectivity requirement. 500 Consequently, local link failues should always be given priority over 501 updates (e.g., withdrawing all routes learned on a session) in order 502 to ensure the highest priority progation and optimal convergence. 504 Delaying the withdrawal of non-local routes is an area for further 505 study as more IGP-like mechanisms would be required to prevent usage 506 of stale NLRI. 508 5.6. Error Handling 510 When a BGP speaker receives a BGP Update containing a malformed SPF 511 Capability TLV in the Node NLRI BGP-LS Attribute [RFC7752], it MUST 512 ignore the received TLV and the Node NLRI and not pass it to other 513 BGP peers as specified in [RFC7606]. When discarding a Node NLRI 514 with malformed TLV, a BGP speaker SHOULD log an error for further 515 analysis. 517 6. IANA Considerations 519 This document defines a couple AFI/SAFIs for BGP LS SPF operation and 520 requests IANA to assign the BGP-LS-SPF AFI 16388 / SAFI TBD1 and the 521 BGP-LS-SPF-VPN AFI 16388 / SAFI TBD2 as described in [RFC4750]. 523 This document also defines two attribute TLV for BGP LS NLRI. We 524 request IANA to assign TLVs for the SPF capability and the Sequence 525 Number from the "BGP-LS Node Descriptor, Link Descriptor, Prefix 526 Descriptor, and Attribute TLVs" Registry. Additionally, IANA is 527 requested to create a new registry for "BGP-LS SPF Capability 528 Algorithms" for the value of the algorithm both in the BGP-LS Node 529 Attribute TLV and the BGP SPF Capability. The initial assignments 530 are: 532 +-------------+-----------------------------------+ 533 | Value(s) | Assignment Policy | 534 +-------------+-----------------------------------+ 535 | 0 | Reserved (not to be assigned) | 536 | | | 537 | 1 | SPF | 538 | | | 539 | 2 | Strict SPF | 540 | | | 541 | 3-254 | Unassigned (IETF Review) | 542 | | | 543 | 255 | Reserved (not to be assigned) | 544 +-------------+-----------------------------------+ 546 BGP-LS SPF Capability Algorithms 548 7. Security Considerations 550 This extension to BGP does not change the underlying security issues 551 inherent in the existing [RFC4724] and [RFC4271]. 553 7.1. Acknowledgements 555 The authors would like to thank .... for the review and comments. 557 7.2. Contributors 559 In addition to the authors listed on the front page, the following 560 co-authors have contributed to the document. 562 Derek Yeung 563 Arrcus, Inc. 564 derek@arrcus.com 566 Gunter Van De Velde 567 Nokia 568 gunter.van_de_velde@nokia.com 570 Abhay Roy 571 Cisco Systems 572 akr@cisco.com 574 Venu Venugopal 575 Cisco Systems 576 venuv@cisco.com 578 8. References 580 8.1. Normative References 582 [I-D.ietf-idr-bgpls-segment-routing-epe] 583 Previdi, S., Filsfils, C., Patel, K., Ray, S., and J. 584 Dong, "BGP-LS extensions for Segment Routing BGP Egress 585 Peer Engineering", draft-ietf-idr-bgpls-segment-routing- 586 epe-14 (work in progress), December 2017. 588 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 589 Requirement Levels", BCP 14, RFC 2119, 590 DOI 10.17487/RFC2119, March 1997, . 593 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 594 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 595 DOI 10.17487/RFC4271, January 2006, . 598 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 599 Patel, "Revised Error Handling for BGP UPDATE Messages", 600 RFC 7606, DOI 10.17487/RFC7606, August 2015, 601 . 603 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 604 S. Ray, "North-Bound Distribution of Link-State and 605 Traffic Engineering (TE) Information Using BGP", RFC 7752, 606 DOI 10.17487/RFC7752, March 2016, . 609 [RFC7938] Lapukhov, P., Premji, A., and J. Mitchell, Ed., "Use of 610 BGP for Routing in Large-Scale Data Centers", RFC 7938, 611 DOI 10.17487/RFC7938, August 2016, . 614 8.2. Information References 616 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, 617 DOI 10.17487/RFC2328, April 1998, . 620 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 621 Reflection: An Alternative to Full Mesh Internal BGP 622 (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, 623 . 625 [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y. 626 Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724, 627 DOI 10.17487/RFC4724, January 2007, . 630 [RFC4750] Joyal, D., Ed., Galecki, P., Ed., Giacalone, S., Ed., 631 Coltun, R., and F. Baker, "OSPF Version 2 Management 632 Information Base", RFC 4750, DOI 10.17487/RFC4750, 633 December 2006, . 635 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 636 "Multiprotocol Extensions for BGP-4", RFC 4760, 637 DOI 10.17487/RFC4760, January 2007, . 640 [RFC4790] Newman, C., Duerst, M., and A. Gulbrandsen, "Internet 641 Application Protocol Collation Registry", RFC 4790, 642 DOI 10.17487/RFC4790, March 2007, . 645 [RFC4915] Psenak, P., Mirtorabi, S., Roy, A., Nguyen, L., and P. 646 Pillay-Esnault, "Multi-Topology (MT) Routing in OSPF", 647 RFC 4915, DOI 10.17487/RFC4915, June 2007, 648 . 650 [RFC5286] Atlas, A., Ed. and A. Zinin, Ed., "Basic Specification for 651 IP Fast Reroute: Loop-Free Alternates", RFC 5286, 652 DOI 10.17487/RFC5286, September 2008, . 655 [RFC5549] Le Faucheur, F. and E. Rosen, "Advertising IPv4 Network 656 Layer Reachability Information with an IPv6 Next Hop", 657 RFC 5549, DOI 10.17487/RFC5549, May 2009, 658 . 660 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 661 (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, 662 . 664 Authors' Addresses 666 Keyur Patel 667 Arrcus, Inc. 669 Email: keyur@arrcus.com 671 Acee Lindem 672 Cisco Systems 673 301 Midenhall Way 674 Cary, NC 27513 675 USA 677 Email: acee@cisco.com 679 Shawn Zandi 680 Linkedin 681 222 2nd Street 682 San Francisco, CA 94105 683 USA 685 Email: szandi@linkedin.com 687 Wim Henderickx 688 Nokia 689 Antwerp 690 Belgium 692 Email: wim.henderickx@nokia.com