| < draft-ietf-lsvr-bgp-spf-11.txt | draft-ietf-lsvr-bgp-spf-12.txt > | |||
|---|---|---|---|---|
| Network Working Group K. Patel | Network Working Group K. Patel | |||
| Internet-Draft Arrcus, Inc. | Internet-Draft Arrcus, Inc. | |||
| Intended status: Standards Track A. Lindem | Intended status: Standards Track A. Lindem | |||
| Expires: February 4, 2021 Cisco Systems | Expires: July 30, 2021 Cisco Systems | |||
| S. Zandi | S. Zandi | |||
| W. Henderickx | W. Henderickx | |||
| Nokia | Nokia | |||
| August 3, 2020 | January 26, 2021 | |||
| Shortest Path Routing Extensions for BGP Protocol | BGP Link-State Shortest Path First (SPF) Routing | |||
| draft-ietf-lsvr-bgp-spf-11 | draft-ietf-lsvr-bgp-spf-12 | |||
| Abstract | Abstract | |||
| Many Massively Scaled Data Centers (MSDCs) have converged on | Many Massively Scaled Data Centers (MSDCs) have converged on | |||
| simplified layer 3 routing. Furthermore, requirements for | simplified layer 3 routing. Furthermore, requirements for | |||
| operational simplicity have led many of these MSDCs to converge on | operational simplicity have led many of these MSDCs to converge on | |||
| BGP as their single routing protocol for both their fabric routing | BGP as their single routing protocol for both their fabric routing | |||
| and their Data Center Interconnect (DCI) routing. This document | and their Data Center Interconnect (DCI) routing. This document | |||
| describes a solution which leverages BGP Link-State distribution and | describes extensions to BGP to use BGP Link-State distribution and | |||
| the Shortest Path First (SPF) algorithm similar to Internal Gateway | the Shortest Path First (SPF) algorithm used by Internal Gateway | |||
| Protocols (IGPs) such as OSPF. | Protocols (IGPs) such as OSPF. In doing this, it allows BGP to be | |||
| efficiently used as both the underlay protocol and the overlay | ||||
| protocol in MSDCs. | ||||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on February 4, 2021. | This Internet-Draft will expire on July 30, 2021. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2021 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 1.1. BGP Shortest Path First (SPF) Motivation . . . . . . . . 4 | 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 5 | 1.2. BGP Shortest Path First (SPF) Motivation . . . . . . . . 4 | |||
| 2. BGP Peering Models . . . . . . . . . . . . . . . . . . . . . 5 | 1.3. Document Overview . . . . . . . . . . . . . . . . . . . . 6 | |||
| 2.1. BGP Single-Hop Peering on Network Node Connections . . . 5 | 1.4. Requirements Language . . . . . . . . . . . . . . . . . . 6 | |||
| 2.2. BGP Peering Between Directly Connected Network Nodes . . 5 | 2. Base BGP Protocol Relationship . . . . . . . . . . . . . . . 6 | |||
| 2.3. BGP Peering in Route-Reflector or Controller Topology . . 6 | 3. BGP Link-State (BGP-LS) Relationship . . . . . . . . . . . . 7 | |||
| 3. BGP-LS Shortest Path Routing (SPF) SAFI . . . . . . . . . . . 6 | 4. BGP Peering Models . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 4. Extensions to BGP-LS . . . . . . . . . . . . . . . . . . . . 6 | 4.1. BGP Single-Hop Peering on Network Node Connections . . . 8 | |||
| 4.1. Node NLRI Usage . . . . . . . . . . . . . . . . . . . . . 7 | 4.2. BGP Peering Between Directly-Connected Nodes . . . . . . 8 | |||
| 4.1.1. Node NLRI Attribute SPF Capability TLV . . . . . . . 7 | 4.3. BGP Peering in Route-Reflector or Controller Topology . . 9 | |||
| 4.1.2. BGP-LS Node NLRI Attribute SPF Status TLV . . . . . . 8 | 5. BGP Shortest Path Routing (SPF) Protocol Extensions . . . . . 9 | |||
| 4.2. Link NLRI Usage . . . . . . . . . . . . . . . . . . . . . 8 | 5.1. BGP-LS Shortest Path Routing (SPF) SAFI . . . . . . . . . 9 | |||
| 4.2.1. BGP-LS Link NLRI Attribute Prefix-Length TLVs . . . . 9 | 5.1.1. BGP-LS-SPF NLRI TLVs . . . . . . . . . . . . . . . . 9 | |||
| 4.2.2. BGP-LS Link NLRI Attribute SPF Status TLV . . . . . . 9 | 5.1.2. BGP-LS Attribute . . . . . . . . . . . . . . . . . . 10 | |||
| 4.3. Prefix NLRI Usage . . . . . . . . . . . . . . . . . . . . 10 | 5.2. Extensions to BGP-LS . . . . . . . . . . . . . . . . . . 11 | |||
| 4.3.1. BGP-LS Prefix NLRI Attribute SPF Status TLV . . . . . 10 | 5.2.1. Node NLRI Usage . . . . . . . . . . . . . . . . . . . 11 | |||
| 4.4. BGP-LS Attribute Sequence-Number TLV . . . . . . . . . . 10 | 5.2.1.1. Node NLRI Attribute SPF Capability TLV . . . . . 11 | |||
| 5. Decision Process with SPF Algorithm . . . . . . . . . . . . . 11 | 5.2.1.2. BGP-LS-SPF Node NLRI Attribute SPF Status TLV . . 12 | |||
| 5.1. Phase-1 BGP NLRI Selection . . . . . . . . . . . . . . . 12 | 5.2.2. Link NLRI Usage . . . . . . . . . . . . . . . . . . . 13 | |||
| 5.2. Dual Stack Support . . . . . . . . . . . . . . . . . . . 13 | 5.2.2.1. BGP-LS-SPF Link NLRI Attribute Prefix-Length TLVs 14 | |||
| 5.3. SPF Calculation based on BGP-LS NLRI . . . . . . . . . . 13 | 5.2.2.2. BGP-LS-SPF Link NLRI Attribute SPF Status TLV . . 14 | |||
| 5.4. NEXT_HOP Manipulation . . . . . . . . . . . . . . . . . . 16 | 5.2.3. IPv4/IPv6 Prefix NLRI Usage . . . . . . . . . . . . . 15 | |||
| 5.5. IPv4/IPv6 Unicast Address Family Interaction . . . . . . 16 | 5.2.3.1. BGP-LS-SPF Prefix NLRI Attribute SPF Status TLV . 16 | |||
| 5.6. NLRI Advertisement and Convergence . . . . . . . . . . . 17 | 5.2.4. BGP-LS Attribute Sequence-Number TLV . . . . . . . . 16 | |||
| 5.6.1. Link/Prefix Failure Convergence . . . . . . . . . . . 17 | 5.3. NEXT_HOP Manipulation . . . . . . . . . . . . . . . . . . 17 | |||
| 5.6.2. Node Failure Convergence . . . . . . . . . . . . . . 17 | 6. Decision Process with SPF Algorithm . . . . . . . . . . . . . 18 | |||
| 5.7. Error Handling . . . . . . . . . . . . . . . . . . . . . 18 | 6.1. BGP NLRI Selection . . . . . . . . . . . . . . . . . . . 19 | |||
| 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 | 6.1.1. BGP Self-Originated NLRI . . . . . . . . . . . . . . 20 | |||
| 7. Security Considerations . . . . . . . . . . . . . . . . . . . 18 | 6.2. Dual Stack Support . . . . . . . . . . . . . . . . . . . 20 | |||
| 8. Management Considerations . . . . . . . . . . . . . . . . . . 18 | 6.3. SPF Calculation based on BGP-LS-SPF NLRI . . . . . . . . 20 | |||
| 8.1. Configuration . . . . . . . . . . . . . . . . . . . . . . 18 | 6.4. IPv4/IPv6 Unicast Address Family Interaction . . . . . . 25 | |||
| 8.2. Operational Data . . . . . . . . . . . . . . . . . . . . 19 | 6.5. NLRI Advertisement . . . . . . . . . . . . . . . . . . . 25 | |||
| 9. Implementation Status . . . . . . . . . . . . . . . . . . . . 19 | 6.5.1. Link/Prefix Failure Convergence . . . . . . . . . . . 25 | |||
| 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 19 | 6.5.2. Node Failure Convergence . . . . . . . . . . . . . . 26 | |||
| 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 20 | 7. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 26 | |||
| 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 | 7.1. Processing of BGP-LS-SPF TLVs . . . . . . . . . . . . . . 26 | |||
| 12.1. Normative References . . . . . . . . . . . . . . . . . . 20 | 7.2. Processing of BGP-LS-SPF NLRIs . . . . . . . . . . . . . 27 | |||
| 12.2. Information References . . . . . . . . . . . . . . . . . 21 | 7.3. Processing of BGP-LS Attribute . . . . . . . . . . . . . 28 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 | |||
| 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30 | ||||
| 10. Management Considerations . . . . . . . . . . . . . . . . . . 31 | ||||
| 10.1. Configuration . . . . . . . . . . . . . . . . . . . . . 31 | ||||
| 10.1.1. Link Metric Configuration . . . . . . . . . . . . . 31 | ||||
| 10.1.2. backoff-config . . . . . . . . . . . . . . . . . . . 31 | ||||
| 10.2. Operational Data . . . . . . . . . . . . . . . . . . . . 31 | ||||
| 11. Implementation Status . . . . . . . . . . . . . . . . . . . . 32 | ||||
| 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 32 | ||||
| 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 32 | ||||
| 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 33 | ||||
| 14.1. Normative References . . . . . . . . . . . . . . . . . . 33 | ||||
| 14.2. Informational References . . . . . . . . . . . . . . . . 35 | ||||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 36 | ||||
| 1. Introduction | 1. Introduction | |||
| Many Massively Scaled Data Centers (MSDCs) have converged on | Many Massively Scaled Data Centers (MSDCs) have converged on | |||
| simplified layer 3 routing. Furthermore, requirements for | simplified layer 3 routing. Furthermore, requirements for | |||
| operational simplicity have led many of these MSDCs to converge on | operational simplicity have led many of these MSDCs to converge on | |||
| BGP [RFC4271] as their single routing protocol for both their fabric | BGP [RFC4271] as their single routing protocol for both their fabric | |||
| routing and their Data Center Interconnect (DCI) routing. | routing and their Data Center Interconnect (DCI) routing [RFC7938]. | |||
| Requirements and procedures for using BGP are described in [RFC7938]. | ||||
| This document describes an alternative solution which leverages BGP- | This document describes an alternative solution which leverages BGP- | |||
| LS [RFC7752] and the Shortest Path First algorithm similar to | LS [RFC7752] and the Shortest Path First algorithm used by Internal | |||
| Internal Gateway Protocols (IGPs) such as OSPF [RFC2328]. | Gateway Protocols (IGPs) such as OSPF [RFC2328]. | |||
| [RFC4271] defines the Decision Process that is used to select routes | This document leverages both the BGP protocol [RFC4271] and the BGP- | |||
| for subsequent advertisement by applying the policies in the local | LS [RFC7752] protocols. The relationship, as well as the scope of | |||
| Policy Information Base (PIB) to the routes stored in its Adj-RIBs- | changes are described respectively in Section 2 and Section 3. The | |||
| In. The output of the Decision Process is the set of routes that are | modifications to [RFC4271] for BGP SPF described herein only apply to | |||
| announced by a BGP speaker to its peers. These selected routes are | IPv4 and IPv6 as underlay unicast Subsequent Address Families | |||
| stored by a BGP speaker in the speaker's Adj-RIBs-Out according to | Identifiers (SAFIs). Operations for any other BGP SAFIs are outside | |||
| policy. | the scope of this document. | |||
| [RFC7752] describes a mechanism by which link-state and TE | This solution avails the benefits of both BGP and SPF-based IGPs. | |||
| information can be collected from networks and shared with external | These include TCP based flow-control, no periodic link-state refresh, | |||
| components using BGP. This is achieved by defining NLRI advertised | and completely incremental NLRI advertisement. These advantages can | |||
| within the BGP-LS/BGP-LS-SPF AFI/SAFI. The BGP-LS extensions defined | reduce the overhead in MSDCs where there is a high degree of Equal | |||
| in [RFC7752] makes use of the Decision Process defined in [RFC4271]. | Cost Multi-Path (ECMPs) and the topology is very stable. | |||
| Additionally, using an SPF-based computation can support fast | ||||
| convergence and the computation of Loop-Free Alternatives (LFAs). | ||||
| The SPF LFA extensions defined in [RFC5286] can be similarly applied | ||||
| to BGP SPF calculations. However, the details are a matter of | ||||
| implementation detail. Furthermore, a BGP-based solution lends | ||||
| itself to multiple peering models including those incorporating | ||||
| route-reflectors [RFC4456] or controllers. | ||||
| This document augments [RFC7752] by replacing its use of the existing | 1.1. Terminology | |||
| Decision Process. Rather than reusing the BGP-LS SAFI, the BGP-LS- | ||||
| SPF SAFI is introduced to insure backward compatibility. The Phase 1 | ||||
| and 2 decision functions of the Decision Process are replaced with | ||||
| the Shortest Path First (SPF) algorithm also known as the Dijkstra | ||||
| algorithm. The Phase 3 decision function is also simplified since it | ||||
| is no longer dependent on the previous phases. This solution avails | ||||
| the benefits of both BGP and SPF-based IGPs. These include TCP based | ||||
| flow-control, no periodic link-state refresh, and completely | ||||
| incremental NLRI advertisement. These advantages can reduce the | ||||
| overhead in MSDCs where there is a high degree of Equal Cost Multi- | ||||
| Path (ECMPs) and the topology is very stable. Additionally, using an | ||||
| SPF-based computation can support fast convergence and the | ||||
| computation of Loop-Free Alternatives (LFAs) [RFC5286] in the event | ||||
| of link failures. Furthermore, a BGP based solution lends itself to | ||||
| multiple peering models including those incorporating route- | ||||
| reflectors [RFC4456] or controllers. | ||||
| Support for Multiple Topology Routing (MTR) as described in [RFC4915] | This specification reuses terms defined in section 1.1 of [RFC4271] | |||
| is an area for further study dependent on deployment requirements. | including BGP speaker, NLRI, and Route. | |||
| 1.1. BGP Shortest Path First (SPF) Motivation | Additionally, this document introduces the following terms: | |||
| BGP SPF Routing Domain: A set of BGP routers that are under a single | ||||
| administrative domain and exchange link-state information using | ||||
| the BGP-LS-SPF SAFI and compute routes using BGP SPF as described | ||||
| herein. | ||||
| BGP-LS-SPF NLRI: This refers to BGP-LS Network Layer Reachability | ||||
| Information (NLRI) that is being advertised in the BGP-LS-SPF SAFI | ||||
| (Section 5.1) and is being used for BGP SPF route computation. | ||||
| Dijkstra Algorithm: An algorithm for computing the shortest path | ||||
| from a given node in a graph to every other node in the graph. At | ||||
| each iteration of the algorithm, there is a list of candidate | ||||
| vertices. Paths from the root to these vertices have been found, | ||||
| but not necessarily the shortest ones. However, the paths to the | ||||
| candidate vertex that is closest to the root are guaranteed to be | ||||
| shortest; this vertex is added to the shortest-path tree, removed | ||||
| from the candidate list, and its adjacent vertices are examined | ||||
| for possible addition to/modification of the candidate list. The | ||||
| algorithm then iterates again. It terminates when the candidate | ||||
| list becomes empty. [RFC2328] | ||||
| 1.2. BGP Shortest Path First (SPF) Motivation | ||||
| Given that [RFC7938] already describes how BGP could be used as the | Given that [RFC7938] already describes how BGP could be used as the | |||
| sole routing protocol in an MSDC, one might question the motivation | sole routing protocol in an MSDC, one might question the motivation | |||
| for defining an alternate BGP deployment model when a mature solution | for defining an alternate BGP deployment model when a mature solution | |||
| exists. For both alternatives, BGP offers the operational benefits | exists. For both alternatives, BGP offers the operational benefits | |||
| of a single routing protocol. However, BGP SPF offers some unique | of a single routing protocol as opposed to the combination of an IGP | |||
| advantages above and beyond standard BGP distance-vector routing. | for the underlay and BGP as an overlay. However, BGP SPF offers some | |||
| unique advantages above and beyond standard BGP distance-vector | ||||
| routing. With BGP SPF, the standard hop-by-hop peering model is | ||||
| relaxed. | ||||
| A primary advantage is that all BGP speakers in the BGP SPF routing | A primary advantage is that all BGP-LS-SPF speakers in the BGP SPF | |||
| domain will have a complete view of the topology. This will allow | routing domain will have a complete view of the topology. This will | |||
| support for ECMP, IP fast-reroute (e.g., Loop-Free Alternatives), | allow support for ECMP, IP fast-reroute (e.g., Loop-Free | |||
| Shared Risk Link Groups (SRLGs), and other routing enhancements | Alternatives), Shared Risk Link Groups (SRLGs), and other routing | |||
| without advertisement of addition BGP paths or other extensions. In | enhancements without advertisement of additional BGP paths [RFC7911] | |||
| short, the advantages of an IGP such as OSPF [RFC2328] are availed in | or other extensions. In short, the advantages of an IGP such as OSPF | |||
| BGP. | [RFC2328] are availed in BGP. | |||
| With the simplified BGP decision process as defined in Section 5.1, | With the simplified BGP decision process as defined in Section 6, | |||
| NLRI changes can be disseminated throughout the BGP routing domain | NLRI changes can be disseminated throughout the BGP routing domain | |||
| much more rapidly (equivalent to IGPs with the proper | much more rapidly (equivalent to IGPs with the proper | |||
| implementation). | implementation). The added advantage of BGP using TCP for reliable | |||
| transport leverages TCP's inherent flow-control and guaranteed in- | ||||
| order delivery. | ||||
| Another primary advantage is a potential reduction in NLRI | Another primary advantage is a potential reduction in NLRI | |||
| advertisement. With standard BGP distance-vector routing, a single | advertisement. With standard BGP distance-vector routing, a single | |||
| link failure may impact 100s or 1000s prefixes and result in the | link failure may impact 100s or 1000s prefixes and result in the | |||
| withdrawal or re-advertisement of the attendant NLRI. With BGP SPF, | withdrawal or re-advertisement of the attendant NLRI. With BGP SPF, | |||
| only the BGP speakers corresponding to the link NLRI need withdraw | only the BGP speakers corresponding to the link NLRI need to withdraw | |||
| the corresponding BGP-LS Link NLRI. This advantage will contribute | the corresponding BGP-LS-SPF Link NLRI. Additionally, the changed | |||
| to both faster convergence and better scaling. | NLRI will be advertised immediately as opposed to normal BGP where it | |||
| is only advertised after the best route selection. These advantages | ||||
| will afford NLRI dissemination throughout the BGP SPF routing domain | ||||
| with efficiencies similar to link-state protocols. | ||||
| With controller and route-reflector peering models, BGP SPF | With controller and route-reflector peering models, BGP SPF | |||
| advertisement and distributed computation require a minimal number of | advertisement and distributed computation require a minimal number of | |||
| sessions and copies of the NLRI since only the latest version of the | sessions and copies of the NLRI since only the latest version of the | |||
| NLRI from the originator is required. Given that verification of the | NLRI from the originator is required. Given that verification of the | |||
| adjacencies is done outside of BGP (see Section 2), each BGP speaker | adjacencies is done outside of BGP (see Section 4), each BGP speaker | |||
| will only need as many sessions and copies of the NLRI as required | will only need as many sessions and copies of the NLRI as required | |||
| for redundancy (e.g., one for the SPF computation and another for | for redundancy (see Section 4). Additionally, a controller could | |||
| backup). Functions such as Optimized Route Reflection (ORR) are | inject topology that is learned outside the BGP SPF routing domain. | |||
| supported without extension by virtue of the primary advantages. | ||||
| Additionally, a controller could inject topology that is learned | ||||
| outside the BGP routing domain. | ||||
| Given that controllers are already consuming BGP-LS NLRI [RFC7752], | Given that controllers are already consuming BGP-LS NLRI [RFC7752], | |||
| reusing for the BGP-LS SPF leverages the existing controller | this functionality can be reused for BGP-LS-SPF NLRI. | |||
| implementations. | ||||
| Another potential advantage of BGP SPF is that both IPv6 and IPv4 can | Another potential advantage of BGP SPF is that both IPv6 and IPv4 can | |||
| be supported in the same address family using the same topology. | both be supported using the BGP-LS-SPF SAFI with the same BGP-LS-SPF | |||
| Although not described in this version of the document, multi- | NLRIs. In many MSDC fabrics, the IPv4 and IPv6 topologies are | |||
| topology extensions can be used to support separate IPv4, IPv6, | congruent. Although beyond the scope of this document, multi- | |||
| topology extensions could be used to support separate IPv4, IPv6, | ||||
| unicast, and multicast topologies while sharing the same NLRI. | unicast, and multicast topologies while sharing the same NLRI. | |||
| Finally, the BGP SPF topology can be used as an underlay for other | Finally, the BGP SPF topology can be used as an underlay for other | |||
| BGP address families (using the existing model) and realize all the | BGP SAFIs (using the existing model) and realize all the above | |||
| above advantages. A simplified peering model using IPv6 link-local | advantages. | |||
| addresses as next-hops can be deployed similar to [RFC5549]. | ||||
| 1.2. Requirements Language | 1.3. Document Overview | |||
| The document begins with sections defining the precise relationship | ||||
| that BGP SPF has with both the base BGP protocol [RFC4271] | ||||
| (Section 2) and the BGP Link-State (BGP-LS) extensions [RFC7752] | ||||
| (Section 3). This is required to dispel the notion that BGP SPF is | ||||
| an independent protocol. The BGP peering models, as well as the | ||||
| their respective trade-offs are then discussed in Section 4. The | ||||
| remaining sections, which make up the bulk of the document, define | ||||
| the protocol enhancements necessary to support BGP SPF. The BGP-LS | ||||
| extensions to support BGP SPF are defined in Section 5. The | ||||
| replacement of the base BGP decision process with the SPF computation | ||||
| is specified in Section 6. Finally, BGP SPF error handling is | ||||
| defined in Section 7 | ||||
| 1.4. Requirements Language | ||||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
| "OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in BCP | |||
| 14 [RFC2119] [RFC8174] when, and only when, they appear in all | 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
| capitals, as shown here. | capitals, as shown here. | |||
| 2. BGP Peering Models | 2. Base BGP Protocol Relationship | |||
| Depending on the requirements, scaling, and capabilities of the BGP | With the exception of the decision process, the BGP SPF extensions | |||
| speakers, various peering models are supported. The only requirement | leverage the BGP protocol [RFC4271] without change. This includes | |||
| is that all BGP speakers in the BGP SPF routing domain receive link- | the BGP protocol Finite State Machine, BGP messages and their | |||
| state NLRI on a timely basis, run an SPF calculation, and update | encodings, processing of BGP messages, BGP attributes and path | |||
| their data plane appropriately. The content of the Link NLRI is | attributes, BGP NLRI encodings, and any error handling defined in the | |||
| described in Section 4.2. | [RFC4271] and [RFC7606]. | |||
| 2.1. BGP Single-Hop Peering on Network Node Connections | Due to the changes to the decision process, there are mechanisms and | |||
| encodings that are no longer applicable. While not necessarily | ||||
| required for computation, the ORIGIN, AS_PATH, MULTI_EXIT_DISC, | ||||
| LOCAL_PREF, and NEXT_HOP path attributes are mandatory and will be | ||||
| validated. The ATOMIC_AGGEGATE, and AGGREGATOR are not applicable | ||||
| within the context of BGP SPF and SHOULD NOT be advertised. However, | ||||
| if they are advertised, they will be accepted, validated, and | ||||
| propagated consistent with the BGP protocol. | ||||
| The simplest peering model is the one described in section 5.2.1 of | Section 9 of [RFC4271] defines the decision process that is used to | |||
| [RFC7938]. In this model, EBGP single-hop sessions are established | select routes for subsequent advertisement by applying the policies | |||
| over direct point-to-point links interconnecting the SPF domain | in the local Policy Information Base (PIB) to the routes stored in | |||
| nodes. For the purposes of BGP SPF, Link NLRI is only advertised if | its Adj-RIBs-In. The output of the Decision Process is the set of | |||
| a single-hop BGP session has been established and the Link-State/SPF | routes that are announced by a BGP speaker to its peers. These | |||
| address family capability has been exchanged [RFC4790] on the | selected routes are stored by a BGP speaker in the speaker's Adj- | |||
| corresponding session. If the session goes down, the corresponding | RIBs-Out according to policy. | |||
| Link NLRI will be withdrawn. Topologically, this would be equivalent | ||||
| to the peering model in [RFC7938] where there is a BGP session on | ||||
| every link in the data center switch fabric. | ||||
| 2.2. BGP Peering Between Directly Connected Network Nodes | The BGP SPF extension fundamentally changes the decision process, as | |||
| described herein, to be more like a link-state protocol (e.g., OSPF | ||||
| [RFC2328]). Specifically: | ||||
| In this model, BGP speakers peer with all directly connected network | 1. BGP advertisements are readvertised to neighbors immediately | |||
| nodes but the sessions may be multi-hop and the direct connection | without waiting or dependence on the route computation as | |||
| discovery and liveliness detection for those connections are | specified in phase 3 of the base BGP decision process. Multiple | |||
| independent of the BGP protocol. How this is accomplished is outside | peering models are supported as specified in Section 4. | |||
| the scope of this document. Consequently, there will be a single | ||||
| session even if there are multiple direct connections between BGP | ||||
| speakers. For the purposes of BGP SPF, Link NLRI is advertised as | ||||
| long as a BGP session has been established, the Link-State/SPF | ||||
| address family capability has been exchanged [RFC4790] and the | ||||
| corresponding link is considered is up and considered operational. | ||||
| This is much like the previous peering model only peering is on a | ||||
| single loopback address and the switch fabric links can be | ||||
| unnumbered. However, there will be the same number of sessions as | ||||
| with the previous peering model unless there are parallel links | ||||
| between switches in the fabric. | ||||
| 2.3. BGP Peering in Route-Reflector or Controller Topology | 2. Determining the degree of preference for BGP routes for the SPF | |||
| calculation as described in phase 1 of the base BGP decision | ||||
| process is replaced with the mechanisms in Section 6.1. | ||||
| In this model, BGP speakers peer solely with one or more Route | 3. Phase 2 of the base BGP protocol decision process is replaced | |||
| with the Shortest Path First (SPF) algorithm, also known as the | ||||
| Dijkstra algorithm Section 1.1. | ||||
| 3. BGP Link-State (BGP-LS) Relationship | ||||
| [RFC7752] describes a mechanism by which link-state and TE | ||||
| information can be collected from networks and shared with external | ||||
| entities using BGP. This is achieved by defining NLRI advertised | ||||
| using the BGP-LS AFI. The BGP-LS extensions defined in [RFC7752] | ||||
| make use of the decision process defined in [RFC4271]. This document | ||||
| reuses NLRI and TLVs defined in [RFC7752]. Rather than reusing the | ||||
| BGP-LS SAFI, the BGP-LS-SPF SAFI Section 5.1 is introduced to insure | ||||
| backward compatibility for the BGP-LS SAFI usage. | ||||
| The BGP SPF extensions reuse the Node, Link, and Prefix NLRI defined | ||||
| in [RFC7752]. The usage of the BGP-LS NLRI, metric attributes, and | ||||
| attribute extensions is described in Section 5.2.1. The usage of | ||||
| others BGP-LS attributes is not precluded and is, in fact, expected. | ||||
| However, the details are beyond the scope of this document and will | ||||
| be specified in future documents. | ||||
| Support for Multiple Topology Routing (MTR) similar to the OSPF MTR | ||||
| computation described in [RFC4915] is beyond the scope of this | ||||
| document. Consequently, the usage of the Multi-Topology TLV as | ||||
| described in section 3.2.1.5 of [RFC7752] is not specified. | ||||
| The rules for setting the NLRI next-hop path attribute for the BGP- | ||||
| LS-SPF SAFI will follow the BGP-LS SAFI as specified in section 3.4 | ||||
| of [RFC7752]. | ||||
| 4. BGP Peering Models | ||||
| Depending on the topology, scaling, capabilities of the BGP-LS-SPF | ||||
| speakers, and redundancy requirements, various peering models are | ||||
| supported. The only requirements are that all BGP SPF speakers in | ||||
| the BGP SPF routing domain exchange BGP-LS-SPF NLRI, run an SPF | ||||
| calculation, and update their routing table appropriately. | ||||
| 4.1. BGP Single-Hop Peering on Network Node Connections | ||||
| The simplest peering model is the one where EBGP single-hop sessions | ||||
| are established over direct point-to-point links interconnecting the | ||||
| nodes in the BGP SPF routing domain. Once the single-hop BGP session | ||||
| has been established and the BGP-LS-SPF AFI/SAFI capability has been | ||||
| exchanged [RFC4760] for the corresponding session, then the link is | ||||
| considered up from a BGP SPF perspective and the corresponding BGP- | ||||
| LS-SPF Link NLRI is advertised. If the session goes down, the | ||||
| corresponding Link NLRI will be withdrawn. Topologically, this would | ||||
| be equivalent to the peering model in [RFC7938] where there is a BGP | ||||
| session on every link in the data center switch fabric. The content | ||||
| of the Link NLRI is described in Section 5.2.2. | ||||
| 4.2. BGP Peering Between Directly-Connected Nodes | ||||
| In this model, BGP-LS-SPF speakers peer with all directly-connected | ||||
| nodes but the sessions may be between loopback addresses (i.e., two- | ||||
| hop sessions) and the direct connection discovery and liveliness | ||||
| detection for the interconnecting links are independent of the BGP | ||||
| protocol. the scope of this document. For example, liveliness | ||||
| detection could be done using the BFD protocol [RFC5880]. Precisely | ||||
| how discovery and liveliness detection is accomplished is outside the | ||||
| scope of this document. Consequently, there will be a single BGP | ||||
| session even if there are multiple direct connections between BGP-LS- | ||||
| SPF speakers. BGP-LS-SPF Link NLRI is advertised as long as a BGP | ||||
| session has been established, the BGP-LS-SPF AFI/SAFI capability has | ||||
| been exchanged [RFC4760], and the link is operational as determined | ||||
| using liveliness detection mechanisms outside the scope of this | ||||
| document. This is much like the previous peering model only peering | ||||
| is between loopback addresses and the interconnecting links can be | ||||
| unnumbered. However, since there are BGP sessions between every | ||||
| directly-connected node in the BGP SPF routing domain, there is only | ||||
| a reduction in BGP sessions when there are parallel links between | ||||
| nodes. | ||||
| 4.3. BGP Peering in Route-Reflector or Controller Topology | ||||
| In this model, BGP-LS-SPF speakers peer solely with one or more Route | ||||
| Reflectors [RFC4456] or controllers. As in the previous model, | Reflectors [RFC4456] or controllers. As in the previous model, | |||
| direct connection discovery and liveliness detection for those | direct connection discovery and liveliness detection for those links | |||
| connections are done outside the BGP protocol. More specifically, | in the BGP SPF routing domain are done outside of the BGP protocol. | |||
| the Liveliness detection is done using BFD protocol described in | BGP-LS-SPF Link NLRI is advertised as long as the corresponding link | |||
| [RFC5880]. For the purposes of BGP SPF, Link NLRI is advertised as | is considered up as per the chosen liveness detection mechanism. | |||
| long as the corresponding link is up and considered operational. | ||||
| This peering model, known as sparse peering, allows for many fewer | This peering model, known as sparse peering, allows for fewer BGP | |||
| BGP sessions and, consequently, instances of the same NLRI received | sessions and, consequently, fewer instances of the same NLRI received | |||
| from multiple peers. It is discussed in greater detail in | from multiple peers. Normally, the route-reflectors or controller | |||
| BGP sessions would be on directly-connected links to avoid dependence | ||||
| on another routing protocol for session connectivity. However, | ||||
| multi-hop peering is not precluded. The number of BGP sessions is | ||||
| dependent on the redundancy requirements and the stability of the BGP | ||||
| sessions. This is discussed in greater detail in | ||||
| [I-D.ietf-lsvr-applicability]. | [I-D.ietf-lsvr-applicability]. | |||
| 3. BGP-LS Shortest Path Routing (SPF) SAFI | 5. BGP Shortest Path Routing (SPF) Protocol Extensions | |||
| In order to replace the Phase 1 and 2 decision functions of the | 5.1. BGP-LS Shortest Path Routing (SPF) SAFI | |||
| existing Decision Process with an SPF-based Decision Process and | ||||
| streamline the Phase 3 decision functions in a backward compatible | ||||
| manner, this draft introduces the BGP-LS-SFP SAFI for BGP-LS SPF | ||||
| operation. The BGP-LS-SPF (AFI 16388 / SAFI TBD1) [RFC4790] is | ||||
| allocated by IANA as specified in the Section 6. A BGP speaker using | ||||
| the BGP-LS SPF extensions described herein MUST exchange the AFI/SAFI | ||||
| using Multiprotocol Extensions Capability Code [RFC4760] with other | ||||
| BGP speakers in the SPF routing domain. | ||||
| 4. Extensions to BGP-LS | In order to replace the existing BGP decision process with an SPF- | |||
| based decision process in a backward compatible manner by not | ||||
| impacting the BGP-LS SAFI, this document introduces the BGP-LS-SPF | ||||
| SAFI. The BGP-LS-SPF (AFI 16388 / SAFI 80) [RFC4760] is allocated by | ||||
| IANA as specified in the Section 8. In order for two BGP-LS-SPF | ||||
| speakers to exchange BGP SPF NLRI, they MUST exchange the | ||||
| Multiprotocol Extensions Capability [RFC5492] [RFC4760] to ensure | ||||
| that they are both capable of properly processing such NLRI. This is | ||||
| done with AFI 16388 / SAFI 80 for BGP-LS-SPF advertised within the | ||||
| BGP SPF Routing Domain. The BGP-LS-SPF SAFI is used to carry IPv4 | ||||
| and IPv6 prefix information in a format facilitating an SPF-based | ||||
| decision process. | ||||
| 5.1.1. BGP-LS-SPF NLRI TLVs | ||||
| The NLRI format of BGP-LS-SPF SAFI uses exactly same format as the | ||||
| BGP-LS AFI [RFC7752]. In other words, all the TLVs used in BGP-LS | ||||
| AFI are applicable and used for the BGP-LS-SPF SAFI. These TLVs | ||||
| within BGP-LS-SPF NLRI advertise information that describes links, | ||||
| nodes, and prefixes comprising IGP link-state information. | ||||
| In order to compare the NLRI efficiently, it is REQUIRED that all the | ||||
| TLVs within the given NLRI must be ordered in ascending order by the | ||||
| TLV type. For multiple TLVs of same type within a single NLRI, it is | ||||
| REQUIRED that these TLVs are ordered in ascending order by the TLV | ||||
| value field. Comparison of the value fields is performed by treating | ||||
| the entire value field as a hexadecimal string. NLRIs having TLVs | ||||
| which do not follow the ordering rules MUST be considered as | ||||
| malformed and discarded with appropriate error logging. | ||||
| [RFC7752] defines certain NLRI TLVs as a mandatory TLVs. These TLVs | ||||
| are considered mandatory for the BGP-LS-SPF SAFI as well. All the | ||||
| other TLVs are considered as an optional TLVs. | ||||
| 5.1.2. BGP-LS Attribute | ||||
| The BGP-LS attribute of the BGP-LS-SPF SAFI uses exactly same format | ||||
| of the BGP-LS AFI [RFC7752]. In other words, all the TLVs used in | ||||
| BGP-LS attribute of the BGP-LS AFI are applicable and used for the | ||||
| BGP-LS attribute of the BGP-LS-SPF SAFI. This attribute is an | ||||
| optional, non-transitive BGP attribute that is used to carry link, | ||||
| node, and prefix properties and attributes. The BGP-LS attribute is | ||||
| a set of TLVs. | ||||
| The BGP-LS attribute may potentially grow large in size depending on | ||||
| the amount of link-state information associated with a single Link- | ||||
| State NLRI. The BGP specification [RFC4271] mandates a maximum BGP | ||||
| message size of 4096 octets. It is RECOMMENDED that an | ||||
| implementation support [RFC8654] in order to accommodate larger size | ||||
| of information within the BGP-LS Attribute. BGP-LS-SPF speakers MUST | ||||
| ensure that they limit the TLVs included in the BGP-LS Attribute to | ||||
| ensure that a BGP update message for a single Link-State NLRI does | ||||
| not cross the maximum limit for a BGP message. The determination of | ||||
| the types of TLVs to be included by the BGP-LS-SPF speaker | ||||
| originating the attribute is outside the scope of this document. | ||||
| When a BGP-LS-SPF speaker finds that it is exceeding the maximum BGP | ||||
| message size due to addition or update of some other BGP Attribute | ||||
| (e.g., AS_PATH), it MUST consider the BGP-LS Attribute to be | ||||
| malformed and the attribute discard handling of [RFC7606] applies. | ||||
| In order to compare the BGP-LS attribute efficiently, it is REQUIRED | ||||
| that all the TLVs within the given attribute must be ordered in | ||||
| ascending order by the TLV type. For multiple TLVs of same type | ||||
| within a single attribute, it is REQUIRED that these TLVs are ordered | ||||
| in ascending order by the TLV value field. Comparison of the value | ||||
| fields is performed by treating the entire value field as a | ||||
| hexadecimal string. Attributes having TLVs which do not follow the | ||||
| ordering rules MUST NOT be considered as malformed. | ||||
| All TLVs within the BGP-LS Attribute are considered optional unless | ||||
| specified otherwise. | ||||
| 5.2. Extensions to BGP-LS | ||||
| [RFC7752] describes a mechanism by which link-state and TE | [RFC7752] describes a mechanism by which link-state and TE | |||
| information can be collected from networks and shared with external | information can be collected from IGPs and shared with external | |||
| components using BGP protocol. It describes both the definition of | components using the BGP protocol. It describes both the definition | |||
| BGP-LS NLRI that describes links, nodes, and prefixes comprising IGP | of the BGP-LS-SPF NLRI that advertise links, nodes, and prefixes | |||
| link-state information and the definition of a BGP path attribute | comprising IGP link-state information and the definition of a BGP | |||
| (BGP-LS attribute) that carries link, node, and prefix properties and | path attribute (BGP-LS attribute) that carries link, node, and prefix | |||
| attributes, such as the link and prefix metric or auxiliary Router- | properties and attributes, such as the link and prefix metric or | |||
| IDs of nodes, etc. | auxiliary Router-IDs of nodes, etc. This document extends the usage | |||
| of BGP-LS NLRI for the purpose of BGP SPF calculation via | ||||
| advertisement in the BGP-LS-SPF SAFI. | ||||
| The BGP protocol will be used in the Protocol-ID field specified in | The protocol identifier specified in the Protocol-ID field [RFC7752] | |||
| table 1 of [I-D.ietf-idr-bgpls-segment-routing-epe]. The local and | will represent the origin of the advertised NLRI. For Node NLRI and | |||
| remote node descriptors for all NLRI will be the BGP Router-ID (TLV | Link NLRI, this MUST be the direct protocol (4). Node or Link NLRI | |||
| 516) and either the AS Number (TLV 512) [RFC7752] or the BGP | with a Protocol-ID other than direct will be considered malformed. | |||
| Confederation Member (TLV 517) [RFC8402]. However, if the BGP | For Prefix NLRI, the specified Protocol-ID MUST be the origin of the | |||
| Router-ID is known to be unique within the BGP Routing domain, it can | prefix. The local and remote node descriptors for all NLRI MUST | |||
| be used as the sole descriptor. | include the BGP Identifier (TLV 516) and the AS Number (TLV 512) | |||
| [RFC7752]. The BGP Confederation Member (TLV 517) [RFC7752] is not | ||||
| appliable and SHOULD not be included. If TLV 517 is included, it | ||||
| will be ignored. | ||||
| 4.1. Node NLRI Usage | 5.2.1. Node NLRI Usage | |||
| The BGP Node NLRI will be advertised unconditionally by all routers | The Node NLRI MUST be advertised unconditionally by all routers in | |||
| in the BGP SPF routing domain. | the BGP SPF routing domain. | |||
| 4.1.1. Node NLRI Attribute SPF Capability TLV | 5.2.1.1. Node NLRI Attribute SPF Capability TLV | |||
| The SPF capability is a new Node Attribute TLV that will be added to | The SPF capability is an additional Node Attribute TLV. This | |||
| those defined in table 7 of [RFC7752]. The new attribute TLV will | attribute TLV MUST be included with the BGP-LS-SPF SAFI and SHOULD | |||
| only be applicable when BGP is specified in the Node NLRI Protocol ID | NOT be used for other SAFIs. The TLV type 1180 will be assigned by | |||
| field. The TBD TLV type will be defined by IANA. The new Node | IANA. The Node Attribute TLV will contain a single-octet SPF | |||
| Attribute TLV will contain a single-octet SPF algorithm as defined in | algorithm as defined in [RFC8665]. | |||
| [RFC8402]. | ||||
| 0 1 2 3 | 0 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Type | Length | | | Type (1180) | Length - (1 Octet) | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | SPF Algorithm | | | SPF Algorithm | | |||
| +-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+ | |||
| The SPF Algorithm may take the following values: | The SPF algorithm inherits the values from the IGP Algorithm Types | |||
| registry [RFC8665]. Algorithm 0, (Shortest Path Algorithm (SPF) | ||||
| 0 - Normal Shortest Path First (SPF) algorithm based on link | based on link metric, is supported and described in Section 6.3. | |||
| metric. This is the standard shortest path algorithm as | Support for other algorithm types is beyond the scope of this | |||
| computed by the IGP protocol. Consistent with the deployed | specification. | |||
| practice for link-state protocols, Algorithm 0 permits any | ||||
| node to overwrite the SPF path with a different path based on | ||||
| its local policy. | ||||
| 1 - Strict Shortest Path First (SPF) algorithm based on link | ||||
| metric. The algorithm is identical to Algorithm 0 but Algorithm | ||||
| 1 requires that all nodes along the path will honor the SPF | ||||
| routing decision. Local policy at the node claiming support for | ||||
| Algorithm 1 MUST NOT alter the SPF paths computed by Algorithm 1. | ||||
| Note that usage of Strict Shortest Path First (SPF) algorithm is | ||||
| defined in the IGP algorithm registry but usage is restricted to | ||||
| [I-D.ietf-idr-bgpls-segment-routing-epe]. Hence, its usage for BGP- | ||||
| LS SPF is out of scope. | ||||
| When computing the SPF for a given BGP routing domain, only BGP nodes | When computing the SPF for a given BGP routing domain, only BGP nodes | |||
| advertising the SPF capability attribute will be included the | advertising the SPF capability TLV with same SPF algorithm will be | |||
| Shortest Path Tree (SPT). | included in the Shortest Path Tree (SPT). An implementation MAY | |||
| optionally log detection of a BGP node that has either not advertised | ||||
| the SPF capability TLV or is advertising the SPF capability TLV with | ||||
| an algorithm type other than 0. | ||||
| 4.1.2. BGP-LS Node NLRI Attribute SPF Status TLV | 5.2.1.2. BGP-LS-SPF Node NLRI Attribute SPF Status TLV | |||
| A BGP-LS Attribute TLV to BGP-LS Node NLRI is defined to indicate the | A BGP-LS Attribute TLV of the BGP-LS-SPF Node NLRI is defined to | |||
| status of the node with respect to the BGP SPF calculation. This | indicate the status of the node with respect to the BGP SPF | |||
| will be used to rapidly take a node out of service or to indicate the | calculation. This will be used to rapidly take a node out of service | |||
| node is not to be used for transit (i.e., non-local) traffic. If the | Section 6.5.2 or to indicate the node is not to be used for transit | |||
| SPF Status TLV is not included with the Node NLRI, the node is | (i.e., non-local) traffic Section 6.3. If the SPF Status TLV is not | |||
| considered to be up and is available for transit traffic. | included with the Node NLRI, the node is considered to be up and is | |||
| available for transit traffic. The SPF status is acted upon with the | ||||
| execution of the next SPF calculation Section 6.3. A single TLV type | ||||
| will be shared by the BGP-LS-SPF Node, Link, and Prefix NLRI. The | ||||
| TLV type 1184 will be assigned by IANA. | ||||
| 0 1 2 3 | 0 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | TBD Type | Length | | | Type (1184) | Length (1 Octet) | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | SPF Status | | | SPF Status | | |||
| +-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+ | |||
| BGP Status Values: 0 - Reserved | BGP Status Values: 0 - Reserved | |||
| 1 - Node Unreachable with respect to BGP SPF | 1 - Node Unreachable with respect to BGP SPF | |||
| 2 - Node does not support transit with respect | 2 - Node does not support transit with respect | |||
| to BGP SPF | to BGP SPF | |||
| 3-254 - Undefined | 3-254 - Undefined | |||
| 255 - Reserved | 255 - Reserved | |||
| 4.2. Link NLRI Usage | If the SPF Status TLV is received and the corresponding Node NLRI has | |||
| not been received, then the SPF Status TLV is ignored and not used in | ||||
| SPF computation but is still announced to other BGP speakers. An | ||||
| implementation MAY log an error for further analysis. If a BGP | ||||
| speaker received the Node NLRI but the SPF Status TLV is not | ||||
| received, then any previously received information is considered as | ||||
| implicitly withdrawn and the update is propagated to other BGP | ||||
| speakers. A BGP speaker receiving a BGP Update containing a SPF | ||||
| Status TLV in the BGP-LS attribute [RFC7752] with a value that is | ||||
| outside the range of defined values SHOULD be processed and announced | ||||
| to other BGP speakers. However, a BGP speaker MUST not use the | ||||
| Status TLV in its SPF computation. An implementation MAY log this | ||||
| condition for further analysis. | ||||
| 5.2.2. Link NLRI Usage | ||||
| The criteria for advertisement of Link NLRI are discussed in | The criteria for advertisement of Link NLRI are discussed in | |||
| Section 2. | Section 4. | |||
| Link NLRI is advertised with local and remote node descriptors as | Link NLRI is advertised with unique local and remote node descriptors | |||
| described above and unique link identifiers dependent on the | dependent on the IP addressing. For IPv4 links, the link's local | |||
| addressing. For IPv4 links, the links local IPv4 (TLV 259) and | IPv4 (TLV 259) and remote IPv4 (TLV 260) addresses will be used. For | |||
| remote IPv4 (TLV 260) addresses will be used. For IPv6 links, the | IPv6 links, the local IPv6 (TLV 261) and remote IPv6 (TLV 262) | |||
| local IPv6 (TLV 261) and remote IPv6 (TLV 262) addresses will be | addresses will be used. For unnumbered links, the link local/remote | |||
| used. For unnumbered links, the link local/remote identifiers (TLV | identifiers (TLV 258) will be used. For links supporting having both | |||
| 258) will be used. For links supporting having both IPv4 and IPv6 | IPv4 and IPv6 addresses, both sets of descriptors MAY be included in | |||
| addresses, both sets of descriptors may be included in the same Link | the same Link NLRI. The link identifiers are described in table 5 of | |||
| NLRI. The link identifiers are described in table 5 of [RFC7752]. | [RFC7752]. | |||
| The link IGP metric attribute TLV (TLV 1095) as well as any others | For a link to be used in Shortest Path Tree (SPT) for a given address | |||
| required for non-SPF purposes SHOULD be advertised. The metric value | family, i.e., IPv4 or IPv6, both routers connecting the link MUST | |||
| in this TLV is variable length dependent on specific protocol usage | have an address in the same subnet for that address family. However, | |||
| (refer to section 3.3.2.4 in [RFC7752]). For simplicity, the BGP-LS | an IPv4 or IPv6 prefix associated with the link MAY be installed | |||
| SPF metric length will be 4 octets. Algorithms such as setting the | without the corresponding address on the other side of link. | |||
| metric inversely to the link speed as done in the OSPF MIB [RFC4750] | ||||
| MAY be supported. However, this is beyond the scope of this | The link IGP metric attribute TLV (TLV 1095) MUST be advertised. If | |||
| a BGP speaker receives a Link NLRI without an IGP metric attribute | ||||
| TLV, then it SHOULD consider the received NLRI as a malformed and the | ||||
| receiving BGP speaker MUST handle such malformed NLRI as 'Treat-as- | ||||
| withdraw' [RFC7606]. The BGP SPF metric length is 4 octets. Like | ||||
| OSPF [RFC2328], a cost is associated with the output side of each | ||||
| router interface. This cost is configurable by the system | ||||
| administrator. The lower the cost, the more likely the interface is | ||||
| to be used to forward data traffic. One possible default for metric | ||||
| would be to give each interface a cost of 1 making it effectively a | ||||
| hop count. Algorithms such as setting the metric inversely to the | ||||
| link speed as supported in the OSPF MIB [RFC4750] MAY be supported. | ||||
| However, this is beyond the scope of this document. Refer to | ||||
| Section 10.1.1 for operational guidance. | ||||
| The usage of other link attribute TLVs is beyond the scope of this | ||||
| document. | document. | |||
| 4.2.1. BGP-LS Link NLRI Attribute Prefix-Length TLVs | 5.2.2.1. BGP-LS-SPF Link NLRI Attribute Prefix-Length TLVs | |||
| Two BGP-LS Attribute TLVs to BGP-LS Link NLRI are defined to | Two BGP-LS Attribute TLVs of the BGP-LS-SPF Link NLRI are defined to | |||
| advertise the prefix length associated with the IPv4 and IPv6 link | advertise the prefix length associated with the IPv4 and IPv6 link | |||
| prefixes. The prefix length is used for the optional installation of | prefixes derived from the link descriptor addresses. The prefix | |||
| prefixes corresponding to Link NLRI as defined in Section 5.3. | length is used for the optional installation of prefixes | |||
| corresponding to Link NLRI as defined in Section 6.3. | ||||
| 0 1 2 3 | 0 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | TBD IPv4 or IPv6 Type | Length | | |IPv4 (1182) or IPv6 Type (1183)| Length (1 Octet) | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Prefix-Length | | | Prefix-Length | | |||
| +-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+ | |||
| Prefix-length - A one-octet length restricted to 1-32 for IPv4 | Prefix-length - A one-octet length restricted to 1-32 for IPv4 | |||
| Link NLRI endpoint prefixes and 1-128 for IPv6 | Link NLRI endpoint prefixes and 1-128 for IPv6 | |||
| Link NLRI endpoint prefixes. | Link NLRI endpoint prefixes. | |||
| 4.2.2. BGP-LS Link NLRI Attribute SPF Status TLV | The Prefix-Length TLV is only relevant to Link NLRIs. The Prefix- | |||
| Length TLVs MUST be discarded as an error and not passed to other BGP | ||||
| peers as specified in [RFC7606] when received with any NLRIs other | ||||
| than Link NRLIs. An implementation MAY log an error for further | ||||
| analysis. | ||||
| A BGP-LS Attribute TLV to BGP-LS Link NLRI is defined to indicate the | The maximum prefix-length for IPv4 Prefix-Length TLV is 32 bits. A | |||
| status of the link with respect to the BGP SPF calculation. This | prefix-length field indicating a larger value than 32 bits MUST be | |||
| will be used to expedite convergence for link failures as discussed | discarded as an error and the received TLV is not passed to other BGP | |||
| in Section 5.6.1. If the SPF Status TLV is not included with the | peers as specified in [RFC7606]. The corresponding Link NLRI is | |||
| Link NLRI, the link is considered up and available. | considered as malformed and MUST be handled as 'Treat-as-withdraw'. | |||
| An implementation MAY log an error for further analysis. | ||||
| The maximum prefix-length for IPv6 Prefix-Length Type is 128 bits. A | ||||
| prefix-length field indicating a larger value than 128 bits MUST be | ||||
| discarded as an error and the received TLV is not passed to other BGP | ||||
| peers as specified in [RFC7606]. The corresponding Link NLRI is | ||||
| considered as malformed and MUST be handled as 'Treat-as-withdraw'. | ||||
| An implementation MAY log an error for further analysis. | ||||
| 5.2.2.2. BGP-LS-SPF Link NLRI Attribute SPF Status TLV | ||||
| A BGP-LS Attribute TLV of the BGP-LS-SPF Link NLRI is defined to | ||||
| indicate the status of the link with respect to the BGP SPF | ||||
| calculation. This will be used to expedite convergence for link | ||||
| failures as discussed in Section 6.5.1. If the SPF Status TLV is not | ||||
| included with the Link NLRI, the link is considered up and available. | ||||
| The SPF status is acted upon with the execution of the next SPF | ||||
| calculation Section 6.3. A single TLV type will be shared by the | ||||
| Node, Link, and Prefix NLRI. The TLV type 1184 will be assigned by | ||||
| IANA. | ||||
| 0 1 2 3 | 0 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | TBD Type | Length | | | Type (1184) | Length (1 Octet) | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | SPF Status | | | SPF Status | | |||
| +-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+ | |||
| BGP Status Values: 0 - Reserved | BGP Status Values: 0 - Reserved | |||
| 1 - Link Unreachable with respect to BGP SPF | 1 - Link Unreachable with respect to BGP SPF | |||
| 2-254 - Undefined | 2-254 - Undefined | |||
| 255 - Reserved | 255 - Reserved | |||
| 4.3. Prefix NLRI Usage | If the SPF Status TLV is received and the corresponding Link NLRI has | |||
| not been received, then the SPF Status TLV is ignored and not used in | ||||
| SPF computation but is still announced to other BGP speakers. An | ||||
| implementation MAY log an error for further analysis. If a BGP | ||||
| speaker received the Link NLRI but the SPF Status TLV is not | ||||
| received, then any previously received information is considered as | ||||
| implicitly withdrawn and the update is propagated to other BGP | ||||
| speakers. A BGP speaker receiving a BGP Update containing an SPF | ||||
| Status TLV in the BGP-LS attribute [RFC7752] with a value that is | ||||
| outside the range of defined values SHOULD be processed and announced | ||||
| to other BGP speakers. However, a BGP speaker MUST not use the | ||||
| Status TLV in its SPF computation. An implementation MAY log this | ||||
| information for further analysis. | ||||
| Prefix NLRI is advertised with a local node descriptor as described | 5.2.3. IPv4/IPv6 Prefix NLRI Usage | |||
| above and the prefix and length used as the descriptors (TLV 265) as | ||||
| described in [RFC7752]. The prefix metric attribute TLV (TLV 1155) | ||||
| as well as any others required for non-SPF purposes SHOULD be | ||||
| advertised. For loopback prefixes, the metric should be 0. For non- | ||||
| loopback prefixes, the setting of the metric is a local matter and | ||||
| beyond the scope of this document. | ||||
| 4.3.1. BGP-LS Prefix NLRI Attribute SPF Status TLV | IPv4/IPv6 Prefix NLRI is advertised with a Local Node Descriptor and | |||
| the prefix and length. The Prefix Descriptors field includes the IP | ||||
| Reachability Information TLV (TLV 265) as described in [RFC7752]. | ||||
| The prefix metric attribute TLV (TLV 1155) MUST be advertised. The | ||||
| IGP Route Tag TLV (TLV 1153) MAY be advertised. The usage of other | ||||
| attribute TLVs is beyond the scope of this document. For loopback | ||||
| prefixes, the metric should be 0. For non-loopback prefixes, the | ||||
| setting of the metric is a local matter and beyond the scope of this | ||||
| document. | ||||
| A BGP-LS Attribute TLV to BGP-LS Prefix NLRI is defined to indicate | 5.2.3.1. BGP-LS-SPF Prefix NLRI Attribute SPF Status TLV | |||
| the status of the prefix with respect to the BGP SPF calculation. | ||||
| This will be used to expedite convergence for prefix unreachability | A BGP-LS Attribute TLV to BGP-LS-SPF Prefix NLRI is defined to | |||
| as discussed in Section 5.6.1. If the SPF Status TLV is not included | indicate the status of the prefix with respect to the BGP SPF | |||
| with the Prefix NLRI, the prefix is considered reachable. | calculation. This will be used to expedite convergence for prefix | |||
| unreachability as discussed in Section 6.5.1. If the SPF Status TLV | ||||
| is not included with the Prefix NLRI, the prefix is considered | ||||
| reachable. A single TLV type will be shared by the Node, Link, and | ||||
| Prefix NLRI. The TLV type 1184 will be assigned by IANA. | ||||
| 0 1 2 3 | 0 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | TBD Type | Length | | | Type (1184) | Length (1 Octet) | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | SPF Status | | | SPF Status | | |||
| +-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+ | |||
| BGP Status Values: 0 - Reserved | BGP Status Values: 0 - Reserved | |||
| 1 - Prefix down with respect to SPF | 1 - Prefix Unreachable with respect to SPF | |||
| 2-254 - Undefined | 2-254 - Undefined | |||
| 255 - Reserved | 255 - Reserved | |||
| 4.4. BGP-LS Attribute Sequence-Number TLV | If the SPF Status TLV is received and the corresponding Prefix NLRI | |||
| has not been received, then the SPF Status TLV is ignored and not | ||||
| used in SPF computation but is still announced to other BGP speakers. | ||||
| An implementation MAY log an error for further analysis. If a BGP | ||||
| speaker received the Prefix NLRI but the SPF Status TLV is not | ||||
| received, then any previously received information is considered as | ||||
| implicitly withdrawn and the update is propagated to other BGP | ||||
| speakers. A BGP speaker receiving a BGP Update containing an SPF | ||||
| Status TLV in the BGP-LS attribute [RFC7752] with a value that is | ||||
| outside the range of defined values SHOULD be processed and announced | ||||
| to other BGP speakers. However, a BGP speaker MUST not use the | ||||
| Status TLV in its SPF computation. An implementation MAY log this | ||||
| information for further analysis. | ||||
| A new BGP-LS Attribute TLV to BGP-LS NLRI types is defined to assure | 5.2.4. BGP-LS Attribute Sequence-Number TLV | |||
| the most recent version of a given NLRI is used in the SPF | ||||
| computation. The TBD TLV type will be defined by IANA. The new BGP- | A BGP-LS Attribute TLV of the BGP-LS-SPF NLRI types is defined to | |||
| LS Attribute TLV will contain an 8-octet sequence number. The usage | assure the most recent version of a given NLRI is used in the SPF | |||
| of the Sequence Number TLV is described in Section 5.1. | computation. The Sequence-Number TLV is mandatory for BGP-LS-SPF | |||
| NLRI. The TLV type 1181 has been assigned by IANA. The BGP-LS | ||||
| Attribute TLV will contain an 8-octet sequence number. The usage of | ||||
| the Sequence Number TLV is described in Section 6.1. | ||||
| 0 1 2 3 | 0 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Type | Length | | | Type (1181) | Length (8 Octets) | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Sequence Number (High-Order 32 Bits) | | | Sequence Number (High-Order 32 Bits) | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Sequence Number (Low-Order 32 Bits) | | | Sequence Number (Low-Order 32 Bits) | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Sequence Number | Sequence Number | |||
| The 64-bit strictly increasing sequence number is incremented for | The 64-bit strictly-increasing sequence number MUST be incremented | |||
| every version of BGP-LS NLRI originated. BGP speakers implementing | for every self-originated version of BGP-LS-SPF NLRI. BGP speakers | |||
| this specification MUST use available mechanisms to preserve the | implementing this specification MUST use available mechanisms to | |||
| sequence number's strictly increasing property for the deployed life | preserve the sequence number's strictly increasing property for the | |||
| of the BGP speaker (including cold restarts). One mechanism for | deployed life of the BGP speaker (including cold restarts). One | |||
| accomplishing this would be to use the high-order 32 bits of the | mechanism for accomplishing this would be to use the high-order 32 | |||
| sequence number as a wrap/boot count that is incremented anytime the | bits of the sequence number as a wrap/boot count that is incremented | |||
| BGP router loses its sequence number state or the low-order 32 bits | any time the BGP router loses its sequence number state or the low- | |||
| wrap. | order 32 bits wrap. | |||
| When incrementing the sequence number for each self-originated NLRI, | When incrementing the sequence number for each self-originated NLRI, | |||
| the sequence number should be treated as an unsigned 64-bit value. | the sequence number should be treated as an unsigned 64-bit value. | |||
| If the lower-order 32-bit value wraps, the higher-order 32-bit value | If the lower-order 32-bit value wraps, the higher-order 32-bit value | |||
| should be incremented and saved in non-volatile storage. If by some | should be incremented and saved in non-volatile storage. If by some | |||
| chance the BGP Speaker is deployed long enough that there is a | chance the BGP-LS-SPF speaker is deployed long enough that there is a | |||
| possibility that the 64-bit sequence number may wrap or a BGP Speaker | possibility that the 64-bit sequence number may wrap or a BGP-LS-SPF | |||
| completely loses its sequence number state (e.g., the BGP speaker | speaker completely loses its sequence number state (e.g., the BGP | |||
| hardware is replaced or experiences a cold-start), the phase 1 | speaker hardware is replaced or experiences a cold-start), the BGP | |||
| decision function (see Section 5.1) rules will insure convergence, | NLRI selection rules (see Section 6.1) will insure convergence, | |||
| albeit, not immediately. | albeit not immediately. | |||
| 5. Decision Process with SPF Algorithm | The Sequence-Number TLV is mandatory for BGP-LS-SPF NLRI. If the | |||
| Sequence-Number TLV is not received then the corresponding Link NLRI | ||||
| is considered as malformed and MUST be handled as 'Treat-as- | ||||
| withdraw'. An implementation MAY log an error for further analysis. | ||||
| 5.3. NEXT_HOP Manipulation | ||||
| All BGP peers that support SPF extensions would locally compute the | ||||
| Loc-RIB Next-Hop as a result of the SPF process. Consequently, the | ||||
| Next-Hop is always ignored on receipt. The Next-Hop address MUST be | ||||
| encoded as described in [RFC4760]. BGP speakers MUST interpret the | ||||
| Next-Hop address of MP_REACH_NLRI attribute as an IPv4 address | ||||
| whenever the length of the Next-Hop address is 4 octets, and as a | ||||
| IPv6 address whenever the length of the Next-Hop address is 16 | ||||
| octets. | ||||
| [RFC4760] modifies the rules of NEXT_HOP attribute whenever the | ||||
| multiprotocol extensions for BGP-4 are enabled. BGP speakers MUST | ||||
| set the NEXT_HOP attribute according to the rules specified in | ||||
| [RFC4760] as the BGP-LS-SPF routing information is carried within the | ||||
| multiprotocol extensions for BGP-4. | ||||
| 6. Decision Process with SPF Algorithm | ||||
| The Decision Process described in [RFC4271] takes place in three | The Decision Process described in [RFC4271] takes place in three | |||
| distinct phases. The Phase 1 decision function of the Decision | distinct phases. The Phase 1 decision function of the Decision | |||
| Process is responsible for calculating the degree of preference for | Process is responsible for calculating the degree of preference for | |||
| each route received from a BGP speaker's peer. The Phase 2 decision | each route received from a BGP speaker's peer. The Phase 2 decision | |||
| function is invoked on completion of the Phase 1 decision function | function is invoked on completion of the Phase 1 decision function | |||
| and is responsible for choosing the best route out of all those | and is responsible for choosing the best route out of all those | |||
| available for each distinct destination, and for installing each | available for each distinct destination, and for installing each | |||
| chosen route into the Loc-RIB. The combination of the Phase 1 and 2 | chosen route into the Loc-RIB. The combination of the Phase 1 and 2 | |||
| decision functions is characterized as a Path Vector algorithm. | decision functions is characterized as a Path Vector algorithm. | |||
| The SPF based Decision process replaces the BGP best-path Decision | The SPF based Decision process replaces the BGP Decision process | |||
| process described in [RFC4271]. This process starts with selecting | described in [RFC4271]. This process starts with selecting only | |||
| only those Node NLRI whose SPF capability TLV matches with the local | those Node NLRI whose SPF capability TLV matches with the local BGP- | |||
| BGP speaker's SPF capability TLV value. Since Link-State NLRI always | LS-SPF speaker's SPF capability TLV value. Since Link-State NLRI | |||
| contains the local descriptor [RFC7752], it will only be originated | always contains the local node descriptor Section 5.2.1, each NLRI is | |||
| by a single BGP speaker in the BGP routing domain. These selected | uniquely originated by a single BGP-LS-SPF speaker in the BGP SPF | |||
| Node NLRI and their Link/Prefix NLRI are used to build a directed | routing domain (the BGP node matching the NLRI's Node Descriptors). | |||
| graph during the SPF computation. The best paths for BGP prefixes | Instances of the same NLRI originated by multiple BGP speakers would | |||
| are installed as a result of the SPF process. | be indicative of a configuration error or a masquerading attack | |||
| (Section 9). These selected Node NLRI and their Link/Prefix NLRI are | ||||
| used to build a directed graph during the SPF computation as | ||||
| described below. The best routes for BGP prefixes are installed in | ||||
| the RIB as a result of the SPF process. | ||||
| When BGP-LS-SPF NLRI is received, all that is required is to | When BGP-LS-SPF NLRI is received, all that is required is to | |||
| determine whether it is the best-path by examining the Node-ID and | determine whether it is the most recent by examining the Node-ID and | |||
| sequence number as described in Section 5.1. If the received best- | sequence number as described in Section 6.1. If the received NLRI | |||
| path NLRI had changed, it will be advertised to other BGP-LS-SPF | has changed, it will be advertised to other BGP-LS-SPF peers. If the | |||
| peers. If the attributes have changed (other than the sequence | attributes have changed (other than the sequence number), a BGP SPF | |||
| number), a BGP SPF calculation will be scheduled. However, a changed | calculation will be triggered. However, a changed NLRI MAY be | |||
| NLRI MAY be advertised to other peers almost immediately and | advertised immediately to other peers and prior to any SPF | |||
| propagation of changes can approach IGP convergence times. To | calculation. Note that the BGP MinRouteAdvertisementIntervalTimer | |||
| accomplish this, the MinRouteAdvertisementIntervalTimer and | and MinASOriginationIntervalTimer [RFC4271] timers are not applicable | |||
| MinASOriginationIntervalTimer [RFC4271] are not applicable to the | to the BGP-LS-SPF SAFI. The scheduling of the SPF calculation, as | |||
| BGP-LS-SPF SAFI. Rather, SPF calculations SHOULD be triggered and | described in Section 6.3, is an implementation issue. Scheduling MAY | |||
| dampened consistent with the SPF back-off algorithm specified in | be dampened consistent with the SPF back-off algorithm specified in | |||
| [RFC8405]. | [RFC8405]. | |||
| The Phase 3 decision function of the Decision Process [RFC4271] is | The Phase 3 decision function of the Decision Process [RFC4271] is | |||
| also simplified since under normal SPF operation, a BGP speaker would | also simplified since under normal SPF operation, a BGP speaker MUST | |||
| advertise the NLRI selected for the SPF to all BGP peers with the | advertise the changed NLRIs to all BGP peers with the BGP-LS-SPF AFI/ | |||
| BGP-LS/BGP-LS-SPF AFI/SAFI. Application of policy would not be | SAFI and install the changed routes in the Global RIB. The only | |||
| prevented however its usage to best-path process would be limited as | exception are unchanged NLRIs or stale NLRIs, i.e., NLRI received | |||
| the SPF relies solely on link metrics. | with a less recent (numerically smaller) sequence number. | |||
| 5.1. Phase-1 BGP NLRI Selection | 6.1. BGP NLRI Selection | |||
| The rules for NLRI selection are greatly simplified from [RFC4271]. | The rules for all BGP-LS-SPF NLRIs selection for phase 1 of the BGP | |||
| decision process, section 9.1.1 [RFC4271], no longer apply. | ||||
| 1. If the NLRI is received from the BGP speaker originating the NLRI | 1. Routes originated by directly connected BGP SPF peers are | |||
| (as determined by the comparing BGP Router ID in the NLRI Node | preferred. This condition can be determined by comparing the BGP | |||
| identifiers with the BGP speaker Router ID), then it is preferred | Identifiers in the received Local Node Descriptor and OPEN | |||
| over the same NLRI from non-originators. This rule will assure | message. This rule will assure that stale NLRI is updated even | |||
| that stale NLRI is updated even if a BGP-LS router loses its | if a BGP-LS router loses its sequence number state due to a cold- | |||
| sequence number state due to a cold-start. | start. | |||
| 2. If the Sequence-Number TLV is present in the BGP-LS Attribute, | 2. The NLRI with the most recent Sequence Number TLV, i.e., highest | |||
| then the NLRI with the most recent, i.e., highest sequence number | sequence number is selected. | |||
| is selected. BGP-LS NLRI with a Sequence-Number TLV will be | ||||
| considered more recent than NLRI without a BGP-LS Attribute or a | ||||
| BGP-LS Attribute that doesn't include the Sequence-Number TLV. | ||||
| 3. The final tie-breaker is the NLRI from the BGP Speaker with the | 3. The route received from the BGP SPF speaker with the numerically | |||
| numerically largest BGP Router ID. | larger BGP Identifier is preferred. | |||
| When a BGP speaker completely loses its sequence number state, i.e., | When a BGP SPF speaker completely loses its sequence number state, | |||
| due to a cold start, or in the unlikely possibility that that | i.e., due to a cold start, or in the unlikely possibility that that | |||
| sequence number wraps, the BGP routing domain will still converge. | 64-bit sequence number wraps, the BGP routing domain will still | |||
| This is due to the fact that BGP speakers adjacent to the router will | converge. This is due to the fact that BGP speakers adjacent to the | |||
| always accept self-originated NLRI from the associated speaker as | router will always accept self-originated NLRI from the associated | |||
| more recent (rule # 1). When BGP speaker reestablishes a connection | speaker as more recent (rule # 1). When a BGP speaker reestablishes | |||
| with its peers, any existing session will be taken down and stale | a connection with its peers, any existing session will be taken down | |||
| NLRI will be replaced by the new NLRI and stale NLRI will be | and stale NLRI will be replaced. The adjacent BGP speaker will | |||
| discarded independent of whether or not BGP graceful restart is | update their NLRI advertisements, hop by hop, until the BGP routing | |||
| deployed, [RFC4724]. The adjacent BGP speaker will update their NLRI | domain has converged. | |||
| advertisements in turn until the BGP routing domain has converged. | ||||
| The modified SPF Decision Process performs an SPF calculation rooted | The modified SPF Decision Process performs an SPF calculation rooted | |||
| at the BGP speaker using the metrics from Link and Prefix NLRI | at the BGP speaker using the metrics from the Link Attribute IGP | |||
| Attribute TLVs [RFC7752]. As a result, any attributes that would | Metric TLV (1095) and the Prefix Attribute Prefix Metric TLV (1155) | |||
| influence the Decision process defined in [RFC4271] like ORIGIN, | [RFC7752]. As a result, any other BGP attributes that would | |||
| MULTI_EXIT_DISC, and LOCAL_PREF attributes are ignored by the SPF | influence the BGP decision process defined in [RFC4271] including | |||
| algorithm. Furthermore, the NEXT_HOP attribute value is preserved | ORIGIN, MULTI_EXIT_DISC, and LOCAL_PREF attributes are ignored by the | |||
| but otherwise ignored during the SPF or best-path. | SPF algorithm. Furthermore, the NEXT_HOP attribute value is | |||
| preserved but otherwise ignored during the SPF computation for BGP- | ||||
| LS-SPF NLRIs. The AS_PATH and AS4_PATH [RFC6793] attributes are | ||||
| preserved and used for loop detection [RFC4271]. They are ignored | ||||
| during the SPF computation for BGP-LS-SPF NRLIs. | ||||
| 5.2. Dual Stack Support | 6.1.1. BGP Self-Originated NLRI | |||
| Node, Link, or Prefix NLRI with Node Descriptors matching the local | ||||
| BGP speaker are considered self-originated. When self-originated | ||||
| NLRI is received and it doesn't match the local node's NLRI content | ||||
| (including sequence number), special processing is required. | ||||
| o If a self-originated NLRI is received and the sequence number is | ||||
| more recent (i.e., greater than the local node's sequence number | ||||
| for the NLRI), the NLRI sequence number will be advanced to one | ||||
| greater than the received sequence number and the NLRI will be | ||||
| readvertised to all peers. | ||||
| o If self-originated NLRI is received and the sequence number is the | ||||
| same as the local node's sequence number but the attributes | ||||
| differ, the NLRI sequence number will be advanced to one greater | ||||
| than the received sequence number and the NLRI will be | ||||
| readvertised to all peers. | ||||
| o If self-originated Link or Prefix NLRI is received and the Link or | ||||
| Prefix NLRI is no longer being advertised by the local node, the | ||||
| NLRI will be withdrawn. | ||||
| The above actions are performed immediately when the first instance | ||||
| of a newer self-originated NLRI is received. In this case, the newer | ||||
| instance is considered to be a stale instance that was advertised by | ||||
| the local node prior to a restart where the NLRI state is lost. | ||||
| However, if subsequent newer self-originated NLRI is received for the | ||||
| same Node, Link, or Prefix NLRI, the readvertisement or withdrawal is | ||||
| delayed by 5 seconds since it is likely being advertised by a | ||||
| misconfigured or rogue BGP-LS-SPF speaker Section 9. | ||||
| 6.2. Dual Stack Support | ||||
| The SPF-based decision process operates on Node, Link, and Prefix | The SPF-based decision process operates on Node, Link, and Prefix | |||
| NLRIs that support both IPv4 and IPv6 addresses. Whether to run a | NLRIs that support both IPv4 and IPv6 addresses. Whether to run a | |||
| single SPF instance or multiple SPF instances for separate AFs is a | single SPF computation or multiple SPF computations for separate AFs | |||
| matter of a local implementation. Normally, IPv4 next-hops are | is an implementation matter. Normally, IPv4 next-hops are calculated | |||
| calculated for IPv4 prefixes and IPv6 next-hops are calculated for | for IPv4 prefixes and IPv6 next-hops are calculated for IPv6 | |||
| IPv6 prefixes. However, an interesting use-case is deployment of | prefixes. | |||
| [RFC5549] where IPv6 next-hops are calculated for both IPv4 and IPv6 | ||||
| prefixes. As stated in Section 1, support for Multiple Topology | ||||
| Routing (MTR) is an area for future study. | ||||
| 5.3. SPF Calculation based on BGP-LS NLRI | 6.3. SPF Calculation based on BGP-LS-SPF NLRI | |||
| This section details the BGP-LS SPF local routing information base | This section details the BGP-LS-SPF local routing information base | |||
| (RIB) calculation. The router will use BGP-LS Node, Link, and Prefix | (RIB) calculation. The router will use BGP-LS-SPF Node, Link, and | |||
| NLRI to populate the local RIB using the following algorithm. This | Prefix NLRI to compute routes using the following algorithm. This | |||
| calculation yields the set of intra-area routes associated with the | calculation yields the set of routes associated with the BGP-LS | |||
| BGP-LS domain. A router calculates the shortest-path tree using | domain. A router calculates the shortest-path tree using itself as | |||
| itself as the root. Variations and optimizations of the algorithm | the root. Optimizations to the BGP-LS-SPF algorithm are possible but | |||
| are valid as long as it yields the same set of routes. The algorithm | MUST yield the same set of routes. The algorithm below supports | |||
| below supports Equal Cost Multi-Path (ECMP) routes. Weighted Unequal | Equal Cost Multi-Path (ECMP) routes. Weighted Unequal Cost Multi- | |||
| Cost Multi-Path are out of scope. The organization of this section | Path routes are out of scope. The organization of this section owes | |||
| owes heavily to section 16 of [RFC2328]. | heavily to section 16 of [RFC2328]. | |||
| The following abstract data structures are defined in order to | The following abstract data structures are defined in order to | |||
| specify the algorithm. | specify the algorithm. | |||
| o Local Route Information Base (RIB) - This is abstract contains | o Local Route Information Base (LOC-RIB) - This routing table | |||
| reachability information (i.e., next hops) for all prefixes (both | contains reachability information (i.e., next hops) for all | |||
| IPv4 and IPv6) as well as the Node NLRI reachability. | prefixes (both IPv4 and IPv6) as well as BGP-LS-SPF node | |||
| Implementations may choose to implement this as separate RIBs for | reachability. Implementations may choose to implement this with | |||
| each address family and/or Node NLRI. | separate RIBs for each address family and/or Prefix versus Node | |||
| reachability. It is synonymous with the Loc-RIB specified in | ||||
| [RFC4271]. | ||||
| o Link State NLRI Database (LSNDB) - Database of BGP-LS NLRI that | o Global Routing Information Base (GLOBAL-RIB) - This is Routing | |||
| facilitates access to all Node, Link, and Prefix NLRI as well as | Information Base (RIB) containing the current routes that are | |||
| all the Link and Prefix NLRI corresponding to a given Node NLRI. | installed in the router's forwarding plane. This is commonly | |||
| Other optimization, such as, resolving bi-directional connectivity | referred to in networking parlance as "the RIB". | |||
| associations between Link NLRI are possible but of scope of this | ||||
| document. | ||||
| o Candidate List - This is a list of candidate Node NLRI with the | o Link State NLRI Database (LSNDB) - Database of BGP-LS-SPF NLRI | |||
| lowest cost Node NLRI at the front of the list. It is typically | that facilitates access to all Node, Link, and Prefix NLRI. | |||
| implemented as a heap but other concrete data structures have also | ||||
| been used. | o Candidate List (CAN-LIST) - This is a list of candidate Node | |||
| NLRIs. The list is sorted by the cost to reach the Node NLRI with | ||||
| the Node NLRI with the lowest reachability cost at the head of the | ||||
| list. This facilitates execution of the Dijkstra algorithm | ||||
| Section 1.1 where the shortest paths between the local node and | ||||
| other nodes in graph area computed. The CAN-LIST is typically | ||||
| implemented as a heap but other data structures have been used. | ||||
| The algorithm is comprised of the steps below: | The algorithm is comprised of the steps below: | |||
| 1. The current local RIB is invalidated. The local RIB is rebuilt | 1. The current LOC-RIB is invalidated, and the CAN-LIST is | |||
| during the course of the SPF computation. The existing routing | initialized to empty. The LOC-RIB is rebuilt during the course | |||
| entries are preserved for comparison to determine changes that | of the SPF computation. The existing routing entries are | |||
| need to be installed in the global RIB. | preserved for comparison to determine changes that need to be | |||
| made to the GLOBAL-RIB in step 6. | ||||
| 2. The computing router's Node NLRI is installed in the local RIB | 2. The computing router's Node NLRI is updated in the LOC-RIB with a | |||
| with a cost of 0 and as the sole entry in the candidate list. | cost of 0 and the Node NLRI is also added to the CAN-LIST. The | |||
| next-hop list is set to the internal loopback next-hop. | ||||
| 3. The Node NLRI with the lowest cost is removed from the candidate | 3. The Node NLRI with the lowest cost is removed from the candidate | |||
| list for processing. If the BGP-LS Node attribute includes an | list for processing. If the BGP-LS Node attribute includes an | |||
| SPF Status TLV (Section 4.1.2) indicating the node is | SPF Status TLV (Section 5.2.1.2) indicating the node is | |||
| unreachable, the Node NLRI is ignored and the next lowest cost | unreachable, the Node NLRI is ignored and the next lowest cost | |||
| Node NLRI is selected from candidate list. The Node | Node NLRI is selected from candidate list. The Node | |||
| corresponding to this NLRI will be referred to as the Current | corresponding to this NLRI will be referred to as the Current- | |||
| Node. If the candidate list is empty, the SPF calculation has | Node. If the candidate list is empty, the SPF calculation has | |||
| completed and the algorithm proceeds to step 6. | completed and the algorithm proceeds to step 6. | |||
| 4. All the Prefix NLRI with the same Node Identifiers as the Current | 4. All the Prefix NLRI with the same Node Identifiers as the | |||
| Node will be considered for installation. The cost for each | Current-Node will be considered for installation. The next- | |||
| prefix is the metric advertised in the Prefix NLRI added to the | hop(s) for these Prefix NLRI are inherited from the Current-Node. | |||
| cost to reach the Current Node. | The cost for each prefix is the metric advertised in the Prefix | |||
| Attribute Prefix Metric TLV (1155) added to the cost to reach the | ||||
| Current-Node. The following will be done for each Prefix NLRI | ||||
| (referred to as the Current-Prefix): | ||||
| * If the BGP-LS Prefix attribute includes an SPF Status TLV | * If the BGP-LS Prefix attribute includes an SPF Status TLV | |||
| indicating the prefix is unreachable, the BGP-LS Prefix NLRI | indicating the prefix is unreachable, the Current-Prefix is | |||
| is considered unreachable and the next BGP-LS Prefix NLRI is | considered unreachable and the next Prefix NLRI is examined in | |||
| examined. | Step 4. | |||
| * If the prefix is in the local RIB and the cost is greater than | * If the Current-Prefix's corresponding prefix is in the LOC-RIB | |||
| the Current route's metric, the Prefix NLRI does not | and the cost is less than the Current-Prefix's metric, the | |||
| contribute to the route and is ignored. | Current-Prefix does not contribute to the route and the next | |||
| Prefix NLRI is examined in Step 4. | ||||
| * If the prefix is in the local RIB and the cost is less than | * If the Current-Prefix's corresponding prefix is not in the | |||
| the current route's metric, the Prefix is installed with the | LOC-RIB, the prefix is installed with the Current-Node's next- | |||
| Current Node's next-hops replacing the local RIB route's next- | hops installed as the LOC-RIB route's next-hops and the metric | |||
| hops and the metric being updated. | being updated. If the IGP Route Tag TLV (1153) is included in | |||
| the Current-Prefix's NLRI Attribute, the tag(s) are installed | ||||
| in the current LOC-RIB route's tag(s). | ||||
| * If the prefix is in the local RIB and the cost is same as the | * If the Current-Prefix's corresponding prefix is in the LOC-RIB | |||
| current route's metric, the Prefix is installed with the | and the cost is less than the current route's metric, the | |||
| Current Node's next-hops being merged with local RIB route's | prefix is installed with the Current-Node's next-hops | |||
| next-hops. | replacing the LOC-RIB route's next-hops and the metric being | |||
| updated and any route tags removed. If the IGP Route Tag TLV | ||||
| (1153) is included in the Current-Prefix's NLRI Attribute, the | ||||
| tag(s) are installed in the current LOC-RIB route's tag(s). | ||||
| 5. All the Link NLRI with the same Node Identifiers as the Current | * If the Current-Prefix's corresponding prefix is in the LOC-RIB | |||
| and the cost is the same as the current route's metric, the | ||||
| Current-Node's next-hops will be merged with LOC-RIB route's | ||||
| next-hops. If the IGP Route Tag TLV (1153) is included in the | ||||
| Current-Prefix's NLRI Attribute, the tag(s) are merged into | ||||
| the LOC-RIB route's current tags. | ||||
| 5. All the Link NLRI with the same Node Identifiers as the Current- | ||||
| Node will be considered for installation. Each link will be | Node will be considered for installation. Each link will be | |||
| examined and will be referred to in the following text as the | examined and will be referred to in the following text as the | |||
| Current Link. The cost of the Current Link is the advertised | Current-Link. The cost of the Current-Link is the advertised IGP | |||
| metric in the Link NLRI added to the cost to reach the Current | Metric TLV (1095) from the Link NLRI BGP-LS attribute added to | |||
| Node. | the cost to reach the Current-Node. If the Current-Node is for | |||
| the local BGP Router, the next-hop for the link will be a direct | ||||
| next-hop pointing to the corresponding local interface. For any | ||||
| other Current-Node, the next-hop(s) for the Current-Link will be | ||||
| inherited from the Current-Node. The following will be done for | ||||
| each link: | ||||
| * Optionally, the prefix(es) associated with the Current Link | A. The prefix(es) associated with the Current-Link are installed | |||
| are installed into the local RIB using the same rules as were | into the LOC-RIB using the same rules as were used for Prefix | |||
| used for Prefix NLRI in the previous steps. | NLRI in the previous steps. Optionally, in deployments where | |||
| BGP-SPF routers have limited routing table capacity, | ||||
| installation of these subnets can be suppressed. Suppression | ||||
| will have an operational impact as the IPv4/IPv6 link | ||||
| endpoint addresses will not be reachable and tools such as | ||||
| traceroute will display addresses that are not reachable. | ||||
| * If the current Node NLRI attributes includes the SPF status | B. If the Current-Node NLRI attributes includes the SPF status | |||
| TLV (Section 4.1.2) and the status indicates that the Node | TLV (Section 5.2.1.2) and the status indicates that the Node | |||
| doesn't support transit, the next link for the current node is | doesn't support transit, the next link for the Current-Node | |||
| processed. | is processed in Step 5. | |||
| * The Current Link's endpoint Node NLRI is accessed (i.e., the | C. If the Current-Link's NLRI attribute includes an SPF Status | |||
| Node NLRI with the same Node identifiers as the Link | TLV indicating the link is down, the BGP-LS-SPF Link NLRI is | |||
| endpoint). If it exists, it will be referred to as the | considered down and the next link for the Current-Node is | |||
| Endpoint Node NLRI and the algorithm will proceed as follows: | examined in Step 5. | |||
| + If the BGP-LS Link NLRI attribute includes an SPF Status | D. The Current-Link's Remote Node NLRI is accessed (i.e., the | |||
| TLV indicating the link is down, the BGP-LS Link NLRI is | Node NLRI with the same Node identifiers as the Current- | |||
| considered down and the next BGP-LS Link NLRI is examined. | Link's Remote Node Descriptors). If it exists, it will be | |||
| referred to as the Remote-Node and the algorithm will proceed | ||||
| as follows: | ||||
| + All the Link NLRI corresponding the Endpoint Node NLRI will | + If the Remote-Node's NLRI attribute includes an SPF Status | |||
| be searched for a back-link NLRI pointing to the current | TLV indicating the node is unreachable, the next link for | |||
| node. Both the Node identifiers and the Link endpoint | the Current-Node is examined in Step 5. | |||
| identifiers in the Endpoint Node's Link NLRI must match for | ||||
| a match. If there is no corresponding Link NLRI | ||||
| corresponding to the Endpoint Node NLRI, the Endpoint Node | ||||
| NLRI fails the bi-directional connectivity test and is not | ||||
| processed further. | ||||
| + If the Endpoint Node NLRI is not on the candidate list, it | + All the Link NLRI corresponding the Remote-Node will be | |||
| is inserted based on the link cost and BGP Identifier (the | searched for a Link NLRI pointing to the Current-Node. | |||
| latter being used as a tie-breaker). | Each Link NLRI is examined for Remote Node Descriptors | |||
| matching the Current-Node and Link Descriptors matching | ||||
| the Current-Link (e.g., sharing a common IPv4 or IPv6 | ||||
| subnet). If both these conditions are satisfied for one | ||||
| of the Remote-Node's links, the bi-directional | ||||
| connectivity check succeeds and the Remote-Node may be | ||||
| processed further. The Remote-Node's Link NLRI providing | ||||
| bi-directional connectivity will be referred to as the | ||||
| Remote-Link. If no Remote-Link is found, the next link | ||||
| for the Current-Node is examined in Step 5. | ||||
| + If the Endpoint Node NLRI is already on the candidate list | + If the Remote-Link NLRI attribute includes an SPF Status | |||
| with a lower cost, it need not be inserted again. | TLV indicating the link is down, the Remote-Link NLRI is | |||
| considered down and the next link for the Current-Node is | ||||
| examined in Step 5. | ||||
| + If the Endpoint Node NLRI is already on the candidate list | + If the Remote-Node is not on the CAN-LIST, it is inserted | |||
| with a higher cost, it must be removed and reinserted with | based on the cost. The Remote Node's cost is the cost of | |||
| a lower cost. | Current-Node added the Current-Link's IGP Metric TLV | |||
| (1095). The next-hop(s) for the Remote-Node are inherited | ||||
| from the Current-Link. | ||||
| * Return to step 3 to process the next lowest cost Node NLRI on | + If the Remote-Node NLRI is already on the CAN-LIST with a | |||
| the candidate list. | higher cost, it must be removed and reinserted with the | |||
| Remote-Node cost based on the Current-Link (as calculated | ||||
| in the previous step). The next-hop(s) for the Remote- | ||||
| Node are inherited from the Current-Link. | ||||
| 6. The local RIB is examined and changes (adds, deletes, | + If the Remote-Node NLRI is already on the CAN-LIST with | |||
| modifications) are installed into the global RIB. | the same cost, it need not be reinserted on the CAN-LIST. | |||
| However, the Current-Link's next-hop(s) must be merged | ||||
| into the current set of next-hops for the Remote-Node. | ||||
| 5.4. NEXT_HOP Manipulation | + If the Remote-Node NLRI is already on the CAN-LIST with a | |||
| lower cost, it need not be reinserted on the CAN-LIST. | ||||
| A BGP speaker that supports SPF extensions MAY interact with peers | E. Return to step 3 to process the next lowest cost Node NLRI on | |||
| that don't support SPF extensions. If the BGP-LS address family is | the CAN-LIST. | |||
| advertised to a peer not supporting the SPF extensions described | ||||
| herein, then the BGP speaker MUST conform to the NEXT_HOP rules | ||||
| specified in [RFC4271] when announcing the Link-State address family | ||||
| routes to those peers. | ||||
| All BGP peers that support SPF extensions would locally compute the | 6. The LOC-RIB is examined and changes (adds, deletes, | |||
| Loc-RIB next-hops as a result of the SPF process. Consequently, the | modifications) are installed into the GLOBAL-RIB. For each route | |||
| NEXT_HOP attribute is always ignored on receipt. However, BGP | in the LOC-RIB: | |||
| speakers SHOULD set the NEXT_HOP address according to the NEXT_HOP | ||||
| attribute rules specified in [RFC4271]. | ||||
| 5.5. IPv4/IPv6 Unicast Address Family Interaction | * If the route was added during the current BGP SPF computation, | |||
| install the route into the GLOBAL-RIB. | ||||
| While the BGP-LS SPF address family and the IPv4/IPv6 unicast address | * If the route modified during the current BGP SPF computation | |||
| families install routes into the same device routing tables, they | (e.g., metric, tags, or next-hops), update the route in the | |||
| GLOBAL-RIB. | ||||
| * If the route was not installed during the current BGP SPF | ||||
| computation, remove the route from both the GLOBAL-RIB and the | ||||
| LOC-RIB. | ||||
| 6.4. IPv4/IPv6 Unicast Address Family Interaction | ||||
| While the BGP-LS-SPF address family and the IPv4/IPv6 unicast address | ||||
| families MAY install routes into the same device routing tables, they | ||||
| will operate independently much the same as OSPF and IS-IS would | will operate independently much the same as OSPF and IS-IS would | |||
| operate today (i.e., "Ships-in-the-Night" mode). There will be no | operate today (i.e., "Ships-in-the-Night" mode). There is no | |||
| implicit route redistribution between the BGP address families. | implicit route redistribution between the BGP address families. | |||
| However, implementation specific redistribution mechanisms SHOULD be | ||||
| made available with the restriction that redistribution of BGP-LS SPF | ||||
| routes into the IPv4 address family applies only to IPv4 routes and | ||||
| redistribution of BGP-LS SPF route into the IPv6 address family | ||||
| applies only to IPv6 routes. | ||||
| Given the fact that SPF algorithms are based on the assumption that | It is RECOMMENDED that BGP-LS-SPF IPv4/IPv6 route computation and | |||
| all routers in the routing domain calculate the precisely the same | installation be given scheduling priority by default over other BGP | |||
| SPF tree and install the same set of routes, it is RECOMMENDED that | address families as these address families are considered as underlay | |||
| BGP-LS SPF IPv4/IPv6 routes be given priority by default when | SAFIs. Similarly, it is RECOMMENDED that the route preference or | |||
| installed into their respective RIBs. In common implementations the | administrative distance give active route installation preference to | |||
| prioritization is governed by route preference or administrative | BGP-LS-SPF IPv4/IPv6 routes over BGP routes from other AFI/SAFIs. | |||
| distance with lower being more preferred. | However, this preference MAY be overridden by an operator-configured | |||
| policy. | ||||
| 5.6. NLRI Advertisement and Convergence | 6.5. NLRI Advertisement | |||
| 5.6.1. Link/Prefix Failure Convergence | 6.5.1. Link/Prefix Failure Convergence | |||
| A local failure will prevent a link from being used in the SPF | A local failure will prevent a link from being used in the SPF | |||
| calculation due to the IGP bi-directional connectivity requirement. | calculation due to the IGP bi-directional connectivity requirement. | |||
| Consequently, local link failures should always be given priority | Consequently, local link failures SHOULD always be given priority | |||
| over updates (e.g., withdrawing all routes learned on a session) in | over updates (e.g., withdrawing all routes learned on a session) in | |||
| order to ensure the highest priority propagation and optimal | order to ensure the highest priority propagation and optimal | |||
| convergence. | convergence. | |||
| An IGP such as OSPF [RFC2328] will stop using the link as soon as the | An IGP such as OSPF [RFC2328] will stop using the link as soon as the | |||
| Router-LSA for one side of the link is received. With normal BGP | Router-LSA for one side of the link is received. With a BGP | |||
| advertisement, the link would continue to be used until the last copy | advertisement, the link would continue to be used until the last copy | |||
| of the BGP-LS Link NLRI is withdrawn. In order to avoid this delay, | of the BGP-LS-SPF Link NLRI is withdrawn. In order to avoid this | |||
| the originator of the Link NLRI will advertise a more recent version | delay, the originator of the Link NLRI SHOULD advertise a more recent | |||
| of the BGP-LS Link NLRI including the SPF Status TLV Section 4.2.2 | version with an increased Sequence Number TLV for the BGP-LS-SPF Link | |||
| indicating the link is down with respect to BGP SPF. After some | NLRI including the SPF Status TLV (Section 5.2.2.2) indicating the | |||
| configurable period of time, e.g., 2-3 seconds, the BGP-LS Link NLRI | link is down with respect to BGP SPF. After some configurable period | |||
| can be withdrawn with no consequence. If the link becomes available | of time, which is an implementation dependent, e.g., 2-3 seconds, the | |||
| in that period, the originator of the BGP-LS LINK NLRI will simply | BGP-LS-SPF Link NLRI can be withdrawn with no consequence. If the | |||
| advertise a more recent version of the BGP-LS Link NLRI without the | link becomes available in that period, the originator of the BGP-LS- | |||
| SPF Status TLV in the BGP-LS Link Attributes. | SPF LINK NLRI will simply advertise a more recent version of the BGP- | |||
| LS-SPF Link NLRI without the SPF Status TLV in the BGP-LS Link | ||||
| Attributes. | ||||
| Similarly, when a prefix becomes unreachable, a more recent version | Similarly, when a prefix becomes unreachable, a more recent version | |||
| of the BGP-LS Prefix NLRI will be advertised with the SPF Status TLV | of the BGP-LS-SPF Prefix NLRI will be advertised with the SPF Status | |||
| Section 4.3.1 indicating the prefix is unreachable in the BGP-LS | TLV (Section 5.2.3.1) indicating the prefix is unreachable in the | |||
| Prefix Attributes and the prefix will be considered unreachable with | BGP-LS Prefix Attributes and the prefix will be considered | |||
| respect to BGP SPF. After some configurable period of time, e.g., | unreachable with respect to BGP SPF. After some configurable period | |||
| 2-3 seconds, the BGP-LS Prefix NLRI can be withdrawn with no | of time, which is implementation dependent, e.g., 2-3 seconds, the | |||
| consequence. If the prefix becomes reachable in that period, the | BGP-LS-SPF Prefix NLRI can be withdrawn with no consequence. If the | |||
| originator of the BGP-LS Prefix NLRI will simply advertise a more | prefix becomes reachable in that period, the originator of the BGP- | |||
| recent version of the BGP-LS Prefix NLRI without the SPF Status TLV | LS-SPF Prefix NLRI will simply advertise a more recent version of the | |||
| in the BGP-LS Prefix Attributes. | BGP-LS-SPF Prefix NLRI without the SPF Status TLV in the BGP-LS | |||
| Prefix Attributes. | ||||
| 5.6.2. Node Failure Convergence | 6.5.2. Node Failure Convergence | |||
| With BGP without graceful restart [RFC4724], all the NLRI advertised | With BGP without graceful restart [RFC4724], all the NLRI advertised | |||
| by node are implicitly withdrawn when a session failure is detected. | by a node are implicitly withdrawn when a session failure is | |||
| If fast failure detection such as BFD is utilized, and the node is on | detected. If fast failure detection such as BFD is utilized, and the | |||
| the fastest converging path, the most recent versions of BGP-LS NLRI | node is on the fastest converging path, the most recent versions of | |||
| may be withdrawn while these versions are in-flight on longer paths. | BGP-LS-SPF NLRI may be withdrawn. This will result into an older | |||
| This will result the older version of the NLRI being used until the | version of the NLRI being used until the new versions arrive and, | |||
| new versions arrive and, potentially, unnecessary route flaps. | potentially, unnecessary route flaps. Therefore, BGP-LS-SPF NLRI | |||
| Therefore, BGP-LS SPF NLRI SHOULD always be retained before being | SHOULD always be retained before being implicitly withdrawn for a | |||
| implicitly withdrawn for a brief configurable interval, e.g., 2-3 | configurable implementation-dependent interval, e.g., 2-3 seconds. | |||
| seconds. This will not delay convergence since the adjacent nodes | This will not delay convergence since the adjacent nodes will detect | |||
| will detect the link failure and advertise a more recent NLRI | the link failure and advertise a more recent NLRI indicating the link | |||
| indicating the link is down with respect to BGP SPF Section 5.6.1 and | is down with respect to BGP SPF (Section 6.5.1) and the BGP SPF | |||
| the BGP-SPF calculation will failure the bi-directional connectivity | calculation will fail the bi-directional connectivity check | |||
| check. | Section 6.3. | |||
| 5.7. Error Handling | 7. Error Handling | |||
| This section describes the Error Handling actions, as described in | ||||
| [RFC7606], that are specific to SAFI BGP-LS-SPF BGP Update message | ||||
| processing. | ||||
| 7.1. Processing of BGP-LS-SPF TLVs | ||||
| When a BGP speaker receives a BGP Update containing a malformed Node | ||||
| NLRI SPF Status TLV in the BGP-LS Attribute [RFC7752], it MUST ignore | ||||
| the received TLV and MUST NOT pass it to other BGP peers as specified | ||||
| in [RFC7606]. When discarding an associated Node NLRI with a | ||||
| malformed TLV, a BGP speaker SHOULD log an error for further | ||||
| analysis. | ||||
| When a BGP speaker receives a BGP Update containing a malformed Link | ||||
| NLRI SPF Status TLV in the BGP-LS Attribute [RFC7752], it MUST ignore | ||||
| the received TLV and MUST NOT pass it to other BGP peers as specified | ||||
| in [RFC7606]. When discarding an associated Link NLRI with a | ||||
| malformed TLV, a BGP speaker SHOULD log an error for further | ||||
| analysis. | ||||
| When a BGP speaker receives a BGP Update containing a malformed | ||||
| Prefix NLRI SPF Status TLV in the BGP-LS Attribute [RFC7752], it MUST | ||||
| ignore the received TLV and MUST NOT pass it to other BGP peers as | ||||
| specified in [RFC7606]. When discarding an associated Prefix NLRI | ||||
| with a malformed TLV, a BGP speaker SHOULD log an error for further | ||||
| analysis. | ||||
| When a BGP speaker receives a BGP Update containing a malformed SPF | When a BGP speaker receives a BGP Update containing a malformed SPF | |||
| Capability TLV in the Node NLRI BGP-LS Attribute [RFC7752], it MUST | Capability TLV in the Node NLRI BGP-LS Attribute [RFC7752], it MUST | |||
| ignore the received TLV and the Node NLRI and not pass it to other | ignore the received TLV and the Node NLRI and MUST NOT pass it to | |||
| BGP peers as specified in [RFC7606]. When discarding a Node NLRI | other BGP peers as specified in [RFC7606]. When discarding a Node | |||
| with malformed TLV, a BGP speaker SHOULD log an error for further | NLRI with a malformed TLV, a BGP speaker SHOULD log an error for | |||
| analysis. | further analysis. | |||
| 6. IANA Considerations | When a BGP speaker receives a BGP Update containing a malformed IPv4 | |||
| Prefix-Length TLV in the Link NLRI BGP-LS Attribute [RFC7752], it | ||||
| MUST ignore the received TLV and the Node NLRI and MUST NOT pass it | ||||
| to other BGP peers as specified in [RFC7606]. The corresponding Link | ||||
| NLRI is considered as malformed and MUST be handled as 'Treat-as- | ||||
| withdraw'. An implementation MAY log an error for further analysis. | ||||
| This document defines an AFI/SAFI for BGP-LS SPF operation and | When a BGP speaker receives a BGP Update containing a malformed IPv6 | |||
| requests IANA to assign the BGP-LS/BGP-LS-SPF (AFI 16388 / SAFI TBD1) | Prefix-Length TLV in the Link NLRI BGP-LS Attribute [RFC7752], it | |||
| as described in [RFC4760]. | MUST ignore the received TLV and the Node NLRI and MUST NOT pass it | |||
| to other BGP peers as specified in [RFC7606]. The corresponding Link | ||||
| NLRI is considered as malformed and MUST be handled as 'Treat-as- | ||||
| withdraw'. An implementation MAY log an error for further analysis. | ||||
| This document also defines five attribute TLVs for BGP-LS NLRI. We | 7.2. Processing of BGP-LS-SPF NLRIs | |||
| request IANA to assign types for the SPF capability TLV, Sequence | ||||
| A Link-State NLRI MUST NOT be considered as malformed or invalid | ||||
| based on the inclusion/exclusion of TLVs or contents of the TLV | ||||
| fields (i.e., semantic errors), as described in Section 5.1 and | ||||
| Section 5.1.1. | ||||
| A BGP-LS-SPF Speaker MUST perform the following syntactic validation | ||||
| of the BGP-LS-SPF NLRI to determine if it is malformed. | ||||
| 1. Does the sum of all TLVs found in the BGP MP_REACH_NLRI attribute | ||||
| correspond to the BGP MP_REACH_NLRI length? | ||||
| 2. Does the sum of all TLVs found in the BGP MP_UNREACH_NLRI | ||||
| attribute correspond to the BGP MP_UNREACH_NLRI length? | ||||
| 3. Does the sum of all TLVs found in a BGP-LS-SPF NLRI correspond to | ||||
| the Total NLRI Length field of all its Descriptors? | ||||
| 4. When an NLRI TLV is recognized, is the length of the TLV and its | ||||
| sub-TLVs valid? | ||||
| 5. Has the syntactic correctness of the NLRI fields been verified as | ||||
| per [RFC7606]? | ||||
| 6. Has the rule regarding ordering of TLVs been followed as | ||||
| described in Section 5.1.1? | ||||
| When the error determined allows for the router to skip the malformed | ||||
| NLRI(s) and continue processing of the rest of the update message | ||||
| (e.g., when the TLV ordering rule is violated), then it MUST handle | ||||
| such malformed NLRIs as 'Treat-as-withdraw'. In other cases, where | ||||
| the error in the NLRI encoding results in the inability to process | ||||
| the BGP update message (e.g., length related encoding errors), then | ||||
| the router SHOULD handle such malformed NLRIs as 'AFI/SAFI disable' | ||||
| when other AFI/SAFI besides BGP-LS are being advertised over the same | ||||
| session. Alternately, the router MUST perform 'session reset' when | ||||
| the session is only being used for BGP-LS-SPF or when its 'AFI/SAFI | ||||
| disable' action is not possible. | ||||
| 7.3. Processing of BGP-LS Attribute | ||||
| A BGP-LS Attribute MUST NOT be considered as malformed or invalid | ||||
| based on the inclusion/exclusion of TLVs or contents of the TLV | ||||
| fields (i.e., semantic errors), as described in Section 5.1 and | ||||
| Section 5.1.1. | ||||
| A BGP-LS-SPF Speaker MUST perform the following syntactic validation | ||||
| of the BGP-LS Attribute to determine if it is malformed. | ||||
| 1. Does the sum of all TLVs found in the BGP-LS-SPF Attribute | ||||
| correspond to the BGP-LS Attribute length? | ||||
| 2. Has the syntactic correctness of the Attributes (including BGP-LS | ||||
| Attribute) been verified as per [RFC7606]? | ||||
| 3. Is the length of each TLV and, when the TLV is recognized then, | ||||
| its sub-TLVs in the BGP-LS Attribute valid? | ||||
| When the detected error allows for the router to skip the malformed | ||||
| BGP-LS Attribute and continue processing of the rest of the update | ||||
| message (e.g., when the BGP-LS Attribute length and the total Path | ||||
| Attribute Length are correct but some TLV/sub-TLV length within the | ||||
| BGP-LS Attribute is invalid), then it MUST handle such malformed BGP- | ||||
| LS Attribute as 'Attribute Discard'. In other cases, when the error | ||||
| in the BGP-LS Attribute encoding results in the inability to process | ||||
| the BGP update message, then the handling is the same as described | ||||
| above for malformed NLRI. | ||||
| Note that the 'Attribute Discard' action results in the loss of all | ||||
| TLVs in the BGP-LS Attribute and not the removal of a specific | ||||
| malformed TLV. The removal of specific malformed TLVs may give a | ||||
| wrong indication to a BGP-LS-SPF speaker that the specific | ||||
| information is being deleted or is not available. | ||||
| When a BGP-LS-SPF speaker receives an update message with Link-State | ||||
| NLRI(s) in the MP_REACH_NLRI but without the BGP-LS-SPF Attribute, it | ||||
| is most likely an indication that a BGP-LS-SPF speaker preceding it | ||||
| has performed the 'Attribute Discard' fault handling. An | ||||
| implementation SHOULD preserve and propagate the Link-State NLRIs in | ||||
| such an update message so that the BGP-LS-SPF speaker can detect the | ||||
| loss of link-state information for that object and not assume its | ||||
| deletion/withdrawal. This also makes it possible for a network | ||||
| operator to trace back to the BGP-LS-SPF speaker which actually | ||||
| detected a problem with the BGP-LS Attribute. | ||||
| An implementation SHOULD log an error for further analysis for | ||||
| problems detected during syntax validation. | ||||
| When a BGP speaker receives a BGP Update containing a malformed IGP | ||||
| metric TLV in the Link NLRI BGP-LS Attribute [RFC7752], it MUST | ||||
| ignore the received TLV and the Link NLRI and MUST NOT pass it to | ||||
| other BGP peers as specified in [RFC7606]. When discarding a Link | ||||
| NLRI with a malformed TLV, a BGP speaker SHOULD log an error for | ||||
| further analysis. | ||||
| 8. IANA Considerations | ||||
| This document defines the use of SAFI (80) for BGP SPF operation | ||||
| Section 5.1, and requests IANA to assign the value from the First | ||||
| Come First Serve (FCFS) range in the Subsequent Address Family | ||||
| Identifiers (SAFI) Parameters registry. | ||||
| This document also defines five attribute TLVs of BGP-LS-SPF NLRI. | ||||
| We request IANA to assign types for the SPF capability TLV, Sequence | ||||
| Number TLV, IPv4 Link Prefix-Length TLV, IPv6 Link Prefix-Length TLV, | Number TLV, IPv4 Link Prefix-Length TLV, IPv6 Link Prefix-Length TLV, | |||
| and SPF Status TLV from the "BGP-LS Node Descriptor, Link Descriptor, | and SPF Status TLV from the "BGP-LS Node Descriptor, Link Descriptor, | |||
| Prefix Descriptor, and Attribute TLVs" Registry. | Prefix Descriptor, and Attribute TLVs" Registry. | |||
| 7. Security Considerations | +-------------------------+-----------------+--------------------+ | |||
| | Attribute TLV | Suggested Value | NLRI Applicability | | ||||
| +-------------------------+-----------------+--------------------+ | ||||
| | SPF Capability | 1180 | Node | | ||||
| | SPF Status | 1184 | Node, Link, Prefix | | ||||
| | IPv4 Link Prefix Length | 1182 | Link | | ||||
| | IPv6 Link Prefix Length | 1183 | Link | | ||||
| | Sequence Number | 1181 | Node, Link, Prefix | | ||||
| +-------------------------+-----------------+--------------------+ | ||||
| This extension to BGP does not change the underlying security issues | Table 1: NLRI Attribute TLVs | |||
| inherent in the existing [RFC4271], [RFC4724], and [RFC7752]. | ||||
| 8. Management Considerations | 9. Security Considerations | |||
| This section includes unique management considerations for the BGP-LS | This document defines a BGP SAFI, i.e., the BGP-LS-SPF SAFI. This | |||
| SPF address family. | document does not change the underlying security issues inherent in | |||
| the BGP protocol [RFC4271]. The Security Considerations discussed in | ||||
| [RFC4271] apply to the BGP SPF functionality as well. The analysis | ||||
| of the security issues for BGP mentioned in [RFC4272] and [RFC6952] | ||||
| also applies to this document. The analysis of Generic Threats to | ||||
| Routing Protocols done in [RFC4593] is also worth noting. As the | ||||
| modifications described in this document for BGP SPF apply to IPv4 | ||||
| Unicast and IPv6 Unicast as undelay SAFIs in a single BGP SPF Routing | ||||
| Domain, the BGP security solutions described in [RFC6811] and | ||||
| [RFC8205] are somewhat constricted as they are meant to apply for | ||||
| inter-domain BGP where multiple BGP Routing Domains are typically | ||||
| involved. The BGP-LS-SPF SAFI NLRI described in this document are | ||||
| typically advertised between EBGP or IBGP speakers under a single | ||||
| administrative domain. | ||||
| 8.1. Configuration | In the context of the BGP peering associated with this document, a | |||
| BGP speaker MUST NOT accept updates from a peer that is not within | ||||
| any administrative control of an operator. That is, a participating | ||||
| BGP speaker SHOULD be aware of the nature of its peering | ||||
| relationships. Such protection can be achieved by manual | ||||
| configuration of peers at the BGP speaker. | ||||
| In addition to configuration of the BGP-LS SPF address family, | In order to mitigate the risk of peering with BGP speakers | |||
| implementations SHOULD support the configuration of the | masquerading as legitimate authorized BGP speakers, it is recommended | |||
| INITIAL_SPF_DELAY, SHORT_SPF_DELAY, LONG_SPF_DELAY, TIME_TO_LEARN, | that the TCP Authentication Option (TCP-AO) [RFC5925] be used to | |||
| and HOLDDOWN_INTERVAL as documented in [RFC8405]. | authenticate BGP sessions. If an authorized BGP peer is compromised, | |||
| that BGP peer could advertise modified Node, Link, or Prefix NLRI | ||||
| will result in misrouting, repeating origination of NLRI, and/or | ||||
| excessive SPF calculations. When a BGP speaker detects that its | ||||
| self-originated NLRI is being originated by another BGP speaker, an | ||||
| appropriate error should be logged so that the operator can take | ||||
| corrective action. | ||||
| 8.2. Operational Data | 10. Management Considerations | |||
| This section includes unique management considerations for the BGP- | ||||
| LS-SPF address family. | ||||
| 10.1. Configuration | ||||
| All routers in BGP SPF Routing Domain are under a single | ||||
| administrative domain allowing for consistent configuration. | ||||
| 10.1.1. Link Metric Configuration | ||||
| Within a BGP SPF Routing Domain, the IGP metrics for all advertised | ||||
| links SHOULD be configured or defaulted consistently. For example, | ||||
| if a default metric is used for one router's links, then a similar | ||||
| metric should be used for all router's links. Similarly, if the link | ||||
| cost is derived from using the inverse of the link bandwidth on one | ||||
| router, then this SHOULD be done for all routers and the same | ||||
| reference bandwidth should be used to derive the inversely | ||||
| proportional metric. Failure to do so will not result in correct | ||||
| routing based on link metric. | ||||
| 10.1.2. backoff-config | ||||
| In addition to configuration of the BGP-LS-SPF address family, | ||||
| implementations SHOULD support the "Shortest Path First (SPF) Back- | ||||
| Off Delay Algorithm for Link-State IGPs" [RFC8405]. If supported, | ||||
| configuration of the INITIAL_SPF_DELAY, SHORT_SPF_DELAY, | ||||
| LONG_SPF_DELAY, TIME_TO_LEARN, and HOLDDOWN_INTERVAL MUST be | ||||
| supported [RFC8405]. Section 6 of [RFC8405] recommends consistent | ||||
| configuration of these values throughout the IGP routing domain and | ||||
| this also applies to the BGP SPF Routing Domain. | ||||
| 10.2. Operational Data | ||||
| In order to troubleshoot SPF issues, implementations SHOULD support | In order to troubleshoot SPF issues, implementations SHOULD support | |||
| an SPF log including entries for previous SPF computations, Each SPF | an SPF log including entries for previous SPF computations. Each SPF | |||
| log entry would include the BGP-LS NLRI SPF triggering the SPF, SPF | log entry would include the BGP-LS-SPF NLRI SPF triggering the SPF, | |||
| scheduled time, SPF start time, SPF end time, and SPF type if | SPF scheduled time, SPF start time, SPF end time, and SPF type if | |||
| different types of SPF are supported. Since the size of the log will | different types of SPF are supported. Since the size of the log will | |||
| be finite, implementations SHOULD also maintain counters for the | be finite, implementations SHOULD also maintain counters for the | |||
| total number of SPF computations of each type and the total number of | total number of SPF computations and the total number of SPF | |||
| SPF triggering events. Additionally, to troubleshoot SPF scheduling | triggering events. Additionally, to troubleshoot SPF scheduling and | |||
| and back-off [RFC8405], the current SPF back-off state, remaining | back-off [RFC8405], the current SPF back-off state, remaining time- | |||
| time-to-learn, remaining holddown, last trigger event time, last SPF | to-learn, remaining holddown, last trigger event time, last SPF time, | |||
| time, and next SPF time should be available. | and next SPF time should be available. | |||
| 9. Implementation Status | 11. Implementation Status | |||
| Note RFC Editor: Please remove this section and the associated | Note RFC Editor: Please remove this section and the associated | |||
| references prior to publication. | references prior to publication. | |||
| This section records the status of known implementations of the | This section records the status of known implementations of the | |||
| protocol defined by this specification at the time of posting of this | protocol defined by this specification at the time of posting of this | |||
| Internet-Draft, and is based on a proposal described in [RFC7942]. | Internet-Draft and is based on a proposal described in [RFC7942]. | |||
| The description of implementations in this section is intended to | The description of implementations in this section is intended to | |||
| assist the IETF in its decision processes in progressing drafts to | assist the IETF in its decision processes in progressing drafts to | |||
| RFCs. Please note that the listing of any individual implementation | RFCs. Please note that the listing of any individual implementation | |||
| here does not imply endorsement by the IETF. Furthermore, no effort | here does not imply endorsement by the IETF. Furthermore, no effort | |||
| has been spent to verify the information presented here that was | has been spent to verify the information presented here that was | |||
| supplied by IETF contributors. This is not intended as, and must not | supplied by IETF contributors. This is not intended as, and must not | |||
| be construed to be, a catalog of available implementations or their | be construed to be, a catalog of available implementations or their | |||
| features. Readers are advised to note that other implementations may | features. Readers are advised to note that other implementations may | |||
| exist. | exist. | |||
| According to RFC 7942, "this will allow reviewers and working groups | According to RFC 7942, "this will allow reviewers and working groups | |||
| to assign due consideration to documents that have the benefit of | to assign due consideration to documents that have the benefit of | |||
| running code, which may serve as evidence of valuable experimentation | running code, which may serve as evidence of valuable experimentation | |||
| and feedback that have made the implemented protocols more mature. | and feedback that have made the implemented protocols more mature. | |||
| It is up to the individual working groups to use this information as | It is up to the individual working groups to use this information as | |||
| they see fit". | they see fit". | |||
| The BGP-LS SPF implementatation status is documented in | The BGP-LS-SPF implementation status is documented in | |||
| [I-D.psarkar-lsvr-bgp-spf-impl]. | [I-D.psarkar-lsvr-bgp-spf-impl]. | |||
| 10. Acknowledgements | 12. Acknowledgements | |||
| The authors would like to thank Sue Hares, Jorge Rabadan, Boris | The authors would like to thank Sue Hares, Jorge Rabadan, Boris | |||
| Hassanov, Dan Frost, Matt Anderson, Fred Baker, and Lukas Krattiger | Hassanov, Dan Frost, Matt Anderson, Fred Baker, and Lukas Krattiger | |||
| for their review and comments. Thanks to Pushpasis Sarkar for | for their review and comments. Thanks to Pushpasis Sarkar for | |||
| discussions on preventing a BGP SPF Router from being used for non- | discussions on preventing a BGP SPF Router from being used for non- | |||
| local traffic (i.e., transit traffic). | local traffic (i.e., transit traffic). | |||
| The authors extend special thanks to Eric Rosen for fruitful | The authors extend special thanks to Eric Rosen for fruitful | |||
| discussions on BGP-LS SPF convergence as compared to IGPs. | discussions on BGP-LS-SPF convergence as compared to IGPs. | |||
| 11. Contributors | 13. Contributors | |||
| In addition to the authors listed on the front page, the following | In addition to the authors listed on the front page, the following | |||
| co-authors have contributed to the document. | co-authors have contributed to the document. | |||
| Derek Yeung | Derek Yeung | |||
| Arrcus, Inc. | Arrcus, Inc. | |||
| derek@arrcus.com | derek@arrcus.com | |||
| Gunter Van De Velde | Gunter Van De Velde | |||
| Nokia | Nokia | |||
| skipping to change at page 20, line 35 ¶ | skipping to change at page 33, line 25 ¶ | |||
| abhay@arrcus.com | abhay@arrcus.com | |||
| Venu Venugopal | Venu Venugopal | |||
| Cisco Systems | Cisco Systems | |||
| venuv@cisco.com | venuv@cisco.com | |||
| Chaitanya Yadlapalli | Chaitanya Yadlapalli | |||
| AT&T | AT&T | |||
| cy098d@att.com | cy098d@att.com | |||
| 12. References | 14. References | |||
| 12.1. Normative References | ||||
| [I-D.ietf-idr-bgpls-segment-routing-epe] | 14.1. Normative References | |||
| Previdi, S., Talaulikar, K., Filsfils, C., Patel, K., Ray, | ||||
| S., and J. Dong, "BGP-LS extensions for Segment Routing | ||||
| BGP Egress Peer Engineering", draft-ietf-idr-bgpls- | ||||
| segment-routing-epe-19 (work in progress), May 2019. | ||||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
| DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
| <https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
| [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A | [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A | |||
| Border Gateway Protocol 4 (BGP-4)", RFC 4271, | Border Gateway Protocol 4 (BGP-4)", RFC 4271, | |||
| DOI 10.17487/RFC4271, January 2006, | DOI 10.17487/RFC4271, January 2006, | |||
| <https://www.rfc-editor.org/info/rfc4271>. | <https://www.rfc-editor.org/info/rfc4271>. | |||
| [RFC4272] Murphy, S., "BGP Security Vulnerabilities Analysis", | ||||
| RFC 4272, DOI 10.17487/RFC4272, January 2006, | ||||
| <https://www.rfc-editor.org/info/rfc4272>. | ||||
| [RFC4593] Barbir, A., Murphy, S., and Y. Yang, "Generic Threats to | ||||
| Routing Protocols", RFC 4593, DOI 10.17487/RFC4593, | ||||
| October 2006, <https://www.rfc-editor.org/info/rfc4593>. | ||||
| [RFC4750] Joyal, D., Ed., Galecki, P., Ed., Giacalone, S., Ed., | ||||
| Coltun, R., and F. Baker, "OSPF Version 2 Management | ||||
| Information Base", RFC 4750, DOI 10.17487/RFC4750, | ||||
| December 2006, <https://www.rfc-editor.org/info/rfc4750>. | ||||
| [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, | ||||
| "Multiprotocol Extensions for BGP-4", RFC 4760, | ||||
| DOI 10.17487/RFC4760, January 2007, | ||||
| <https://www.rfc-editor.org/info/rfc4760>. | ||||
| [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement | ||||
| with BGP-4", RFC 5492, DOI 10.17487/RFC5492, February | ||||
| 2009, <https://www.rfc-editor.org/info/rfc5492>. | ||||
| [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP | ||||
| Authentication Option", RFC 5925, DOI 10.17487/RFC5925, | ||||
| June 2010, <https://www.rfc-editor.org/info/rfc5925>. | ||||
| [RFC6793] Vohra, Q. and E. Chen, "BGP Support for Four-Octet | ||||
| Autonomous System (AS) Number Space", RFC 6793, | ||||
| DOI 10.17487/RFC6793, December 2012, | ||||
| <https://www.rfc-editor.org/info/rfc6793>. | ||||
| [RFC6811] Mohapatra, P., Scudder, J., Ward, D., Bush, R., and R. | ||||
| Austein, "BGP Prefix Origin Validation", RFC 6811, | ||||
| DOI 10.17487/RFC6811, January 2013, | ||||
| <https://www.rfc-editor.org/info/rfc6811>. | ||||
| [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. | [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. | |||
| Patel, "Revised Error Handling for BGP UPDATE Messages", | Patel, "Revised Error Handling for BGP UPDATE Messages", | |||
| RFC 7606, DOI 10.17487/RFC7606, August 2015, | RFC 7606, DOI 10.17487/RFC7606, August 2015, | |||
| <https://www.rfc-editor.org/info/rfc7606>. | <https://www.rfc-editor.org/info/rfc7606>. | |||
| [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and | [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and | |||
| S. Ray, "North-Bound Distribution of Link-State and | S. Ray, "North-Bound Distribution of Link-State and | |||
| Traffic Engineering (TE) Information Using BGP", RFC 7752, | Traffic Engineering (TE) Information Using BGP", RFC 7752, | |||
| DOI 10.17487/RFC7752, March 2016, | DOI 10.17487/RFC7752, March 2016, | |||
| <https://www.rfc-editor.org/info/rfc7752>. | <https://www.rfc-editor.org/info/rfc7752>. | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., | [RFC8205] Lepinski, M., Ed. and K. Sriram, Ed., "BGPsec Protocol | |||
| Decraene, B., Litkowski, S., and R. Shakir, "Segment | Specification", RFC 8205, DOI 10.17487/RFC8205, September | |||
| Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, | 2017, <https://www.rfc-editor.org/info/rfc8205>. | |||
| July 2018, <https://www.rfc-editor.org/info/rfc8402>. | ||||
| [RFC8405] Decraene, B., Litkowski, S., Gredler, H., Lindem, A., | [RFC8405] Decraene, B., Litkowski, S., Gredler, H., Lindem, A., | |||
| Francois, P., and C. Bowers, "Shortest Path First (SPF) | Francois, P., and C. Bowers, "Shortest Path First (SPF) | |||
| Back-Off Delay Algorithm for Link-State IGPs", RFC 8405, | Back-Off Delay Algorithm for Link-State IGPs", RFC 8405, | |||
| DOI 10.17487/RFC8405, June 2018, | DOI 10.17487/RFC8405, June 2018, | |||
| <https://www.rfc-editor.org/info/rfc8405>. | <https://www.rfc-editor.org/info/rfc8405>. | |||
| 12.2. Information References | [RFC8654] Bush, R., Patel, K., and D. Ward, "Extended Message | |||
| Support for BGP", RFC 8654, DOI 10.17487/RFC8654, October | ||||
| 2019, <https://www.rfc-editor.org/info/rfc8654>. | ||||
| [RFC8665] Psenak, P., Ed., Previdi, S., Ed., Filsfils, C., Gredler, | ||||
| H., Shakir, R., Henderickx, W., and J. Tantsura, "OSPF | ||||
| Extensions for Segment Routing", RFC 8665, | ||||
| DOI 10.17487/RFC8665, December 2019, | ||||
| <https://www.rfc-editor.org/info/rfc8665>. | ||||
| 14.2. Informational References | ||||
| [I-D.ietf-lsvr-applicability] | [I-D.ietf-lsvr-applicability] | |||
| Patel, K., Lindem, A., Zandi, S., and G. Dawra, "Usage and | Patel, K., Lindem, A., Zandi, S., and G. Dawra, "Usage and | |||
| Applicability of Link State Vector Routing in Data | Applicability of Link State Vector Routing in Data | |||
| Centers", draft-ietf-lsvr-applicability-05 (work in | Centers", draft-ietf-lsvr-applicability-05 (work in | |||
| progress), March 2020. | progress), March 2020. | |||
| [I-D.psarkar-lsvr-bgp-spf-impl] | [I-D.psarkar-lsvr-bgp-spf-impl] | |||
| Sarkar, P., Patel, K., Pallagatti, S., and s. | Sarkar, P., Patel, K., Pallagatti, S., and s. | |||
| sajibasil@gmail.com, "BGP Shortest Path Routing Extension | sajibasil@gmail.com, "BGP Shortest Path Routing Extension | |||
| skipping to change at page 22, line 15 ¶ | skipping to change at page 35, line 43 ¶ | |||
| [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route | [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route | |||
| Reflection: An Alternative to Full Mesh Internal BGP | Reflection: An Alternative to Full Mesh Internal BGP | |||
| (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, | (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, | |||
| <https://www.rfc-editor.org/info/rfc4456>. | <https://www.rfc-editor.org/info/rfc4456>. | |||
| [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y. | [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y. | |||
| Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724, | Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724, | |||
| DOI 10.17487/RFC4724, January 2007, | DOI 10.17487/RFC4724, January 2007, | |||
| <https://www.rfc-editor.org/info/rfc4724>. | <https://www.rfc-editor.org/info/rfc4724>. | |||
| [RFC4750] Joyal, D., Ed., Galecki, P., Ed., Giacalone, S., Ed., | ||||
| Coltun, R., and F. Baker, "OSPF Version 2 Management | ||||
| Information Base", RFC 4750, DOI 10.17487/RFC4750, | ||||
| December 2006, <https://www.rfc-editor.org/info/rfc4750>. | ||||
| [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, | ||||
| "Multiprotocol Extensions for BGP-4", RFC 4760, | ||||
| DOI 10.17487/RFC4760, January 2007, | ||||
| <https://www.rfc-editor.org/info/rfc4760>. | ||||
| [RFC4790] Newman, C., Duerst, M., and A. Gulbrandsen, "Internet | ||||
| Application Protocol Collation Registry", RFC 4790, | ||||
| DOI 10.17487/RFC4790, March 2007, | ||||
| <https://www.rfc-editor.org/info/rfc4790>. | ||||
| [RFC4915] Psenak, P., Mirtorabi, S., Roy, A., Nguyen, L., and P. | [RFC4915] Psenak, P., Mirtorabi, S., Roy, A., Nguyen, L., and P. | |||
| Pillay-Esnault, "Multi-Topology (MT) Routing in OSPF", | Pillay-Esnault, "Multi-Topology (MT) Routing in OSPF", | |||
| RFC 4915, DOI 10.17487/RFC4915, June 2007, | RFC 4915, DOI 10.17487/RFC4915, June 2007, | |||
| <https://www.rfc-editor.org/info/rfc4915>. | <https://www.rfc-editor.org/info/rfc4915>. | |||
| [RFC5286] Atlas, A., Ed. and A. Zinin, Ed., "Basic Specification for | [RFC5286] Atlas, A., Ed. and A. Zinin, Ed., "Basic Specification for | |||
| IP Fast Reroute: Loop-Free Alternates", RFC 5286, | IP Fast Reroute: Loop-Free Alternates", RFC 5286, | |||
| DOI 10.17487/RFC5286, September 2008, | DOI 10.17487/RFC5286, September 2008, | |||
| <https://www.rfc-editor.org/info/rfc5286>. | <https://www.rfc-editor.org/info/rfc5286>. | |||
| [RFC5549] Le Faucheur, F. and E. Rosen, "Advertising IPv4 Network | ||||
| Layer Reachability Information with an IPv6 Next Hop", | ||||
| RFC 5549, DOI 10.17487/RFC5549, May 2009, | ||||
| <https://www.rfc-editor.org/info/rfc5549>. | ||||
| [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection | [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection | |||
| (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, | (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, | |||
| <https://www.rfc-editor.org/info/rfc5880>. | <https://www.rfc-editor.org/info/rfc5880>. | |||
| [RFC6952] Jethanandani, M., Patel, K., and L. Zheng, "Analysis of | ||||
| BGP, LDP, PCEP, and MSDP Issues According to the Keying | ||||
| and Authentication for Routing Protocols (KARP) Design | ||||
| Guide", RFC 6952, DOI 10.17487/RFC6952, May 2013, | ||||
| <https://www.rfc-editor.org/info/rfc6952>. | ||||
| [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, | ||||
| "Advertisement of Multiple Paths in BGP", RFC 7911, | ||||
| DOI 10.17487/RFC7911, July 2016, | ||||
| <https://www.rfc-editor.org/info/rfc7911>. | ||||
| [RFC7938] Lapukhov, P., Premji, A., and J. Mitchell, Ed., "Use of | [RFC7938] Lapukhov, P., Premji, A., and J. Mitchell, Ed., "Use of | |||
| BGP for Routing in Large-Scale Data Centers", RFC 7938, | BGP for Routing in Large-Scale Data Centers", RFC 7938, | |||
| DOI 10.17487/RFC7938, August 2016, | DOI 10.17487/RFC7938, August 2016, | |||
| <https://www.rfc-editor.org/info/rfc7938>. | <https://www.rfc-editor.org/info/rfc7938>. | |||
| [RFC7942] Sheffer, Y. and A. Farrel, "Improving Awareness of Running | [RFC7942] Sheffer, Y. and A. Farrel, "Improving Awareness of Running | |||
| Code: The Implementation Status Section", BCP 205, | Code: The Implementation Status Section", BCP 205, | |||
| RFC 7942, DOI 10.17487/RFC7942, July 2016, | RFC 7942, DOI 10.17487/RFC7942, July 2016, | |||
| <https://www.rfc-editor.org/info/rfc7942>. | <https://www.rfc-editor.org/info/rfc7942>. | |||
| End of changes. 155 change blocks. | ||||
| 579 lines changed or deleted | 1195 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||