| < draft-keyupate-idr-bgp-spf-01.txt | draft-keyupate-idr-bgp-spf-02.txt > | |||
|---|---|---|---|---|
| Network Working Group K. Patel | Network Working Group K. Patel | |||
| Internet-Draft Arrcus, Inc. | Internet-Draft Arrcus, Inc. | |||
| Intended status: Standards Track A. Lindem | Intended status: Standards Track A. Lindem | |||
| Expires: May 4, 2017 Cisco Systems | Expires: June 24, 2017 Cisco Systems | |||
| S. Zandi | S. Zandi | |||
| G. Van de Velde | G. Van de Velde | |||
| Nokia | Nokia | |||
| October 31, 2016 | December 21, 2016 | |||
| Shortest Path Routing Extensions for BGP Protocol | Shortest Path Routing Extensions for BGP Protocol | |||
| draft-keyupate-idr-bgp-spf-01.txt | draft-keyupate-idr-bgp-spf-02.txt | |||
| Abstract | Abstract | |||
| Many Massively Scaled Data Centers (MSDCs) have converged on | Many Massively Scaled Data Centers (MSDCs) have converged on | |||
| simplified layer 3 routing. Furthermore, requirements for | simplified layer 3 routing. Furthermore, requirements for | |||
| operational simplicity have lead many of these MSDCs to converge on | operational simplicity have lead many of these MSDCs to converge on | |||
| BGP as their single routing protocol for both their fabric routing | BGP as their single routing protocol for both their fabric routing | |||
| and their Data Center Interconnect (DCI) routing. This document | and their Data Center Interconnect (DCI) routing. This document | |||
| describes a solution which leverages BGP Link-State distribution and | describes a solution which leverages BGP Link-State distribution and | |||
| the Shortest Path First algorithm similar to Internal Gateway | the Shortest Path First algorithm similar to Internal Gateway | |||
| skipping to change at page 1, line 42 ¶ | skipping to change at page 1, line 42 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on May 4, 2017. | This Internet-Draft will expire on June 24, 2017. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2016 IETF Trust and the persons identified as the | Copyright (c) 2016 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 41 ¶ | skipping to change at page 2, line 41 ¶ | |||
| 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 5 | 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 5 | |||
| 2. BGP Peering Models . . . . . . . . . . . . . . . . . . . . . 5 | 2. BGP Peering Models . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 2.1. BGP Single-Hop Peering on Network Node Connections . . . 5 | 2.1. BGP Single-Hop Peering on Network Node Connections . . . 5 | |||
| 2.2. BGP Peering Between Directly Connected Network Nodes . . 5 | 2.2. BGP Peering Between Directly Connected Network Nodes . . 5 | |||
| 2.3. BGP Peering in Route-Reflector or Controller Topology . . 6 | 2.3. BGP Peering in Route-Reflector or Controller Topology . . 6 | |||
| 3. BGP-LS Shortest Path Routing (SPF) SAFI . . . . . . . . . . . 6 | 3. BGP-LS Shortest Path Routing (SPF) SAFI . . . . . . . . . . . 6 | |||
| 4. Extensions to BGP-LS . . . . . . . . . . . . . . . . . . . . 6 | 4. Extensions to BGP-LS . . . . . . . . . . . . . . . . . . . . 6 | |||
| 4.1. Node NLRI Usage and Modifications . . . . . . . . . . . . 6 | 4.1. Node NLRI Usage and Modifications . . . . . . . . . . . . 6 | |||
| 4.2. Link NLRI Usage . . . . . . . . . . . . . . . . . . . . . 7 | 4.2. Link NLRI Usage . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 4.3. Prefix NLRI Usage . . . . . . . . . . . . . . . . . . . . 7 | 4.3. Prefix NLRI Usage . . . . . . . . . . . . . . . . . . . . 7 | |||
| 4.4. BGP-LS Attribute Sequence-Number TLV . . . . . . . . . . 8 | 5. Decision Process with SPF Algorithm . . . . . . . . . . . . . 8 | |||
| 5. Decision Process with SPF Algorithm . . . . . . . . . . . . . 9 | 5.1. Phase-1 BGP NLRI Selection . . . . . . . . . . . . . . . 8 | |||
| 5.1. Phase-1 BGP NLRI Selection . . . . . . . . . . . . . . . 9 | 5.2. Dual Stack Support . . . . . . . . . . . . . . . . . . . 9 | |||
| 5.2. Dual Stack Support . . . . . . . . . . . . . . . . . . . 10 | 5.3. NEXT_HOP Manipulation . . . . . . . . . . . . . . . . . . 9 | |||
| 5.3. NEXT_HOP Manipulation . . . . . . . . . . . . . . . . . . 10 | 5.4. Error Handling . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 5.4. Error Handling . . . . . . . . . . . . . . . . . . . . . 10 | 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 | |||
| 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 | |||
| 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11 | 7.1. Acknowledgements . . . . . . . . . . . . . . . . . . . . 10 | |||
| 7.1. Acknowledgements . . . . . . . . . . . . . . . . . . . . 11 | 7.2. Contributorss . . . . . . . . . . . . . . . . . . . . . . 10 | |||
| 7.2. Contributorss . . . . . . . . . . . . . . . . . . . . . . 11 | 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 | 8.1. Normative References . . . . . . . . . . . . . . . . . . 11 | |||
| 8.1. Normative References . . . . . . . . . . . . . . . . . . 12 | 8.2. Information References . . . . . . . . . . . . . . . . . 12 | |||
| 8.2. Information References . . . . . . . . . . . . . . . . . 13 | ||||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 1. Introduction | 1. Introduction | |||
| Many Massively Scaled Data Centers (MSDCs) have converged on | Many Massively Scaled Data Centers (MSDCs) have converged on | |||
| simplified layer 3 routing. Furthermore, requirements for | simplified layer 3 routing. Furthermore, requirements for | |||
| operational simplicity have lead many of these MSDCs to converge on | operational simplicity have lead many of these MSDCs to converge on | |||
| BGP [RFC4271] as their single routing protocol for both their fabric | BGP [RFC4271] as their single routing protocol for both their fabric | |||
| routing and their Data Center Interconnect (DCI) routing. | routing and their Data Center Interconnect (DCI) routing. | |||
| Requirements and procedures for using BGP are described in [RFC7938]. | Requirements and procedures for using BGP are described in [RFC7938]. | |||
| This document describes an alternative solution which leverages BGP- | This document describes an alternative solution which leverages BGP- | |||
| skipping to change at page 8, line 8 ¶ | skipping to change at page 8, line 5 ¶ | |||
| 4.3. Prefix NLRI Usage | 4.3. Prefix NLRI Usage | |||
| Prefix NLRI is advertised with a local descriptor as described above | Prefix NLRI is advertised with a local descriptor as described above | |||
| and the prefix and length used as the descriptors (TLV 265) as | and the prefix and length used as the descriptors (TLV 265) as | |||
| described in [RFC7752]. The prefix metric attribute TLV (TLV 1155) | described in [RFC7752]. The prefix metric attribute TLV (TLV 1155) | |||
| as well as any others required for non-SPF purposes SHOULD be | as well as any others required for non-SPF purposes SHOULD be | |||
| advertised. For loopback prefixes, the metric should be 0. For non- | advertised. For loopback prefixes, the metric should be 0. For non- | |||
| loopback, the setting of the metric is beyond the scope of this | loopback, the setting of the metric is beyond the scope of this | |||
| document. | document. | |||
| 4.4. BGP-LS Attribute Sequence-Number TLV | ||||
| A new BGP-LS Attribute TLV to BGP-LS NLRI types is defined to assure | ||||
| the most recent version of a given NLRI is used in the SPF | ||||
| computation. The TBD TLV type will be defined by IANA. The new BGP- | ||||
| LS Attribute TLV will contain an 8 octet sequence number. The usage | ||||
| of the Sequence Number TLV is described in Section 5.1. | ||||
| 0 1 2 3 | ||||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | ||||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| | Type | Length | | ||||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| | Sequence Number (High-Order 32 Bits) | | ||||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| | Sequence Number (Low-Order 32 Bits) | | ||||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| Sequence Number | ||||
| The 64-bit strictly increasing sequence number is incremented for | ||||
| every version of BGP-LS NLRI originated. BGP speakers implementing | ||||
| this specification MUST use available mechanisms to preserve the | ||||
| sequence number's strictly increasing property for the deployed life | ||||
| of the BGP speaker (including cold restarts). One mechanism for | ||||
| accomplishing this would be to use the high-order 32 bits of the | ||||
| sequence number as a wrap/boot count that is incremented anytime the | ||||
| BGP Router router loses its sequence number state or the low-order 32 | ||||
| bits wrap. | ||||
| When incrementing the sequence number for each self-originated NLRI, | ||||
| the sequence number should be treated as an unsigned 64-bit value. | ||||
| If the lower-order 32-bit value wraps, the higher-order 32-bit value | ||||
| should be incremented and saved in non-volatile storage. If by some | ||||
| chance the BGP Speaker is deployed long enough that there is a | ||||
| possibility that the 64-bit sequence number may wrap or a BGP Speaker | ||||
| completely loses its sequence number state (e.g, the BGP speaker | ||||
| hardware is replaced), the phase 1 decision function (see | ||||
| Section 5.1) rules should insure convergance, albeit, not | ||||
| immediately. | ||||
| 5. Decision Process with SPF Algorithm | 5. Decision Process with SPF Algorithm | |||
| The Decision Process described in [RFC4271] takes place in three | The Decision Process described in [RFC4271] takes place in three | |||
| distinct phases. The Phase 1 decision function of the Decision | distinct phases. The Phase 1 decision function of the Decision | |||
| Process is responsible for calculating the degree of preference for | Process is responsible for calculating the degree of preference for | |||
| each route received from a Speaker's peer. The Phase 2 decision | each route received from a Speaker's peer. The Phase 2 decision | |||
| function is invoked on completion of the Phase 1 decision function | function is invoked on completion of the Phase 1 decision function | |||
| and is responsible for choosing the best route out of all those | and is responsible for choosing the best route out of all those | |||
| available for each distinct destination, and for installing each | available for each distinct destination, and for installing each | |||
| chosen route into the Loc-RIB. The combination of the Phase 1 and 2 | chosen route into the Loc-RIB. The combination of the Phase 1 and 2 | |||
| decision functions is also known as a Path vector algorithm. | decision functions is also known as a Path vector algorithm. | |||
| When BGP-LS-SPF NLRI is received, all that is required is to | When BGP-LS-SPF NLRI is received, all that is required is to | |||
| determine whether it is the best-path by examining the Node-ID and | determine whether it is the best-path by examining the Node-ID as | |||
| sequence number as described in Section 5.1. If the best-path NLRI | described in Section 5.1. If the best-path NLRI had changed, it will | |||
| had changed, it will be advertised to other BGP-LS-SPF peers. If the | be advertised to other BGP-LS-SPF peers. If the attributes have | |||
| attributes have changed (other than the sequence number), a BGP SPF | changed a BGP SPF calculation will be scheduled. However, a changed | |||
| calculation will be scheduled. However, a changed best-path can be | best-path can be advertised to other peer immediately and propagation | |||
| advertised to other peer immediately and propagation of changes can | of changes can approach IGP convergence times. | |||
| approach IGP convergence times. | ||||
| The SPF based Decision process starts with selecting only those Node | The SPF based Decision process starts with selecting only those Node | |||
| NLRI whose SPF capability TLV matches with the local BGP speaker's | NLRI whose SPF capability TLV matches with the local BGP speaker's | |||
| SPF capability TLV value. Since Link-State NLRI always contains the | SPF capability TLV value. Since Link-State NLRI always contains the | |||
| local descriptor [RFC7752], it will only be originated by a single | local descriptor [RFC7752], it will only be originated by a single | |||
| BGP speaker in the BGP routing domain. These selected Node NLRI and | BGP speaker in the BGP routing domain. These selected Node NLRI and | |||
| their Link/Prefix NLRI are used to build a directed graph during the | their Link/Prefix NLRI are used to build a directed graph during the | |||
| SPF computation. The best paths for BGP prefixes are installed as a | SPF computation. The best paths for BGP prefixes are installed as a | |||
| result of the SPF process. | result of the SPF process. | |||
| skipping to change at page 9, line 50 ¶ | skipping to change at page 8, line 49 ¶ | |||
| 5.1. Phase-1 BGP NLRI Selection | 5.1. Phase-1 BGP NLRI Selection | |||
| The rules for NLRI selection are greatly simplified from [RFC4271]. | The rules for NLRI selection are greatly simplified from [RFC4271]. | |||
| 1. If the NLRI is received from the BGP speaker originating the NLRI | 1. If the NLRI is received from the BGP speaker originating the NLRI | |||
| (as determined by the comparing BGP Router ID in the NLRI Node | (as determined by the comparing BGP Router ID in the NLRI Node | |||
| identifiers with the BGP speaker Router ID), then it is preferred | identifiers with the BGP speaker Router ID), then it is preferred | |||
| over the same NLRI from non-originators. | over the same NLRI from non-originators. | |||
| 2. If the Sequence-Number TLV is present in the BGP-LS Attribute, | 2. The final tie-breaker is the NLRI from the BGP Speaker with the | |||
| then the NLIR with the most recent, i.e., highest sequence number | ||||
| is selected. BGP-LS NLRI with a Sequence-Number TLV will be | ||||
| considered more recent than NLRI without a BGP-LS or a BGP-LS | ||||
| Attribute that doesn't include the Sequence-Number TLV. | ||||
| 3. The final tie-breaker is the NLRI from the BGP Speaker with the | ||||
| numerically largest BGP Router ID. | numerically largest BGP Router ID. | |||
| The modified Decision Process with SPF algorithm uses the metric from | The modified Decision Process with SPF algorithm uses the metric from | |||
| Link and Prefix NLRI Attribute TLVs [RFC7752]. As a result, any | Link and Prefix NLRI Attribute TLVs [RFC7752]. As a result, any | |||
| attributes that would influence the Decision process defined in | attributes that would influence the Decision process defined in | |||
| [RFC4271] like ORIGIN, MULTI_EXIT_DISC, and LOCAL_PREF attributes are | [RFC4271] like ORIGIN, MULTI_EXIT_DISC, and LOCAL_PREF attributes are | |||
| ignored by the SPF algorithm. Furthermore, the NEXT_HOP attribute | ignored by the SPF algorithm. Furthermore, the NEXT_HOP attribute | |||
| value is preserved and validated but otherwise ignored during the SPF | value is preserved and validated but otherwise ignored during the SPF | |||
| or best-path. | or best-path. | |||
| skipping to change at page 11, line 14 ¶ | skipping to change at page 10, line 12 ¶ | |||
| with malformed TLV, a BGP speaker SHOULD log an error for further | with malformed TLV, a BGP speaker SHOULD log an error for further | |||
| analysis. | analysis. | |||
| 6. IANA Considerations | 6. IANA Considerations | |||
| This document defines a couple AFI/SAFIs for BGP LS SPF operation and | This document defines a couple AFI/SAFIs for BGP LS SPF operation and | |||
| requests IANA to assign the BGP-LS-SPF AFI 16388 / SAFI TBD1 and the | requests IANA to assign the BGP-LS-SPF AFI 16388 / SAFI TBD1 and the | |||
| BGP-LS-SPF-VPN AFI 16388 / SAFI TBD2 as described in [RFC4750]. | BGP-LS-SPF-VPN AFI 16388 / SAFI TBD2 as described in [RFC4750]. | |||
| This document also defines two attribute TLV for BGP LS NLRI. We | This document also defines two attribute TLV for BGP LS NLRI. We | |||
| request IANA to assign TLVs for the SPF capability and the Sequence | request IANA to assign TLVs for the SPF capability from the "BGP-LS | |||
| Number from the "BGP-LS Node Descriptor, Link Descriptor, Prefix | Node Descriptor, Link Descriptor, Prefix Descriptor, and Attribute | |||
| Descriptor, and Attribute TLVs" Registry. Additionally, IANA is | TLVs" Registry. Additionally, IANA is requested to create a new | |||
| requested to create a new registry for "BGP-LS SPF Capability | registry for "BGP-LS SPF Capability Algorithms" for the value of the | |||
| Algorithms" for the value of the algorithm both in the BGP-LS Node | algorithm both in the BGP-LS Node Attribute TLV and the BGP SPF | |||
| Attribute TLV and the BGP SPF Capability. The initial assignments | Capability. The initial assignments are: | |||
| are: | ||||
| +-------------+-----------------------------------+ | +-------------+-----------------------------------+ | |||
| | Value(s) | Assignment Policy | | | Value(s) | Assignment Policy | | |||
| +-------------+-----------------------------------+ | +-------------+-----------------------------------+ | |||
| | 0 | Reserved (not to be assigned) | | | 0 | Reserved (not to be assigned) | | |||
| | | | | | | | | |||
| | 1 | SPF | | | 1 | SPF | | |||
| | | | | | | | | |||
| | 2 | Strict SPF | | | 2 | Strict SPF | | |||
| | | | | | | | | |||
| End of changes. 9 change blocks. | ||||
| 80 lines changed or deleted | 31 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||