< draft-keyupate-idr-bgp-spf-01.txt   draft-keyupate-idr-bgp-spf-02.txt >
Network Working Group K. Patel Network Working Group K. Patel
Internet-Draft Arrcus, Inc. Internet-Draft Arrcus, Inc.
Intended status: Standards Track A. Lindem Intended status: Standards Track A. Lindem
Expires: May 4, 2017 Cisco Systems Expires: June 24, 2017 Cisco Systems
S. Zandi S. Zandi
Linkedin Linkedin
G. Van de Velde G. Van de Velde
Nokia Nokia
October 31, 2016 December 21, 2016
Shortest Path Routing Extensions for BGP Protocol Shortest Path Routing Extensions for BGP Protocol
draft-keyupate-idr-bgp-spf-01.txt draft-keyupate-idr-bgp-spf-02.txt
Abstract Abstract
Many Massively Scaled Data Centers (MSDCs) have converged on Many Massively Scaled Data Centers (MSDCs) have converged on
simplified layer 3 routing. Furthermore, requirements for simplified layer 3 routing. Furthermore, requirements for
operational simplicity have lead many of these MSDCs to converge on operational simplicity have lead many of these MSDCs to converge on
BGP as their single routing protocol for both their fabric routing BGP as their single routing protocol for both their fabric routing
and their Data Center Interconnect (DCI) routing. This document and their Data Center Interconnect (DCI) routing. This document
describes a solution which leverages BGP Link-State distribution and describes a solution which leverages BGP Link-State distribution and
the Shortest Path First algorithm similar to Internal Gateway the Shortest Path First algorithm similar to Internal Gateway
skipping to change at page 1, line 42 skipping to change at page 1, line 42
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 4, 2017. This Internet-Draft will expire on June 24, 2017.
Copyright Notice Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 41 skipping to change at page 2, line 41
1.2. Requirements Language . . . . . . . . . . . . . . . . . . 5 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 5
2. BGP Peering Models . . . . . . . . . . . . . . . . . . . . . 5 2. BGP Peering Models . . . . . . . . . . . . . . . . . . . . . 5
2.1. BGP Single-Hop Peering on Network Node Connections . . . 5 2.1. BGP Single-Hop Peering on Network Node Connections . . . 5
2.2. BGP Peering Between Directly Connected Network Nodes . . 5 2.2. BGP Peering Between Directly Connected Network Nodes . . 5
2.3. BGP Peering in Route-Reflector or Controller Topology . . 6 2.3. BGP Peering in Route-Reflector or Controller Topology . . 6
3. BGP-LS Shortest Path Routing (SPF) SAFI . . . . . . . . . . . 6 3. BGP-LS Shortest Path Routing (SPF) SAFI . . . . . . . . . . . 6
4. Extensions to BGP-LS . . . . . . . . . . . . . . . . . . . . 6 4. Extensions to BGP-LS . . . . . . . . . . . . . . . . . . . . 6
4.1. Node NLRI Usage and Modifications . . . . . . . . . . . . 6 4.1. Node NLRI Usage and Modifications . . . . . . . . . . . . 6
4.2. Link NLRI Usage . . . . . . . . . . . . . . . . . . . . . 7 4.2. Link NLRI Usage . . . . . . . . . . . . . . . . . . . . . 7
4.3. Prefix NLRI Usage . . . . . . . . . . . . . . . . . . . . 7 4.3. Prefix NLRI Usage . . . . . . . . . . . . . . . . . . . . 7
4.4. BGP-LS Attribute Sequence-Number TLV . . . . . . . . . . 8 5. Decision Process with SPF Algorithm . . . . . . . . . . . . . 8
5. Decision Process with SPF Algorithm . . . . . . . . . . . . . 9 5.1. Phase-1 BGP NLRI Selection . . . . . . . . . . . . . . . 8
5.1. Phase-1 BGP NLRI Selection . . . . . . . . . . . . . . . 9 5.2. Dual Stack Support . . . . . . . . . . . . . . . . . . . 9
5.2. Dual Stack Support . . . . . . . . . . . . . . . . . . . 10 5.3. NEXT_HOP Manipulation . . . . . . . . . . . . . . . . . . 9
5.3. NEXT_HOP Manipulation . . . . . . . . . . . . . . . . . . 10 5.4. Error Handling . . . . . . . . . . . . . . . . . . . . . 9
5.4. Error Handling . . . . . . . . . . . . . . . . . . . . . 10 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10
7. Security Considerations . . . . . . . . . . . . . . . . . . . 11 7.1. Acknowledgements . . . . . . . . . . . . . . . . . . . . 10
7.1. Acknowledgements . . . . . . . . . . . . . . . . . . . . 11 7.2. Contributorss . . . . . . . . . . . . . . . . . . . . . . 10
7.2. Contributorss . . . . . . . . . . . . . . . . . . . . . . 11 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 11
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 8.1. Normative References . . . . . . . . . . . . . . . . . . 11
8.1. Normative References . . . . . . . . . . . . . . . . . . 12 8.2. Information References . . . . . . . . . . . . . . . . . 12
8.2. Information References . . . . . . . . . . . . . . . . . 13
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12
1. Introduction 1. Introduction
Many Massively Scaled Data Centers (MSDCs) have converged on Many Massively Scaled Data Centers (MSDCs) have converged on
simplified layer 3 routing. Furthermore, requirements for simplified layer 3 routing. Furthermore, requirements for
operational simplicity have lead many of these MSDCs to converge on operational simplicity have lead many of these MSDCs to converge on
BGP [RFC4271] as their single routing protocol for both their fabric BGP [RFC4271] as their single routing protocol for both their fabric
routing and their Data Center Interconnect (DCI) routing. routing and their Data Center Interconnect (DCI) routing.
Requirements and procedures for using BGP are described in [RFC7938]. Requirements and procedures for using BGP are described in [RFC7938].
This document describes an alternative solution which leverages BGP- This document describes an alternative solution which leverages BGP-
skipping to change at page 8, line 8 skipping to change at page 8, line 5
4.3. Prefix NLRI Usage 4.3. Prefix NLRI Usage
Prefix NLRI is advertised with a local descriptor as described above Prefix NLRI is advertised with a local descriptor as described above
and the prefix and length used as the descriptors (TLV 265) as and the prefix and length used as the descriptors (TLV 265) as
described in [RFC7752]. The prefix metric attribute TLV (TLV 1155) described in [RFC7752]. The prefix metric attribute TLV (TLV 1155)
as well as any others required for non-SPF purposes SHOULD be as well as any others required for non-SPF purposes SHOULD be
advertised. For loopback prefixes, the metric should be 0. For non- advertised. For loopback prefixes, the metric should be 0. For non-
loopback, the setting of the metric is beyond the scope of this loopback, the setting of the metric is beyond the scope of this
document. document.
4.4. BGP-LS Attribute Sequence-Number TLV
A new BGP-LS Attribute TLV to BGP-LS NLRI types is defined to assure
the most recent version of a given NLRI is used in the SPF
computation. The TBD TLV type will be defined by IANA. The new BGP-
LS Attribute TLV will contain an 8 octet sequence number. The usage
of the Sequence Number TLV is described in Section 5.1.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number (High-Order 32 Bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number (Low-Order 32 Bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Sequence Number
The 64-bit strictly increasing sequence number is incremented for
every version of BGP-LS NLRI originated. BGP speakers implementing
this specification MUST use available mechanisms to preserve the
sequence number's strictly increasing property for the deployed life
of the BGP speaker (including cold restarts). One mechanism for
accomplishing this would be to use the high-order 32 bits of the
sequence number as a wrap/boot count that is incremented anytime the
BGP Router router loses its sequence number state or the low-order 32
bits wrap.
When incrementing the sequence number for each self-originated NLRI,
the sequence number should be treated as an unsigned 64-bit value.
If the lower-order 32-bit value wraps, the higher-order 32-bit value
should be incremented and saved in non-volatile storage. If by some
chance the BGP Speaker is deployed long enough that there is a
possibility that the 64-bit sequence number may wrap or a BGP Speaker
completely loses its sequence number state (e.g, the BGP speaker
hardware is replaced), the phase 1 decision function (see
Section 5.1) rules should insure convergance, albeit, not
immediately.
5. Decision Process with SPF Algorithm 5. Decision Process with SPF Algorithm
The Decision Process described in [RFC4271] takes place in three The Decision Process described in [RFC4271] takes place in three
distinct phases. The Phase 1 decision function of the Decision distinct phases. The Phase 1 decision function of the Decision
Process is responsible for calculating the degree of preference for Process is responsible for calculating the degree of preference for
each route received from a Speaker's peer. The Phase 2 decision each route received from a Speaker's peer. The Phase 2 decision
function is invoked on completion of the Phase 1 decision function function is invoked on completion of the Phase 1 decision function
and is responsible for choosing the best route out of all those and is responsible for choosing the best route out of all those
available for each distinct destination, and for installing each available for each distinct destination, and for installing each
chosen route into the Loc-RIB. The combination of the Phase 1 and 2 chosen route into the Loc-RIB. The combination of the Phase 1 and 2
decision functions is also known as a Path vector algorithm. decision functions is also known as a Path vector algorithm.
When BGP-LS-SPF NLRI is received, all that is required is to When BGP-LS-SPF NLRI is received, all that is required is to
determine whether it is the best-path by examining the Node-ID and determine whether it is the best-path by examining the Node-ID as
sequence number as described in Section 5.1. If the best-path NLRI described in Section 5.1. If the best-path NLRI had changed, it will
had changed, it will be advertised to other BGP-LS-SPF peers. If the be advertised to other BGP-LS-SPF peers. If the attributes have
attributes have changed (other than the sequence number), a BGP SPF changed a BGP SPF calculation will be scheduled. However, a changed
calculation will be scheduled. However, a changed best-path can be best-path can be advertised to other peer immediately and propagation
advertised to other peer immediately and propagation of changes can of changes can approach IGP convergence times.
approach IGP convergence times.
The SPF based Decision process starts with selecting only those Node The SPF based Decision process starts with selecting only those Node
NLRI whose SPF capability TLV matches with the local BGP speaker's NLRI whose SPF capability TLV matches with the local BGP speaker's
SPF capability TLV value. Since Link-State NLRI always contains the SPF capability TLV value. Since Link-State NLRI always contains the
local descriptor [RFC7752], it will only be originated by a single local descriptor [RFC7752], it will only be originated by a single
BGP speaker in the BGP routing domain. These selected Node NLRI and BGP speaker in the BGP routing domain. These selected Node NLRI and
their Link/Prefix NLRI are used to build a directed graph during the their Link/Prefix NLRI are used to build a directed graph during the
SPF computation. The best paths for BGP prefixes are installed as a SPF computation. The best paths for BGP prefixes are installed as a
result of the SPF process. result of the SPF process.
skipping to change at page 9, line 50 skipping to change at page 8, line 49
5.1. Phase-1 BGP NLRI Selection 5.1. Phase-1 BGP NLRI Selection
The rules for NLRI selection are greatly simplified from [RFC4271]. The rules for NLRI selection are greatly simplified from [RFC4271].
1. If the NLRI is received from the BGP speaker originating the NLRI 1. If the NLRI is received from the BGP speaker originating the NLRI
(as determined by the comparing BGP Router ID in the NLRI Node (as determined by the comparing BGP Router ID in the NLRI Node
identifiers with the BGP speaker Router ID), then it is preferred identifiers with the BGP speaker Router ID), then it is preferred
over the same NLRI from non-originators. over the same NLRI from non-originators.
2. If the Sequence-Number TLV is present in the BGP-LS Attribute, 2. The final tie-breaker is the NLRI from the BGP Speaker with the
then the NLIR with the most recent, i.e., highest sequence number
is selected. BGP-LS NLRI with a Sequence-Number TLV will be
considered more recent than NLRI without a BGP-LS or a BGP-LS
Attribute that doesn't include the Sequence-Number TLV.
3. The final tie-breaker is the NLRI from the BGP Speaker with the
numerically largest BGP Router ID. numerically largest BGP Router ID.
The modified Decision Process with SPF algorithm uses the metric from The modified Decision Process with SPF algorithm uses the metric from
Link and Prefix NLRI Attribute TLVs [RFC7752]. As a result, any Link and Prefix NLRI Attribute TLVs [RFC7752]. As a result, any
attributes that would influence the Decision process defined in attributes that would influence the Decision process defined in
[RFC4271] like ORIGIN, MULTI_EXIT_DISC, and LOCAL_PREF attributes are [RFC4271] like ORIGIN, MULTI_EXIT_DISC, and LOCAL_PREF attributes are
ignored by the SPF algorithm. Furthermore, the NEXT_HOP attribute ignored by the SPF algorithm. Furthermore, the NEXT_HOP attribute
value is preserved and validated but otherwise ignored during the SPF value is preserved and validated but otherwise ignored during the SPF
or best-path. or best-path.
skipping to change at page 11, line 14 skipping to change at page 10, line 12
with malformed TLV, a BGP speaker SHOULD log an error for further with malformed TLV, a BGP speaker SHOULD log an error for further
analysis. analysis.
6. IANA Considerations 6. IANA Considerations
This document defines a couple AFI/SAFIs for BGP LS SPF operation and This document defines a couple AFI/SAFIs for BGP LS SPF operation and
requests IANA to assign the BGP-LS-SPF AFI 16388 / SAFI TBD1 and the requests IANA to assign the BGP-LS-SPF AFI 16388 / SAFI TBD1 and the
BGP-LS-SPF-VPN AFI 16388 / SAFI TBD2 as described in [RFC4750]. BGP-LS-SPF-VPN AFI 16388 / SAFI TBD2 as described in [RFC4750].
This document also defines two attribute TLV for BGP LS NLRI. We This document also defines two attribute TLV for BGP LS NLRI. We
request IANA to assign TLVs for the SPF capability and the Sequence request IANA to assign TLVs for the SPF capability from the "BGP-LS
Number from the "BGP-LS Node Descriptor, Link Descriptor, Prefix Node Descriptor, Link Descriptor, Prefix Descriptor, and Attribute
Descriptor, and Attribute TLVs" Registry. Additionally, IANA is TLVs" Registry. Additionally, IANA is requested to create a new
requested to create a new registry for "BGP-LS SPF Capability registry for "BGP-LS SPF Capability Algorithms" for the value of the
Algorithms" for the value of the algorithm both in the BGP-LS Node algorithm both in the BGP-LS Node Attribute TLV and the BGP SPF
Attribute TLV and the BGP SPF Capability. The initial assignments Capability. The initial assignments are:
are:
+-------------+-----------------------------------+ +-------------+-----------------------------------+
| Value(s) | Assignment Policy | | Value(s) | Assignment Policy |
+-------------+-----------------------------------+ +-------------+-----------------------------------+
| 0 | Reserved (not to be assigned) | | 0 | Reserved (not to be assigned) |
| | | | | |
| 1 | SPF | | 1 | SPF |
| | | | | |
| 2 | Strict SPF | | 2 | Strict SPF |
| | | | | |
 End of changes. 9 change blocks. 
80 lines changed or deleted 31 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/