| < draft-ietf-lsvr-applicability-01.txt | draft-ietf-lsvr-applicability-02.txt > | |||
|---|---|---|---|---|
| LSVR K. Patel | LSVR K. Patel | |||
| Internet-Draft Arrcus, Inc. | Internet-Draft Arrcus, Inc. | |||
| Intended status: Informational A. Lindem | Intended status: Informational A. Lindem | |||
| Expires: April 25, 2019 Cisco Systems | Expires: November 2, 2019 Cisco Systems | |||
| S. Zandi | S. Zandi | |||
| G. Dawra | G. Dawra | |||
| October 22, 2018 | May 1, 2019 | |||
| Usage and Applicability of Link State Vector Routing in Data Centers | Usage and Applicability of Link State Vector Routing in Data Centers | |||
| draft-ietf-lsvr-applicability-01.txt | draft-ietf-lsvr-applicability-02.txt | |||
| Abstract | Abstract | |||
| This document discusses the usage and applicability of Link State | This document discusses the usage and applicability of Link State | |||
| Vector Routing (LSVR) extensions in the CLOS architecture of Data | Vector Routing (LSVR) extensions in data center networks utilizing | |||
| Center Networks. The document is intended to provide a simplified | CLOS or Fat-Tree topologies. The document is intended to provide a | |||
| guide for the deployment of LSVR extensions. | simplified guide for the deployment of LSVR extensions. | |||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on April 25, 2019. | This Internet-Draft will expire on November 2, 2019. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2019 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
| 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3 | 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3 | |||
| 3. Recommended Reading . . . . . . . . . . . . . . . . . . . . . 3 | 3. Recommended Reading . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 4. Common Deployment Scenario . . . . . . . . . . . . . . . . . 3 | 4. Common Deployment Scenario . . . . . . . . . . . . . . . . . 3 | |||
| 5. Justification for BGP SPF Extension . . . . . . . . . . . . . 4 | 5. Justification for BGP SPF Extension . . . . . . . . . . . . . 4 | |||
| 6. LSVR Applicability to CLOS Networks . . . . . . . . . . . . . 5 | 6. LSVR Applicability to CLOS Networks . . . . . . . . . . . . . 5 | |||
| 6.1. Usage of BGP-LS SAFI . . . . . . . . . . . . . . . . . . 5 | 6.1. Usage of BGP-LS SPF SAFI . . . . . . . . . . . . . . . . 5 | |||
| 6.1.1. Relationship to Other BGP AFI/SAFI Tuples . . . . . . 6 | 6.1.1. Relationship to Other BGP AFI/SAFI Tuples . . . . . . 6 | |||
| 6.2. Peering Models . . . . . . . . . . . . . . . . . . . . . 6 | 6.2. Peering Models . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 6.2.1. Sparse Peering Model . . . . . . . . . . . . . . . . 6 | 6.2.1. Sparse Peering Model . . . . . . . . . . . . . . . . 6 | |||
| 6.2.2. Bi-Connected Graph Heuristic . . . . . . . . . . . . 7 | 6.2.2. Bi-Connected Graph Heuristic . . . . . . . . . . . . 7 | |||
| 6.3. BGP Peer Discovery . . . . . . . . . . . . . . . . . . . 7 | 6.3. BGP Spine/Leaf Topology Policy . . . . . . . . . . . . . 7 | |||
| 6.3.1. BGP Peer Discovery Requirements . . . . . . . . . . . 7 | 6.4. BGP Peer Discovery Requirements . . . . . . . . . . . . . 8 | |||
| 6.3.2. BGP Peer Discovery Alternatives . . . . . . . . . . . 8 | 6.5. BGP Peer Discovery . . . . . . . . . . . . . . . . . . . 9 | |||
| 6.3.3. Data Center Interconnect (DCI) Applicability . . . . 8 | 6.5.1. BGP Peer Discovery Alternatives . . . . . . . . . . . 9 | |||
| 6.4. Non-CLOS/FAT Tree Topology Applicability . . . . . . . . 9 | 6.5.2. Data Center Interconnect (DCI) Applicability . . . . 9 | |||
| 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 | 6.6. Non-CLOS/FAT Tree Topology Applicability . . . . . . . . 10 | |||
| 8. Security Considerations . . . . . . . . . . . . . . . . . . . 9 | 7. BGP Policy Applicability . . . . . . . . . . . . . . . . . . 10 | |||
| 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 | |||
| 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 10 | |||
| 10.1. Normative References . . . . . . . . . . . . . . . . . . 9 | 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10 | |||
| 10.2. Informative References . . . . . . . . . . . . . . . . . 10 | 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 | 11.1. Normative References . . . . . . . . . . . . . . . . . . 11 | |||
| 11.2. Informative References . . . . . . . . . . . . . . . . . 11 | ||||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 | ||||
| 1. Introduction | 1. Introduction | |||
| This document complements [I-D.ietf-lsvr-bgp-spf] by discussing the | This document complements [I-D.ietf-lsvr-bgp-spf] by discussing the | |||
| applicability of the technology in a simple and fairly common | applicability of the technology in a simple and fairly common | |||
| deployment scenario, which is described in Section 4. | deployment scenario, which is described in Section 4. | |||
| After describing the deployment scenario, Section 5 will describe the | After describing the deployment scenario, Section 5 will describe the | |||
| reasons for BGP modifications for such deployments. | reasons for BGP modifications for such deployments. | |||
| skipping to change at page 3, line 23 ¶ | skipping to change at page 3, line 23 ¶ | |||
| 3. Recommended Reading | 3. Recommended Reading | |||
| This document assumes knowledge of existing data center networks and | This document assumes knowledge of existing data center networks and | |||
| data center network topologies [CLOS]. This document also assumes | data center network topologies [CLOS]. This document also assumes | |||
| knowledge of data center routing protocols like BGP [RFC4271], BGP- | knowledge of data center routing protocols like BGP [RFC4271], BGP- | |||
| SPF [I-D.ietf-lsvr-bgp-spf], OSPF [RFC2328], as well as, data center | SPF [I-D.ietf-lsvr-bgp-spf], OSPF [RFC2328], as well as, data center | |||
| OAM protocols like LLDP [RFC4957] and BFD [RFC5580]. | OAM protocols like LLDP [RFC4957] and BFD [RFC5580]. | |||
| 4. Common Deployment Scenario | 4. Common Deployment Scenario | |||
| Within a Data Center, a common network design to interconnect servers | Within a Data Center, servers are commonly interconnected the CLOS | |||
| is done using the CLOS topology [CLOS]. The CLOS topology is fully | topology [CLOS]. The CLOS topology is fully non-blocking and the | |||
| non-blocking and the topology is realized using Equal Cost Multipath | topology is realized using Equal Cost Multi-Path (ECMP). In a CLOS | |||
| (ECMP). In a CLOS topology, the minimum number of parallel paths | topology, the minimum number of parallel paths between two servers is | |||
| between two servers is determined by the width of a tier-1 stage as | determined by the width of a tier-1 stage as shown in the figure 1. | |||
| shown in the figure 1. | ||||
| The following example illustrates multistage CLOS topology. | The following example illustrates multi-stage CLOS topology. | |||
| Tier-1 | Tier-1 | |||
| +-----+ | +-----+ | |||
| |NODE | | |NODE | | |||
| +->| 12 |--+ | +->| 12 |--+ | |||
| | +-----+ | | | +-----+ | | |||
| Tier-2 | | Tier-2 | Tier-2 | | Tier-2 | |||
| +-----+ | +-----+ | +-----+ | +-----+ | +-----+ | +-----+ | |||
| +------------>|NODE |--+->|NODE |--+--|NODE |-------------+ | +------------>|NODE |--+->|NODE |--+--|NODE |-------------+ | |||
| | +-----| 9 |--+ | 10 | +--| 11 |-----+ | | | +-----| 9 |--+ | 10 | +--| 11 |-----+ | | |||
| skipping to change at page 4, line 33 ¶ | skipping to change at page 4, line 33 ¶ | |||
| | 1 | | 2 | | 3 | | 4 | | 5 | | | 1 | | 2 | | 3 | | 4 | | 5 | | |||
| +-----+ +-----+ +-----+ +-----+ +-----+ | +-----+ +-----+ +-----+ +-----+ +-----+ | |||
| | | | | | | | | | | | | | | | | | | |||
| A O B O <- Servers -> Z O O O | A O B O <- Servers -> Z O O O | |||
| Figure 1: Illustration of the basic CLOS | Figure 1: Illustration of the basic CLOS | |||
| 5. Justification for BGP SPF Extension | 5. Justification for BGP SPF Extension | |||
| In order to simplify layer-3 routing and operations [RFC7938], many | In order to simplify layer-3 routing and operations [RFC7938], many | |||
| data centers use BGP as a routing protocol to create an overlay as | data centers use BGP as a routing protocol to create both an underlay | |||
| well as an underlay network for their CLOS Topologies. However, BGP | and overlay network for their CLOS Topologies. However, BGP is a | |||
| is a path-vector routing protocol. Since it does not create a fabric | path-vector routing protocol. Since it does not create a fabric | |||
| topology, it uses hop-by-hop EBGP peering to facilitate hop-by-hop | topology, it uses hop-by-hop EBGP peering to facilitate hop-by-hop | |||
| routing to create the underlay network and to resolve any overlay | routing to create the underlay network and to resolve any overlay | |||
| next hops. The hop-by-hop BGP peering paradigm imposes several | next hops. The hop-by-hop BGP peering paradigm imposes several | |||
| restrictions within a CLOS. It severely prohibits a deployment of | restrictions within a CLOS. It severely prohibits a deployment of | |||
| Route Reflectors/Route Controllers as the EBGP sessions are congruent | Route Reflectors/Route Controllers as the EBGP sessions are congruent | |||
| with the data path. The BGP best path algorithm is prefix-based and | with the data path. The BGP best-path algorithm is prefix-based and | |||
| it prevents announcements of prefixes to other BGP speakers until the | it prevents announcements of prefixes to other BGP speakers until the | |||
| best path decision process is performed for the prefix at each | best-path decision process has been performed for the prefix at each | |||
| intermediate hop. These restrictions significantly delay the overall | intermediate hop. These restrictions significantly delay the overall | |||
| convergence of the underlay network within a CLOS. | convergence of the underlay network within a CLOS network. | |||
| The LSVR SPF modifications allow BGP to overcome these limitations. | The LSVR SPF modifications allow BGP to overcome these limitations. | |||
| Furthermore, using the BGP-LS NLRI format [RFC7752] allows the LSVR | Furthermore, using the BGP-LS NLRI format [RFC7752] allows the LSVR | |||
| data to be advertised for nodes, links, and prefixes in the BGP | data to be advertised for nodes, links, and prefixes in the BGP | |||
| routing domain and used for SPF computations. | routing domain and used for SPF computations. | |||
| 6. LSVR Applicability to CLOS Networks | 6. LSVR Applicability to CLOS Networks | |||
| With the BGP SPF extensions [I-D.ietf-lsvr-bgp-spf], the BGP best | With the BGP SPF extensions [I-D.ietf-lsvr-bgp-spf], the BGP best- | |||
| path computation and route computation are replaced with OSPF-like | path computation and route computation are replaced with OSPF-like | |||
| algorithms [RFC2328] both to determine whether an BGP-LS NLRI has | algorithms [RFC2328] both to determine whether an BGP-LS SPF NLRI has | |||
| changed and needs to be re-advertised and to compute the routing | changed and needs to be re-advertised and to compute the BGP routes. | |||
| table. These modifications will significantly improve convergence of | These modifications will significantly improve convergence of the | |||
| the underlay while affording the operational benefits of a single | underlay while affording the operational benefits of a single routing | |||
| routing protocol [RFC7938]. | protocol [RFC7938]. | |||
| Data center controllers typically require visibility to the BGP | Data center controllers typically require visibility to the BGP | |||
| topology to compute traffic-engineered paths. These controllers | topology to compute traffic-engineered paths. These controllers | |||
| learn the topology and other relevant information via the BGP-LS | learn the topology and other relevant information via the BGP-LS | |||
| address family [RFC7752] which is totally independent of the underlay | address family [RFC7752] which is totally independent of the underlay | |||
| address families (usually IPv4/IPv6 unicast). Furthermore, in | address families (usually IPv4/IPv6 unicast). Furthermore, in | |||
| traditional BGP underlays, all the BGP routers will need to advertise | traditional BGP underlays, all the BGP routers will need to advertise | |||
| their BGP-LS information independently. With the BGP SPF extensions, | their BGP-LS information independently. With the BGP SPF extensions, | |||
| controllers can learn the topology using the same BGP advertisements | controllers can learn the topology using the same BGP advertisements | |||
| used to compute the underlay routes. Furthermore, these data center | used to compute the underlay routes. Furthermore, these data center | |||
| controllers can avail the convergence advantages of the BGP SPF | controllers can avail the convergence advantages of the BGP SPF | |||
| extensions. The placement of controllers can be outside of the | extensions. The placement of controllers can be outside of the | |||
| forwarding path or within the forwarding path. | forwarding path or within the forwarding path. | |||
| Alternatively, as each and every router in the BGP SPF domain will | Alternatively, as each and every router in the BGP SPF domain will | |||
| have a complete view of the topology, the operator can also choose to | have a complete view of the topology, the operator can also choose to | |||
| configure BGP sessions in hop-by-hop peering model described in | configure BGP sessions in hop-by-hop peering model described in | |||
| [RFC7938] along with BFD [RFC5580]. In doing so, while the hop-by- | [RFC7938] along with BFD [RFC5580]. In doing so, while the hop-by- | |||
| hop peering model lacks inherent benefits of the controller-based | hop peering model lacks the inherent benefits of the controller-based | |||
| model, BGP updates need not be serialized by BGP best path algorithm | model, BGP updates need not be serialized by BGP best-path algorithm | |||
| in either of these models. This helps overall network convergence. | in either of these models. This helps overall network convergence. | |||
| 6.1. Usage of BGP-LS SAFI | 6.1. Usage of BGP-LS SPF SAFI | |||
| The BGP SPF extensions [I-D.ietf-lsvr-bgp-spf] define a new BGP-LS | The BGP SPF extensions [I-D.ietf-lsvr-bgp-spf] define a new BGP-LS | |||
| SAFI for announcement of BGP SPF link-state. The NLRI format and its | SPF SAFI for announcement of BGP SPF link-state. The NLRI format and | |||
| associated attributes follow the format of BGP-LS for node, link, and | its associated attributes follow the format of BGP-LS for node, link, | |||
| prefix announcements. Whether the peering model within a CLOS | and prefix announcements. Whether the peering model within a CLOS | |||
| follows hop-by-hop peering described in [RFC7938] or any controller- | follows hop-by-hop peering described in [RFC7938] or any controller- | |||
| based or route-reflector peering, an operator can exchange BGP SPF | based or route-reflector peering, an operator can exchange BGP SPF | |||
| SAFI routes over the BGP peering by simply configuring BGP SPF SAFI | SAFI routes over the BGP peering by simply configuring BGP SPF SAFI | |||
| between the necessary BGP speakers. | between the necessary BGP speakers. | |||
| The BGP-LS SPF SAFI can also co-exist with BGP IP Unicast SAFI which | The BGP-LS SPF SAFI can also co-exist with BGP IP Unicast SAFI which | |||
| could exchange overlapping IP routes. The routes received by these | could exchange overlapping IP routes. The routes received by these | |||
| SAFIs are evaluated, stored, and announced separately according to | SAFIs are evaluated, stored, and announced independently according to | |||
| the rules of [RFC4760]. The tie-breaking of route installation is a | the rules of [RFC4760]. The tie-breaking of route installation is a | |||
| matter of the local policies and preferences of the network operator. | matter of the local policies and preferences of the network operator. | |||
| Finally, as the BGP SPF peering is done following the procedures | Finally, as the BGP SPF peering is done following the procedures | |||
| described in [RFC4271], all the existing transport security | described in [RFC4271], all the existing transport security | |||
| mechanisms including [RFC5925] are available for the BGP-LS SPF SAFI. | mechanisms including [RFC5925] are available for the BGP-LS SPF SAFI. | |||
| 6.1.1. Relationship to Other BGP AFI/SAFI Tuples | 6.1.1. Relationship to Other BGP AFI/SAFI Tuples | |||
| Normally, the BGP-LS AFI/SAFI is used solely to compute the underlay | Normally, the BGP-LS AFI/SAFI is used solely to compute the underlay | |||
| and is given preference over other AFI/SAFIs. Other BGP SAFIs, e.g., | and is given preference over other AFI/SAFIs. Other BGP SAFIs, e.g., | |||
| IPv6/IPv6 Unicast VPN would use the BGP-SPF computed routes for next | IPv6/IPv6 Unicast VPN would use the BGP-SPF computed routes for next | |||
| hop resolution. However, if BGP-LS NLRI is also being advertised for | hop resolution. However, if BGP-LS NLRI is also being advertised for | |||
| controller consumption, there is no need to replicate the Node, Link, | controller consumption, there is no need to replicate the Node, Link, | |||
| and Prefix NLRI in BGP-NLRI. Rather, additional NLRI attributes can | and Prefix NLRI in BGP-NLRI. Rather, additional NLRI attributes can | |||
| be advertised in the BGP-LS SPF AFI/SAFI as required. | be advertised in the BGP-LS SPF AFI/SAFI as required. | |||
| 6.2. Peering Models | 6.2. Peering Models | |||
| As previously stated, BGP SPF can be deployed using the existing | As previously stated, BGP SPF can be deployed using the existing | |||
| peering model where there is a single hop BGP session on each and | peering model where there is a single-hop BGP session on each and | |||
| every link in the data center fabric [RFC7938]. This provides for | every link in the data center fabric [RFC7938]. This provides for | |||
| both the advertisement of routes and the determination of link and | both the advertisement of routes and the determination of link and | |||
| neighboring switch availability. With BGP SPF, the underlay will | neighboring switch availability. With BGP SPF, the underlay will | |||
| converge faster due to changes in the decision process which will | converge faster due to changes to the decision process that will | |||
| allow NLRI changes to be advertised faster after detecting a change. | allow NLRI changes to be advertised faster after detecting a change. | |||
| 6.2.1. Sparse Peering Model | 6.2.1. Sparse Peering Model | |||
| Alternately, BFD [RFC5580] can be used to swiftly determine the | Alternately, BFD [RFC5580] can be used to swiftly determine the | |||
| availability of links and the BGP peering model can be significantly | availability of links and the BGP peering model can be significantly | |||
| sparser than the data center fabric. BGP SPF sessions then only be | sparser than the data center fabric. BGP SPF sessions only need to | |||
| established with enough peers to provide a bi-connected graph. If | be established with enough peers to provide a bi-connected graph. If | |||
| IEBGP is used, then the BGP routers at tier N-1 will act as route- | IEBGP is used, then the BGP routers at tier N-1 will act as route- | |||
| reflectors for the routers at tier N. | reflectors for the routers at tier N. | |||
| The obvious usage of sparse peering is to avoid parallel sessions | The obvious usage of sparse peering is to avoid parallel sessions on | |||
| between the same two BGP speakers in the data center fabric. | links between the same two BGP speakers in the data center fabric. | |||
| However, this use case is not very useful since parallel layer-3 | However, this use case is not very useful since parallel layer-3 | |||
| links between the same two BGP routers are rare in CLOS or Fat-Tree | links between the same two BGP routers are rare in CLOS or Fat-Tree | |||
| topologies. Two more interesting scenarios are described below. | topologies. Two more interesting scenarios are described below. | |||
| In current Data Center topologies, there is often a very dense mesh | In current data center topologies, there is often a very dense mesh | |||
| of links between levels, e.g., leaf and spine, providing 32-way, | of links between levels, e.g., leaf and spine, providing 32-way, | |||
| 64-way, or more Equal-Cost Multi-Path (ECMP) paths. In these | 64-way, or more Equal-Cost Multi-Path (ECMP) paths. In these | |||
| topologies, it is desirable not to have a BGP session on every link | topologies, it is desirable not to have a BGP session on every link | |||
| and techniques such as the one described below Section 6.2.2 can be | and techniques such as the one described in Section 6.2.2 can be used | |||
| used establish sessions on some subset of northbound links. | establish sessions on some subset of northbound links. | |||
| Alternately, controller-based data center topologies are envisioned | Alternately, controller-based data center topologies are envisioned | |||
| where BGP speakers within the data center only establish BGP sessions | where BGP speakers within the data center only establish BGP sessions | |||
| with two or more controllers. In these topologies, fabric nodes | with two or more controllers. In these topologies, fabric nodes | |||
| below the first tier (using [RFC7938] hierarchy) will establish BGP | below the first tier (using [RFC7938] hierarchy) will establish BGP | |||
| multi-hop sessions with the controllers. For the multi-hop sessions, | multi-hop sessions with the controllers. For the multi-hop sessions, | |||
| determining the route to the controllers without depending on BGP | determining the route to the controllers without depending on BGP | |||
| would need to be through some other means beyond the scope of this | would need to be through some other means beyond the scope of this | |||
| document. However, the BGP discovery mechanisms Section 6.3 would be | document. However, the BGP discovery mechanisms described in | |||
| one possibility. | Section 6.5 would be one possibility. | |||
| 6.2.2. Bi-Connected Graph Heuristic | 6.2.2. Bi-Connected Graph Heuristic | |||
| With this heuristic, discovery of BGP peers is assumed Section 6.3. | With this heuristic, discovery of BGP peers is assumed, e.g., as | |||
| Additionally, it assumed that the direction of the peering can be | described in Section 6.5. Additionally, it assumed that the | |||
| ascertained. In the context of a data center fabric, direction is | direction of the peering can be ascertained. In the context of a | |||
| either northbound (toward the spine), southbound (toward the Top-Of- | data center fabric, direction is either northbound (toward the | |||
| Rack (TOR) switches) or east-west (same level in hierarchy. The | spine), southbound (toward the Top-Of-Rack (TOR) switches) or east- | |||
| determination of the direction is beyond the scope of this document. | west (same level in hierarchy. The determination of the direction is | |||
| However, it would be reasonable to assume a technique where the TOR | beyond the scope of this document. However, it would be reasonable | |||
| switches can be identified and the number of hops to the TOR is used | to assume a technique where the TOR switches can be identified and | |||
| to determine the direction. | the number of hops to the TOR is used to determine the direction. | |||
| In this heuristic, BGP speakers allow passive session establishment | In this heuristic, BGP speakers allow passive session establishment | |||
| for southbound BGP sessions. For northbound sessions, BGP speakers | for southbound BGP sessions. For northbound sessions, BGP speakers | |||
| will attempt to maintain two northbound BGP sessions with different | will attempt to maintain two northbound BGP sessions with different | |||
| switches (in data center fabrics there is normally a single layer-3 | switches (in data center fabrics there is normally a single layer-3 | |||
| connection anyway). For east-west sessions, passive BGP session | connection anyway). For east-west sessions, passive BGP session | |||
| establishment is allowed. However, BGP speaker will never actively | establishment is allowed. However, BGP speaker will never actively | |||
| establish an east-west BGP session unless it can't establish two | establish an east-west BGP session unless it can't establish two | |||
| northbound BGP sessions. | northbound BGP sessions. | |||
| 6.3. BGP Peer Discovery | 6.3. BGP Spine/Leaf Topology Policy | |||
| 6.3.1. BGP Peer Discovery Requirements | One of the advantages of using BGP SPF as the underlay protocol is | |||
| that BGP policy can be applied at any level. In Spine/Leaf | ||||
| topologies, it is not necessary to advertise BGP-LS NLRI received by | ||||
| leaves northbound to the spine nodes at the level above. If a common | ||||
| AS is used for the spine nodes, This can easily be accomplished with | ||||
| EBGP and a simple policy to filter advertisements from the leaves to | ||||
| the spine if the first AS in the AS path is the spine AS. | ||||
| In the figure below, the leaves would not advertise any NLRI with AS | ||||
| 64512 as the first AS in the AS path. | ||||
| +--------+ +--------+ +--------+ | ||||
| AS 64512 | | | | | | | ||||
| for Spine | Spine1 +----+ Spine2 +- ......... -+ SpineN | | ||||
| Nodes at | | | | | | | ||||
| this Level +-+-+-+-++ ++-+-+-+-+ +-+-+-+-++ | ||||
| +------+ | | | | | | | | | | | | ||||
| | +-----|-|-|------+ | | | | | | | | ||||
| | | +--|-|-|--------+-|-|-----------------+ | | | | ||||
| | | | | | | +---+ | | | | | | ||||
| | | | | | | | +--|-|-------------------+ | | | ||||
| | | | | | | | | | | +------+ +----+ | ||||
| | | | | | | | | | +--------------|----------+ | | ||||
| | | | | | | | | +-------------+ | | | | ||||
| | | | | | +----|--|----------------|--|--------+ | | | ||||
| | | | | +------|--|--------------+ | | | | | | ||||
| | | | +------+ | | | | | | | | | ||||
| ++--+--++ +-+-+--++ ++-+--+-+ ++-+--+-+ | ||||
| | Leaf1 |~~~~~~| Leaf2 | ........ | LeafX | | LeafY | | ||||
| +-------+ +-------+ +-------+ +-------+ | ||||
| Figure 2: Spine/Leaf Topology Policy | ||||
| 6.4. BGP Peer Discovery Requirements | ||||
| The most basic requirement is to be able to discover the address of a | The most basic requirement is to be able to discover the address of a | |||
| single-hop peer without pre-configuration. This is being | single-hop peer without pre-configuration. This is being | |||
| accomplished today with using IPv6 Router Advertisements (RA) | accomplished today with using IPv6 Router Advertisements (RA) | |||
| [RFC4861] and assuming that a BGP sessions is desired with any | [RFC4861] and assuming that a BGP sessions is desired with any | |||
| discovered peer. Beyond the basic requirement, it is useful to have | discovered peer. Beyond the basic requirement, it is useful to have | |||
| to following information relating to the BGP session: | to following information relating to the BGP session: | |||
| o Autonomous System (AS) and BGP Identifier of a potential peer. | o Autonomous System (AS) and BGP Identifier of a potential peer. | |||
| The latter can be used for debugging and to decrease the | The latter can be used for debugging and to decrease the | |||
| skipping to change at page 8, line 12 ¶ | skipping to change at page 9, line 4 ¶ | |||
| authentication, the security capabilities and possibly a key-chain | authentication, the security capabilities and possibly a key-chain | |||
| [RFC8177] to be used. | [RFC8177] to be used. | |||
| o Session Policy Identifier - A group number or name used to | o Session Policy Identifier - A group number or name used to | |||
| associate common session parameters with the peer. For example, | associate common session parameters with the peer. For example, | |||
| in a data center, BGP sessions with a Top of Rack (ToR) device | in a data center, BGP sessions with a Top of Rack (ToR) device | |||
| could have parameters than BGP sessions between leaf and spine. | could have parameters than BGP sessions between leaf and spine. | |||
| In a data center fabric, it is often useful to know whether a peer is | In a data center fabric, it is often useful to know whether a peer is | |||
| southbound (towards the servers) or northbound (towards the spine or | southbound (towards the servers) or northbound (towards the spine or | |||
| super-spine) Section 6.2.2. A potential requirement would also be to | super-spine), e.g., Section 6.2.2. A potential requirement would be | |||
| determine this dynamically. One mechanism, without specifying all | to determine this dynamically. One mechanism, without specifying all | |||
| the details, might be for the ToRs to be identified when installed | the details, might be for the ToRs to be identified when installed | |||
| and for the others switches in the fabric to determine their level | and for the others switches in the fabric to determine their level | |||
| based on the distance from the closest ToR. | based on the distance from the closest ToR. | |||
| If there are multiple links between BGP speakers or the links between | If there are multiple links between BGP speakers or the links between | |||
| BGP speakers are unnumbered, it is also useful to be able to | BGP speakers are unnumbered, it is also useful to be able to | |||
| establish multi-hop sessions using the loopback addresses. This will | establish multi-hop sessions using the loopback addresses. This will | |||
| often require the discovery protocol to install route(s) toward the | often require the discovery protocol to install route(s) toward the | |||
| potential peer loopback addresses prior to BGP session establishment. | potential peer loopback addresses prior to BGP session establishment. | |||
| Finally, a simple BGP discovery protocol could also be used to | Finally, a simple BGP discovery protocol could also be used to | |||
| establish a multi-hop session with one or more controllers by | establish a multi-hop session with one or more controllers by | |||
| advertising connectivity to one or more controllers. However, once | advertising connectivity to one or more controllers. However, once | |||
| the multi-hop session actually traverses multiple nodes, it is | the multi-hop session actually traverses multiple nodes, it is | |||
| bordering a distance-vector routing protocol and possibly this is not | bordering a distance-vector routing protocol and possibly this is not | |||
| a good requirement for the discovery protocol. | a good requirement for the discovery protocol. | |||
| 6.3.2. BGP Peer Discovery Alternatives | 6.5. BGP Peer Discovery | |||
| 6.5.1. BGP Peer Discovery Alternatives | ||||
| While BGP peer discovery is not part of [I-D.ietf-lsvr-bgp-spf], | While BGP peer discovery is not part of [I-D.ietf-lsvr-bgp-spf], | |||
| there are, at least, three proposals for BGP peer discovery. At | there are, at least, three proposals for BGP peer discovery. At | |||
| least one of these mechanisms will be adopted and will be applicable | least one of these mechanisms will be adopted and will be applicable | |||
| to deployments other than the data center. It is strongly | to deployments other than the data center. It is strongly | |||
| RECOMMENDED that the accepted mechanism be used in conjunction with | RECOMMENDED that the accepted mechanism be used in conjunction with | |||
| BGP SPF in data centers. The BGP discovery mechanism should | BGP SPF in data centers. The BGP discovery mechanism should | |||
| discovery both peer addresses and endpoints for BFD discovery. | discovery both peer addresses and endpoints for BFD discovery. | |||
| Additionally, it would be great if there were a heuristic for | Additionally, it would be great if there were a heuristic for | |||
| determining whether the peer is at a tier above or below the | determining whether the peer is at a tier above or below the | |||
| discovering BGP speaker (refer to Section 6.2.2). | discovering BGP speaker (refer to Section 6.2.2). | |||
| The BGP discovery mechanisms under consideration are | The BGP discovery mechanisms under consideration are | |||
| [I-D.acee-idr-lldp-peer-discovery], | [I-D.acee-idr-lldp-peer-discovery], | |||
| [I-D.xu-idr-neighbor-autodiscovery], and [I-D.ymbk-lsvr-lsoe]. | [I-D.xu-idr-neighbor-autodiscovery], and [I-D.ietf-lsvr-l3dl]. | |||
| 6.3.3. Data Center Interconnect (DCI) Applicability | 6.5.2. Data Center Interconnect (DCI) Applicability | |||
| Since BGP SPF is to be used for the routing underlay and DCI gateway | Since BGP SPF is to be used for the routing underlay and DCI gateway | |||
| boxes typically have direct or very simple connectivity, BGP external | boxes typically have direct or very simple connectivity, BGP external | |||
| sessions would typically not include the BGP SPF SAFI. | sessions would typically not include the BGP SPF SAFI. | |||
| 6.4. Non-CLOS/FAT Tree Topology Applicability | 6.6. Non-CLOS/FAT Tree Topology Applicability | |||
| The BGP SPF extensions [I-D.ietf-lsvr-bgp-spf] can be used in other | The BGP SPF extensions [I-D.ietf-lsvr-bgp-spf] can be used in other | |||
| topologies and avail the inherent convergence improvements. | topologies and avail the inherent convergence improvements. | |||
| Additionally, sparse peering techniques may be utilized Section 6.2. | Additionally, sparse peering techniques may be utilized Section 6.2. | |||
| However, determining whether or to establish a BGP session is more | However, determining whether or to establish a BGP session is more | |||
| complex and the heuristic described in Section 6.2.2 cannot be used. | complex and the heuristic described in Section 6.2.2 cannot be used. | |||
| In such topologies, other techniques such as those described in | In such topologies, other techniques such as those described in | |||
| [I-D.li-lsr-dynamic-flooding] may be employed. One potential | [I-D.ietf-lsr-dynamic-flooding] may be employed. One potential | |||
| deployment would be the underlay for a Service Provider (SP) backbone | deployment would be the underlay for a Service Provider (SP) backbone | |||
| where usage of a single protocol, i.e., BGP, is desired. | where usage of a single protocol, i.e., BGP, is desired. | |||
| 7. IANA Considerations | 7. BGP Policy Applicability | |||
| Existing BGP policy including aggregation and prefix filtering may be | ||||
| used in conjunction with the BGP-LS SPF SAFI. When aggregation | ||||
| policy is used, BGP-LS SPF prefix NLRI will be originated for the | ||||
| aggregate prefix and BGP-LS SPF prefix NLRI for components will be | ||||
| filtered. Additionally, link and node NLRI may be filtered and the | ||||
| abstracted by the prefix NLRI. | ||||
| When BGP policy is used with the BGP-LS SPF SAFI, BGP speakers in the | ||||
| BGP-LS SPF routing domain will not all have the same set of NLRI and | ||||
| will compute a different BGP local routing table. Consequently, care | ||||
| must be taken to assure routing is consistent and blackholes or | ||||
| routing loops do not ensue. However, this is no different than if | ||||
| tradition BGP routing using the IPv4 and IPv6 address families were | ||||
| used. | ||||
| 8. IANA Considerations | ||||
| No IANA updates are requested by this document. | No IANA updates are requested by this document. | |||
| 8. Security Considerations | 9. Security Considerations | |||
| This document introduces no new security considerations above and | This document introduces no new security considerations above and | |||
| beyond those already specified in the [RFC4271] and | beyond those already specified in the [RFC4271] and | |||
| [I-D.ietf-lsvr-bgp-spf]. | [I-D.ietf-lsvr-bgp-spf]. | |||
| 9. Acknowledgements | 10. Acknowledgements | |||
| The authors would like to thank Alvaro Retana and Yan Filyurin for | The authors would like to thank Alvaro Retana and Yan Filyurin for | |||
| the review and comments. | the review and comments. | |||
| 10. References | 11. References | |||
| 10.1. Normative References | 11.1. Normative References | |||
| [I-D.ietf-lsvr-bgp-spf] | [I-D.ietf-lsvr-bgp-spf] | |||
| Patel, K., Lindem, A., Zandi, S., and W. Henderickx, | Patel, K., Lindem, A., Zandi, S., and W. Henderickx, | |||
| "Shortest Path Routing Extensions for BGP Protocol", | "Shortest Path Routing Extensions for BGP Protocol", | |||
| draft-ietf-lsvr-bgp-spf-03 (work in progress), September | draft-ietf-lsvr-bgp-spf-04 (work in progress), December | |||
| 2018. | 2018. | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
| DOI 10.17487/RFC2119, March 1997, <https://www.rfc- | DOI 10.17487/RFC2119, March 1997, <https://www.rfc- | |||
| editor.org/info/rfc2119>. | editor.org/info/rfc2119>. | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| 10.2. Informative References | 11.2. Informative References | |||
| [CLOS] "A Study of Non-Blocking Switching Networks", The Bell | [CLOS] "A Study of Non-Blocking Switching Networks", The Bell | |||
| System Technical Journal, Vol. 32(2), DOI | System Technical Journal, Vol. 32(2), DOI | |||
| 10.1002/j.1538-7305.1953.tb01433.x, March 1953. | 10.1002/j.1538-7305.1953.tb01433.x, March 1953. | |||
| [I-D.acee-idr-lldp-peer-discovery] | [I-D.acee-idr-lldp-peer-discovery] | |||
| Lindem, A., Patel, K., Zandi, S., Haas, J., and X. Xu, | Lindem, A., Patel, K., Zandi, S., Haas, J., and X. Xu, | |||
| "BGP Logical Link Discovery Protocol (LLDP) Peer | "BGP Logical Link Discovery Protocol (LLDP) Peer | |||
| Discovery", draft-acee-idr-lldp-peer-discovery-03 (work in | Discovery", draft-acee-idr-lldp-peer-discovery-04 (work in | |||
| progress), June 2018. | progress), December 2018. | |||
| [I-D.li-lsr-dynamic-flooding] | [I-D.ietf-lsr-dynamic-flooding] | |||
| Li, T., Psenak, P., Ginsberg, L., Przygienda, T., and D. | Li, T., Psenak, P., Ginsberg, L., Przygienda, T., Cooper, | |||
| Cooper, "Dynamic Flooding on Dense Graphs", draft-li-lsr- | D., Jalil, L., and S. Dontula, "Dynamic Flooding on Dense | |||
| dynamic-flooding-01 (work in progress), October 2018. | Graphs", draft-ietf-lsr-dynamic-flooding-00 (work in | |||
| progress), February 2019. | ||||
| [I-D.ietf-lsvr-l3dl] | ||||
| Bush, R., Austein, R., and K. Patel, "Layer 3 Discovery | ||||
| and Liveness", draft-ietf-lsvr-l3dl-00 (work in progress), | ||||
| April 2019. | ||||
| [I-D.xu-idr-neighbor-autodiscovery] | [I-D.xu-idr-neighbor-autodiscovery] | |||
| Xu, X., Talaulikar, K., Bi, K., Tantsura, J., and N. | Xu, X., Talaulikar, K., Bi, K., Tantsura, J., and N. | |||
| Triantafillis, "BGP Neighbor Discovery", draft-xu-idr- | Triantafillis, "BGP Neighbor Discovery", draft-xu-idr- | |||
| neighbor-autodiscovery-10 (work in progress), October | neighbor-autodiscovery-11 (work in progress), April 2019. | |||
| 2018. | ||||
| [I-D.ymbk-lsvr-lsoe] | ||||
| Bush, R. and K. Patel, "Link State Over Ethernet", draft- | ||||
| ymbk-lsvr-lsoe-01 (work in progress), July 2018. | ||||
| [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, | [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, | |||
| DOI 10.17487/RFC2328, April 1998, <https://www.rfc- | DOI 10.17487/RFC2328, April 1998, <https://www.rfc- | |||
| editor.org/info/rfc2328>. | editor.org/info/rfc2328>. | |||
| [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A | [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A | |||
| Border Gateway Protocol 4 (BGP-4)", RFC 4271, | Border Gateway Protocol 4 (BGP-4)", RFC 4271, | |||
| DOI 10.17487/RFC4271, January 2006, <https://www.rfc- | DOI 10.17487/RFC4271, January 2006, <https://www.rfc- | |||
| editor.org/info/rfc4271>. | editor.org/info/rfc4271>. | |||
| End of changes. 46 change blocks. | ||||
| 95 lines changed or deleted | 149 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||