idnits 2.17.1 draft-keyupate-lsvr-applicability-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 13, 2018) is 2173 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-17) exists of draft-acee-idr-lldp-peer-discovery-02 == Outdated reference: A later version (-05) exists of draft-li-dynamic-flooding-04 == Outdated reference: A later version (-12) exists of draft-xu-idr-neighbor-autodiscovery-06 == Outdated reference: A later version (-03) exists of draft-ymbk-lsvr-lsoe-00 -- Obsolete informational reference (is this intentional?): RFC 7752 (Obsoleted by RFC 9552) Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 LSVR K. Patel 3 Internet-Draft Arrcus, Inc. 4 Intended status: Informational A. Lindem 5 Expires: November 14, 2018 Cisco Systems 6 S. Zandi 7 G. Dawra 8 Linkedin 9 May 13, 2018 11 Usage and Applicability of Link State Vector Routing in Data Centers 12 draft-keyupate-lsvr-applicability-01.txt 14 Abstract 16 This document discusses the usage and applicability of Link State 17 Vector Routing (LSVR) extensions in the CLOS architecture of Data 18 Center Networks. The document is intended to provide a simplified 19 guide for the deployment of LSVR extensions. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on November 14, 2018. 38 Copyright Notice 40 Copyright (c) 2018 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 56 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 2 57 3. Recommended Reading . . . . . . . . . . . . . . . . . . . . . 3 58 4. Common Deployment Scenario . . . . . . . . . . . . . . . . . 3 59 5. Justification for BGP SPF Extension . . . . . . . . . . . . . 4 60 6. LSVR Applicability to CLOS Networks . . . . . . . . . . . . . 4 61 6.1. Usage of BGP-LS SAFI . . . . . . . . . . . . . . . . . . 5 62 6.1.1. Relationship to Other BGP AFI/SAFI Tuples . . . . . . 5 63 6.2. Peering Models . . . . . . . . . . . . . . . . . . . . . 5 64 6.2.1. Bi-Connected Graph Heuristic . . . . . . . . . . . . 6 65 6.3. BGP Peer Discovery . . . . . . . . . . . . . . . . . . . 6 66 6.4. Data Center Interconnect (DCI) Applicability . . . . . . 6 67 6.5. Non-CLOS/FAT Tree Topology Applicability . . . . . . . . 7 68 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 69 8. Security Considerations . . . . . . . . . . . . . . . . . . . 7 70 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 71 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 72 10.1. Normative References . . . . . . . . . . . . . . . . . . 7 73 10.2. Informative References . . . . . . . . . . . . . . . . . 8 74 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 76 1. Introduction 78 This document complements [I-D.keyupate-lsvr-bgp-spf] by discussing 79 the applicability of the technology in a simple and fairly common 80 deployment scenario, which is described in Section 4. 82 After describing the deployment scenario, Section 5 will describe the 83 reasons for BGP modifications for such deployments. 85 Once the control plane routing protocol requirements are described, 86 Section 6 will cover the LSVR protocol enhancements to BGP to meet 87 these requirements and their applicability to Data Center CLOS 88 networks. 90 2. Requirements Language 92 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 93 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 94 "OPTIONAL" in this document are to be interpreted as described in BCP 95 14 [RFC2119] [RFC8174] when, and only when, they appear in all 96 capitals, as shown here. 98 3. Recommended Reading 100 This document assumes knowledge of existing data center networks and 101 data center network topologies [CLOS]. This document also assumes 102 knowledge of data center routing protocols like BGP [RFC4271], BGP- 103 SPF [I-D.keyupate-lsvr-bgp-spf], OSPF [RFC2328], as well as, data 104 center OAM protocols like LLDP [RFC4957] and BFD [RFC5580]. 106 4. Common Deployment Scenario 108 Within a Data Center, a common network design to interconnect servers 109 is done using the CLOS topology [CLOS]. The CLOS topology is fully 110 non-blocking and the topology is realized using Equal Cost Multipath 111 (ECMP). In a CLOS topology, the minimum number of parallel paths 112 between two servers is determined by the width of a tier-1 stage as 113 shown in the figure 1. 115 The following example illustrates multistage CLOS topology. 117 Tier-1 118 +-----+ 119 |NODE | 120 +->| 12 |--+ 121 | +-----+ | 122 Tier-2 | | Tier-2 123 +-----+ | +-----+ | +-----+ 124 +------------>|NODE |--+->|NODE |--+--|NODE |-------------+ 125 | +-----| 9 |--+ | 10 | +--| 11 |-----+ | 126 | | +-----+ +-----+ +-----+ | | 127 | | | | 128 | | +-----+ +-----+ +-----+ | | 129 | +-----+---->|NODE |--+ |NODE | +--|NODE |-----+-----+ | 130 | | | +---| 6 |--+->| 7 |--+--| 8 |---+ | | | 131 | | | | +-----+ | +-----+ | +-----+ | | | | 132 | | | | | | | | | | 133 +-----+ +-----+ | +-----+ | +-----+ +-----+ 134 |NODE | |NODE | Tier-3 +->|NODE |--+ Tier-3 |NODE | |NODE | 135 | 1 | | 2 | | 3 | | 4 | | 5 | 136 +-----+ +-----+ +-----+ +-----+ +-----+ 137 | | | | | | | | 138 A O B O <- Servers -> Z O O O 140 Figure 1: Illustration of the basic CLOS 142 5. Justification for BGP SPF Extension 144 Many data centers use BGP as a routing protocol to create an overlay 145 as well as an underlay network for their CLOS Topologies to simplify 146 layer-3 routing and operations [RFC7938]. However, BGP is a path- 147 vector routing protocol. Since it does not create a fabric topology, 148 it uses hop-by-hop EBGP peering to facilitate hop-by-hop routing to 149 create the underlay network and to resolve any overlay next hops. 150 The hop-by-hop BGP peering paradigm imposes several restrictions 151 within a CLOS. It severely prohibits a deployment of Route 152 Reflectors/Route Controllers as the EBGP peerings are inline with the 153 data path. The BGP best path algorithm is prefix-based and it 154 prevents announcements of prefixes to other BGP speakers until the 155 best path decision process is performed for the prefix at each 156 intermediate hop. These restrictions significantly delay the overall 157 convergence of the underlay network within a CLOS. 159 The LSVR SPF modifications allow BGP to overcome these limitations. 160 Furthermore, using the BGP-LS NLRI format [RFC7752] allows the LSVR 161 data to be advertised for nodes, links, and prefixes in the BGP 162 routing domain and used for SPF computations. 164 6. LSVR Applicability to CLOS Networks 166 With the BGP SPF extensions [I-D.keyupate-lsvr-bgp-spf], the BGP best 167 path computation and route computation are replaced with OSPF-like 168 algorithms [RFC2328] both to determine whether an BGP-LS NLRI has 169 changed and needs to be re-advertised and to compute the routing 170 table. These modifications will significantly improve convergence of 171 the underlay while affording the operational benefits of a single 172 routing protocol [RFC7938]. 174 Data center controllers typically require visibility to the BGP 175 topology to compute traffic-engineered paths. These controllers 176 learn the topology and other relevant information via the BGP-LS 177 address family [RFC7752] which is totally independent of the underlay 178 address families (usually IPv4/IPv6 unicast). Furthermore, in 179 traditional BGP underlays, all the BGP routers will need to advertise 180 their BGP-LS information independently. With the BGP SPF extensions, 181 controllers can learn the topology using the same BGP advertisements 182 used to compute the underlay routes. Furthermore, these data center 183 controllers can avail the convergence advantages of the BGP SPF 184 extensions. The placement of controllers can be outside of the 185 forwarding path or within the forwarding path. 187 Alternatively, as each and every router in the BGP SPF domain will 188 have a complete view of the topology, the operator can also choose to 189 configure BGP sessions in hop-by-hop peering model described in 191 [RFC7938] along with BFD [RFC5580]. In doing so, while the hop-by- 192 hop peering model lacks inherent benefits of the controller-based 193 model, BGP updates need not be serialized by BGP best path algorithm 194 in either of these models. This helps overall network convergence. 196 6.1. Usage of BGP-LS SAFI 198 The BGP SPF extensions [I-D.keyupate-lsvr-bgp-spf] define a new BGP- 199 LS SAFI for announcement of BGP SPF link-state. The NLRI format and 200 its associated attributes follow the format of BGP-LS for node, link, 201 and prefix announcements. Whether the peering model within a CLOS 202 follows hop-by-hop peering described in [RFC7938] or any controller- 203 based or route-reflector peering, an operator can exchange BGP SPF 204 SAFI routes over the BGP peering by simply configuring BGP SPF SAFI 205 between the necessary BGP speakers. 207 The BGP-LS SPF SAFI can also co-exist with BGP IP Unicast SAFI which 208 could exchange overlapping IP routes. The routes received by these 209 SAFIs are evaluated, stored, and announced separately according to 210 the rules of [RFC4760]. The tie-breaking of route installation is a 211 matter of the local policies and preferences of the network operator. 213 Finally, as the BGP SPF peering is done following the procedures 214 described in [RFC4271], all the existing transport security 215 mechanisms including [RFC5925] are available for the BGP-LS SPF SAFI. 217 6.1.1. Relationship to Other BGP AFI/SAFI Tuples 219 Normally, the BGP-LS AFI/SAFI is used solely to compute the underlay 220 and is given preference over other AFI/SAFIs. Other BGP SAFIs, e.g., 221 IPv6/IPv6 Unicast VPN would use the BGP-SPF computed routes for next 222 hop resolution. However, if BGP-LS NLRI is also being advertised for 223 controller consumption, there is no need to replicate the Node, Link, 224 and Prefix NLRI in BGP-NLRI. Rather, additional NLRI attributes can 225 be advertised in the BGP-LS SPF AFI/SAFI as required. 227 6.2. Peering Models 229 As previously stated, BGP SPF can be deployed using the existing 230 peering model where there is a single hop BGP session on each and 231 every link in the data center fabric [RFC7938]. This provides for 232 both the advertisement of routes and the determination of link and 233 neighboring switch availability. With BGP SPF, the underlay will 234 converge faster due to changes in the decision process to allow NLRI 235 changes to be readvertised after detecting a change. 237 Alternately, BFD [RFC5580] can be used to swiftly determine the 238 availability of links and the BGP peering model can be significantly 239 sparser than the data center fabric. BGP SPF sessions then only be 240 established with enough peers to provide a bi-connected graph. If 241 IEBGP is used, then the BGP routers at tier N-1 will act as route- 242 reflectors for the routers at tier N. 244 6.2.1. Bi-Connected Graph Heuristic 246 With this heuristic, discovery of BGP peers is assumed Section 6.3. 247 Additionally, it assumed that the direction of the peering can be 248 ascertained. In the context of a data center fabric, direction is 249 either northbound (toward the spine), southbound (toward the Top-Of- 250 Rack (TOR) switches) or east-west (same level in hierarchy. The 251 determination of the direction is beyond the scope of this document. 252 However, it would be reasonable to assume a technique where the TOR 253 switches can be identified and the number of hops to the TOR is used 254 to determine the direction. 256 In this heuristic, BGP speakers allow passive session establishment 257 for southbound BGP sessions. For northbound sessions, BGP speakers 258 will attempt to maintain two northbound BGP sessions with different 259 switches (in data center fabrics there is normally a single layer-3 260 connection anyway). For east-west sessions, passive BGP session 261 establishment is allowed. However, BGP speaker will never actively 262 establish an east-west BGP session unless it can't establish two 263 northbound BGP sessions. 265 6.3. BGP Peer Discovery 267 While BGP peer discovery is not part of [I-D.keyupate-lsvr-bgp-spf], 268 there are, at least, three proposals for BGP peer discovery. At 269 least one of these mechanisms will be adopted and will be applicable 270 to deployments other than the data center. It is strongly 271 RECOMMENDED that the accepted mechanism be used in conjunction with 272 BGP SPF in data centers. The BGP discovery mechanism should 273 discovery both peer addresses and endpoints for BFD discovery. 274 Additionally, it would be great if there were a heuristic for 275 determining whether the peer is at a tier above or below the 276 discovering BGP speaker (refer to Section 6.2.1). 278 The BGP discovery mechanisms under consideration are 279 [I-D.acee-idr-lldp-peer-discovery], 280 [I-D.xu-idr-neighbor-autodiscovery], and [I-D.ymbk-lsvr-lsoe]. 282 6.4. Data Center Interconnect (DCI) Applicability 284 Since BGP SPF is to be used for the routing underlay and DCI gateway 285 boxes typically have direct or very simple connectivity, BGP external 286 sessions would typically not include the BGP SPF SAFI. 288 6.5. Non-CLOS/FAT Tree Topology Applicability 290 The BGP SPF extensions [I-D.keyupate-lsvr-bgp-spf] can be used in 291 other topologies and avail the inherent convergence improvements. 292 Additionally, sparse peerting techniques may be utilized Section 6.2. 293 However, determining whether or to establish a BGP session is more 294 complex and the heuristic described in Section 6.2.1 cannot be used. 295 In such topologies, other techniques such as those described in 296 [I-D.li-dynamic-flooding] may be employed. One potential deployment 297 would be the underlay for a Service Provider (SP) backbone where 298 usage of a single protocol, i.e., BGP, is desired. 300 7. IANA Considerations 302 No IANA updates are requested by this document. 304 8. Security Considerations 306 This document introduces no new security considerations above and 307 beyond those already specified in the [RFC4271] and 308 [I-D.keyupate-lsvr-bgp-spf]. 310 9. Acknowledgements 312 The authors would like to thank Alvaro Retana and Yan Filyurin for 313 the review and comments. 315 10. References 317 10.1. Normative References 319 [I-D.keyupate-lsvr-bgp-spf] 320 Patel, K., Lindem, A., Zandi, S., and W. Henderickx, 321 "Shortest Path Routing Extensions for BGP Protocol", 322 draft-keyupate-lsvr-bgp-spf-00 (work in progress), March 323 2018. 325 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 326 Requirement Levels", BCP 14, RFC 2119, 327 DOI 10.17487/RFC2119, March 1997, . 330 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 331 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 332 May 2017, . 334 10.2. Informative References 336 [CLOS] "A Study of Non-Blocking Switching Networks", The Bell 337 System Technical Journal, Vol. 32(2), DOI 338 10.1002/j.1538-7305.1953.tb01433.x, March 1953. 340 [I-D.acee-idr-lldp-peer-discovery] 341 Lindem, A., Patel, K., Zandi, S., Haas, J., and X. Xu, 342 "BGP Logical Link Discovery Protocol (LLDP) Peer 343 Discovery", draft-acee-idr-lldp-peer-discovery-02 (work in 344 progress), December 2017. 346 [I-D.li-dynamic-flooding] 347 Li, T., "Dynamic Flooding on Dense Graphs", draft-li- 348 dynamic-flooding-04 (work in progress), March 2018. 350 [I-D.xu-idr-neighbor-autodiscovery] 351 Xu, X., Bi, K., Tantsura, J., Triantafillis, N., and K. 352 Talaulikar, "BGP Neighbor Autodiscovery", draft-xu-idr- 353 neighbor-autodiscovery-06 (work in progress), April 2018. 355 [I-D.ymbk-lsvr-lsoe] 356 Bush, R. and K. Patel, "Link State Over Ethernet", draft- 357 ymbk-lsvr-lsoe-00 (work in progress), March 2018. 359 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, 360 DOI 10.17487/RFC2328, April 1998, . 363 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 364 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 365 DOI 10.17487/RFC4271, January 2006, . 368 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 369 "Multiprotocol Extensions for BGP-4", RFC 4760, 370 DOI 10.17487/RFC4760, January 2007, . 373 [RFC4957] Krishnan, S., Ed., Montavont, N., Njedjou, E., Veerepalli, 374 S., and A. Yegin, Ed., "Link-Layer Event Notifications for 375 Detecting Network Attachments", RFC 4957, 376 DOI 10.17487/RFC4957, August 2007, . 379 [RFC5580] Tschofenig, H., Ed., Adrangi, F., Jones, M., Lior, A., and 380 B. Aboba, "Carrying Location Objects in RADIUS and 381 Diameter", RFC 5580, DOI 10.17487/RFC5580, August 2009, 382 . 384 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 385 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 386 June 2010, . 388 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 389 S. Ray, "North-Bound Distribution of Link-State and 390 Traffic Engineering (TE) Information Using BGP", RFC 7752, 391 DOI 10.17487/RFC7752, March 2016, . 394 [RFC7938] Lapukhov, P., Premji, A., and J. Mitchell, Ed., "Use of 395 BGP for Routing in Large-Scale Data Centers", RFC 7938, 396 DOI 10.17487/RFC7938, August 2016, . 399 Authors' Addresses 401 Keyur Patel 402 Arrcus, Inc. 403 2077 Gateway Pl 404 San Jose, CA 95110 405 USA 407 Email: keyur@arrcus.com 409 Acee Lindem 410 Cisco Systems 411 301 Midenhall Way 412 Cary, NC 95110 413 USA 415 Email: acee@cisco.com 417 Shawn Zandi 418 Linkedin 419 222 2nd Street 420 San Francisco, CA 94105 421 USA 423 Email: szandi@linkedin.com 424 Gaurav Dawra 425 Linkedin 426 222 2nd Street 427 San Francisco, CA 94105 428 USA 430 Email: gdawra@linkedin.com