idnits 2.17.1 draft-ietf-lsr-isis-spine-leaf-ext-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 18, 2018) is 1949 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO10589' == Outdated reference: A later version (-17) exists of draft-ietf-isis-reverse-metric-07 ** Obsolete normative reference: RFC 5306 (Obsoleted by RFC 8706) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Networking Working Group N. Shen 3 Internet-Draft L. Ginsberg 4 Intended status: Standards Track Cisco Systems 5 Expires: June 21, 2019 S. Thyamagundalu 6 December 18, 2018 8 IS-IS Routing for Spine-Leaf Topology 9 draft-ietf-lsr-isis-spine-leaf-ext-00 11 Abstract 13 This document describes a mechanism for routers and switches in a 14 Spine-Leaf type topology to have non-reciprocal Intermediate System 15 to Intermediate System (IS-IS) routing relationships between the 16 leafs and spines. The leaf nodes do not need to have the topology 17 information of other nodes and exact prefixes in the network. This 18 extension also has application in the Internet of Things (IoT). 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on June 21, 2019. 37 Copyright Notice 39 Copyright (c) 2018 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 55 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 56 2. Motivations . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 3. Spine-Leaf (SL) Extension . . . . . . . . . . . . . . . . . . 4 58 3.1. Topology Examples . . . . . . . . . . . . . . . . . . . . 4 59 3.2. Applicability Statement . . . . . . . . . . . . . . . . . 5 60 3.3. Spine-Leaf TLV . . . . . . . . . . . . . . . . . . . . . 6 61 3.3.1. Spine-Leaf Sub-TLVs . . . . . . . . . . . . . . . . . 7 62 3.3.1.1. Leaf-Set Sub-TLV . . . . . . . . . . . . . . . . 7 63 3.3.1.2. Info-Req Sub-TLV . . . . . . . . . . . . . . . . 8 64 3.3.2. Advertising IPv4/IPv6 Reachability . . . . . . . . . 8 65 3.3.3. Advertising Connection to RF-Leaf Node . . . . . . . 8 66 3.4. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 8 67 3.4.1. Pure CLOS Topology . . . . . . . . . . . . . . . . . 10 68 3.5. Implementation and Operation . . . . . . . . . . . . . . 11 69 3.5.1. CSNP PDU . . . . . . . . . . . . . . . . . . . . . . 11 70 3.5.2. Overload Bit . . . . . . . . . . . . . . . . . . . . 11 71 3.5.3. Spine Node Hostname . . . . . . . . . . . . . . . . . 11 72 3.5.4. IS-IS Reverse Metric . . . . . . . . . . . . . . . . 11 73 3.5.5. Spine-Leaf Traffic Engineering . . . . . . . . . . . 12 74 3.5.6. Other End-to-End Services . . . . . . . . . . . . . . 12 75 3.5.7. Address Family and Topology . . . . . . . . . . . . . 12 76 3.5.8. Migration . . . . . . . . . . . . . . . . . . . . . . 13 77 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 78 5. Security Considerations . . . . . . . . . . . . . . . . . . . 14 79 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 14 80 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 81 7.1. Normative References . . . . . . . . . . . . . . . . . . 14 82 7.2. Informative References . . . . . . . . . . . . . . . . . 15 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 85 1. Introduction 87 The IS-IS routing protocol defined by [ISO10589] has been widely 88 deployed in provider networks, data centers and enterprise campus 89 environments. In the data center and enterprise switching networks, 90 a Spine-Leaf topology is commonly used. This document describes a 91 mechanism where IS-IS routing can be optimized for a Spine-Leaf 92 topology. 94 In a Spine-Leaf topology, normally a leaf node connects to a number 95 of spine nodes. Data traffic going from one leaf node to another 96 leaf node needs to pass through one of the spine nodes. Also, the 97 decision to choose one of the spine nodes is usually part of equal 98 cost multi-path (ECMP) load sharing. The spine nodes can be 99 considered as gateway devices to reach destinations on other leaf 100 nodes. In this type of topology, the spine nodes have to know the 101 topology and routing information of the entire network, but the leaf 102 nodes only need to know how to reach the gateway devices to which are 103 the spine nodes they are uplinked. 105 This document describes the IS-IS Spine-Leaf extension that allows 106 the spine nodes to have all the topology and routing information, 107 while keeping the leaf nodes free of topology information other than 108 the default gateway routing information. The leaf nodes do not even 109 need to run a Shortest Path First (SPF) calculation since they have 110 no topology information. 112 1.1. Requirements Language 114 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 115 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 116 document are to be interpreted as described in RFC 2119 [RFC2119]. 118 2. Motivations 120 o The leaf nodes in a Spine-Leaf topology do not require complete 121 topology and routing information of the entire domain since their 122 forwarding decision is to use ECMP with spine nodes as default 123 gateways 125 o The spine nodes in a Spine-Leaf topology are richly connected to 126 leaf nodes, which introduces significant flooding duplication if 127 they flood all Link State PDUs (LSPs) to all the leaf nodes. It 128 saves both spine and leaf nodes' CPU and link bandwidth resources 129 if flooding is blocked to leaf nodes. For small Top of the Rack 130 (ToR) leaf switches in data centers, it is meaningful to prevent 131 full topology routing information and massive database flooding 132 through those devices. 134 o When a spine node advertises a topology change, every leaf node 135 connected to it will flood the update to all the other spine 136 nodes, and those spine nodes will further flood them to all the 137 leaf nodes, causing a O(n^2) flooding storm which is largely 138 redundant. 140 o Similar to some of the overlay technologies which are popular in 141 data centers, the edge devices (leaf nodes) may not need to 142 contain all the routing and forwarding information on the device's 143 control and forwarding planes. "Conversational Learning" can be 144 utilized to get the specific routing and forwarding information in 145 the case of pure CLOS topology and in the events of link and node 146 down. 148 o Small devices and appliances of Internet of Things (IoT) can be 149 considered as leafs in the routing topology sense. They have CPU 150 and memory constrains in design, and those IoT devices do not have 151 to know the exact network topology and prefixes as long as there 152 are ways to reach the cloud servers or other devices. 154 3. Spine-Leaf (SL) Extension 156 3.1. Topology Examples 158 +--------+ +--------+ +--------+ 159 | | | | | | 160 | Spine1 +----+ Spine2 +- ......... -+ SpineN | 161 | | | | | | 162 +-+-+-+-++ ++-+-+-+-+ +-+-+-+-++ 163 +------+ | | | | | | | | | | | 164 | +-----|-|-|------+ | | | | | | | 165 | | +--|-|-|--------+-|-|-----------------+ | | | 166 | | | | | | +---+ | | | | | 167 | | | | | | | +--|-|-------------------+ | | 168 | | | | | | | | | | +------+ +----+ 169 | | | | | | | | | +--------------|----------+ | 170 | | | | | | | | +-------------+ | | | 171 | | | | | +----|--|----------------|--|--------+ | | 172 | | | | +------|--|--------------+ | | | | | 173 | | | +------+ | | | | | | | | 174 ++--+--++ +-+-+--++ ++-+--+-+ ++-+--+-+ 175 | Leaf1 | | Leaf2 | ........ | LeafX | | LeafY | 176 +-------+ +-------+ +-------+ +-------+ 178 Figure 1: A Spine-Leaf Topology 180 +---------+ +--------+ 181 | Spine1 | | Spine2 | 182 +-+-+-+-+-+ +-+-+-+-++ 183 | | | | | | | | 184 | | | +-----------------|-|-|-|-+ 185 | | +------------+ | | | | | 186 +--------+ +-+ | | | | | | 187 | +----------------------------+ | | | | 188 | | | +------------------+ | +----+ 189 | | | | | +-------+ | | 190 | | | | | | | | 191 +-+---+-+ +--+--+-+ +-+--+--+ +--+--+-+ 192 | Leaf1 | | Leaf2 | | Leaf3 | | Leaf4 | 193 +-------+ +-------+ +-------+ +-------+ 195 Figure 2: A CLOS Topology 197 3.2. Applicability Statement 199 This extension assumes the network is a Spine-Leaf topology, and it 200 should not be applied in an arbitrary network setup. The spine nodes 201 can be viewed as the aggregation layer of the network, and the leaf 202 nodes as the access layer of the network. The leaf nodes use a load 203 sharing algorithm with spine nodes as nexthops in routing and 204 forwarding. 206 This extension works when the spine nodes are inter-connected, and it 207 works with a pure CLOS or Fat Tree topology based network where the 208 spines are NOT horizontally interconnected. 210 Although the example diagram in Figure 1 shows a fully meshed Spine- 211 Leaf topology, this extension also works in the case where they are 212 partially meshed. For instance, leaf1 through leaf10 may be fully 213 meshed with spine1 through spine5 while leaf11 through leaf20 is 214 fully meshed with spine4 through spine8, and all the spines are 215 inter-connected in a redundant fashion. 217 This extension can also work in multi-level spine-leaf topology. The 218 lower level spine node can be a 'leaf' node to the upper level spine 219 node. A spine-leaf 'Tier' can be exchanged with IS-IS hello packets 220 to allow tier X to be connected with tier X+1 using this extension. 221 Normally tier-0 will be the TOR routers and switches if provisioned. 223 This extension also works with normal IS-IS routing in a topology 224 with more than two layers of spine and leaf. For instance, in 225 example diagrams Figure 1 and Figure 2, there can be another Core 226 layer of routers/switches on top of the aggregation layer. From an 227 IS-IS routing point of view, the Core nodes are not affected by this 228 extension and will have the complete topology and routing information 229 just like the spine nodes. To make the network even more scalable, 230 the Core layer can operate as a level-2 IS-IS sub-domain while the 231 Spine and Leaf layers operate as stays at the level-1 IS-IS domain. 233 This extension assumes the link between the spine and leaf nodes are 234 point-to-point, or point-to-point over LAN [RFC5309]. The links 235 connecting among the spine nodes or the links between the leaf nodes 236 can be any type. 238 3.3. Spine-Leaf TLV 240 This extension introduces a new TLV, the Spine-Leaf TLV, which may be 241 advertised in IS-IS Hello (IIH) PDUs, LSPs, or in Circuit Scoped Link 242 State PDUs (CS-LSP) [RFC7356]. It is used by both spine and leaf 243 nodes in this Spine-Leaf mechanism. 245 0 1 2 3 246 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 247 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 248 | Type | Length | SL Flag | 249 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 250 | .. Optional Sub-TLVs 251 +-+-+-+-+-+-+-+-+-.... 253 The fields of this TLV are defined as follows: 255 Type: 1 octet Suggested value 150 (to be assigned by IANA) 257 Length: 1 octet (2 + length of sub-TLVs). 259 SL Flags: 16 bits 261 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 262 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 263 | Tier | Reserved |T|R|L| 264 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 266 Tier: A value from 0 to 15. It represents the spine-leaf 267 tier level. The value 15 is reserved to indicate the 268 tier level is unknown. This value is only valid when 269 the 'T' bit (see below) is set. If the 'T' bit is 270 clear, this value MUST be set to zero on transmission, 271 and it MUST be ignored on receipt. 273 L bit (0x01): Only leaf node sets this bit. If the L bit is 274 set in the SL flag, the node indicates it is in 'Leaf- 275 Mode'. 277 R bit (0x02): Only Spine node sets this bit. If the R bit is 278 set, the node indicates to the leaf neighbor that it 279 can be used as the default route gateway. 281 T bit (0x04): If set, the value in the "Tier" field (see 282 above) is valid. 284 Optional Sub-TLV: Not defined in this document, for future 285 extension 287 sub-TLVs MAY be included when the TLV is in a CS-LSP. 288 sub-TLVs MUST NOT be included when the TLV is in an IIH 290 3.3.1. Spine-Leaf Sub-TLVs 292 If the data center topology is a pure CLOS or Fat Tree, there are no 293 link connections among the spine nodes. If we also assume there is 294 not another Core layer on top of the aggregation layer, then the 295 traffic from one leaf node to another may have a problem if there is 296 a link outage between a spine node and a leaf node. For instance, in 297 the diagram of Figure 2, if Leaf1 sends data traffic to Leaf3 through 298 Spine1 node, and the Spine1-Leaf3 link is down, the data traffic will 299 be dropped on the Spine1 node. 301 To address this issue spine and leaf nodes may send/request specific 302 reachability information via the sub-TLVs defined below. 304 Two Spine-Leaf sub-TLVs are defined. The Leaf-Set sub-TLV and the 305 Info-Req sub-TLV. 307 3.3.1.1. Leaf-Set Sub-TLV 309 This sub-TLV is used by spine nodes to optionally advertise Leaf 310 neighbors to other Leaf nodes. The fields of this sub-TLV are 311 defined as follows: 313 Type: 1 octet Suggested value 1 (to be assigned by IANA) 315 Length: 1 octet MUST be a multiple of 6 octets. 317 Leaf-Set: A list of IS-IS System-ID of the leaf node neighbors of 318 this spine node. 320 3.3.1.2. Info-Req Sub-TLV 322 This sub-TLV is used by leaf nodes to request the advertisement of 323 more specific prefix information from a selected spine node. The 324 list of leaf nodes in this sub-TLV reflects the current set of leaf- 325 nodes for which not all spine node neighbors have indicated the 326 presence of connectivity in the Leaf-Set sub-TLV (See 327 Section 3.3.1.1). The fields of this sub-TLV are defined as follows: 329 Type: 1 octet Suggested value 2 (to be assigned by IANA) 331 Length: 1 octet. It MUST be a multiple of 6 octets. 333 Info-Req: List of IS-IS System-IDs of leaf nodes for which 334 connectivity information is being requested. 336 3.3.2. Advertising IPv4/IPv6 Reachability 338 In cases where connectivity between a leaf node and a spine node is 339 down, the leaf node MAY request reachability information from a spine 340 node as described in Section 3.3.1.2. The spine node utilizes TLVs 341 135 [RFC5305] and TLVs 236 [RFC5308] to advertise this information. 342 These TLVs MAY be included either in IIHs or CS-LSPs [RFC7356] sent 343 from the spine to the requesting leaf node. Sending such information 344 in IIHs has limited scale - all reachability information MUST fit 345 within a single IIH. It is therefore recommended that CS-LSPs be 346 used. 348 3.3.3. Advertising Connection to RF-Leaf Node 350 For links between Spine and Leaf Nodes on which the Spine Node has 351 set the R-bit and the Leaf node has set the L-bit in their respective 352 Spine-Leaf TLVs, spine nodes may advertise the link with a bit in the 353 "link-attribute" sub-TLV [RFC5029] to express this link is not used 354 for LSP flooding. This information can be used by nodes computing a 355 flooding topology e.g., [DYNAMIC-FLOODING], to exclude the RF-Leaf 356 nodes from the computed flooding topology. 358 3.4. Mechanism 360 Leaf nodes in a spine-leaf application using this extension are 361 provisioned with two attributes: 363 1)Tier level of 0. This indicates the node is a Leaf Node. The 364 value 0 is advertised in the Tier field of Spine-Leaf TLV defined 365 above. 367 2)Flooding reduction enabled/disabled. If flooding reduction is 368 enabled the L-bit is set to one in the Spine-Leaf TLV defined above 370 A spine node does not need explicit configuration. Spine nodes can 371 dynamically discover their tier level by computing the number of hops 372 to a leaf node. Until a spine node determines its tier level it MUST 373 advertise level 15 (unknown tier level) in the Spine-Leaf TLV defined 374 above. Each tier level can also be statically provisioned on the 375 node. 377 When a spine node receives an IIH which includes the Spine-Leaf TLV 378 with Tier level 0 and 'L' bit set, it labels the point-to-point 379 interface and adjacency to be a 'Reduced Flooding Leaf-Peer (RF- 380 Leaf)'. IIHs sent by a spine node on a link to an RF-Leaf include 381 the Spine-Leaf TLV with the 'R' bit set in the flags field. The 'R' 382 bit indicates to the RF-Leaf neighbor that the spine node can be used 383 as a default routing nexthop. 385 There is no change to the IS-IS adjacency bring-up mechanism for 386 Spine-Leaf peers. 388 A spine node blocks LSP flooding to RF-Leaf adjacencies, except for 389 the LSP PDUs in which the IS-IS System-ID matches the System-ID of 390 the RF-Leaf neighbor. This exception is needed since when the leaf 391 node reboots, the spine node needs to forward to the leaf node non- 392 purged LSPs from the RF-Leaf's previous incarnation. 394 Leaf nodes will perform IS-IS LSP flooding as normal over all of its 395 IS-IS adjacencies, but in the case of RF-Leafs only self-originated 396 LSPs will exist in its LSP database. 398 Spine nodes will receive all the LSP PDUs in the network, including 399 all the spine nodes and leaf nodes. It will perform Shortest Path 400 First (SPF) as a normal IS-IS node does. There is no change to the 401 route calculation and forwarding on the spine nodes. 403 The LSPs of a node only floods north bound towards the upper layer 404 spine nodes. The default route is generated with loadsharing also 405 towards the upper layer spine nodes. 407 RF-Leaf nodes do not have any LSP in the network except for its own. 408 Therefore there is no need to perform SPF calculation on the RF-Leaf 409 node. It only needs to download the default route with the nexthops 410 of those Spine Neighbors which have the 'R' bit set in the Spine-Leaf 411 TLV in IIH PDUs. IS-IS can perform equal cost or unequal cost load 412 sharing while using the spine nodes as nexthops. The aggregated 413 metric of the outbound interface and the 'Reverse Metric' 414 [REVERSE-METRIC] can be used for this purpose. 416 3.4.1. Pure CLOS Topology 418 In a data center where the topology is pure CLOS or Fat Tree, there 419 is no interconnection among the spine nodes, and there is not another 420 Core layer above the aggregation layer with reachability to the leaf 421 nodes. When flooding reduction to RF-Leafs is in use, if the link 422 between a spine and a leaf goes down, there is then a possibility of 423 black holing the data traffic in the network. 425 As in the diagram Figure 2, if the link Spine1-Leaf3 goes down, there 426 needs to be a way for Leaf1, Leaf2 and Leaf4 to avoid the Spine1 if 427 the destination of data traffic is to Leaf3 node. 429 In the above example, the Spine1 and Spine2 are provisioned to 430 advertise the Leaf-Set sub-TLV of the Spine-Leaf TLV. Originally 431 both Spines will advertise Leaf1 through Leaf4 as their Leaf-Set. 432 When the Spine1-Leaf3 link is down, Spine1 will only have Leaf1, 433 Leaf2 and Leaf4 in its Leaf-Set. This allows the other leaf nodes to 434 know that Spine1 has lost connectivity to the leaf node of Leaf3. 436 Each RF-Leaf node can select another spine node to request for some 437 prefix information associated with the lost leaf node. In this 438 diagram of Figure 2, there are only two spine nodes (Spine-Leaf 439 topology can have more than two spine nodes in general). Each RF- 440 Leaf node can independently select a spine node for the leaf 441 information. The RF-Leaf nodes will include the Info-Req sub-TLV in 442 the Spine-Leaf TLV in hellos sent to the selected spine node, Spine2 443 in this case. 445 The spine node, upon receiving the request from one or more leaf 446 nodes, will find the IPv6/IPv4 prefixes advertised by the leaf nodes 447 listed in the Info-Req sub-TLV. The spine node will use the 448 mechanism defined in Section 3.3.2 to advertise these prefixes to the 449 RF-Leaf node. For instance, it will include the IPv4 loopback prefix 450 of leaf3 based on the policy configured or administrative tag 451 attached to the prefixes. When the leaf nodes receive the more 452 specific prefixes, they will install the advertised prefixes towards 453 the other spine nodes (Spine2 in this example). 455 For instance in the data center overlay scenario, when any IP 456 destination or MAC destination uses the leaf3's loopback as the 457 tunnel nexthop, the overlay tunnel from leaf nodes will only select 458 Spine2 as the gateway to reach leaf3 as long as the Spine1-Leaf3 link 459 is still down. 461 In cases where multiple links or nodes fail at the same time, the RF- 462 leaf node may need to send the Info-Req to multiple upper layer spine 463 nodes in order to obtain reachability information for all the 464 partially connected nodes. 466 This negative routing is more useful between tier 0 and tier 1 spine- 467 leaf levels in a multi-level spine-leaf topology when the reduced 468 flooding extension is in use. Nodes in tiers 1 or greater may have 469 much richer topology information and alternative paths. 471 3.5. Implementation and Operation 473 3.5.1. CSNP PDU 475 In Spine-Leaf extension, Complete Sequence Number PDU (CSNP) does not 476 need to be transmitted over the Spine-Leaf link to an RF-Leaf. Some 477 IS-IS implementations send periodic CSNPs after the initial adjacency 478 bring-up over a point-to-point interface. There is no need for this 479 optimization here since the RF-Leaf does not need to receive any 480 other LSPs from the network, and the only LSPs transmitted across the 481 Spine-Leaf link is the leaf node LSP. 483 Also in the graceful restart case[RFC5306], for the same reason, 484 there is no need to send the CSNPs over the Spine-Leaf interface to 485 an RF-Leaf. Spine nodes only need to set the SRMflag on the LSPs 486 belonging to the RF-Leaf. 488 3.5.2. Overload Bit 490 The leaf node SHOULD set the 'overload' bit on its LSP PDU, since if 491 the spine nodes were to forward traffic not meant for the local node, 492 the leaf node does not have the topology information to prevent a 493 routing/forwarding loop. 495 3.5.3. Spine Node Hostname 497 This extension creates a non-reciprocal relationship between the 498 spine node and leaf node. The spine node will receive leaf's LSP and 499 will know the leaf's hostname, but the leaf does not have spine's 500 LSP. This extension allows the Dynamic Hostname TLV [RFC5301] to be 501 optionally included in spine's IIH PDU when sending to a 'Leaf-Peer'. 502 This is useful in troubleshooting cases. 504 3.5.4. IS-IS Reverse Metric 506 This metric is part of the aggregated metric for leaf's default route 507 installation with load sharing among the spine nodes. When a spine 508 node is in 'overload' condition, it should use the IS-IS Reverse 509 Metric TLV in IIH [REVERSE-METRIC] to set this metric to maximum to 510 discourage the leaf using it as part of the loadsharing. 512 In some cases, certain spine nodes may have less bandwidth in link 513 provisioning or in real-time condition, and it can use this metric to 514 signal to the leaf nodes dynamically. 516 In other cases, such as when the spine node loses a link to a 517 particular leaf node, although it can redirect the traffic to other 518 spine nodes to reach that destination leaf node, but it MAY want to 519 increase this metric value if the inter-spine connection becomes over 520 utilized, or the latency becomes an issue. 522 In the leaf-leaf link as a backup gateway use case, the 'Reverse 523 Metric' SHOULD always be set to very high value. 525 3.5.5. Spine-Leaf Traffic Engineering 527 Besides using the IS-IS Reverse Metric by the spine nodes to affect 528 the traffic pattern for leaf default gateway towards multiple spine 529 nodes, the IPv6/IPv4 Info-Advertise sub-TLVs can be selectively used 530 by traffic engineering controllers to move data traffic around the 531 data center fabric to alleviate congestion and to reduce the latency 532 of a certain class of traffic pairs. By injecting more specific leaf 533 node prefixes, it will allow the spine nodes to attract more traffic 534 on some underutilized links. 536 3.5.6. Other End-to-End Services 538 Losing the topology information will have an impact on some of the 539 end-to-end network services, for instance, MPLS TE or end-to-end 540 segment routing. Some other mechanisms such as those described in 541 PCE [RFC4655] based solution may be used. In this Spine-Leaf 542 extension, the role of the leaf node is not too much different from 543 the multi-level IS-IS routing while the level-1 IS-IS nodes only have 544 the default route information towards the node which has the Attach 545 Bit (ATT) set, and the level-2 backbone does not have any topology 546 information of the level-1 areas. The exact mechanism to enable 547 certain end-to-end network services in Spine-Leaf network is outside 548 the scope of this document. 550 3.5.7. Address Family and Topology 552 IPv6 Address families[RFC5308], Multi-Topology (MT)[RFC5120] and 553 Multi-Instance (MI)[RFC8202] information is carried over the IIH PDU. 554 Since the goal is to simplify the operation of IS-IS network, for the 555 simplicity of this extension, the Spine-Leaf mechanism is applied the 556 same way to all the address families, MTs and MIs. 558 3.5.8. Migration 560 For this extension to be deployed in existing networks, a simple 561 migration scheme is needed. To support any leaf node in the network, 562 all the involved spine nodes have to be upgraded first. So the first 563 step is to migrate all the involved spine nodes to support this 564 extension, then the leaf nodes can be enabled with 'Leaf-Mode' one by 565 one. No flag day is needed for the extension migration. 567 4. IANA Considerations 569 A new TLV codepoint is defined in this document and needs to be 570 assigned by IANA from the "IS-IS TLV Codepoints" registry. It is 571 referred to as the Spine-Leaf TLV and the suggested value is 150. 572 This TLV is only to be optionally inserted either in the IIH PDU or 573 in the Circuit Flooding Scoped LSP PDU. IANA is also requested to 574 maintain the SL-flag bit values in this TLV, and 0x01, 0x02 and 0x04 575 bits are defined in this document. 577 Value Name IIH LSP SNP Purge CS-LSP 578 ----- --------------------- --- --- --- ----- ------- 579 150 Spine-Leaf y y n n y 581 This extension also proposes to have the Dynamic Hostname TLV, 582 already assigned as code 137, to be allowed in IIH PDU. 584 Value Name IIH LSP SNP Purge 585 ----- --------------------- --- --- --- ----- 586 137 Dynamic Name y y n y 588 Two new sub-TLVs are defined in this document and needs to be added 589 assigned by IANA from the "IS-IS TLV Codepoints". They are referred 590 to in this document as the Leaf-Set sub-TLV and the Info-Req sub-TLV. 591 It is suggested to have the values 1 and 2 respectively. 593 This document also requests that IANA allocate from the registry of 594 link-attribute bit values for sub-TLV 19 of TLV 22 (Extended IS 595 reachability TLV). This new bit is referred to as the "Connect to 596 RF-Leaf Node" bit. 598 Value Name Reference 599 ----- ----- ---------- 600 0x3 Connect to RF-Leaf Node This document 602 5. Security Considerations 604 Security concerns for IS-IS are addressed in [ISO10589], [RFC5304], 605 [RFC5310], and [RFC7602]. This extension does not raise additional 606 security issues. 608 6. Acknowledgments 610 The authors would like to thank Tony Przygienda for his discussion 611 and contributions. The authors also would like to thank Acee Lindem, 612 Russ White, Christian Hopps and Aijun Wang for their review and 613 comments of this document. 615 7. References 617 7.1. Normative References 619 [ISO10589] 620 ISO "International Organization for Standardization", 621 "Intermediate system to Intermediate system intra-domain 622 routeing information exchange protocol for use in 623 conjunction with the protocol for providing the 624 connectionless-mode Network Service (ISO 8473), ISO/IEC 625 10589:2002, Second Edition.", Nov 2002. 627 [REVERSE-METRIC] 628 Shen, N., Amante, S., and M. Abrahamsson, "IS-IS Routing 629 with Reverse Metric", draft-ietf-isis-reverse-metric-07 630 (work in progress), 2017. 632 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 633 Requirement Levels", BCP 14, RFC 2119, 634 DOI 10.17487/RFC2119, March 1997, . 637 [RFC5029] Vasseur, JP. and S. Previdi, "Definition of an IS-IS Link 638 Attribute Sub-TLV", RFC 5029, DOI 10.17487/RFC5029, 639 September 2007, . 641 [RFC5120] Przygienda, T., Shen, N., and N. Sheth, "M-ISIS: Multi 642 Topology (MT) Routing in Intermediate System to 643 Intermediate Systems (IS-ISs)", RFC 5120, 644 DOI 10.17487/RFC5120, February 2008, . 647 [RFC5301] McPherson, D. and N. Shen, "Dynamic Hostname Exchange 648 Mechanism for IS-IS", RFC 5301, DOI 10.17487/RFC5301, 649 October 2008, . 651 [RFC5304] Li, T. and R. Atkinson, "IS-IS Cryptographic 652 Authentication", RFC 5304, DOI 10.17487/RFC5304, October 653 2008, . 655 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 656 Engineering", RFC 5305, DOI 10.17487/RFC5305, October 657 2008, . 659 [RFC5306] Shand, M. and L. Ginsberg, "Restart Signaling for IS-IS", 660 RFC 5306, DOI 10.17487/RFC5306, October 2008, 661 . 663 [RFC5308] Hopps, C., "Routing IPv6 with IS-IS", RFC 5308, 664 DOI 10.17487/RFC5308, October 2008, . 667 [RFC5310] Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R., 668 and M. Fanto, "IS-IS Generic Cryptographic 669 Authentication", RFC 5310, DOI 10.17487/RFC5310, February 670 2009, . 672 [RFC7356] Ginsberg, L., Previdi, S., and Y. Yang, "IS-IS Flooding 673 Scope Link State PDUs (LSPs)", RFC 7356, 674 DOI 10.17487/RFC7356, September 2014, . 677 [RFC7602] Chunduri, U., Lu, W., Tian, A., and N. Shen, "IS-IS 678 Extended Sequence Number TLV", RFC 7602, 679 DOI 10.17487/RFC7602, July 2015, . 682 [RFC8202] Ginsberg, L., Previdi, S., and W. Henderickx, "IS-IS 683 Multi-Instance", RFC 8202, DOI 10.17487/RFC8202, June 684 2017, . 686 7.2. Informative References 688 [DYNAMIC-FLOODING] 689 Li, T., "Dynamic Flooding on Dense Graphs", draft-li- 690 dynamic-flooding (work in progress), 2018. 692 [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation 693 Element (PCE)-Based Architecture", RFC 4655, 694 DOI 10.17487/RFC4655, August 2006, . 697 [RFC5309] Shen, N., Ed. and A. Zinin, Ed., "Point-to-Point Operation 698 over LAN in Link State Routing Protocols", RFC 5309, 699 DOI 10.17487/RFC5309, October 2008, . 702 Authors' Addresses 704 Naiming Shen 705 Cisco Systems 706 560 McCarthy Blvd. 707 Milpitas, CA 95035 708 US 710 Email: naiming@cisco.com 712 Les Ginsberg 713 Cisco Systems 714 821 Alder Drive 715 Milpitas, CA 95035 716 US 718 Email: ginsberg@cisco.com 720 Sanjay Thyamagundalu 722 Email: tsanjay@gmail.com