< draft-shen-isis-spine-leaf-ext-02.txt   draft-shen-isis-spine-leaf-ext-03.txt >
Networking Working Group N. Shen Networking Working Group N. Shen
Internet-Draft S. Thyamagundalu Internet-Draft L. Ginsberg
Intended status: Standards Track Cisco Systems Intended status: Standards Track Cisco Systems
Expires: April 30, 2017 October 27, 2016 Expires: September 3, 2017 S. Thyamagundalu
March 2, 2017
IS-IS Routing for Spine-Leaf Topology IS-IS Routing for Spine-Leaf Topology
draft-shen-isis-spine-leaf-ext-02 draft-shen-isis-spine-leaf-ext-03
Abstract Abstract
This document describes a mechanism for routers and switches in This document describes a mechanism for routers and switches in a
Spine-Leaf type topology to have non-reciprocal Intermediate System Spine-Leaf type topology to have non-reciprocal Intermediate System
to Intermediate System (IS-IS) routing relationships between the to Intermediate System (IS-IS) routing relationships between the
leafs and spines. The leaf nodes do not need to have the topology leafs and spines. The leaf nodes do not need to have the topology
information of other nodes and exact prefixes in the network. This information of other nodes and exact prefixes in the network. This
extension also has application in the Internet of Things (IoT). extension also has application in the Internet of Things (IoT).
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
skipping to change at page 1, line 35 skipping to change at page 1, line 36
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 30, 2017. This Internet-Draft will expire on September 3, 2017.
Copyright Notice Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3
2. Motivations . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Motivations . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Spine-Leaf (SL) Extension . . . . . . . . . . . . . . . . . . 4 3. Spine-Leaf (SL) Extension . . . . . . . . . . . . . . . . . . 4
3.1. Topology Example . . . . . . . . . . . . . . . . . . . . 4 3.1. Topology Examples . . . . . . . . . . . . . . . . . . . . 4
3.2. Applicability Statement . . . . . . . . . . . . . . . . . 4 3.2. Applicability Statement . . . . . . . . . . . . . . . . . 5
3.3. Extension Encoding . . . . . . . . . . . . . . . . . . . 5 3.3. Extension Encoding . . . . . . . . . . . . . . . . . . . 6
3.4. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 6 3.3.1. Spine-Leaf Sub-TLVs . . . . . . . . . . . . . . . . . 7
3.5. Implementation and Operation . . . . . . . . . . . . . . 7 3.3.1.1. Leaf-Set Sub-TLV . . . . . . . . . . . . . . . . 7
3.5.1. CSNP PDU . . . . . . . . . . . . . . . . . . . . . . 7 3.3.1.2. Info-Req Sub-TLV . . . . . . . . . . . . . . . . 7
3.5.2. Leaf to Leaf connection . . . . . . . . . . . . . . . 7 3.3.2. Advertising IPv4/IPv6 Reachability . . . . . . . . . 8
3.5.3. Overload Bit . . . . . . . . . . . . . . . . . . . . 8 3.4. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 8
3.5.4. Spine Node Hostname . . . . . . . . . . . . . . . . . 8 3.4.1. Pure CLOS Topology . . . . . . . . . . . . . . . . . 9
3.5.5. IS-IS Reverse Metric . . . . . . . . . . . . . . . . 8 3.5. Implementation and Operation . . . . . . . . . . . . . . 10
3.5.6. Other End-to-End Services . . . . . . . . . . . . . . 9 3.5.1. CSNP PDU . . . . . . . . . . . . . . . . . . . . . . 10
3.5.7. Address Family and Topology . . . . . . . . . . . . . 9 3.5.2. Leaf to Leaf connection . . . . . . . . . . . . . . . 10
3.5.8. Migration . . . . . . . . . . . . . . . . . . . . . . 9 3.5.3. Overload Bit . . . . . . . . . . . . . . . . . . . . 11
4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 3.5.4. Spine Node Hostname . . . . . . . . . . . . . . . . . 11
5. Security Considerations . . . . . . . . . . . . . . . . . . . 10 3.5.5. IS-IS Reverse Metric . . . . . . . . . . . . . . . . 11
6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 3.5.6. Spine-Leaf Traffic Engineering . . . . . . . . . . . 12
7. Document Change Log . . . . . . . . . . . . . . . . . . . . . 10 3.5.7. Other End-to-End Services . . . . . . . . . . . . . . 12
7.1. Changes to draft-shen-isis-spine-leaf-ext-02.txt . . . . 10 3.5.8. Address Family and Topology . . . . . . . . . . . . . 12
7.2. Changes to draft-shen-isis-spine-leaf-ext-01.txt . . . . 10 3.5.9. Migration . . . . . . . . . . . . . . . . . . . . . . 12
7.3. Changes to draft-shen-isis-spine-leaf-ext-00.txt . . . . 10 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 5. Security Considerations . . . . . . . . . . . . . . . . . . . 13
8.1. Normative References . . . . . . . . . . . . . . . . . . 10 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13
8.2. Informative References . . . . . . . . . . . . . . . . . 12 7. Document Change Log . . . . . . . . . . . . . . . . . . . . . 13
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 7.1. Changes to draft-shen-isis-spine-leaf-ext-03.txt . . . . 13
7.2. Changes to draft-shen-isis-spine-leaf-ext-02.txt . . . . 14
7.3. Changes to draft-shen-isis-spine-leaf-ext-01.txt . . . . 14
7.4. Changes to draft-shen-isis-spine-leaf-ext-00.txt . . . . 14
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 14
8.1. Normative References . . . . . . . . . . . . . . . . . . 14
8.2. Informative References . . . . . . . . . . . . . . . . . 15
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16
1. Introduction 1. Introduction
The IS-IS routing protocol defined by [ISO10589] has been widely The IS-IS routing protocol defined by [ISO10589] has been widely
deployed in provider networks, data centers and enterprise campus deployed in provider networks, data centers and enterprise campus
environments. In the data center and enterprise switching networks, environments. In the data center and enterprise switching networks,
Spine-Leaf topology is commonly used. This document describes the a Spine-Leaf topology is commonly used. This document describes a
mechanism where IS-IS routing can be optimized to take the advantage mechanism where IS-IS routing can be optimized for a Spine-Leaf
of the unique Spine-Leaf topology. topology.
When the network is in Spine-Leaf topology, normally a leaf node In a Spine-Leaf topology, normally a leaf node connects to a number
connects to a number of spine nodes. Data traffic going from one of spine nodes. Data traffic going from one leaf node to another
leaf node to another leaf node needs to pass through one of the spine leaf node needs to pass through one of the spine nodes. Also, the
nodes. Also, the decision to choose one of the spine nodes is decision to choose one of the spine nodes is usually part of equal
usually part of the equal cost multi-path (ECMP) load sharing. The cost multi-path (ECMP) load sharing. The spine nodes can be
spine nodes can be considered as gateway devices to reach the considered as gateway devices to reach destinations on other leaf
destination leaf nodes. In this type of topologies, the spine nodes nodes. In this type of topology, the spine nodes have to know the
have to know the topology and routing information of the entire topology and routing information of the entire network, but the leaf
network, but the leaf nodes only need to know how to reach the nodes only need to know how to reach the gateway devices to which are
gateway devices which are the spine nodes they are uplinked to. the spine nodes they are uplinked.
This document describes the IS-IS Spine-Leaf extension that allows This document describes the IS-IS Spine-Leaf extension that allows
the spine nodes to have all the topology and routing information, the spine nodes to have all the topology and routing information,
while keeping the leaf nodes free of topology information other than while keeping the leaf nodes free of topology information other than
the default gateway routing information. The leaf nodes do not even the default gateway routing information. The leaf nodes do not even
need to run their Shortest Path First (SPF) since there is no network need to run a Shortest Path First (SPF) calculation since they have
topology to run for. no topology information.
1.1. Requirements Language 1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119]. document are to be interpreted as described in RFC 2119 [RFC2119].
2. Motivations 2. Motivations
o The leaf nodes in Spine-Leaf topology do not benefit much to have o The leaf nodes in a Spine-Leaf topology do not require complete
the complete topology and routing information of the entire domain topology and routing information of the entire domain since their
while the forwarding actions are only to use ECMP with spine nodes forwarding decision is to use ECMP with spine nodes as default
as nexthops. gateways
o The spine nodes in Spine-Leaf topology are richly connected to o The spine nodes in a Spine-Leaf topology are richly connected to
leaf nodes, and they need to flood every Link State PDUs (LSPs) to leaf nodes, which introduces significant flooding duplication if
all the leaf nodes. It saves the spine nodes' CPU and link they flood all Link State PDUs (LSPs) to all the leaf nodes. It
bandwidth resources if the flooding is blocked to those leaf saves both spine and leaf nodes' CPU and link bandwidth resources
nodes. if flooding is blocked to leaf nodes. For small Top of the Rack
(ToR) leaf switches in data centers, it is meaningful to prevent
full topology routing information and massive database flooding
through those devices.
o During the time a spine node has a network problem, every leaf o When a spine node advertises a topology change, every leaf node
node connected to it will generate its LSP update to report the connected to it will flood the update to all the other spine
problem to all the other spine nodes, and those spine nodes will nodes, and those spine nodes will further flood them to all the
further flood them to all the leaf nodes, causing a O(n^2) leaf nodes, causing a O(n^2) flooding storm which is largely
flooding storm unnecessarily since every leaf node already knows redundant.
that spine node having problem.
o Similar to some of the overlay technologies which are popular in
data centers, the edge devices (leaf nodes) may not need to
contain all the routing and forwarding information on the device's
control and forwarding planes. "Conversational Learning" can be
utilized to get the specific routing and forwarding information in
the case of pure CLOS topology and in the events of link and node
down.
o Small devices and appliances of Internet of Things (IoT) can be o Small devices and appliances of Internet of Things (IoT) can be
considered as leafs in the routing topology sense. They have CPU considered as leafs in the routing topology sense. They have CPU
and memory constrains in design, and those IoT devices do not have and memory constrains in design, and those IoT devices do not have
to know the exact network topology and prefixes as long as there to know the exact network topology and prefixes as long as there
are ways to reach the cloud servers or other devices and they want are ways to reach the cloud servers or other devices.
to be part of the dynamic routing.
3. Spine-Leaf (SL) Extension 3. Spine-Leaf (SL) Extension
3.1. Topology Example 3.1. Topology Examples
+--------+ +--------+ +--------+ +--------+ +--------+ +--------+
| | | | | | | | | | | |
| Spine1 +----+ Spine2 +- ......... -+ SpineN | | Spine1 +----+ Spine2 +- ......... -+ SpineN |
| | | | | | | | | | | |
+-+-+-+-++ ++-+-+-+-+ +-+-+-+-++ +-+-+-+-++ ++-+-+-+-+ +-+-+-+-++
+------+ | | | | | | | | | | | +------+ | | | | | | | | | | |
| +-----|-|-|------+ | | | | | | | | +-----|-|-|------+ | | | | | | |
| | +--|-|-|--------+-|-|-----------------+ | | | | | +--|-|-|--------+-|-|-----------------+ | | |
| | | | | | +---+ | | | | | | | | | | | +---+ | | | | |
skipping to change at page 4, line 31 skipping to change at page 5, line 5
| | | | | | | | +-------------+ | | | | | | | | | | | +-------------+ | | |
| | | | | +----|--|----------------|--|--------+ | | | | | | | +----|--|----------------|--|--------+ | |
| | | | +------|--|--------------+ | | | | | | | | | +------|--|--------------+ | | | | |
| | | +------+ | | | | | | | | | | | +------+ | | | | | | | |
++--+--++ +-+-+--++ ++-+--+-+ ++-+--+-+ ++--+--++ +-+-+--++ ++-+--+-+ ++-+--+-+
| Leaf1 +~~~~~~+ Leaf2 | ........ | LeafX | | LeafY | | Leaf1 +~~~~~~+ Leaf2 | ........ | LeafX | | LeafY |
+-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+
Figure 1: A Spine-Leaf Topology Figure 1: A Spine-Leaf Topology
+---------+ +--------+
| Spine1 | | Spine2 |
+-+-+-+-+-+ +-+-+-+-++
| | | | | | | |
| | | +-----------------|-|-|-|-+
| | +------------+ | | | | |
+--------+ +-+ | | | | | |
| +----------------------------+ | | | |
| | | +------------------+ | +----+
| | | | | +-------+ | |
| | | | | | | |
+-+---+-+ +--+--+-+ +-+--+--+ +--+--+-+
| Leaf1 | | Leaf2 | | Leaf3 | | Leaf4 |
+-------+ +-------+ +-------+ +-------+
Figure 2: A Fat Tree Topology
3.2. Applicability Statement 3.2. Applicability Statement
This extension assumes the network is a basic Spine-Leaf topology, This extension assumes the network is a Spine-Leaf topology, and it
and it will not work in an arbitrary network setup. The spine nodes should not be applied in an arbitrary network setup. The spine nodes
can be viewed as the aggregation layer of the network, and the leaf can be viewed as the aggregation layer of the network, and the leaf
nodes as the access layer of the network. The leaf nodes use load nodes as the access layer of the network. The leaf nodes use a load
sharing algorithm with spine nodes as nexthops in routing and sharing algorithm with spine nodes as nexthops in routing and
forwarding. forwarding.
This extension assumes the spine nodes are inter-connected. Spine This extension works when the spine nodes are inter-connected, and it
nodes exchanges normal IS-IS topology and routing information among works with a pure CLOS or Fat Tree topology based network where the
themselves. This extension does not apply in the case where spine spines are NOT interconnected.
nodes only have links to leaf nodes but not to themselves.
Although the example diagram in Figure 1 shows a fully meshed Spine- Although the example diagram in Figure 1 shows a fully meshed Spine-
Leaf topology, but this extension also works in the case where they Leaf topology, this extension also works in the case where they are
are partially meshed. For instance, the leaf1 through leaf10 are partially meshed. For instance, leaf1 through leaf10 may be fully
fully meshed with spine1 through spine5; and leaf11 through leaf20 meshed with spine1 through spine5 while leaf11 through leaf20 is
are fully meshed with spine4 through spine8, and all the spines are fully meshed with spine4 through spine8, and all the spines are
inter-connected in a redundant fashion. inter-connected in a redundant fashion.
This extension also works with the topology with more than the This extension also works with a topology with more than the typical
typical two layers of spine and leaf. For instance, in example two layers of spine and leaf. For instance, in example diagrams
diagram Figure 1, there can be another Core layer of routers/switches Figure 1 and Figure 2, there can be another Core layer of routers/
on top of the aggregation layer. From an IS-IS routing point of switches on top of the aggregation layer. From an IS-IS routing
view, the Core nodes are not affected by this extension and will have point of view, the Core nodes are not affected by this extension and
the complete topology and routing information just like the spine will have the complete topology and routing information just like the
nodes. To make the network even more scalable, the Core layer can be spine nodes. To make the network even more scalable, the Core layer
run at the level-2 IS-IS domain while the Spine layer and the Leaf can operate as a level-2 IS-IS sub-domain while the Spine and Leaf
layer staying at the level-1 IS-IS domain. layers operate as stays at the level-1 IS-IS domain.
This extension also supports the leaf nodes having local connections This extension also supports the leaf nodes having local connections
to other leaf nodes, in the example diagram Figure 1 there is a to other leaf nodes, in the example diagram Figure 1 there is a
connection between 'Leaf1' node and 'Leaf2' node, and an external connection between 'Leaf1' node and 'Leaf2' node, and an external
host can be dual homed into both of the leaf nodes. host can be dual homed into both of the leaf nodes.
This extension assumes the link between the spine and leaf nodes are This extension assumes the link between the spine and leaf nodes are
point-to-point, or point-to-point over LAN [RFC5309]. The links point-to-point, or point-to-point over LAN [RFC5309]. The links
connecting the spine nodes, or the links between the leaf nodes can connecting among the spine nodes or the links between the leaf nodes
be any type. can be any type.
3.3. Extension Encoding 3.3. Extension Encoding
This extension introduces one TLV for IS-IS Hello (IIH) PDU and it is This extension introduces one TLV which may be used in IS-IS Hello
used by both spine and leaf nodes in the Spine-Leaf mechanism. (IIH) PDUs or in Circuit Scoped Link State PDUs (CS-LSP) [RFC7356].
It is used by both spine and leaf nodes in this Spine-Leaf mechanism.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | SL Flag | | Type | Length | SL Flag |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| .. Optional Sub-TLVs | .. Optional Sub-TLVs
+-+-+-+-+-+-+-+-+-.... +-+-+-+-+-+-+-+-+-....
The fields of this TLV are defined as follows: The fields of this TLV are defined as follows:
Type: TBD. 8 bits value, suggested value 150. Type: 1 octet Suggested value 150 (to be assigned by IANA)
Length: Variable. 8 bits value. The mandatory part is 6 octets. Length: 1 octet (2 + length of sub-TLVs).
SL Flag: 16 bits value field of following flags: Flags 16 bits
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved |B|R|L| | Reserved |B|R|L|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
L bit (0x01): Only leaf node sets this bit. If the L bit is L bit (0x01): Only leaf node sets this bit. If the L bit is
set in the SL flag, the node indicates it is in 'Leaf- set in the SL flag, the node indicates it is in 'Leaf-
Mode'. Mode'.
R bit (0x02): Only Spine node sets this bit. If the R bit is R bit (0x02): Only Spine node sets this bit. If the R bit is
set, the node indicates to the leaf neighbor that it set, the node indicates to the leaf neighbor that it
can be used as the default route gateway. can be used as the default route gateway.
B bit (0x04): Only leaf node sets this bit on Leaf-Leaf link, B bit (0x04): Only leaf node sets this bit on Leaf-Leaf link,
in additional to the 'L' bit setting. If the B bit is in additional to the 'L' bit setting. If the B bit is
set, the node indicates to its leaf neighbor that it set, the node indicates to its leaf neighbor that it
can be used as the backup default route gateway. can be used as the backup default route gateway.
Optional Sub-TLV: Not defined in this document, for future Optional Sub-TLV: Not defined in this document, for future
extension on SL. extension
sub-TLVs MAY be included when the TLV is in a CS-LSP.
sub-TLVs MUST NOT be included when the TLV is in an IIH
3.3.1. Spine-Leaf Sub-TLVs
If the data center topology is a pure CLOS or Fat Tree, there are no
link connections among the spine nodes. If we also assume there is
not another Core layer on top of the aggregation layer, then the
traffic from one leaf node to another may have a problem if there is
a link outage between a spine node and a leaf node. For instance, in
the diagram of Figure 2, if Leaf1 sends data traffic to Leaf3 through
Spine1 node, and the Spine1-Leaf3 link is down, the data traffic will
be dropped on the Spine1 node.
To address this issue spine and leaf nodes may send/request specific
reachability information via the sub-TLVs defined below.
Two Spine-Leaf sub-TLVs are defined. The Leaf-Set sub-TLV and the
Info-Req sub-TLV.
3.3.1.1. Leaf-Set Sub-TLV
This sub-TLV is used by spine nodes to optionally advertise Leaf
neighbors to other Leaf nodes. The fields of this sub-TLV are
defined as follows:
Type: 1 octet Suggested value 1 (to be assigned by IANA)
Length: 1 octet MUST be a multiple of 6 octets.
Leaf-Set: A list of IS-IS System-ID of the leaf node neighbors of
this spine node.
3.3.1.2. Info-Req Sub-TLV
This sub-TLV is used by leaf nodes to request more specific prefix
information from a selected spine node, upon detecting one of the
spine node has lost the connection to a leaf node. The fields of
this sub-TLV are defined as follows:
Type: 1 octet Suggested value 2 (to be assigned by IANA)
Length: 1 octet. It MUST be a multiple of 6 octets.
Info-Req: List of IS-IS System-IDs of leaf nodes for which
connectivity information is being requested.
3.3.2. Advertising IPv4/IPv6 Reachability
In cases where connectivity between a leaf node and a spine node is
down, the leaf node MAY request reachability information from a spine
node as described in Section 3.3.1.2. The spine node utilizes TLVs
135 [RFC5305] and TLVs 236 [RFC5308] to advertise this information.
These TLVs MAY be included either in IIHs or CS-LSPs sent from the
spine to the requesting leaf node. Sending such information in IIHs
has limited scale - all reachability information MUST fit within a
single IIH. It is therefore recommended that CS-LSPs be used.
3.4. Mechanism 3.4. Mechanism
Each leaf node is provisioned by network operators as in IS-IS 'Leaf- Each leaf node is provisioned by network operators as an IS-IS 'Leaf-
Mode'. A spine node does not need explicit configuration. A leaf Node'. A spine node does not need explicit configuration. A leaf
node inserts the Spine-Leaf TLV and sets the 'L' bit in the SL flag node inserts the Spine-Leaf TLV in IIHs it originates. The TLV has
field when sending out its IIH PDU over all its links. the 'L' bit set in the flags field.
The spine node when receiving the IIH with the SL TLV and 'L' bit When a spine node receives an IIH with the SL TLV and 'L' bit set, it
set, it labels the point-to-point interface and adjacency to be a labels the point-to-point interface and adjacency to be a 'Leaf-
'Leaf-Peer'. When the spine node sending out IIH PDU to the 'Leaf- Peer'. IIHs sent by the spine node on a link to a Leaf-Peer includes
Peer', it will also insert the Spine-Leaf TLV and set the 'R' bit in the Spine-Leaf TLV with the 'R' bit set in the flags field. The 'R'
the SL flag field. This 'R' bit indicates to the 'Leaf-Peer' bit indicates to the 'Leaf-Peer' neighbor that the spine node can be
neighbor that the spine node can be used as a default routing used as a default routing nexthop.
nexthop.
There is no change to the IS-IS adjacency bring-up mechanism for the There is no change to the IS-IS adjacency bring-up mechanism for
point-to-point interface. Spine-Leaf peers.
For the spine node with 'Leaf-Peer' adjacencies, the IS-IS LSP For the spine node with 'Leaf-Peer' adjacencies, the IS-IS LSP
flooding is blocked to the 'Leaf-Peer' interface, except for the LSP flooding is blocked to the 'Leaf-Peer' interface, except for the LSP
PDUs in which the IS-IS System-ID matches the System-ID of the 'Leaf- PDUs in which the IS-IS System-ID matches the System-ID of the 'Leaf-
Peer' adjacency. This exception is needed since when the leaf node Peer' adjacency. This exception is needed since when the leaf node
reboots, the spine node needs to forward to the leaf node its reboots, the spine node needs to forward to the leaf node its
previous generation of LSP. No other LSP PDU needs to be flooded previous generation of LSP. No other LSP PDU needs to be flooded
over this 'Leaf-Peer' interface. over this 'Leaf-Peer' interface.
The leaf node will perform IS-IS LSP flooding as normal over all of The leaf node will perform IS-IS LSP flooding as normal over all of
skipping to change at page 7, line 28 skipping to change at page 9, line 33
In summary, this extension requires leaf node to insert Spine-Leaf In summary, this extension requires leaf node to insert Spine-Leaf
TLV in IIH, and set the 'L' bit in the SL flag, and download IS-IS TLV in IIH, and set the 'L' bit in the SL flag, and download IS-IS
default route using the spine nodes as nexthops where the 'Spine- default route using the spine nodes as nexthops where the 'Spine-
Peer' set the 'R' bit in its IIH PDU; It requires spine node to Peer' set the 'R' bit in its IIH PDU; It requires spine node to
respond from 'Leaf-Peer' by inserting Spine-Leaf TLV in its IIH, respond from 'Leaf-Peer' by inserting Spine-Leaf TLV in its IIH,
setting the 'R' bit in the SL flag, and blocking the LSP flooding setting the 'R' bit in the SL flag, and blocking the LSP flooding
with the exception that it will set SRMflag on the LSPs that belong with the exception that it will set SRMflag on the LSPs that belong
to the 'Leaf-Peer' over that interface. to the 'Leaf-Peer' over that interface.
3.4.1. Pure CLOS Topology
In a data center where the topology is pure CLOS or Fat Tree, there
is no interconnection among the spine nodes, and there is not another
Core layer above the aggregation layer, when the link between a spine
and a leaf goes down, there is a possibility of black holing the data
traffic in the network.
As in the diagram Figure 2, if the link Spine1-Leaf3 goes down, there
needs to be a way for Leaf1, Leaf2 and Leaf4 to avoid the Spine1 if
the destination of data traffic is to Leaf3 node.
In the above example, the Spine1 and Spine2 are provisioned to
advertise the Spine-Leaf sub-TLV of Leaf-Set. Originally both Spines
will advertise Leaf1 through Leaf4 as their Leaf-Set. When the
Spine1-Leaf3 link is down, Spine1 will only have Leaf1, Leaf2 and
Leaf4 in its Leaf-Set. This allows the other leaf nodes to know that
Spine1 has lost the leaf node of Leaf3.
Each leaf node can select another spine node to request for some
prefix information associated with the lost leaf node. In this
diagram of Figure 2, there are only two spine nodes (Spine-Leaf
topology can have more than two spine nodes in general). Each leaf
node can independently select a spine node for the leaf information.
The leaf nodes will include the Info-Req sub-TLV in the Spine-Leaf
TLV towards that spine node, Spine2 in this case.
The spine node, upon receiving the request from one or more leaf
nodes, it will find the associated IPv6/IPv6 prefixes for this
requested client node, and the spine node will include the IPv6/IPv4
Info-Advertise sub-TLV when sending message towards the leaf nodes.
For instance, it will include the IPv4 loopback prefix of the leaf3
based on the policy configured or administrative tag attached to the
prefixes. When the leaf nodes receive the more specific prefixes,
they will install the advertised prefixes towards the other spine
nodes (Spine2 in this example).
For instance in the data center overlay scenario, when any IP
destination or MAC destination uses the leaf3's loopback as the
tunnel nexthop, the overlay tunnel from leaf nodes will only select
Spine2 as the gateway to reach leaf3 as long as the Spine1-Leaf3 link
is still down.
3.5. Implementation and Operation 3.5. Implementation and Operation
3.5.1. CSNP PDU 3.5.1. CSNP PDU
In Spine-Leaf extension, Complete Sequence Number PDU (CSNP) does not In Spine-Leaf extension, Complete Sequence Number PDU (CSNP) does not
need to be transmitted over the Spine-Leaf link. Some IS-IS need to be transmitted over the Spine-Leaf link. Some IS-IS
implementation sends CSNPs after the initial adjacency bring-up over implementation sends CSNPs after the initial adjacency bring-up over
point-to-point interface. There is no need for this optimization point-to-point interface. There is no need for this optimization
here since the Leaf does not need to receive any other LSPs from the here since the Leaf does not need to receive any other LSPs from the
network, and the only LSPs transmitted across the Spine-Leaf link is network, and the only LSPs transmitted across the Spine-Leaf link is
skipping to change at page 9, line 5 skipping to change at page 12, line 5
In other cases, such as when the spine node loses a link to a In other cases, such as when the spine node loses a link to a
particular leaf node, although it can redirect the traffic to other particular leaf node, although it can redirect the traffic to other
spine nodes to reach that destination leaf node, but it MAY want to spine nodes to reach that destination leaf node, but it MAY want to
increase this metric value if the inter-spine connection becomes over increase this metric value if the inter-spine connection becomes over
utilized, or the latency becomes an issue. utilized, or the latency becomes an issue.
In the leaf-leaf link as a backup gateway use case, the 'Reverse In the leaf-leaf link as a backup gateway use case, the 'Reverse
Metric' SHOULD always be set to very high value. Metric' SHOULD always be set to very high value.
3.5.6. Other End-to-End Services 3.5.6. Spine-Leaf Traffic Engineering
Besides using the IS-IS Reverse Metric by the spine nodes to affect
the traffic pattern for leaf default gateway towards multiple spine
nodes, the IPv6/IPv4 Info-Advertise sub-TLVs can be selectively used
by traffic engineering controllers to move data traffic around the
data center fabric to alleviate congestion and to reduce the latency
of a certain class of traffic pairs. By injecting more specific leaf
node prefixes, it will allow the spine nodes to attract more traffic
on some underutilized links.
3.5.7. Other End-to-End Services
Losing the topology information will have an impact on some of the Losing the topology information will have an impact on some of the
end-to-end network services, for instance, MPLS TE or end-to-end end-to-end network services, for instance, MPLS TE or end-to-end
segment routing. Some other mechanisms such as those described in segment routing. Some other mechanisms such as those described in
PCE [RFC4655] based solution may be used. In this Spine-Leaf PCE [RFC4655] based solution may be used. In this Spine-Leaf
extension, the role of the leaf node is not too much different from extension, the role of the leaf node is not too much different from
the multi-level IS-IS routing while the level-1 IS-IS nodes only have the multi-level IS-IS routing while the level-1 IS-IS nodes only have
the default route information towards the node which has the Attach the default route information towards the node which has the Attach
Bit (ATT) set, and the level-2 backbone does not have any topology Bit (ATT) set, and the level-2 backbone does not have any topology
information of the level-1 areas. The exact mechanism to enable information of the level-1 areas. The exact mechanism to enable
certain end-to-end network services in Spine-Leaf network is outside certain end-to-end network services in Spine-Leaf network is outside
the scope of this document. the scope of this document.
3.5.7. Address Family and Topology 3.5.8. Address Family and Topology
IPv6 Address families[RFC5308], Multi-Topology (MT)[RFC5120] and IPv6 Address families[RFC5308], Multi-Topology (MT)[RFC5120] and
Multi-Instance (MI)[RFC6822] information is carried over the IIH PDU. Multi-Instance (MI)[RFC6822] information is carried over the IIH PDU.
Since the goal is to simplify the operation of IS-IS network, for the Since the goal is to simplify the operation of IS-IS network, for the
simplicity of this extension, the Spine-Leaf mechanism is applied the simplicity of this extension, the Spine-Leaf mechanism is applied the
same way to all the address families, MTs and MIs. same way to all the address families, MTs and MIs.
3.5.8. Migration 3.5.9. Migration
For this extension to be deployed in existing networks, a simple For this extension to be deployed in existing networks, a simple
migration scheme is needed. To support any leaf node in the network, migration scheme is needed. To support any leaf node in the network,
all the involved spine nodes have to be upgraded first. So the first all the involved spine nodes have to be upgraded first. So the first
step is to migrate all the involved spine nodes to support this step is to migrate all the involved spine nodes to support this
extension, then the leaf nodes can be enabled with 'Leaf-Mode' one by extension, then the leaf nodes can be enabled with 'Leaf-Mode' one by
one. No flag day is needed for the extension migration. one. No flag day is needed for the extension migration.
4. IANA Considerations 4. IANA Considerations
A new TLV codepoint is defined in this document and needs to be A new TLV codepoint is defined in this document and needs to be
assigned by IANA from the "IS-IS TLV Codepoints" registry. It is assigned by IANA from the "IS-IS TLV Codepoints" registry. It is
referred to as the Spine-Leaf TLV and the suggested value is 150. referred to as the Spine-Leaf TLV and the suggested value is 150.
This TLV is only to be optionally inserted in the IIH PDU. This This TLV is only to be optionally inserted either in the IIH PDU or
document does not propose any sub-TLV out of this Spine-Leaf TLV. in the Circuit Flooding Scoped LSP PDU. IANA is also requested to
IANA is also requested to maintain the SL-flag bit values in this maintain the SL-flag bit values in this TLV, and 0x01, 0x02 and 0x04
TLV, and 0x01, 0x02 and 0x04 bits are defined in this document. bits are defined in this document.
Value Name IIH LSP SNP Purge Value Name IIH LSP SNP Purge CS-LSP
----- --------------------- --- --- --- ----- ----- --------------------- --- --- --- ----- -------
150 Spine-Leaf y n n n 150 Spine-Leaf y n n n y
This extension also proposes to have the Dynamic Hostname TLV, This extension also proposes to have the Dynamic Hostname TLV,
already assigned as code 137, to be allowed in IIH PDU. already assigned as code 137, to be allowed in IIH PDU.
Value Name IIH LSP SNP Purge Value Name IIH LSP SNP Purge
----- --------------------- --- --- --- ----- ----- --------------------- --- --- --- -----
137 Dynamic Name y y n y 137 Dynamic Name y y n y
Two new sub-TLVs are defined in this document and needs to be added
assigned by IANA from the "IS-IS TLV Codepoints". They are referred
to in this document as the Leaf-Set sub-TLV and the Info-Req sub-TLV.
It is suggested to have the values 1 and 2 respectively.
5. Security Considerations 5. Security Considerations
Security concerns for IS-IS are addressed in [ISO10589], [RFC5304], Security concerns for IS-IS are addressed in [ISO10589], [RFC5304],
[RFC5310], and [RFC7602]. This extension does not raise additional [RFC5310], and [RFC7602]. This extension does not raise additional
security issues. security issues.
6. Acknowledgments 6. Acknowledgments
TBD. TBD.
7. Document Change Log 7. Document Change Log
7.1. Changes to draft-shen-isis-spine-leaf-ext-02.txt 7.1. Changes to draft-shen-isis-spine-leaf-ext-03.txt
o Submitted March 2017.
o Added the Spine-Leaf sub-TLVs to handle the case of data center
pure CLOS topology and mechanism.
o Added the Spine-Leaf TLV and sub-TLVs can be optionally inserted
in either IIH PDU or CS-LSP PDU.
o Allow use of prefix Reachability TLVs 135 and 236 in IIHs/CS-LSPs
sent from spine to leaf.
7.2. Changes to draft-shen-isis-spine-leaf-ext-02.txt
o Submitted October 2016. o Submitted October 2016.
o Removed the 'Default Route Metric' field in the Spine-Leaf TLV and o Removed the 'Default Route Metric' field in the Spine-Leaf TLV and
changed to using the IS-IS Reverse Metric in IIH. changed to using the IS-IS Reverse Metric in IIH.
7.2. Changes to draft-shen-isis-spine-leaf-ext-01.txt 7.3. Changes to draft-shen-isis-spine-leaf-ext-01.txt
o Submitted April 2016. o Submitted April 2016.
o No change. Refresh the draft version. o No change. Refresh the draft version.
7.3. Changes to draft-shen-isis-spine-leaf-ext-00.txt 7.4. Changes to draft-shen-isis-spine-leaf-ext-00.txt
o Initial version of the draft is published in November 2015. o Initial version of the draft is published in November 2015.
8. References 8. References
8.1. Normative References 8.1. Normative References
[ISO10589] [ISO10589]
ISO "International Organization for Standardization", ISO "International Organization for Standardization",
"Intermediate system to Intermediate system intra-domain "Intermediate system to Intermediate system intra-domain
skipping to change at page 11, line 29 skipping to change at page 15, line 9
<http://www.rfc-editor.org/info/rfc5120>. <http://www.rfc-editor.org/info/rfc5120>.
[RFC5301] McPherson, D. and N. Shen, "Dynamic Hostname Exchange [RFC5301] McPherson, D. and N. Shen, "Dynamic Hostname Exchange
Mechanism for IS-IS", RFC 5301, DOI 10.17487/RFC5301, Mechanism for IS-IS", RFC 5301, DOI 10.17487/RFC5301,
October 2008, <http://www.rfc-editor.org/info/rfc5301>. October 2008, <http://www.rfc-editor.org/info/rfc5301>.
[RFC5304] Li, T. and R. Atkinson, "IS-IS Cryptographic [RFC5304] Li, T. and R. Atkinson, "IS-IS Cryptographic
Authentication", RFC 5304, DOI 10.17487/RFC5304, October Authentication", RFC 5304, DOI 10.17487/RFC5304, October
2008, <http://www.rfc-editor.org/info/rfc5304>. 2008, <http://www.rfc-editor.org/info/rfc5304>.
[RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic
Engineering", RFC 5305, DOI 10.17487/RFC5305, October
2008, <http://www.rfc-editor.org/info/rfc5305>.
[RFC5306] Shand, M. and L. Ginsberg, "Restart Signaling for IS-IS", [RFC5306] Shand, M. and L. Ginsberg, "Restart Signaling for IS-IS",
RFC 5306, DOI 10.17487/RFC5306, October 2008, RFC 5306, DOI 10.17487/RFC5306, October 2008,
<http://www.rfc-editor.org/info/rfc5306>. <http://www.rfc-editor.org/info/rfc5306>.
[RFC5308] Hopps, C., "Routing IPv6 with IS-IS", RFC 5308, [RFC5308] Hopps, C., "Routing IPv6 with IS-IS", RFC 5308,
DOI 10.17487/RFC5308, October 2008, DOI 10.17487/RFC5308, October 2008,
<http://www.rfc-editor.org/info/rfc5308>. <http://www.rfc-editor.org/info/rfc5308>.
[RFC5310] Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R., [RFC5310] Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R.,
and M. Fanto, "IS-IS Generic Cryptographic and M. Fanto, "IS-IS Generic Cryptographic
Authentication", RFC 5310, DOI 10.17487/RFC5310, February Authentication", RFC 5310, DOI 10.17487/RFC5310, February
2009, <http://www.rfc-editor.org/info/rfc5310>. 2009, <http://www.rfc-editor.org/info/rfc5310>.
[RFC6822] Previdi, S., Ed., Ginsberg, L., Shand, M., Roy, A., and D. [RFC6822] Previdi, S., Ed., Ginsberg, L., Shand, M., Roy, A., and D.
Ward, "IS-IS Multi-Instance", RFC 6822, Ward, "IS-IS Multi-Instance", RFC 6822,
DOI 10.17487/RFC6822, December 2012, DOI 10.17487/RFC6822, December 2012,
<http://www.rfc-editor.org/info/rfc6822>. <http://www.rfc-editor.org/info/rfc6822>.
[RFC7356] Ginsberg, L., Previdi, S., and Y. Yang, "IS-IS Flooding
Scope Link State PDUs (LSPs)", RFC 7356,
DOI 10.17487/RFC7356, September 2014,
<http://www.rfc-editor.org/info/rfc7356>.
[RFC7602] Chunduri, U., Lu, W., Tian, A., and N. Shen, "IS-IS [RFC7602] Chunduri, U., Lu, W., Tian, A., and N. Shen, "IS-IS
Extended Sequence Number TLV", RFC 7602, Extended Sequence Number TLV", RFC 7602,
DOI 10.17487/RFC7602, July 2015, DOI 10.17487/RFC7602, July 2015,
<http://www.rfc-editor.org/info/rfc7602>. <http://www.rfc-editor.org/info/rfc7602>.
8.2. Informative References 8.2. Informative References
[RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation
Element (PCE)-Based Architecture", RFC 4655, Element (PCE)-Based Architecture", RFC 4655,
DOI 10.17487/RFC4655, August 2006, DOI 10.17487/RFC4655, August 2006,
skipping to change at page 12, line 27 skipping to change at page 16, line 15
Authors' Addresses Authors' Addresses
Naiming Shen Naiming Shen
Cisco Systems Cisco Systems
560 McCarthy Blvd. 560 McCarthy Blvd.
Milpitas, CA 95035 Milpitas, CA 95035
US US
Email: naiming@cisco.com Email: naiming@cisco.com
Sanjay Thyamagundalu Les Ginsberg
Cisco Systems Cisco Systems
3625 Cisco Way 821 Alder Drive
San Jose, CA 95134 Milpitas, CA 95035
US US
Email: sanjayt@cisco.com Email: ginsberg@cisco.com
Sanjay Thyamagundalu
Email: tsanjay@gmail.com
 End of changes. 46 change blocks. 
120 lines changed or deleted 293 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/