< draft-ietf-rift-applicability-03.txt   draft-ietf-rift-applicability-04.txt >
RIFT WG Yuehua. Wei, Ed. RIFT WG Yuehua. Wei, Ed.
Internet-Draft Zheng. Zhang Internet-Draft Zheng. Zhang
Intended status: Informational ZTE Corporation Intended status: Informational ZTE Corporation
Expires: 16 April 2021 Dmitry. Afanasiev Expires: 24 July 2021 Dmitry. Afanasiev
Yandex Yandex
Tom. Verhaeg Tom. Verhaeg
Juniper Networks Juniper Networks
Jaroslaw. Kowalczyk Jaroslaw. Kowalczyk
Orange Polska Orange Polska
P. Thubert P. Thubert
Cisco Systems Cisco Systems
13 October 2020 20 January 2021
RIFT Applicability RIFT Applicability
draft-ietf-rift-applicability-03 draft-ietf-rift-applicability-04
Abstract Abstract
This document discusses the properties, applicability and operational This document discusses the properties, applicability and operational
considerations of RIFT in different network scenarios. It intends to considerations of RIFT in different network scenarios. It intends to
provide a rough guide how RIFT can be deployed to simplify routing provide a rough guide how RIFT can be deployed to simplify routing
operations in Clos topologies and their variations. operations in Clos topologies and their variations.
Status of This Memo Status of This Memo
skipping to change at page 1, line 41 skipping to change at page 1, line 41
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 16 April 2021. This Internet-Draft will expire on 24 July 2021.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components and restrictions with respect to this document. Code Components
extracted from this document must include Simplified BSD License text extracted from this document must include Simplified BSD License text
as described in Section 4.e of the Trust Legal Provisions and are as described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License. provided without warranty as described in the Simplified BSD License.
skipping to change at page 2, line 26 skipping to change at page 2, line 26
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Problem Statement of Routing in Modern IP Fabric Fat Tree 2. Problem Statement of Routing in Modern IP Fabric Fat Tree
Networks . . . . . . . . . . . . . . . . . . . . . . . . 3 Networks . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Applicability of RIFT to Clos IP Fabrics . . . . . . . . . . 3 3. Applicability of RIFT to Clos IP Fabrics . . . . . . . . . . 3
3.1. Overview of RIFT . . . . . . . . . . . . . . . . . . . . 4 3.1. Overview of RIFT . . . . . . . . . . . . . . . . . . . . 4
3.2. Applicable Topologies . . . . . . . . . . . . . . . . . . 6 3.2. Applicable Topologies . . . . . . . . . . . . . . . . . . 6
3.2.1. Horizontal Links . . . . . . . . . . . . . . . . . . 6 3.2.1. Horizontal Links . . . . . . . . . . . . . . . . . . 6
3.2.2. Vertical Shortcuts . . . . . . . . . . . . . . . . . 6 3.2.2. Vertical Shortcuts . . . . . . . . . . . . . . . . . 6
3.2.3. Generalizing to any Directed Acyclic Graph . . . . . 7 3.2.3. Generalizing to any Directed Acyclic Graph . . . . . 7
3.3. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3.1. DC Fabrics . . . . . . . . . . . . . . . . . . . . . 8 3.3.1. Data Center Fabrics . . . . . . . . . . . . . . . . . 8
3.3.2. Metro Fabrics . . . . . . . . . . . . . . . . . . . . 8 3.3.2. Metro Fabrics . . . . . . . . . . . . . . . . . . . . 8
3.3.3. Building Cabling . . . . . . . . . . . . . . . . . . 8 3.3.3. Building Cabling . . . . . . . . . . . . . . . . . . 8
3.3.4. Internal Router Switching Fabrics . . . . . . . . . . 9 3.3.4. Internal Router Switching Fabrics . . . . . . . . . . 9
3.3.5. CloudCO . . . . . . . . . . . . . . . . . . . . . . . 9 3.3.5. CloudCO . . . . . . . . . . . . . . . . . . . . . . . 9
4. Deployment Considerations . . . . . . . . . . . . . . . . . . 11 4. Deployment Considerations . . . . . . . . . . . . . . . . . . 11
4.1. South Reflection . . . . . . . . . . . . . . . . . . . . 12 4.1. South Reflection . . . . . . . . . . . . . . . . . . . . 12
4.2. Suboptimal Routing on Link Failures . . . . . . . . . . . 12 4.2. Suboptimal Routing on Link Failures . . . . . . . . . . . 12
4.3. Black-Holing on Link Failures . . . . . . . . . . . . . . 14 4.3. Black-Holing on Link Failures . . . . . . . . . . . . . . 14
4.4. Zero Touch Provisioning (ZTP) . . . . . . . . . . . . . . 15 4.4. Zero Touch Provisioning (ZTP) . . . . . . . . . . . . . . 15
4.5. Miscabling Examples . . . . . . . . . . . . . . . . . . . 15 4.5. Mis-cabling Examples . . . . . . . . . . . . . . . . . . 15
4.6. Positive vs. Negative Disaggregation . . . . . . . . . . 18 4.6. Positive vs. Negative Disaggregation . . . . . . . . . . 17
4.7. Mobile Edge and Anycast . . . . . . . . . . . . . . . . . 19 4.7. Mobile Edge and Anycast . . . . . . . . . . . . . . . . . 19
4.8. IPv4 over IPv6 . . . . . . . . . . . . . . . . . . . . . 21 4.8. IPv4 over IPv6 . . . . . . . . . . . . . . . . . . . . . 21
4.9. In-Band Reachability of Nodes . . . . . . . . . . . . . . 22 4.9. In-Band Reachability of Nodes . . . . . . . . . . . . . . 22
4.10. Dual Homing Servers . . . . . . . . . . . . . . . . . . . 23 4.10. Dual Homing Servers . . . . . . . . . . . . . . . . . . . 23
4.11. Fabric With A Controller . . . . . . . . . . . . . . . . 24 4.11. Fabric With A Controller . . . . . . . . . . . . . . . . 24
4.11.1. Controller Attached to ToFs . . . . . . . . . . . . 24 4.11.1. Controller Attached to ToFs . . . . . . . . . . . . 24
4.11.2. Controller Attached to Leaf . . . . . . . . . . . . 24 4.11.2. Controller Attached to Leaf . . . . . . . . . . . . 25
4.12. Internet Connectivity With Underlay . . . . . . . . . . . 25 4.12. Internet Connectivity With Underlay . . . . . . . . . . . 25
4.12.1. Internet Default on the Leaf . . . . . . . . . . . . 25 4.12.1. Internet Default on the Leaf . . . . . . . . . . . . 25
4.12.2. Internet Default on the ToFs . . . . . . . . . . . . 25 4.12.2. Internet Default on the ToFs . . . . . . . . . . . . 25
4.13. Subnet Mismatch and Address Families . . . . . . . . . . 25 4.13. Subnet Mismatch and Address Families . . . . . . . . . . 25
4.14. Anycast Considerations . . . . . . . . . . . . . . . . . 26 4.14. Anycast Considerations . . . . . . . . . . . . . . . . . 26
5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 26 4.15. IoT Applicability . . . . . . . . . . . . . . . . . . . . 27
6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 27 5. Security Considerations . . . . . . . . . . . . . . . . . . . 27
7. Normative References . . . . . . . . . . . . . . . . . . . . 27 6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 28
8. Informative References . . . . . . . . . . . . . . . . . . . 28 7. Normative References . . . . . . . . . . . . . . . . . . . . 28
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28 8. Informative References . . . . . . . . . . . . . . . . . . . 29
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30
1. Introduction 1. Introduction
This document intends to explain the properties and applicability of This document intends to explain the properties and applicability of
"Routing in Fat Trees" [RIFT] in different deployment scenarios and "Routing in Fat Trees" [RIFT] in different deployment scenarios and
highlight the operational simplicity of the technology compared to highlight the operational simplicity of the technology compared to
traditional routing solutions. It also documents special traditional routing solutions. It also documents special
considerations when RIFT is used with or without overlays, considerations when RIFT is used with or without overlays, with or
controllers and corrects topology miscablings and/or node and link without controllers, corrects topology mis-cablings, and node or link
failures. failures.
2. Problem Statement of Routing in Modern IP Fabric Fat Tree Networks 2. Problem Statement of Routing in Modern IP Fabric Fat Tree Networks
Clos and Fat-Tree topologies have gained prominence in today's Clos [CLOS] and fat tree [FATTREE] topologies have gained prominence
networking, primarily as result of the paradigm shift towards a in today's networking, primarily as a result of the paradigm shift
centralized data-center based architecture that is poised to deliver towards a centralized data-center based architecture that deliver a
a majority of computation and storage services in the future. majority of computation and storage services.
Today's current routing protocols were geared towards a network with Today's current routing protocols were geared towards a network with
an irregular topology and low degree of connectivity originally. an irregular topology and low degree of connectivity originally.
When they are applied to Fat-Tree topologies: When they are applied to fat tree topologies:
* they tend to need extensive configuration or provisioning during * They tend to need extensive configuration or provisioning during
bring up and re-dimensioning. bring up and re-dimensioning.
* spine and leaf nodes have the entire network topology and routing * Spine and leaf nodes have the entire network topology and routing
information, which is in fact, not needed on the leaf nodes during information which is in fact not needed on the leaf nodes during
normal operation. normal operation.
* significant Link State PDUs (LSPs) flooding duplication between * Significant Link State PDUs (LSPs) flooding duplication between
spine nodes and leaf nodes occurs during network bring up and spine nodes and leaf nodes occurs during network bring up and
topology updates. It consumes both spine and leaf nodes' CPU and topology updates. It consumes both spine and leaf nodes' CPU and
link bandwidth resources and with that limits protocol link bandwidth resources.
scalability.
3. Applicability of RIFT to Clos IP Fabrics 3. Applicability of RIFT to Clos IP Fabrics
Further content of this document assumes that the reader is familiar Further content of this document assumes that the reader is familiar
with the terms and concepts used in OSPF [RFC2328] and IS-IS with the terms and concepts used in OSPF [RFC2328] and IS-IS
[ISO10589-Second-Edition] link-state protocols and at least the [ISO10589-Second-Edition] link-state protocols. The sections of RIFT
sections of [RIFT] outlining the requirement of routing in IP fabrics [RIFT] outline the requirements of routing in IP fabrics and RIFT
and RIFT protocol concepts. protocol concepts.
3.1. Overview of RIFT 3.1. Overview of RIFT
RIFT is a dynamic routing protocol for Clos and fat-tree network RIFT is a dynamic routing protocol for Clos and fat tree network
topologies. It defines a link-state protocol when "pointing north" topologies. It defines a link-state protocol when "pointing north"
and path-vector protocol when "pointing south". and path-vector protocol when "pointing south".
It floods flat link-state information northbound only so that each It floods flat link-state information northbound only so that each
level obtains the full topology of levels south of it. That level obtains the full topology of levels south of it. That
information is never flooded east-west or back South again. So a top information is never flooded east-west or back south again. So a top
tier node has full set of prefixes from the SPF calculation. tier node has full set of prefixes from the Shortest Path First (SPF)
calculation.
In the southbound direction the protocol operates like a "fully In the southbound direction, the protocol operates like a "fully
summarizing, unidirectional" path vector protocol or rather a summarizing, unidirectional" path vector protocol or rather a
distance vector with implicit split horizon whereas the information distance vector with implicit split horizon. Routing information,
propagates one hop south and is 're-advertised' by nodes at next normally just the default route, propagates one hop south and is 're-
lower level, normally just the default route. advertised' by nodes at next lower level.
+-----------+ +-----------+ +-----------+ +-----------+
| ToF | | ToF | LEVEL 2 | ToF | | ToF | LEVEL 2
+ +-----+--+--+ +-+--+------+ + +-----+--+--+ +-+--+------+
| | | | | | | | | ^ | | | | | | | | | ^
+ | | | +-------------------------+ | + | | | +-------------------------+ |
Distance | +-------------------+ | | | | | Distance | +-------------------+ | | | | |
Vector | | | | | | | | + Vector | | | | | | | | +
South | | | | +--------+ | | | Link-state South | | | | +--------+ | | | Link-state
+ | | | | | | | | Flooding + | | | | | | | | Flooding
skipping to change at page 4, line 47 skipping to change at page 4, line 48
Distance | +-------+ | | +--------+ | | | E Distance | +-------+ | | +--------+ | | | E
Vector | | | | | | | | | +------> Vector | | | | | | | | | +------>
South | +-------+ | | | +-------+ | | | | South | +-------+ | | | +-------+ | | | |
+ | | | | | | | | | + + | | | | | | | | | +
v ++--++ +-+-++ ++-+-+ +-+--++ + v ++--++ +-+-++ ++-+-+ +-+--++ +
|LEAF| |LEAF| |LEAF| |LEAF | LEVEL 0 |LEAF| |LEAF| |LEAF| |LEAF | LEVEL 0
+----+ +----+ +----+ +-----+ +----+ +----+ +----+ +-----+
Figure 1: Rift overview Figure 1: Rift overview
A middle tier node has only information necessary for its level, A spine node has only information necessary for its level, which is
which are all destinations south of the node based on SPF all destinations south of the node based on SPF calculation, default
calculation, default route and potential disaggregated routes. route, and potential disaggregated routes.
RIFT combines the advantage of both link-state and distance vector: RIFT combines the advantage of both link-state and distance vector:
* Fastest Possible Convergence * Fastest possible convergence
* Automatic Detection of Topology * Automatic detection of topology
* Minimal Routes/Info on TORs * Minimal routes/info on tors
* High Degree of ECMP * High degree of ECMP
* Fast De-commissioning of Nodes * Fast de-commissioning of nodes
* Maximum Propagation Speed with Flexible Prefixes in an Update * Maximum Propagation speed with flexible prefixes in an update
And RIFT eliminates the disadvantages of link-state or distance And RIFT eliminates the disadvantages of link-state or distance
vector: vector:
* Reduced and Balanced Flooding * Reduced and balanced flooding
* Automatic Neighbor Detection * Automatic neighbor detection
So there are two types of link-state database which are "north So there are two types of link-state database which are "north
representation" N-TIEs and "south representation" S-TIEs. The N-TIEs representation" North Topology Information Elements (N-TIEs) and
contain a link-state topology description of lower levels and S-TIEs "south representation" South Topology Information Elements (S-TIEs).
carry simply default routes for the lower levels. The N-TIEs contain a link-state topology description of lower levels
and S-TIEs carry simply default routes for the lower levels.
There are a bunch of more advantages unique to RIFT listed below There are more advantages unique to RIFT listed below which could be
which could be understood if you read the details of [RIFT]. understood if you read the details of RIFT [RIFT].
* True ZTP * True ZTP
* Minimal Blast Radius on Failures * Minimal blast radius on failures
* Can Utilize All Paths Through Fabric Without Looping * Can utilize all paths through fabric without looping
* Automatic Disaggregation on Failures * Automatic disaggregation on failures
* Simple Leaf Implementation that Can Scale Down to Servers * Simple leaf implementation that can scale down to servers
* Key-Value Store * Key-Value store
* Horizontal Links Used for Protection Only * Horizontal links used for protection only
* Supports Non-Equal Cost Multipath and Can Replace MC-LAG * Supports non-equal cost multipath and can replace MC-LAG
* Optimal Flooding Reduction and Load-Balancing * Optimal flooding reduction and load-balancing
3.2. Applicable Topologies 3.2. Applicable Topologies
Albeit RIFT is specified primarily for "proper" Clos or "fat-tree" Albeit RIFT is specified primarily for "proper" Clos or "fat tree"
structures, it already supports PoD concepts which are strictly structures, it already supports Points of Delivery (PoD) concepts
speaking not found in original Clos concepts. which are strictly speaking not found in original Clos concepts.
Further, the specification explains and supports operations of multi- Further, the specification explains and supports operations of multi-
plane Clos variants where the protocol relies on set of rings to plane Clos variants where the protocol relies on set of rings to
allow the reconciliation of topology view of different planes as most allow the reconciliation of topology view of different planes as most
desirable solution making proper disaggregation viable in case of desirable solution making proper disaggregation viable in case of
failures. These observations hold not only in case of RIFT but in failures. These observations hold not only in case of RIFT but also
the generic case of dynamic routing on Clos variants with multiple in the generic case of dynamic routing on Clos variants with multiple
planes and failures in bi-sectional bandwidth, especially on the planes and failures in bi-sectional bandwidth, especially on the
leafs. leafs.
3.2.1. Horizontal Links 3.2.1. Horizontal Links
RIFT is not limited to pure Clos divided into PoD and multi-planes RIFT is not limited to pure Clos divided into PoD and multi-planes
but supports horizontal links below the top of fabric level. Those but supports horizontal links below the top of fabric level. Those
links are used however only as routes of last resort northbound when links are used only as routes of last resort northbound when a spine
a spine loses all northbound links or cannot compute a default route loses all northbound links or cannot compute a default route through
through them. them.
A possible configuration is a "ring" of horizontal links at a level. A possible configuration is a "ring" of horizontal links at a level.
In presence of such a "ring" in any level (except ToF level) neither In presence of such a "ring" in any level (except Top of Fabric (ToF)
N-SPF nor S-SPF will provide a "ring-based protection" scheme since level) neither North SPF (N-SPF) nor South SPF (S-SPF) will provide a
such a computation would have to deal necessarily with breaking of "ring-based protection" scheme since such a computation would have to
"loops" in Dijkstra sense; an application for which RIFT is not deal necessarily with breaking of "loops" in Dijkstra sense; an
intended. application for which RIFT is not intended.
A full-mesh connectivity between nodes on the same level can be A full-mesh connectivity between nodes on the same level can be
employed and that allows N-SPF to provide for any node loosing all employed and that allows N-SPF to provide for any node loosing all
its northbound adjacencies (as long as any of the other nodes in the its northbound adjacencies (as long as any of the other nodes in the
level are northbound connected) to still participate in northbound level are northbound connected) to still participate in northbound
forwarding. forwarding.
3.2.2. Vertical Shortcuts 3.2.2. Vertical Shortcuts
Through relaxations of the specified adjacency forming rules RIFT Through relaxations of the specified adjacency forming rules, RIFT
implementations can be extended to support vertical "shortcuts" as implementations can be extended to support vertical "shortcuts" as
proposed by e.g. [I-D.white-distoptflood]. The RIFT specification proposed by e.g. [I-D.white-distoptflood]. The RIFT specification
itself does not provide the exact details since the resulting itself does not provide the exact details since the resulting
solution suffers from either much larger blast radius with increased solution suffers from either much larger blast radius with increased
flooding volumes or in case of maximum aggregation routing bow-tie flooding volumes or in case of maximum aggregation routing bow-tie
problems. problems.
3.2.3. Generalizing to any Directed Acyclic Graph 3.2.3. Generalizing to any Directed Acyclic Graph
RIFT is an anisotropic routing protocol, meaning that it has a sense RIFT is an anisotropic routing protocol, meaning that it has a sense
of direction (northbound, southbound, east-west) and that it operates of direction (northbound, southbound, east-west) and that it operates
differently depending on the direction. differently depending on the direction.
* Northbound, RIFT operates as a link-state IGP, whereby the control * Northbound, RIFT operates as a link-state IGP, whereby the control
packets are reflooded first all the way North and only interpreted packets are reflooded first all the way north and only interpreted
later. All the individual fine grained routes are advertised. later. All the individual fine grained routes are advertised.
* Southbound, RIFT operates as a distance vector IGP, whereby the * Southbound, RIFT operates as a distance vector IGP, whereby the
control packets are flooded only one hop, interpreted, and the control packets are flooded only one hop, interpreted, and the
consequence of that computation is what gets flooded on more hop consequence of that computation is what gets flooded one more hop
South. In the most common use-cases, a ToF node can reach most of south. In the most common use-cases, a ToF node can reach most of
the prefixes in the fabric. If that is the case, the ToF node the prefixes in the fabric. If that is the case, the ToF node
advertises the fabric default and disaggregates the prefixes that advertises the fabric default and disaggregates the prefixes that
it cannot reach. On the other hand, a ToF Node that can reach it cannot reach. On the other hand, a ToF node that can reach
only a small subset of the prefixes in the fabric will preferably only a small subset of the prefixes in the fabric will preferably
advertise those prefixes and refrain from aggregating. advertise those prefixes and refrain from aggregating.
In the general case, what gets advertised South is in more In the general case, what gets advertised south is in more
details: details:
1. A fabric default that aggregates all the prefixes that are 1. A fabric default that aggregates all the prefixes that are
reachable within the fabric, and that could be a default route reachable within the fabric, and that could be a default route
or a prefix that is dedicated to this particular fabric. or a prefix that is dedicated to this particular fabric.
2. The loopback addresses of the northbound nodes, e.g., for 2. The loopback addresses of the northbound nodes, e.g., for
inband management. inband management.
3. The disaggregated prefixes for the dynamic exceptions to the 3. The disaggregated prefixes for the dynamic exceptions to the
fabric Default, advertised to route around the black hole that fabric default, advertised to route around the black hole that
may form may form.
* east-west routing can optionally be used, with specific * East-west routing can optionally be used, with specific
restrictions. It is useful in particular when a sibling has restrictions. It is useful in particular when a sibling has
access to the fabric default but this node does not. access to the fabric default but this node does not.
A Directed Acyclic Graph (DAG) provides a sense of North (the A Directed Acyclic Graph (DAG) provides a sense of north (the
direction of the DAG) and of South (the reverse), which can be used direction of the DAG) and of south (the reverse), which can be used
to apply RIFT. For the purpose of RIFT, an edge in the DAG that has to apply RIFT. For the purpose of RIFT, an edge in the DAG that has
only incoming vertices is a ToF node. only incoming vertices is a ToF node.
There are a number of caveats though: There are a number of caveats though:
* The DAG structure must exist before RIFT starts, so there is a * The DAG structure must exist before RIFT starts, so there is a
need for a companion protocol to establish the logical DAG need for a companion protocol to establish the logical DAG
structure. structure.
* A generic DAG does not have a sense of east and west. The * A generic DAG does not have a sense of east and west. The
skipping to change at page 8, line 18 skipping to change at page 8, line 18
* In order to aggregate and disaggregate routes, RIFT requires that * In order to aggregate and disaggregate routes, RIFT requires that
all the ToF nodes share the full knowledge of the prefixes in the all the ToF nodes share the full knowledge of the prefixes in the
fabric. This can be achieved with a ring as suggested by the RIFT fabric. This can be achieved with a ring as suggested by the RIFT
main specification, by some preconfiguration, or using a main specification, by some preconfiguration, or using a
synchronization with a common repository where all the active synchronization with a common repository where all the active
prefixes are registered. prefixes are registered.
3.3. Use Cases 3.3. Use Cases
3.3.1. DC Fabrics 3.3.1. Data Center Fabrics
RIFT is largely driven by demands and hence ideally suited for RIFT is largely driven by demands and hence ideally suited for
application in underlay of data center IP fabrics, vast majority of applying in data center (DC) IP fabrics underlay routing, vast
which seem to be currently (and for the foreseeable future) Clos majority of which seem to be currently (and for the foreseeable
architectures. It significantly simplifies operation and deployment future) Clos architectures. It significantly simplifies operation
of such fabrics as described in Section 4 for environments compared and deployment of such fabrics as described in Section 4 for
to extensive proprietary provisioning and operational solutions. environments compared to extensive proprietary provisioning and
operational solutions.
3.3.2. Metro Fabrics 3.3.2. Metro Fabrics
The demand for bandwidth is increasing steadily, driven primarily by The demand for bandwidth is increasing steadily, driven primarily by
environments close to content producers (server farms connection via environments close to content producers (server farms connection via
DC fabrics) but in proximity to content consumers as well. Consumers DC fabrics) but in proximity to content consumers as well. Consumers
are often clustered in metro areas with their own network are often clustered in metro areas with their own network
architectures that can benefit from simplified, regular Clos architectures that can benefit from simplified, regular Clos
structures and hence RIFT. structures and hence RIFT.
3.3.3. Building Cabling 3.3.3. Building Cabling
Commercial edifices are often cabled in topologies that are either Commercial edifices are often cabled in topologies that are either
Clos or its isomorphic equivalents. With many floors the Clos can Clos or its isomorphic equivalents. The Clos can grow rather high
grow rather high and with that present a challenge for traditional with many floors. That presents a challenge for traditional routing
routing protocols (except BGP and by now largely phased-out PNNI) protocols (except BGP and by now largely phased-out PNNI) which do
which do not support an arbitrary number of levels which RIFT does not support an arbitrary number of levels which RIFT does naturally.
naturally. Moreover, due to limited sizes of forwarding tables in Moreover, due to the limited sizes of forwarding tables in network
active elements of building cabling the minimum FIB size RIFT elements of building cabling&#65292;the minimum FIB size RIFT
maintains under normal conditions can prove particularly cost- maintains under normal conditions is cost-effective in terms of
effective in terms of hardware and operational costs. hardware and operational costs.
3.3.4. Internal Router Switching Fabrics 3.3.4. Internal Router Switching Fabrics
It is common in high-speed communications switching and routing It is common in high-speed communications switching and routing
devices to use fabrics when a crossbar is not feasible due to cost, devices to use fabrics when a crossbar is not feasible due to cost,
head-of-line blocking or size trade-offs. Normally such fabrics are head-of-line blocking or size trade-offs. Normally such fabrics are
not self-healing or rely on 1:/+1 protection schemes but it is not self-healing or rely on 1:/+1 protection schemes but it is
conceivable to use RIFT to operate Clos fabrics that can deal conceivable to use RIFT to operate Clos fabrics that can deal
effectively with interconnections or subsystem failures in such effectively with interconnections or subsystem failures in such
module. RIFT is neither IP specific and hence any link addressing module. RIFT is neither IP specific and hence any link addressing
connecting internal device subnets is conceivable. connecting internal device subnets is conceivable.
3.3.5. CloudCO 3.3.5. CloudCO
The Cloud Central Office (CloudCO) is a new stage of telecom Central The Cloud Central Office (CloudCO) is a new stage of telecom Central
Office. It takes the advantage of Software Defined Networking (SDN) Office. It takes the advantage of Software Defined Networking (SDN)
and Network Function Virtualization (NFV) in conjunction with general and Network Function Virtualization (NFV) in conjunction with general
purpose hardware to optimize current networks. The following figure purpose hardware to optimize current networks. The following figure
illustrates this architecture at a high level. It describes a single illustrates this architecture at a high level. It describes a single
instance or macro-node of cloud CO. An Access I/O module faces a instance or macro-node of cloud CO. An Access I/O module faces a
Cloud CO Access Node, and the CPEs behind it. A Network I/O module Cloud CO access node, and the Customer Premises Equipments (CPEs)
is facing the core network. The two I/O modules are interconnected behind it. A Network I/O module is facing the core network. The two
by a leaf and spine fabric. [TR-384] I/O modules are interconnected by a leaf and spine fabric. [TR-384]
+---------------------+ +----------------------+ +---------------------+ +----------------------+
| Spine | | Spine | | Spine | | Spine |
| Switch | | Switch | | Switch | | Switch |
+------+---+------+-+-+ +--+-+-+-+-----+-------+ +------+---+------+-+-+ +--+-+-+-+-----+-------+
| | | | | | | | | | | | | | | | | | | | | | | |
| | | | | +-------------------------------+ | | | | | | +-------------------------------+ |
| | | | | | | | | | | | | | | | | | | | | | | |
| | | | +-------------------------+ | | | | | | | +-------------------------+ | | |
| | | | | | | | | | | | | | | | | | | | | | | |
| | +----------------------+ | | | | | | | | | | +----------------------+ | | | | | | | |
skipping to change at page 11, line 12 skipping to change at page 11, line 12
The Spine-Leaf architecture deployed inside CloudCO meets the network The Spine-Leaf architecture deployed inside CloudCO meets the network
requirements of adaptable, agile, scalable and dynamic. requirements of adaptable, agile, scalable and dynamic.
4. Deployment Considerations 4. Deployment Considerations
RIFT presents the opportunity for organizations building and RIFT presents the opportunity for organizations building and
operating IP fabrics to simplify their operation and deployments operating IP fabrics to simplify their operation and deployments
while achieving many desirable properties of a dynamic routing on while achieving many desirable properties of a dynamic routing on
such a substrate: such a substrate:
* RIFT design follows minimum blast radius and minimum necessary * RIFT only foods routing information to the devices that absolutely
epistemological scope philosophy which leads to very good scaling need it. RIFT design follows minimum blast radius and minimum
properties while delivering maximum reactiveness. necessary epistemological scope philosophy which leads to good
scaling properties while delivering maximum reactiveness.
* RIFT allows for extensive Zero Touch Provisioning within the * RIFT allows for extensive Zero Touch Provisioning within the
protocol. In its most extreme version RIFT does not rely on any protocol. In its most extreme version RIFT does not rely on any
specific addressing and for IP fabric can operate using IPv6 ND specific addressing and for IP fabric can operate using IPv6 ND
[RFC4861] only. [RFC4861] only.
* RIFT has provisions to detect common IP fabric mis-cabling * RIFT has provisions to detect common IP fabric mis-cabling
scenarios. scenarios.
* RIFT negotiates automatically BFD per link allowing this way for * RIFT negotiates automatically BFD per link allowing this way for
IP and micro-BFD [RFC7130] to replace LAGs which do hide bandwidth IP and micro-BFD [RFC7130] to replace Link Aggregation Groups
imbalances in case of constituent failures. Further automatic (LAGs) which do hide bandwidth imbalances in case of constituent
link validation techniques similar to [RFC5357] could be supported failures. Further automatic link validation techniques similar to
as well. [RFC5357] could be supported as well.
* RIFT inherently solves many difficult problems associated with the * RIFT inherently solves many difficult problems associated with the
use of traditional routing topologies with dense meshes and high use of traditional routing topologies with dense meshes and high
degrees of ECMP by including automatic bandwidth balancing, flood degrees of ECMP by including automatic bandwidth balancing, flood
reduction and automatic disaggregation on failures while providing reduction and automatic disaggregation on failures while providing
maximum aggregation of prefixes in default scenarios. maximum aggregation of prefixes in default scenarios.
* RIFT reduces FIB size towards the bottom of the IP fabric where * RIFT reduces FIB size towards the bottom of the IP fabric where
most nodes reside and allows with that for cheaper hardware on the most nodes reside and allows with that for cheaper hardware on the
edges and introduction of modern IP fabric architectures that edges and introduction of modern IP fabric architectures that
skipping to change at page 12, line 14 skipping to change at page 12, line 17
* Many further operational and design points collected over many * Many further operational and design points collected over many
years of routing protocol deployments have been incorporated in years of routing protocol deployments have been incorporated in
RIFT such as fast flooding rates, protection of information RIFT such as fast flooding rates, protection of information
lifetimes and operationally easily recognizable remote ends of lifetimes and operationally easily recognizable remote ends of
links and node names. links and node names.
4.1. South Reflection 4.1. South Reflection
South reflection is a mechanism that South Node TIEs are "reflected" South reflection is a mechanism that South Node TIEs are "reflected"
back up north to allow nodes in same level without E-W links to "see" back up north to allow nodes in same level without East-west links to
each other. "see" each other.
For example, Spine111\Spine112\Spine121\Spine122 reflects Node S-TIEs For example, Spine111\Spine112\Spine121\Spine122 reflects Node S-TIEs
from ToF21 to ToF22 separately. Respectively, from ToF21 to ToF22 separately. Respectively,
Spine111\Spine112\Spine121\Spine122 reflects Node S-TIEs from ToF22 Spine111\Spine112\Spine121\Spine122 reflects Node S-TIEs from ToF22
to ToF21 separately. So ToF22 and ToF21 see each other's node to ToF21 separately. So ToF22 and ToF21 see each other's node
information as level 2 nodes. information as level 2 nodes.
In an equivalent fashion, as the result of the south reflection In an equivalent fashion, as the result of the south reflection
between Spine121-Leaf121-Spine122 and Spine121-Leaf122-Spine122, between Spine121-Leaf121-Spine122 and Spine121-Leaf122-Spine122,
Spine121 and Spine 122 knows each other at level 1. Spine121 and Spine 122 knows each other at level 1.
4.2. Suboptimal Routing on Link Failures 4.2. Suboptimal Routing on Link Failures
+--------+ +--------+ +--------+ +--------+
| ToF21 | | ToF22 | LEVEL 2 | ToF21 | | ToF22 | LEVEL 2
++--+-+-++ ++-+--+-++ ++--+-+-++ ++-+--+-++
| | | | | | | + | | | | | | | +
| | | | | | | linkTS8 | | | | | | | linkTS8
+-------------+ | +-+linkTS3+-+ | | | +--------------+ +-------------+ | +-+linkTS3+-+ | | | +-------------+
| | | | | | + | | | | | | | + |
| +----------------------------+ | linkTS7 | | +----------------------------+ | linkTS7 |
| | | | + + + | | | | | + + + |
| | | +-------+linkTS4+------------+ | | | | +-------+linkTS4+------------+ |
| | | + + | | | | | | + + | | |
| | | +------------+--+ | | | | | +------------+--+ | |
| | | | | linkTS6 | | | | | | | linkTS6 | |
+-+----++ ++-----++ ++------+ ++-----++ +-+----+-+ +-----+--+ ++--------+ +-+----+-+
|Spin111| |Spin112| |Spin121| |Spin122| LEVEL 1 |Spine111| |Spine112| |Spine121 | |Spine122| LEVEL 1
+-+---+-+ ++----+-+ +-+---+-+ ++---+--+ +-+---+--+ +----+---+ +-+---+---+ +-+---+--+
| | | | | | | | | | | | | | | |
| +--------------+ | + ++XX+linkSL6+---+ + | +--------------+ | + ++XX+linkSL6+---+ +
| | | | linkSL5 | | linkSL8 | | | | linkSL5 | | linkSL8
| +------------+ | | + +---+linkSL7+-+ | + | +------------+ | | + +---+linkSL7+-+ | +
| | | | | | | | | | | | | | | |
+-+---+-+ +--+--+-+ +-+---+-+ +--+-+--+ +-+---+-+ +--+--+-+ +-+---+-+ +--+-+--+
|Leaf111| |Leaf112| |Leaf121| |Leaf122| LEVEL 0 |Leaf111| |Leaf112| |Leaf121| |Leaf122| LEVEL 0
+-+-----+ ++------+ +-----+-+ +-+-----+ +-+-----+ ++------+ +-----+-+ +-+-----+
+ + + + + + + +
Prefix111 Prefix112 Prefix121 Prefix122 Prefix111 Prefix112 Prefix121 Prefix122
skipping to change at page 14, line 10 skipping to change at page 14, line 10
S-TIE PrefixesElement(prefix122, cost 1). The packet from leaf121 to S-TIE PrefixesElement(prefix122, cost 1). The packet from leaf121 to
prefix122 will only be sent to linkSL7 following a longest-prefix prefix122 will only be sent to linkSL7 following a longest-prefix
match to prefix 122 directly then go down through linkSL8 to Leaf122 match to prefix 122 directly then go down through linkSL8 to Leaf122
. .
4.3. Black-Holing on Link Failures 4.3. Black-Holing on Link Failures
+--------+ +--------+ +--------+ +--------+
| ToF 21 | | ToF 22 | LEVEL 2 | ToF 21 | | ToF 22 | LEVEL 2
++-+--+-++ ++-+--+-++ ++-+--+-++ ++-+--+-++
| | | | | | | | | | | | | | | +
| | | | | | | linkTS8 | | | | | | | linkTS8
+--------------+ | +--linkTS3-X+ | | | +--------------+ +--------------+ | +-+linkTS3+X+ | | | +--------------+
linkTS1 | | | | | | | linkTS1 | | | | | + |
| +-----------------------------+ | linkTS7 | + +-----------------------------+ | linkTS7 |
| | | | | | | | | | + | + + + |
| | linkTS2 +--------linkTS4-X-----------+ | | | linkTS2 +-------+linkTS4+X+----------+ |
| | | | | | | | | + + + + | | |
| linkTS5 +-+ +---------------+ | | | linkTS5 +-+ +------------+--+ | |
| | | | | linkTS6 | | | + | | | linkTS6 | |
+-+----++ +-+-----+ ++----+-+ ++-----++ +-+----+-+ +-+----+-+ ++-------+ +-+-----++
|Spin111| |Spin112| |Spin121| |Spin122| LEVEL 1 |Spine111| |Spine112| |Spine121| |Spine122| LEVEL 1
+-+---+-+ ++----+-+ +-+---+-+ ++---+--+ +-+---+--+ ++----+--+ +-+---+--+ +-+---+--+
| | | | | | | | | | | | | | | |
| +---------------+ | | +----linkSL6----+ | + +---------------+ | + +---+linkSL6+---+ +
linkSL1 | | | linkSL5 | | linkSL8 linkSL1 | | | linkSL5 | | linkSL8
| +---linkSL3---+ | | | +----linkSL7--+ | | + +--+linkSL3+--+ | | + +---+linkSL7+-+ | +
| | | | | | | | | | | | | | | |
+-+---+-+ +--+--+-+ +-+---+-+ +--+-+--+ +-+---+-+ +--+--+-+ +-+---+-+ +--+-+--+
|Leaf111| |Leaf112| |Leaf121| |Leaf122| LEVEL 0 |Leaf111| |Leaf112| |Leaf121| |Leaf122| LEVEL 0
+-+-----+ ++------+ +-----+-+ +-+-----+ +-+-----+ ++------+ +-----+-+ +-+-----+
+ + + + + + + +
Prefix111 Prefix112 Prefix121 Prefix122 Prefix111 Prefix112 Prefix121 Prefix122
Figure 4: Black-holing upon link failure use case Figure 4: Black-holing upon link failure use case
This scenario illustrates a case when double link failure occurs and This scenario illustrates a case when double link failure occurs and
skipping to change at page 15, line 12 skipping to change at page 15, line 12
with prefix 121 and prefix 122, that is flooded to spines 111, 112, with prefix 121 and prefix 122, that is flooded to spines 111, 112,
121 and 122. 121 and 122.
The packet from leaf111 to prefix122 will not be routed to linkTS1 or The packet from leaf111 to prefix122 will not be routed to linkTS1 or
linkTS2. The packet from leaf111 to prefix122 will only be routed to linkTS2. The packet from leaf111 to prefix122 will only be routed to
linkTS5 or linkTS7 following a longest-prefix match to prefix122. linkTS5 or linkTS7 following a longest-prefix match to prefix122.
4.4. Zero Touch Provisioning (ZTP) 4.4. Zero Touch Provisioning (ZTP)
Each RIFT node may operate in zero touch provisioning (ZTP) mode. It Each RIFT node may operate in zero touch provisioning (ZTP) mode. It
has no configuration (unless it is a Top-of-Fabric at the top of the has no configuration (unless it is a ToF at the top of the topology
topology or it is desired to confine it to leaf role w/o leaf-2-leaf or it is desired to confine it to leaf role w/o leaf-2-leaf
procedures). In such case RIFT will fully configure the node's level procedures). In such case RIFT will fully configure the node's level
after it is attached to the topology. after it is attached to the topology.
The most import component for ZTP is the automatic level derivation The most important component for ZTP is the automatic level
procedure. All the Top-of-Fabric nodes are explicitly marked with derivation procedure. All the ToF nodes are explicitly marked with
TOP_OF_FABRIC flag which are initial 'seeds' needed for other ZTP TOP_OF_FABRIC flag which are initial 'seeds' needed for other ZTP
nodes to derive their level in the topology. The derivation of the nodes to derive their level in the topology. The derivation of the
level of each node happens then based on LIEs received from its level of each node happens then based on Link Information Elements
neighbors whereas each node (with possibly exceptions of configured (LIEs) received from its neighbors whereas each node (with possibly
leafs) tries to attach at the highest possible point in the fabric. exceptions of configured leafs) tries to attach at the highest
This guarantees that even if the diffusion front reaches a node from possible point in the fabric. This guarantees that even if the
"below" faster than from "above", it will greedily abandon already diffusion front reaches a node from "below" faster than from "above",
negotiated level derived from nodes topologically below it and it will greedily abandon already negotiated level derived from nodes
properly peer with nodes above. topologically below it and properly peer with nodes above.
4.5. Miscabling Examples 4.5. Mis-cabling Examples
+----------------+ +-----------------+ +----------------+ +-----------------+
| ToF21 | +------+ ToF22 | LEVEL 2 | ToF21 | +------+ ToF22 | LEVEL 2
+-------+----+---+ | +----+---+--------+ +-------+----+---+ | +----+---+--------+
| | | | | | | | | | | | | | | | | |
| | | +----------------------------+ | | | | +----------------------------+ |
| +---------------------------+ | | | | | +---------------------------+ | | | |
| | | | | | | | | | | | | | | | | |
| | | | +-----------------------+ | | | | | | +-----------------------+ | |
| | +------------------------+ | | | | | +------------------------+ | | |
| | | | | | | | | | | | | | | | | |
+-+---+-+ +-+---+-+ | +-+---+-+ +-+---+-+ +-+---+--+ +-+---+--+ | +--+---+-+ +--+---+-+
|Spin111| |Spin112| | |Spin121| |Spin122| LEVEL 1 |Spine111| |Spine112| | |Spine121| |Spine122| LEVEL 1
+-+---+-+ ++----+-+ | +-+---+-+ ++----+-+ +-+---+--+ ++----+--+ | +--+---+-+ +-+----+-+
| | | | | | | | | | | | | | | | | |
| +---------+ | link-M | +---------+ | | +---------+ | link-M | +---------+ |
| | | | | | | | | | | | | | | | | |
| +-------+ | | | | +-------+ | | | +-------+ | | | | +-------+ | |
| | | | | | | | | | | | | | | | | |
+-+---+-+ +--+--+-+ | +-+---+-+ +--+--+-+ +-+---+-+ +--+--+-+ | +-+---+-+ +--+--+-+
|Leaf111| |Leaf112+-----+ |Leaf121| |Leaf122| LEVEL 0 |Leaf111| |Leaf112+-----+ |Leaf121| |Leaf122| LEVEL 0
+-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+
Figure 5: A single plane miscabling example Figure 5: A single plane mis-cabling example
Figure 5 shows a single plane miscabling example. It's a perfect Figure 5 shows a single plane mis-cabling example. It's a perfect
fat-tree fabric except link-M connecting Leaf112 to ToF22. fat tree fabric except link-M connecting Leaf112 to ToF22.
The RIFT control protocol can discover the physical links The RIFT control protocol can discover the physical links
automatically and be able to detect cabling that violates fat-tree automatically and be able to detect cabling that violates fat tree
topology constraints. It react accordingly to such mis-cabling topology constraints. It reacts accordingly to such mis-cabling
attempts, at a minimum preventing adjacencies between nodes from attempts, at a minimum preventing adjacencies between nodes from
being formed and traffic from being forwarded on those mis-cabled being formed and traffic from being forwarded on those mis-cabled
links. Leaf112 will in such scenario use link-M to derive its level links. Leaf112 will in such scenario use link-M to derive its level
(unless it is leaf) and can report links to spines 111 and 112 as (unless it is leaf) and can report links to Spine111 and Spine112 as
miscabled unless the implementations allows horizontal links. mis-cabled unless the implementations allows horizontal links.
Figure 6 shows a multiple plane miscabling example. Since Leaf112 Figure 6 shows a multiple plane mis-cabling example. Since Leaf112
and Spine121 belong to two different PoDs, the adjacency between and Spine121 belong to two different PoDs, the adjacency between
Leaf112 and Spine121 can not be formed. link-W would be detected and Leaf112 and Spine121 can not be formed. link-W would be detected and
prevented. prevented.
+-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+
|ToF A1| |ToF A2| |ToF B1| |ToF B2| LEVEL 2 |ToF A1| |ToF A2| |ToF B1| |ToF B2| LEVEL 2
+-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+
| | | | | | | | | | | | | | | |
| | | +-----------------+ | | | | | | +-----------------+ | | |
| +--------------------------+ | | | | | +--------------------------+ | | | |
| | | | | | | | | | | | | | | |
| +------+ | | | +------+ | | +------+ | | | +------+ |
| | +-----------------+ | | | | | | | +-----------------+ | | | | |
| | | +--------------------------+ | | | | | +--------------------------+ | |
| A | | B | | A | | B | | A | | B | | A | | B |
+-----+-+ +-+---+-+ +-+---+-+ +-+-----+ +-----+--+ +-+---+--+ +--+---+-+ +--+-----+
|Spin111| |Spin112| +----+Spin121| |Spin122| LEVEL 1 |Spine111| |Spine112| +---+Spine121| |Spine122| LEVEL 1
+-+---+-+ ++----+-+ | +-+---+-+ ++----+-+ +-+---+--+ ++----+--+ | +--+---+-+ +-+----+-+
| | | | | | | | | | | | | | | | | |
| +---------+ | | | +---------+ | | +---------+ | | | +---------+ |
| | | | link-W | | | | | | | | link-W | | | |
| +-------+ | | | | +-------+ | | | +-------+ | | | | +-------+ | |
| | | | | | | | | | | | | | | | | |
+-+---+-+ +--+--+-+ | +-+---+-+ +--+--+-+ +-+---+-+ +--+--+-+ | +-+---+-+ +--+--+-+
|Leaf111| |Leaf112+------+ |Leaf121| |Leaf122| LEVEL 0 |Leaf111| |Leaf112+------+ |Leaf121| |Leaf122| LEVEL 0
+-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+
+--------PoD#1----------+ +---------PoD#2---------+ +--------PoD#1----------+ +---------PoD#2---------+
Figure 6: A multiple plane miscabling example Figure 6: A multiple plane mis-cabling example
RIFT provides an optional level determination procedure in its Zero RIFT provides an optional level determination procedure in its Zero
Touch Provisioning mode. Nodes in the fabric without their level Touch Provisioning mode. Nodes in the fabric without their level
configured determine it automatically. This can have possibly configured determine it automatically. This can have possibly
counter-intuitive consequences however. One extreme failure scenario counter-intuitive consequences however. One extreme failure scenario
is depicted in Figure 7 and it shows that if all northbound links of is depicted in Figure 7 and it shows that if all northbound links of
spine11 fail at the same time, spine11 negotiates a lower level than spine11 fail at the same time, spine11 negotiates a lower level than
Leaf11 and Leaf12. Leaf11 and Leaf12.
To prevent such scenario where leafs are expected to act as switches, To prevent such scenario where leafs are expected to act as switches,
LEAF_ONLY flag can be set for Leaf111 and Leaf112. Since level -1 is LEAF_ONLY flag can be set for Leaf111 and Leaf112. Since level -1 is
invalid, Spine11 would not derive a valid level from the topology in invalid, Spine11 would not derive a valid level from the topology in
Figure 7. It will be isolated from the whole fabric and it would be Figure 7. It will be isolated from the whole fabric and it would be
up to the leafs to declare the links towards such spine as miscabled. up to the leafs to declare the links towards such spine as mis-
cabled.
+-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+
|ToF A1| |ToF A2| |ToF A1| |ToF A2| |ToF A1| |ToF A2| |ToF A1| |ToF A2|
+-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+
| | | | | | | | | | | |
| +-------+ | | | | +-------+ | | |
+ + | | ====> | | + + | | ====> | |
X X +------+ | +------+ | X X +------+ | +------+ |
+ + | | | | + + | | | |
+----+--+ +-+-----+ +-+-----+ +----+--+ +-+-----+ +-+-----+
skipping to change at page 18, line 8 skipping to change at page 17, line 47
| | | |
+-+---+-+ +-+---+-+
|Spine11| |Spine11|
+-------+ +-------+
Figure 7: Fallen spine Figure 7: Fallen spine
4.6. Positive vs. Negative Disaggregation 4.6. Positive vs. Negative Disaggregation
Disaggregation is the procedure whereby [RIFT] advertises a more Disaggregation is the procedure whereby [RIFT] advertises a more
specific route Southwards as an exception to the aggregated fabric- specific route southwards as an exception to the aggregated fabric-
default North. Disaggregation is useful when a prefix within the default north. Disaggregation is useful when a prefix within the
aggregation is reachable via some of the parents but not the others aggregation is reachable via some of the parents but not the others
at the same level of the fabric. It is mandatory when the level is at the same level of the fabric. It is mandatory when the level is
the ToF since a ToF node that cannot reach a prefix becomes a black the ToF since a ToF node that cannot reach a prefix becomes a black
hole for that prefix. The hard problem is to know which prefixes are hole for that prefix. The hard problem is to know which prefixes are
reachable by whom. reachable by whom.
In the general case, [RIFT] solves that problem by interconnecting In the general case, [RIFT] solves that problem by interconnecting
the ToF nodes so they can exchange the full list of prefixes that the ToF nodes. So the ToF nodes can exchange the full list of
exist in the fabric and figure when a ToF node lacks reachability and prefixes that exist in the fabric and figure when a ToF node lacks
to existing prefix. This requires additional ports at the ToF, reachability and to existing prefix. This requires additional ports
typically 2 ports per ToF node to form a ToF-spanning ring. [RIFT] at the ToF, typically 2 ports per ToF node to form a ToF-spanning
also defines the southbound reflection procedure that enables a ring. [RIFT] also defines the southbound reflection procedure that
parent to explore the direct connectivity of its peers, meaning their enables a parent to explore the direct connectivity of its peers,
own parents and children; based on the advertisements received from meaning their own parents and children; based on the advertisements
the shared parents and children, it may enable the parent to infer received from the shared parents and children, it may enable the
the prefixes its peers can reach. parent to infer the prefixes its peers can reach.
When a parent lacks reachability to a prefix, it may disaggregate the When a parent lacks reachability to a prefix, it may disaggregate the
prefix negatively, i.e., advertise that this parent can be used to prefix negatively, i.e., advertise that this parent can be used to
reach any prefix in the aggregation except that one. The Negative reach any prefix in the aggregation except that one. The Negative
Disaggregation signaling is simple and functions transitively from Disaggregation signaling is simple and functions transitively from
ToF to ToP and then from Top to Leaf. But it is hard for a parent to ToF to top-of-pod (ToP) and then from ToP to Leaf. But it is hard
figure which prefix it needs to disaggregate, because it does not for a parent to figure which prefix it needs to disaggregate, because
know what it does not know; it results that the use of a spanning it does not know what it does not know; it results that the use of a
ring at the ToF is required to operate the Negative Disaggregation. spanning ring at the ToF is required to operate the Negative
Also, though it is only an implementation problem, the programmation Disaggregation. Also, though it is only an implementation problem,
of the FIB is complex compared to normal routes, and may incur the programmation of the FIB is complex compared to normal routes,
recursions. and may incur recursions.
The more classical alternative is, for the parents that can reach a The more classical alternative is, for the parents that can reach a
prefix that peers at the same level cannot, to advertise a more prefix that peers at the same level cannot, to advertise a more
specific route to that prefix. This leverages the normal longest specific route to that prefix. This leverages the normal longest
prefix match in the FIB, and does not require a special prefix match in the FIB, and does not require a special
implementation. But as opposed to the Negative Disaggregation, the implementation. But as opposed to the Negative Disaggregation, the
Positive Disaggregation is difficult and inefficient to operate Positive Disaggregation is difficult and inefficient to operate
transitively. transitively.
Transitivity is not needed to a grandchild if all its parents Transitivity is not needed to a grandchild if all its parents
skipping to change at page 19, line 45 skipping to change at page 19, line 37
meantime. In the case of Negative Disaggregation, the last ToF meantime. In the case of Negative Disaggregation, the last ToF
node(s) that injects the route may also incur an incast issue; this node(s) that injects the route may also incur an incast issue; this
problem would occur if a prefix that becomes totally unreachable is problem would occur if a prefix that becomes totally unreachable is
disaggregated, but doing so is mostly useless and is not recommended. disaggregated, but doing so is mostly useless and is not recommended.
4.7. Mobile Edge and Anycast 4.7. Mobile Edge and Anycast
When a physical or a virtual node changes its point of attachement in When a physical or a virtual node changes its point of attachement in
the fabric from a previous-leaf to a next-leaf, new routes must be the fabric from a previous-leaf to a next-leaf, new routes must be
installed that supersede the old ones. Since the flooding flows installed that supersede the old ones. Since the flooding flows
Northwards, the nodes (if any) between the previous-leaf and the northwards, the nodes (if any) between the previous-leaf and the
common parent are not immediately aware that the path via previous- common parent are not immediately aware that the path via previous-
leaf is obsolete, and a stale route may exist for a while. The leaf is obsolete, and a stale route may exist for a while. The
common parent needs to select the freshest route advertisement in common parent needs to select the freshest route advertisement in
order to install the correct route via the next-leaf. This requires order to install the correct route via the next-leaf. This requires
that the fabric determines the sequence of the movements of the that the fabric determines the sequence of the movements of the
mobile node. mobile node.
On the one hand, a classical sequence counter provides a total order On the one hand, a classical sequence counter provides a total order
for a while but it will eventually wrap. On the other hand, a for a while but it will eventually wrap. On the other hand, a
timestamp provides a permanent order but it may miss a movement that timestamp provides a permanent order but it may miss a movement that
happens too quickly vs. the granularity of the timing information. happens too quickly vs. the granularity of the timing information.
It is not envisioned in the short term that the average fabric It is not envisioned in the short term that the average fabric
supports a Precision Time Protocol, and the precision that may be supports a Precision Time Protocol [IEEEstd1588], and the precision
available with the Network Time Protocol [RFC5905], in the order of that may be available with the Network Time Protocol [RFC5905], in
100 to 200ms, may not be necessarily enough to cover, e.g., the fast the order of 100 to 200ms, may not be necessarily enough to cover,
mobility of a Virtual Machine. e.g., the fast mobility of a Virtual Machine.
Section 4.3.3. "Mobility" of [RIFT] specifies an hybrid method that Section 4.3.3. "Mobility" of [RIFT] specifies an hybrid method that
combines a sequence counter from the mobile node and a timestamp from combines a sequence counter from the mobile node and a timestamp from
the network taken at the leaf when the route is injected. If the the network taken at the leaf when the route is injected. If the
timestamps of the concurrent advertisements are comparable (i.e., timestamps of the concurrent advertisements are comparable (i.e.,
more distant than the precision of the timing protocol), then the more distant than the precision of the timing protocol), then the
timestamp alone is used to determine the relative freshness of the timestamp alone is used to determine the relative freshness of the
routes. Otherwise, the sequence counter from the mobile node, if routes. Otherwise, the sequence counter from the mobile node, if
available, is used. One caveat is that the sequence counter must not available, is used. One caveat is that the sequence counter must not
wrap within the precision of the timing protocol. Another is that wrap within the precision of the timing protocol. Another is that
the mobile node may not even provide a sequence counter, in which the mobile node may not even provide a sequence counter, in which
case the mobility itself must be slower than the precision of the case the mobility itself must be slower than the precision of the
timing. timing.
Mobility must not be confused with Anycast. In both cases, a same Mobility must not be confused with anycast. In both cases, a same
address is injected in RIFT at different leaves. In the case of address is injected in RIFT at different leaves. In the case of
mobility, only the freshest route must be conserved, since mobile mobility, only the freshest route must be conserved, since mobile
node changed its point of attachment for a leaf to the next. In the node changed its point of attachment for a leaf to the next. In the
case of anycast, the node may be either multihomed (attached to case of anycast, the node may be either multihomed (attached to
multiple leaves in parallel) or reachable beyond the fabric via multiple leaves in parallel) or reachable beyond the fabric via
multiple routes that are redistributed to different leaves; either multiple routes that are redistributed to different leaves; either
way, in the case of anycast, the multiple routes are equally valid way, in the case of anycast, the multiple routes are equally valid
and should be conserved. Without further information from the and should be conserved. Without further information from the
redistributed routing protocol, it is impossible to sort out a redistributed routing protocol, it is impossible to sort out a
movement from a redistribution that happens asynchronously on movement from a redistribution that happens asynchronously on
skipping to change at page 20, line 50 skipping to change at page 20, line 50
advertised within the timing precision, which is typically the case advertised within the timing precision, which is typically the case
with a low-precision timing and a multihomed node. Beyond that time with a low-precision timing and a multihomed node. Beyond that time
interval, RIFT interprets the lag as a mobility and only the freshest interval, RIFT interprets the lag as a mobility and only the freshest
route is retained. route is retained.
When using IPv6 [RFC8200], RIFT suggests to leverage "Registration When using IPv6 [RFC8200], RIFT suggests to leverage "Registration
Extensions for IPv6 over Low-Power Wireless Personal Area Network Extensions for IPv6 over Low-Power Wireless Personal Area Network
(6LoWPAN) Neighbor Discovery (ND)" [RFC8505] as the IPv6 ND (6LoWPAN) Neighbor Discovery (ND)" [RFC8505] as the IPv6 ND
interaction between the mobile node and the leaf. This provides not interaction between the mobile node and the leaf. This provides not
only a sequence counter but also a lifetime and a security token that only a sequence counter but also a lifetime and a security token that
may be used to protect the ownership of an address. When using may be used to protect the ownership of an address [RFC8928]. When
[RFC8505], the parallel registration of an anycast address to using [RFC8505], the parallel registration of an anycast address to
multiple leaves is done with the same sequence counter, whereas the multiple leaves is done with the same sequence counter, whereas the
sequence counter is incremented when the point of attachement sequence counter is incremented when the point of attachement
changes. This way, it is possible to differentiate a mobile node changes. This way, it is possible to differentiate a mobile node
from a multihomed node, even when the mobility happens within the from a multihomed node, even when the mobility happens within the
timing precision. It is also possible for a mobile node to be timing precision. It is also possible for a mobile node to be
multihomed as well, e.g., to change only one of its points of multihomed as well, e.g., to change only one of its points of
attachement. attachement.
4.8. IPv4 over IPv6 4.8. IPv4 over IPv6
RIFT allows advertising IPv4 prefixes over IPv6 RIFT network. IPv6 RIFT allows advertising IPv4 prefixes over IPv6 RIFT network. IPv6
AF configures via the usual ND mechanisms and then V4 can use V6 Address Family (AF) configures via the usual Neighbor Discovery (ND)
nexthops analogous to RFC5549. It is expected that the whole fabric mechanisms and then V4 can use V6 nexthops analogous to [RFC5549].
supports the same type of forwarding of address families on all the It is expected that the whole fabric supports the same type of
links. RIFT provides an indication whether a node is v4 forwarding forwarding of address families on all the links. RIFT provides an
capable and implementations are possible where different routing indication whether a node is v4 forwarding capable and
tables are computed per address family as long as the computation implementations are possible where different routing tables are
remains loop-free. computed per address family as long as the computation remains loop-
free.
+-----+ +-----+ +-----+ +-----+
+---+---+ | ToF | | ToF | +---+---+ | ToF | | ToF |
^ +--+--+ +-----+ ^ +--+--+ +-----+
| | | | | | | | | |
| | +-------------+ | | | +-------------+ |
| | +--------+ | | | | +--------+ | |
| | | | | + | | | |
V6 +-----+ +-+---+ V6 +-----+ +-+---+
Forwarding |SPINE| |SPINE| Forwarding |Spine| |Spine|
| +--+--+ +-----+ + +--+--+ +-----+
| | | | | | | | | |
| | +-------------+ | | | +-------------+ |
| | +--------+ | | | | +--------+ | |
| | | | | | | | | |
v +-----+ +-+---+ v +-----+ +-+---+
+---+---+ |LEAF | | LEAF| +---+---+ |Leaf | | Leaf|
+--+--+ +--+--+ +--+--+ +--+--+
| | | |
IPv4 prefixes| |IPv4 prefixes IPv4 prefixes| |IPv4 prefixes
| | | |
+---+----+ +---+----+ +---+----+ +---+----+
| V4 | | V4 | | V4 | | V4 |
| subnet | | subnet | | subnet | | subnet |
+--------+ +--------+ +--------+ +--------+
Figure 8: IPv4 over IPv6 Figure 8: IPv4 over IPv6
4.9. In-Band Reachability of Nodes 4.9. In-Band Reachability of Nodes
RIFT doesn't precondition that nodes of the fabric have reachable RIFT doesn't precondition that nodes of the fabric have reachable
addresses. But the operational purposes to reach the internal nodes addresses. But the operational purposes to reach the internal nodes
may exist. Figure 9 shows an example that the NMS attaches to LEAF1. may exist. Figure 9 shows an example that the network management
station (NMS) attaches to leaf1.
+-------+ +-------+ +-------+ +-------+
| ToF1 | | ToF2 | | ToF1 | | ToF2 |
++---- ++ ++-----++ ++---- ++ ++-----++
| | | | | | | |
| +----------+ | | +----------+ |
| +--------+ | | | +--------+ | |
| | | | | | | |
++-----++ +--+---++ ++-----++ +--+---++
|SPINE1 | |SPINE2 | |Spine1 | |Spine2 |
++-----++ ++-----++ ++-----++ ++-----++
| | | | | | | |
| +----------+ | | +----------+ |
| +--------+ | | | +--------+ | |
| | | | | | | |
++-----++ +--+---++ ++-----++ +--+---++
| LEAF1 | | LEAF2 | | Leaf1 | | Leaf2 |
+---+---+ +-------+ +---+---+ +-------+
| |
|NMS |NMS
Figure 9: In-Band reachability of node Figure 9: In-Band reachability of node
If NMS wants to access LEAF2, it simply works. Because loopback If NMS wants to access Leaf2, it simply works. Because loopback
address of LEAF2 is flooded in its Prefix North TIE. address of Leaf2 is flooded in its Prefix North TIE.
If NMS wants to access SPINE2, it simply works too. Because spine If NMS wants to access Spine2, it simply works too. Because spine
node always advertises its loopback address in the Prefix North TIE. node always advertises its loopback address in the Prefix North TIE.
NMS may reach SPINE2 from LEAF1-SPINE2 or LEAF1-SPINE1-ToF1/ NMS may reach Spine2 from Leaf1-Spine2 or Leaf1-Spine1-ToF1/
ToF2-SPINE2. ToF2-Spine2.
If NMS wants to access ToF2, ToF2's loopback address needs to be If NMS wants to access ToF2, ToF2's loopback address needs to be
injected into its Prefix South TIE. Otherwise, the traffic from NMS injected into its Prefix South TIE. This TIE must be seen by all
may be sent to ToF1. nodes at the level below - the spine nodes in Figure 9 - that must
form a ceiling for all the traffic coming from below (south).
Otherwise, the traffic from NMS may follow the default route to the
wrong ToF Node, e.g., ToF1.
And in case of failure between ToF2 and spine nodes, ToF2's loopback In a fully connected ToF, in case of failure between ToF2 and spine
address must be sent all the way down to the leaves. nodes, ToF2's loopback address must be disaggregated recursively all
the way to the leaves.
In a partitioned ToF, a TOF node is only reachable within its Plane,
and the disaggregation to the leaves is also required. A possible
alternative is to use the ring that interconnects the ToF nodes to
transmit packets between them for their loopback addresses only. The
idea is that this is mostly control traffic and should not alter the
load balancing properties of the fabric.
4.10. Dual Homing Servers 4.10. Dual Homing Servers
Each RIFT node may operate in zero touch provisioning (ZTP) mode. It Each RIFT node may operate in Zero Touch Provisioning (ZTP) mode. It
has no configuration (unless it is a Top-of-Fabric at the top of the has no configuration (unless it is a Top-of-Fabric at the top of the
topology or the must operate in the topology as leaf and/or support topology or the must operate in the topology as leaf and/or support
leaf-2-leaf procedures) and it will fully configure itself after leaf-2-leaf procedures) and it will fully configure itself after
being attached to the topology. being attached to the topology.
+---+ +---+ +---+ +---+ +---+ +---+
|ToF| |ToF| |ToF| |ToF| |ToF| |ToF| ToF
+---+ +---+ +---+ +---+ +---+ +---+
| | | | | | | | | | | |
| +----------------+ | | | +----------------+ | |
| | | | | | | | | | | |
| +----------------+ | | +----------------+ |
| | | | | | | | | | | |
+----------+--+ +--+----------+ +----------+--+ +--+----------+
| Spine|ToR1 | | Spine|ToR2 | | ToR1 | | ToR2 | Spine
+--+------+---+ +--+-------+--+ +--+------+---+ +--+-------+--+
+---+ | | | | | | +---+ +---+ | | | | | | +---+
| | | | | | | | | | | | | | | |
| +-----------------+ | | | | +-----------------+ | | |
| | | +-------------+ | | | | | +-------------+ | |
+ | + | | |-----------------+ | + | + | | |-----------------+ |
X | X | +--------x-----+ | X | X | X | +--------x-----+ | X |
+ | + | | | + | + | + | | | + |
+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
| | | | | | | | | | | | | | | |
+---+ +---+ ...............+---+ +---+ +---+ +---+ ...............+---+ +---+
SV(1) SV(2) SV(n+1) SV(n) SV(1) SV(2) SV(n+1) SV(n) Leaf
Figure 10: Dual-homing servers Figure 10: Dual-homing servers
In the single plane, the worst condition is disaggregation of every In the single plane, the worst condition is disaggregation of every
other servers at the same level. Suppose the links from ToR1 to all other servers at the same level. Suppose the links from ToR1 (Top of
the leaves become not available. All the servers' routes are Rack) to all the leaves become not available. All the servers'
disaggregated and the FIB of the servers will be expanded with n-1 routes are disaggregated and the FIB of the servers will be expanded
more specific routes. with n-1 more specific routes.
Sometimes, people may prefer to disaggregate from ToR to servers from Sometimes, people may prefer to disaggregate from ToR to servers from
start on, i.e. the servers have couple tens of routes in FIB from start on, i.e. the servers have couple tens of routes in FIB from
start on beside default routes to avoid breakages at rack level. start on beside default routes to avoid breakages at rack level.
Full disaggregation of the fabric could be achieved by configuration Full disaggregation of the fabric could be achieved by configuration
supported by RIFT. supported by RIFT.
4.11. Fabric With A Controller 4.11. Fabric With A Controller
There are many different ways to deploy the controller. One There are many different ways to deploy the controller. One
skipping to change at page 24, line 24 skipping to change at page 24, line 30
| | | |
| | | |
+----++ ++----+ +----++ ++----+
------- | ToF | | ToF | ------- | ToF | | ToF |
| +--+--+ +-----+ | +--+--+ +-----+
| | | | | | | | | |
| | +-------------+ | | | +-------------+ |
| | +--------+ | | | | +--------+ | |
| | | | | | | | | |
+-----+ +-+---+ +-----+ +-+---+
RIFT domain |SPINE| |SPINE| RIFT domain |Spine| |Spine|
+--+--+ +-----+ +--+--+ +-----+
| | | | | | | | | |
| | +-------------+ | | | +-------------+ |
| | +--------+ | | | | +--------+ | |
| | | | | | | | | |
| +-----+ +-+---+ | +-----+ +-+---+
------- |LEAF | | LEAF| ------- |Leaf | | Leaf|
+-----+ +-----+ +-----+ +-----+
Figure 11: Fabric with a controller Figure 11: Fabric with a controller
4.11.1. Controller Attached to ToFs 4.11.1. Controller Attached to ToFs
If a controller is attaching to the RIFT domain from ToF, it usually If a controller is attaching to the RIFT domain from ToF, it usually
uses dual-homing connections. The loopback prefix of the controller uses dual-homing connections. The loopback prefix of the controller
should be advertised down by the ToF and spine to leaves. If the should be advertised down by the ToF and spine to leaves. If the
controller loses link to ToF, make sure the ToF withdraw the prefix controller loses link to ToF, make sure the ToF withdraw the prefix
skipping to change at page 26, line 48 skipping to change at page 26, line 48
+ traffic + traffic
Figure 13: Anycast Figure 13: Anycast
If the traffic comes from ToF to Leaf111 or Leaf121 which has anycast If the traffic comes from ToF to Leaf111 or Leaf121 which has anycast
prefix PrefixA. RIFT can deal with this case well. But if the prefix PrefixA. RIFT can deal with this case well. But if the
traffic comes from Leaf122, it arrives Spine21 or Spine22 at level 1. traffic comes from Leaf122, it arrives Spine21 or Spine22 at level 1.
But Spine21 or Spine22 doesn't know another PrefixA attaching But Spine21 or Spine22 doesn't know another PrefixA attaching
Leaf111. So it will always get to Leaf121 and never get to Leaf111. Leaf111. So it will always get to Leaf121 and never get to Leaf111.
If the intension is that the traffic should been offloaded to If the intension is that the traffic should been offloaded to
Leaf111, then use policy guided prefixes [PGP reference]. Leaf111, then use policy guided prefixes defined in "Routing in Fat
Trees" [RIFT].
4.15. IoT Applicability
The design of RIFT inherits from RPL [RFC6550] the anisotropic design
of a default route upwards (northwards); it also inherits the
capability to inject external host routes at the Leaf level using
Wireless ND (WiND) [RFC8505][RFC8928] between a RIFT-agnostic host
and a RIFT router. Both the RPL and the RIFT protocols are meant for
large scale, and WiND enables device mobility at the edge the same
way in both cases.
The main difference between RIFT and RPL is that with RPL, there's a
single Root, whereas RIFT has many ToF nodes. The adds huge
capabilities for leaf-2-leaf ECMP paths, but additional complexity
with the need to disaggregate. Also RIFT uses Link State flooding
northwards, and is not designed for low-power operation.
Still nothing prevents that the IP devices connected at the Leaf are
IoT (Internet of Things) devices, which typically expose their
address using WiND - which is an upgrade from 6LoWPAN ND [RFC6775].
A network that serves high speed/ high power IoT devices should
typically provide deterministic capabilities for applications such as
high speed control loops or movement detection. The Fat Tree is
highly reliable, and in normal condition provides an equilatent
multipath operation; but the ECMP doesn't provide hard guarantees for
either delivery or latency. As long as the fabric is non-blocking
the result is the same; but there can be load unbalances resulting in
incast and possibly congestion loss that will prevent the delivery
within bounded latency.
This could be alleviated with Packet Replication, Elimination and
Reordering (PREOF) [RFC8655] leaf-2-leaf but PREOF is hard to provide
at the scale of all flows, and the replication may increase the
probability of the overload that it attempts to solve.
Note that the load balancing is not RIFT's problem, but it is key to
serve IoT adequately.
5. Security Considerations
This document presents applicability of RIFT. As such, it does not
introduce any security considerations. However, there are a number
of security concerns at [RIFT].
5. Acknowledgements
6. Contributors 6. Contributors
The following people (listed in alphabetical order) contributed The following people (listed in alphabetical order) contributed
significantly to the content of this document and should be significantly to the content of this document and should be
considered co-authors: considered co-authors:
Tony Przygienda Tony Przygienda
Juniper Networks Juniper Networks
skipping to change at page 28, line 11 skipping to change at page 29, line 11
Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)", Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)",
RFC 5357, DOI 10.17487/RFC5357, October 2008, RFC 5357, DOI 10.17487/RFC5357, October 2008,
<https://www.rfc-editor.org/info/rfc5357>. <https://www.rfc-editor.org/info/rfc5357>.
[RFC7130] Bhatia, M., Ed., Chen, M., Ed., Boutros, S., Ed., [RFC7130] Bhatia, M., Ed., Chen, M., Ed., Boutros, S., Ed.,
Binderberger, M., Ed., and J. Haas, Ed., "Bidirectional Binderberger, M., Ed., and J. Haas, Ed., "Bidirectional
Forwarding Detection (BFD) on Link Aggregation Group (LAG) Forwarding Detection (BFD) on Link Aggregation Group (LAG)
Interfaces", RFC 7130, DOI 10.17487/RFC7130, February Interfaces", RFC 7130, DOI 10.17487/RFC7130, February
2014, <https://www.rfc-editor.org/info/rfc7130>. 2014, <https://www.rfc-editor.org/info/rfc7130>.
[RFC5549] Le Faucheur, F. and E. Rosen, "Advertising IPv4 Network
Layer Reachability Information with an IPv6 Next Hop",
RFC 5549, DOI 10.17487/RFC5549, May 2009,
<https://www.rfc-editor.org/info/rfc5549>.
[RFC6550] Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J.,
Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur,
JP., and R. Alexander, "RPL: IPv6 Routing Protocol for
Low-Power and Lossy Networks", RFC 6550,
DOI 10.17487/RFC6550, March 2012,
<https://www.rfc-editor.org/info/rfc6550>.
[RFC6775] Shelby, Z., Ed., Chakrabarti, S., Nordmark, E., and C.
Bormann, "Neighbor Discovery Optimization for IPv6 over
Low-Power Wireless Personal Area Networks (6LoWPANs)",
RFC 6775, DOI 10.17487/RFC6775, November 2012,
<https://www.rfc-editor.org/info/rfc6775>.
[RFC8655] Finn, N., Thubert, P., Varga, B., and J. Farkas,
"Deterministic Networking Architecture", RFC 8655,
DOI 10.17487/RFC8655, October 2019,
<https://www.rfc-editor.org/info/rfc8655>.
[RIFT] Przygienda, T., Sharma, A., Thubert, P., Rijsman, B., and [RIFT] Przygienda, T., Sharma, A., Thubert, P., Rijsman, B., and
D. Afanasiev, "RIFT: Routing in Fat Trees", Work in D. Afanasiev, "RIFT: Routing in Fat Trees", Work in
Progress, Internet-Draft, draft-ietf-rift-rift-12, 26 May Progress, Internet-Draft, draft-ietf-rift-rift-12, 26 May
2020, 2020,
<https://tools.ietf.org/html/draft-ietf-rift-rift-12>. <https://tools.ietf.org/html/draft-ietf-rift-rift-12>.
[I-D.white-distoptflood] [I-D.white-distoptflood]
White, R., Hegde, S., and S. Zandi, "IS-IS Optimal White, R., Hegde, S., and S. Zandi, "IS-IS Optimal
Distributed Flooding for Dense Topologies", Work in Distributed Flooding for Dense Topologies", Work in
Progress, Internet-Draft, draft-white-distoptflood-04, 27 Progress, Internet-Draft, draft-white-distoptflood-04, 27
July 2020, July 2020,
<https://tools.ietf.org/html/draft-white-distoptflood-04>. <https://tools.ietf.org/html/draft-white-distoptflood-04>.
8. Informative References 8. Informative References
[IEEEstd1588]
IEEE standard for Information Technology, "IEEE Standard
for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems",
<https://standards.ieee.org/standard/1588-2019.html>.
[CLOS] Yuan, X., "On Nonblocking Folded-Clos Networks in Computer
Communication Environments", IEEE International Parallel &
Distributed Processing Symposium, 2011.
[FATTREE] Leiserson, C. E., "Fat-Trees: Universal Networks for
Hardware-Efficient Supercomputing", 1985.
[RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch, [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch,
"Network Time Protocol Version 4: Protocol and Algorithms "Network Time Protocol Version 4: Protocol and Algorithms
Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010, Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010,
<https://www.rfc-editor.org/info/rfc5905>. <https://www.rfc-editor.org/info/rfc5905>.
[RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", STD 86, RFC 8200, (IPv6) Specification", STD 86, RFC 8200,
DOI 10.17487/RFC8200, July 2017, DOI 10.17487/RFC8200, July 2017,
<https://www.rfc-editor.org/info/rfc8200>. <https://www.rfc-editor.org/info/rfc8200>.
[RFC8505] Thubert, P., Ed., Nordmark, E., Chakrabarti, S., and C. [RFC8505] Thubert, P., Ed., Nordmark, E., Chakrabarti, S., and C.
Perkins, "Registration Extensions for IPv6 over Low-Power Perkins, "Registration Extensions for IPv6 over Low-Power
Wireless Personal Area Network (6LoWPAN) Neighbor Wireless Personal Area Network (6LoWPAN) Neighbor
Discovery", RFC 8505, DOI 10.17487/RFC8505, November 2018, Discovery", RFC 8505, DOI 10.17487/RFC8505, November 2018,
<https://www.rfc-editor.org/info/rfc8505>. <https://www.rfc-editor.org/info/rfc8505>.
[RFC8928] Thubert, P., Ed., Sarikaya, B., Sethi, M., and R. Struik,
"Address-Protected Neighbor Discovery for Low-Power and
Lossy Networks", RFC 8928, DOI 10.17487/RFC8928, November
2020, <https://www.rfc-editor.org/info/rfc8928>.
Authors' Addresses Authors' Addresses
Yuehua Wei (editor) Yuehua Wei (editor)
ZTE Corporation ZTE Corporation
No.50, Software Avenue No.50, Software Avenue
Nanjing Nanjing
210012 210012
China China
Email: wei.yuehua@zte.com.cn Email: wei.yuehua@zte.com.cn
 End of changes. 104 change blocks. 
264 lines changed or deleted 367 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/