| < draft-bookham-rtgwg-nfix-arch-00.txt | draft-bookham-rtgwg-nfix-arch-01.txt > | |||
|---|---|---|---|---|
| RTG Working Group C. Bookham, Ed. | RTG Working Group C. Bookham, Ed. | |||
| Internet-Draft A. Stone | Internet-Draft A. Stone | |||
| Intended status: Informational Nokia | Intended status: Informational Nokia | |||
| Expires: September 10, 2020 J. Tantsura | Expires: December 26, 2020 J. Tantsura | |||
| Apstra | Apstra | |||
| M. Durrani | M. Durrani | |||
| Equinix Inc | Equinix Inc | |||
| March 9, 2020 | B. Decraene | |||
| Orange | ||||
| June 24, 2020 | ||||
| An Architecture for Network Function Interconnect | An Architecture for Network Function Interconnect | |||
| draft-bookham-rtgwg-nfix-arch-00 | draft-bookham-rtgwg-nfix-arch-01 | |||
| Abstract | Abstract | |||
| The emergence of technologies such as 5G, the Internet of Things | The emergence of technologies such as 5G, the Internet of Things | |||
| (IoT), and Industry 4.0, coupled with the move towards network | (IoT), and Industry 4.0, coupled with the move towards network | |||
| functionvirtualization, means that the service requirements demanded | function virtualization, means that the service requirements demanded | |||
| from networks are changing. This document describes an architecture | from networks are changing. This document describes an architecture | |||
| for a Network Function Interconnect (NFIX) that allows for | for a Network Function Interconnect (NFIX) that allows for | |||
| interworking of physical and virtual network functions in a unified | interworking of physical and virtual network functions in a unified | |||
| and scalable manner across wide-area network and data center domains | and scalable manner across wide-area network and data center domains | |||
| while maintaining the ability to deliver against SLAs. | while maintaining the ability to deliver against SLAs. | |||
| Requirements Language | Requirements Language | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| skipping to change at page 1, line 48 ¶ | skipping to change at page 2, line 4 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on December 26, 2020. | ||||
| This Internet-Draft will expire on September 10, 2020. | ||||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 30 ¶ | skipping to change at page 2, line 31 ¶ | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 3. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 3. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 6 | 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 5. Theory of Operation . . . . . . . . . . . . . . . . . . . . . 7 | 5. Theory of Operation . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 5.1. VNF Assumptions . . . . . . . . . . . . . . . . . . . . . 7 | 5.1. VNF Assumptions . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 5.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . 8 | 5.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 5.3. Use of a Centralized Controller . . . . . . . . . . . . . 9 | 5.3. Use of a Centralized Controller . . . . . . . . . . . . . 9 | |||
| 5.4. Transport Layer . . . . . . . . . . . . . . . . . . . . . 11 | 5.4. Routing and LSP Underlay . . . . . . . . . . . . . . . . 11 | |||
| 5.4.1. Intra-Domain Routing . . . . . . . . . . . . . . . . 11 | 5.4.1. Intra-Domain Routing . . . . . . . . . . . . . . . . 11 | |||
| 5.4.2. Intra-Domain Routing . . . . . . . . . . . . . . . . 11 | 5.4.2. Inter-Domain Routing . . . . . . . . . . . . . . . . 13 | |||
| 5.4.3. Inter-Domain Routing . . . . . . . . . . . . . . . . 12 | 5.4.3. Intra-Domain and Inter-Domain Traffic-Engineering . . 14 | |||
| 5.4.4. Intra-Domain and Inter-Domain Traffic-Engineering . . 13 | 5.5. Service Layer . . . . . . . . . . . . . . . . . . . . . . 17 | |||
| 5.5. Service Layer . . . . . . . . . . . . . . . . . . . . . . 15 | 5.6. Service Differentiation . . . . . . . . . . . . . . . . . 19 | |||
| 5.6. Service Differentiation . . . . . . . . . . . . . . . . . 16 | 5.7. Automated Service Activation . . . . . . . . . . . . . . 20 | |||
| 5.7. Automated Service Activation . . . . . . . . . . . . . . 17 | 5.8. Service Function Chaining . . . . . . . . . . . . . . . . 21 | |||
| 5.8. Service Function Chaining . . . . . . . . . . . . . . . . 18 | 5.9. Stability and Availability . . . . . . . . . . . . . . . 23 | |||
| 5.9. Stability and Availability . . . . . . . . . . . . . . . 20 | 5.9.1. IGP Reconvergence . . . . . . . . . . . . . . . . . . 23 | |||
| 5.9.1. IGP Reconvergence . . . . . . . . . . . . . . . . . . 20 | 5.9.2. Data Center Reconvergence . . . . . . . . . . . . . . 23 | |||
| 5.9.2. Data Center Reconvergence . . . . . . . . . . . . . . 21 | 5.9.3. Exchange of Inter-Domain Routes . . . . . . . . . . . 24 | |||
| 5.9.3. Exchange of Inter-Domain Routes . . . . . . . . . . . 21 | 5.9.4. Controller Redundancy . . . . . . . . . . . . . . . . 24 | |||
| 5.9.4. Controller Redundancy . . . . . . . . . . . . . . . . 22 | 5.9.5. Path and Segment Liveliness . . . . . . . . . . . . . 26 | |||
| 5.9.5. Path and Segment Liveliness . . . . . . . . . . . . . 24 | 5.10. Scalability . . . . . . . . . . . . . . . . . . . . . . . 28 | |||
| 5.10. Scalability . . . . . . . . . . . . . . . . . . . . . . . 25 | 5.10.1. Asymmetric Model B for VPN Families . . . . . . . . 30 | |||
| 5.10.1. Asymmetric Model B for VPN Families . . . . . . . . 27 | 6. Illustration of Use . . . . . . . . . . . . . . . . . . . . . 32 | |||
| 6. Illustration of Use . . . . . . . . . . . . . . . . . . . . . 29 | 6.1. Reference Topology . . . . . . . . . . . . . . . . . . . 32 | |||
| 6.1. Reference Topology . . . . . . . . . . . . . . . . . . . 29 | 6.2. PNF to PNF Connectivity . . . . . . . . . . . . . . . . . 34 | |||
| 6.2. PNF to PNF Connectivity . . . . . . . . . . . . . . . . . 31 | 6.3. VNF to PNF Connectivity . . . . . . . . . . . . . . . . . 35 | |||
| 6.3. VNF to PNF Connectivity . . . . . . . . . . . . . . . . . 32 | 6.4. VNF to VNF Connectivity . . . . . . . . . . . . . . . . . 36 | |||
| 6.4. VNF to VNF Connectivity . . . . . . . . . . . . . . . . . 33 | ||||
| 7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 34 | 7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 37 | |||
| 8. Security Considerations . . . . . . . . . . . . . . . . . . . 35 | 8. Security Considerations . . . . . . . . . . . . . . . . . . . 38 | |||
| 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 35 | 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 38 | |||
| 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 35 | 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 38 | |||
| 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36 | 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39 | |||
| 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 36 | 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 39 | |||
| 12.1. Normative References . . . . . . . . . . . . . . . . . . 36 | 12.1. Normative References . . . . . . . . . . . . . . . . . . 39 | |||
| 12.2. Informative References . . . . . . . . . . . . . . . . . 36 | 12.2. Informative References . . . . . . . . . . . . . . . . . 40 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 40 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45 | |||
| 1. Introduction | 1. Introduction | |||
| With the introduction of technologies such as 5G, the Internet of | With the introduction of technologies such as 5G, the Internet of | |||
| Things (IoT), and Industry 4.0, service requirements are changing. | Things (IoT), and Industry 4.0, service requirements are changing. | |||
| In addition to the ever-increasing demand for more capacity, these | In addition to the ever-increasing demand for more capacity, these | |||
| services have other stringent service requirements that need to be | services have other stringent service requirements that need to be | |||
| met such as ultra-reliable and/or low-latency communication. | met such as ultra-reliable and/or low-latency communication. | |||
| Parallel to this, there is a continued trend to move towards network | Parallel to this, there is a continued trend to move towards network | |||
| skipping to change at page 4, line 26 ¶ | skipping to change at page 4, line 26 ¶ | |||
| wide-area network. | wide-area network. | |||
| o A virtualized network function (VNF) refers to a network device | o A virtualized network function (VNF) refers to a network device | |||
| such as a provider edge (PE) router that is hosted on an | such as a provider edge (PE) router that is hosted on an | |||
| application server. The VNF may be bare-metal in that it consumes | application server. The VNF may be bare-metal in that it consumes | |||
| the entire resources of the server, or it may be one of numerous | the entire resources of the server, or it may be one of numerous | |||
| virtual functions instantiated as a VM or number of containers on | virtual functions instantiated as a VM or number of containers on | |||
| a given server that is controlled by a hypervisor or container | a given server that is controlled by a hypervisor or container | |||
| management platform. | management platform. | |||
| o A Data Center Interconnect (DCI) refers to the network function | o A Data Center Border (DCB) router refers to the network function | |||
| that spans the border between the wide-area and the data center | that spans the border between the wide-area and the data center | |||
| networks, typically interworking the different encapsulation | networks, typically interworking the different encapsulation | |||
| techniques employed within each domain. | techniques employed within each domain. | |||
| o An Interconnect controller is the controller responsible for | o An Interconnect controller is the controller responsible for | |||
| managing the NFIX fabric and services. | managing the NFIX fabric and services. | |||
| o A DC controller is the term used for a controller that resides | o A DC controller is the term used for a controller that resides | |||
| within an SDN-enabled data center and is responsible for the DC | within an SDN-enabled data center and is responsible for the DC | |||
| network(s) | network(s) | |||
| skipping to change at page 5, line 34 ¶ | skipping to change at page 5, line 35 ¶ | |||
| the core. Examples of this include cloud-RAN or Software-Defined | the core. Examples of this include cloud-RAN or Software-Defined | |||
| Access Networks. | Access Networks. | |||
| Historically service providers have architected data centers | Historically service providers have architected data centers | |||
| independently from the wide-area network, creating two independent | independently from the wide-area network, creating two independent | |||
| domains or islands. As VNFs become part of the service landscape the | domains or islands. As VNFs become part of the service landscape the | |||
| service data-path must be extended across the WAN into the data | service data-path must be extended across the WAN into the data | |||
| center infrastructure, but in a manner that still allows operators to | center infrastructure, but in a manner that still allows operators to | |||
| meet deterministic performance requirements. Methods for stitching | meet deterministic performance requirements. Methods for stitching | |||
| WAN and DC infrastructures together with some form of service- | WAN and DC infrastructures together with some form of service- | |||
| interworking at the data center interconnect have been implemented | interworking at the data center border have been implemented and | |||
| and deployed, but this service-interworking approach has several | deployed, but this service-interworking approach has several | |||
| limitations: | limitations: | |||
| o The data center environment typically uses encapsulation | o The data center environment typically uses encapsulation | |||
| techniques such as VXLAN or NVGRE while the WAN typically uses | techniques such as VXLAN or NVGRE while the WAN typically uses | |||
| encapsulation techniques such as MPLS [RFC3031]. Underlying | encapsulation techniques such as MPLS [RFC3031]. Underlying | |||
| optical infrastructure might also need to be programmed. These | optical infrastructure might also need to be programmed. These | |||
| are incompatible and require interworking at the service layer. | are incompatible and require interworking at the service layer. | |||
| o It typically requires heavy-touch service provisioning on the data | o It typically requires heavy-touch service provisioning on the data | |||
| center interconnect. In an end-to-end service, midpoint | center border. In an end-to-end service, midpoint provisioning is | |||
| provisioning is undesirable and should be avoided. | undesirable and should be avoided. | |||
| o Automation is difficult; largely due to the first two points but | o Automation is difficult; largely due to the first two points but | |||
| with additional contributing factors. In the virtualization world | with additional contributing factors. In the virtualization world | |||
| automation is a must-have capability. | automation is a must-have capability. | |||
| o When a service is operating at Layer 3 in a data center with | o When a service is operating at Layer 3 in a data center with | |||
| redundant interconnects the risk of routing loops exists. There | redundant interconnects the risk of routing loops exists. There | |||
| is no inherent loop avoidance mechanism when redistributing routes | is no inherent loop avoidance mechanism when redistributing routes | |||
| between address families so extreme care must be taken. Proposals | between address families so extreme care must be taken. Proposals | |||
| such as the Domain Path (D-PATH) attribute | such as the Domain Path (D-PATH) attribute | |||
| skipping to change at page 7, line 16 ¶ | skipping to change at page 7, line 16 ¶ | |||
| o Provide a solution that allows for optimal end-to-end path | o Provide a solution that allows for optimal end-to-end path | |||
| placement; where optimal not only meets the requirements of the | placement; where optimal not only meets the requirements of the | |||
| path in question but also meets the global network objectives. | path in question but also meets the global network objectives. | |||
| o Support varying types of VNF physical network attachment and | o Support varying types of VNF physical network attachment and | |||
| logical (underlay/overlay) connectivity. | logical (underlay/overlay) connectivity. | |||
| o Facilitate automation of service provision. As such the solution | o Facilitate automation of service provision. As such the solution | |||
| should avoid heavy-touch service provisioning and decapsulation/ | should avoid heavy-touch service provisioning and decapsulation/ | |||
| encapsulation at data center interconnects. | encapsulation at data center border routers. | |||
| o Provide a framework for delivering logical end-to-end networks | o Provide a framework for delivering logical end-to-end networks | |||
| using differentiated logical topologies and/or constraints. | using differentiated logical topologies and/or constraints. | |||
| o Provide a high level of stability; faults in one domain should not | o Provide a high level of stability; faults in one domain should not | |||
| propagate to another domain. | propagate to another domain. | |||
| o Provide a mechanism for homogeneous end-to-end OAM. | o Provide a mechanism for homogeneous end-to-end OAM. | |||
| o Hide/localize instabilities in the different domains that | o Hide/localize instabilities in the different domains that | |||
| skipping to change at page 7, line 50 ¶ | skipping to change at page 7, line 50 ¶ | |||
| This section describes the NFIX architecture including the building | This section describes the NFIX architecture including the building | |||
| blocks and protocol machinery that is used to form the fabric. Where | blocks and protocol machinery that is used to form the fabric. Where | |||
| considered appropriate rationale is given for selection of an | considered appropriate rationale is given for selection of an | |||
| architectural component where other seemingly applicable choices | architectural component where other seemingly applicable choices | |||
| could have been made. | could have been made. | |||
| 5.1. VNF Assumptions | 5.1. VNF Assumptions | |||
| For the sake of simplicity, references to VNF are made in a broad | For the sake of simplicity, references to VNF are made in a broad | |||
| sense. The way in which a VNF is instantiated and provided network | sense. Equally, the differences between VNF and Container Network | |||
| connectivity will differ based on environment and VNF capability, but | Function (CNF) are largely immaterial for the purposes of this | |||
| for conciseness this is not explicitly detailed with every reference | document, therefore VNF is used to represent both. The way in which | |||
| to a VNF. Common examples of VNF variants include but are not | a VNF is instantiated and provided network connectivity will differ | |||
| limited to: | based on environment and VNF capability, but for conciseness this is | |||
| not explicitly detailed with every reference to a VNF. Common | ||||
| examples of VNF variants include but are not limited to: | ||||
| o o A VNF that functions as a routing device and has full IP routing | o A VNF that functions as a routing device and has full IP routing | |||
| and MPLS capabilities. It can be connected simultaneously to the | and MPLS capabilities. It can be connected simultaneously to the | |||
| data center fabric underlay and overlay and serves as the NVO | data center fabric underlay and overlay and serves as the NVO | |||
| tunnel endpoint [RFC8014]. Examples of this might be a | tunnel endpoint [RFC8014]. Examples of this might be a | |||
| virtualized PE router, or a virtualized Broadband Network Gateway | virtualized PE router, or a virtualized Broadband Network Gateway | |||
| (BNG). | (BNG). | |||
| o A VNF that functions as a device (host or router) with limited IP | o A VNF that functions as a device (host or router) with limited IP | |||
| routing capability. It does not connect directly to the data | routing capability. It does not connect directly to the data | |||
| center fabric underlay but rather connects to one or more external | center fabric underlay but rather connects to one or more external | |||
| physical or virtual devices that serve as the NVO tunnel | physical or virtual devices that serve as the NVO tunnel | |||
| endpoint(s). It may however have single or multiple connections | endpoint(s). It may however have single or multiple connections | |||
| to the overlay. Examples of this might be a mobile network | to the overlay. Examples of this might be a mobile network | |||
| control or management plane function. | control or management plane function. | |||
| o A VNF that has no routing capability. It is a virtualized | o A VNF that has no routing capability. It is a virtualized | |||
| function hosted within an application server and is managed by a | function hosted within an application server and is managed by a | |||
| hypervisor or container host. The hypervisor/container host acts | hypervisor or container host. The hypervisor/container host acts | |||
| as the NVO endpoint and interfaces to some form of SDN controller | as the NVO endpoint and interfaces to some form of SDN controller | |||
| responsible for programming the forwarding plane of the | responsible for programming the forwarding plane of the | |||
| virtualization host using, for example, OpenFlow. Examples of | virtualization host using, for example, OpenFlow. Examples of | |||
| this might be an Enterprise application server, or a web server. | this might be an Enterprise application server or a web server | |||
| running as a virtual machine and front-ended by a virtual routing | ||||
| function such as OVS/xVRS/VTF. | ||||
| Where considered necessary exceptions to the examples provided above | Where considered necessary exceptions to the examples provided above | |||
| or focus on a particular scenario will be highlighted. | or focus on a particular scenario will be highlighted. | |||
| 5.2. Overview | 5.2. Overview | |||
| The NFIX architecture makes no assumptions about how the network is | The NFIX architecture makes no assumptions about how the network is | |||
| physically composed, nor does it impose any dependencies upon it. It | physically composed, nor does it impose any dependencies upon it. It | |||
| also makes no assumptions about IGP hierarchies. The use of areas/ | also makes no assumptions about IGP hierarchies and the use of areas/ | |||
| levels or discrete IGP instances within the WAN is fully endorsed to | levels or discrete IGP instances within the WAN is fully endorsed to | |||
| enhance scalability and constrain fault propagation. The overall | enhance scalability and constrain fault propagation. This could | |||
| architecture uses the constructs of seamless MPLS as a baseline and | apply for instance to a hierarchical WAN from core to edge or from | |||
| extends upon that. The concept of decomposing the network into | WAN to LAN connections. The overall architecture uses the constructs | |||
| multiple domains is one that has been widely deployed and has been | of seamless MPLS as a baseline and extends upon that. The concept of | |||
| proven to scale in networks with large numbers of nodes. | decomposing the network into multiple domains is one that has been | |||
| widely deployed and has been proven to scale in networks with large | ||||
| numbers of nodes. | ||||
| The proposed architecture uses segment routing (SR) as its preferred | The proposed architecture uses segment routing (SR) as its preferred | |||
| choice of transport. Segment routing is chosen for construction of | choice of transport. Segment routing is chosen for construction of | |||
| end-to-end LSPs given its ability to traffic-engineer through source- | end-to-end LSPs given its ability to traffic-engineer through source- | |||
| routing while concurrently scaling exceptionally well due to its lack | routing while concurrently scaling exceptionally well due to its lack | |||
| of network state other than the ingress node. This document uses SR | of network state other than the ingress node. This document uses SR | |||
| instantiated on an MPLS forwarding plane(SR-MPLS), although it does | instantiated on an MPLS forwarding plane(SR-MPLS), although it does | |||
| not preclude the use of SRv6 either now or at some point in the | not preclude the use of SRv6 either now or at some point in the | |||
| future. The rationale for selecting SR-MPLS is simply maturity and | future. The rationale for selecting SR-MPLS is simply maturity and | |||
| more widespread applicability across a potentially broad range of | more widespread applicability across a potentially broad range of | |||
| skipping to change at page 10, line 31 ¶ | skipping to change at page 10, line 37 ¶ | |||
| Reflectors in order to learn topology information. | Reflectors in order to learn topology information. | |||
| Where BGP link-state is used to learn the topology of a data center | Where BGP link-state is used to learn the topology of a data center | |||
| (or any IGP routing domain) the BGP-LS Instance Identifier (Instance- | (or any IGP routing domain) the BGP-LS Instance Identifier (Instance- | |||
| ID) is carried within Node/Link/Prefix NLRI and is used to identify a | ID) is carried within Node/Link/Prefix NLRI and is used to identify a | |||
| given IGP routing domain. Where labeled unicast BGP is used to | given IGP routing domain. Where labeled unicast BGP is used to | |||
| discover the topology of one or more data center domains there is no | discover the topology of one or more data center domains there is no | |||
| equivalent way for the Interconnect controller to achieve a level of | equivalent way for the Interconnect controller to achieve a level of | |||
| routing domain correlation. The controller may learn some splintered | routing domain correlation. The controller may learn some splintered | |||
| connectivity map consisting of 10 leaf switches, four spine switches, | connectivity map consisting of 10 leaf switches, four spine switches, | |||
| and four DCI's, but it needs some form of key to inform it that leaf | and four DCB's, but it needs some form of key to inform it that leaf | |||
| switches 1-5, spine switches 1 and 2, and DCI's 1 and 2 belong to | switches 1-5, spine switches 1 and 2, and DCB's 1 and 2 belong to | |||
| data center 1, while leaf switches 6-10, spine switches 3 and 4, and | data center 1, while leaf switches 6-10, spine switches 3 and 4, and | |||
| DCI's 3 and 4 belong to data center 2. What is needed is a form of | DCB's 3 and 4 belong to data center 2. What is needed is a form of | |||
| 'data center membership identification' to provide this correlation. | 'data center membership identification' to provide this correlation. | |||
| Optionally this could be achieved at BGP level using a standard | Optionally this could be achieved at BGP level using a standard | |||
| community to represent each data center, or it could be done at a | community to represent each data center, or it could be done at a | |||
| more abstract level where for example the DC controller provides the | more abstract level where for example the DC controller provides the | |||
| membership identification to the Interconnect controller through an | membership identification to the Interconnect controller through an | |||
| application programming interface (API). | application programming interface (API). | |||
| Understanding real-time network state is an important part of the | Understanding real-time network state is an important part of the | |||
| Interconnect controllers role, and only with this information is the | Interconnect controllers role, and only with this information is the | |||
| controller able to make informed decisions and take preventive or | controller able to make informed decisions and take preventive or | |||
| corrective actions as necessary. There are numerous methods | corrective actions as necessary. There are numerous methods | |||
| implemented and deployed that allow for harvesting of network state, | implemented and deployed that allow for harvesting of network state, | |||
| including (but not limited to) IPFIX [RFC7011], Netconf/YANG | including (but not limited to) IPFIX [RFC7011], Netconf/YANG | |||
| [RFC6241][RFC6020], streaming telemetry, and the BGP Monitoring | [RFC6241][RFC6020], streaming telemetry, BGP link-state [RFC7752] | |||
| Protocol (BMP) [RFC7854]. | [I-D.ietf-idr-te-lsp-distribution], and the BGP Monitoring Protocol | |||
| (BMP) [RFC7854]. | ||||
| 5.4. Transport Layer | 5.4. Routing and LSP Underlay | |||
| This section describes the mechanisms and protocols that are used to | This section describes the mechanisms and protocols that are used to | |||
| establish end-to-end transport LSPs; where end-to-end refers to VNF- | establish end-to-end LSPs; where end-to-end refers to VNF-to-VNF, | |||
| to-VNF, PNF-to-PNF, or VNF-to-PNF. | PNF-to-PNF, or VNF-to-PNF. | |||
| 5.4.1. Intra-Domain Routing | 5.4.1. Intra-Domain Routing | |||
| This section describes the mechanisms and protocols that are used to | ||||
| establish end-to-end transport LSPs; where end-to-end refers to VNF- | ||||
| to-VNF, PNF-to-PNF, or VNF-to-PNF. | ||||
| 5.4.2. Intra-Domain Routing | ||||
| In a seamless MPLS architecture domains are based on geographic | In a seamless MPLS architecture domains are based on geographic | |||
| dispersion (core, aggregation, access). Within this document a | dispersion (core, aggregation, access). Within this document a | |||
| domain is considered as any entity with a captive topology; be it a | domain is considered as any entity with a captive topology; be it a | |||
| link-state topology or otherwise. Where reference is made to the | link-state topology or otherwise. Where reference is made to the | |||
| wide-area network domain, it refers to one or more domains that | wide-area network domain, it refers to one or more domains that | |||
| constitute the wide-area network domain. | constitute the wide-area network domain. | |||
| This section discusses the basic building blocks required within the | This section discusses the basic building blocks required within the | |||
| wide-area network and the data center, noting from above that the | wide-area network and the data center, noting from above that the | |||
| wide-area network may itself consist of multiple domains. | wide-area network may itself consist of multiple domains. | |||
| 5.4.2.1. Wide-Area Network Domains | 5.4.1.1. Wide-Area Network Domains | |||
| The wide-area network includes all levels of hierarchy (core, | The wide-area network includes all levels of hierarchy (core, | |||
| aggregation, access) that constitute the networks MPLS footprint as | aggregation, access) that constitute the networks MPLS footprint as | |||
| well as the Data Center Interconnects (DCIs). Each domain that | well as the data Center border routers. Each domain that constitutes | |||
| constitutes part of the wide-area network runs a link-state interior | part of the wide-area network runs a link-state interior gateway | |||
| gateway protocol (IGP) such as ISIS or OSPF, and each domain may use | protocol (IGP) such as ISIS or OSPF, and each domain may use IGP- | |||
| IGP-inherent hierarchy (OSPF areas, ISIS levels) with an assumption | inherent hierarchy (OSPF areas, ISIS levels) with an assumption that | |||
| that visibility is domain-wide using, for example, L2 to L1 | visibility is domain-wide using, for example, L2 to L1 | |||
| redistribution. Alternatively, or additionally, there may be | redistribution. Alternatively, or additionally, there may be | |||
| multiple domains that are split by using separate and distinct | multiple domains that are split by using separate and distinct | |||
| instances of IGP. There is no requirement for IGP redistribution of | instances of IGP. There is no requirement for IGP redistribution of | |||
| any link or loopback addresses between domains. | any link or loopback addresses between domains. | |||
| Each IGP should be enabled with the relevant extensions for segment | Each IGP should be enabled with the relevant extensions for segment | |||
| routing [RFC8667][RFC8665], and each SR-capable router should | routing [RFC8667][RFC8665], and each SR-capable router should | |||
| advertise a Node-SID for its loopback address, and an Adjacency-SID | advertise a Node-SID for its loopback address, and an Adjacency-SID | |||
| (Adj-SID) for every connected interface (unidirectional adjacency) | (Adj-SID) for every connected interface (unidirectional adjacency) | |||
| belonging to the SR domain. SR Global Blocks (SRGB) can be allocated | belonging to the SR domain. SR Global Blocks (SRGB) can be allocated | |||
| to each domain as deemed appropriate to specific network | to each domain as deemed appropriate to specific network | |||
| requirements. Border routers belonging to multiple domains have an | requirements. Border routers belonging to multiple domains have an | |||
| SRGB for each domain. | SRGB for each domain. | |||
| The default forwarding path for intra-domain transport LSPs that do | The default forwarding path for intra-domain LSPs that do not require | |||
| not require TE is simply an SR LSP containing a single label | TE is simply an SR LSP containing a single label advertised by the | |||
| advertised by the destination as a Node-SID and representing the | destination as a Node-SID and representing the ECMP-aware shortest | |||
| ECMP-aware shortest path to that destination. Intra-domain TE | path to that destination. Intra-domain TE LSPs are constructed as | |||
| transport LSPs are constructed as required by the Interconnect | required by the Interconnect controller. Once a path is calculated | |||
| controller. Once a path is calculated it is advertised as an | it is advertised as an explicit SR Policy | |||
| explicit SR Policy [I-D.ietf-spring-segment-routing-policy] | [I-D.ietf-spring-segment-routing-policy] containing one or more paths | |||
| containing one or more paths expressed as one or more segment-lists. | expressed as one or more segment-lists, which may optionally contain | |||
| An SR Policy is identified through the tuple [headend, color, | binding SIDs if requirements dictate. An SR Policy is identified | |||
| endpoint] and this tuple is used extensively by the Interconnect | through the tuple [headend, color, endpoint] and this tuple is used | |||
| controller to associate services with an underlying SR Policy that | extensively by the Interconnect controller to associate services with | |||
| meets its objectives. | an underlying SR Policy that meets its objectives. | |||
| 5.4.2.2. Data Center Domain | To provide support for ECMP the Entropy Label [RFC6790][RFC8662] | |||
| should be utilized. Entropy Label Capability (ELC) should be | ||||
| advertised into the IGP using the IS-IS Prefix Attributes TLV | ||||
| [I-D.ietf-isis-mpls-elc] or the OSPF Extended Prefix TLV | ||||
| [I-D.ietf-ospf-mpls-elc] coupled with the Node MSD Capability sub-TLV | ||||
| to advertise Entropy Readable Label Depth (ERLD) [RFC8491][RFC8476] | ||||
| and the base MPLS Imposition (BMI). Equally, support for ELC | ||||
| together with the supported ERLD should be signaled in BGP using the | ||||
| BGP Next-Hop Capability [I-D.ietf-idr-next-hop-capability]. Ingress | ||||
| nodes and or DCBs should ensure sufficient entropy is applied to | ||||
| packets to exercise available ECMP links. | ||||
| 5.4.1.2. Data Center Domain | ||||
| The data center domain includes all fabric switches, network | The data center domain includes all fabric switches, network | |||
| virtualization edge (NVE), and the Data Center Interconnects. The | virtualization edge (NVE), and the data center border routers. The | |||
| data center routing design may align with the framework of [RFC7938] | data center routing design may align with the framework of [RFC7938] | |||
| running eBGP single-hop sessions established over direct point-to- | running eBGP single-hop sessions established over direct point-to- | |||
| point links, or it may use an IGP for dissemination of topology | point links, or it may use an IGP for dissemination of topology | |||
| information. | information. This document focuses on the former, simply because the | |||
| ue of an IGP largely makes the data centers behaviour analogous to | ||||
| that of a wide-area network domain. | ||||
| The chosen method of transport or encapsulation within the data | The chosen method of transport or encapsulation within the data | |||
| center for NFIX is SR-MPLS over IP/UDP [RFC8663] or, where possible, | center for NFIX is SR-MPLS over IP/UDP [RFC8663] or, where possible, | |||
| native SR-MPLS. The choice of SR-MPLS over IP/UDP or native SR-MPLS | native SR-MPLS. The choice of SR-MPLS over IP/UDP or native SR-MPLS | |||
| allows for good entropy to maximize the use of equal-cost Clos fabric | allows for good entropy to maximize the use of equal-cost Clos fabric | |||
| links and allows for a lightweight interworking function at the DCI | links. Native SR-MPLS encapsulation provides entropy through use of | |||
| without the requirement for midpoint service provisioning. Loopback | the Entropy Label, and, like the wide-area network, support for ELC | |||
| addresses of network elements within the data center are advertised | together with the support ERLD should be signaled using the BGP Next- | |||
| using labeled unicast BGP with the addition of SR Prefix SID | Hop Capability attribute. As described in [RFC6790] the ELC is an | |||
| extensions [RFC8669] containing a globally unique and persistent | indication from the egress node of an MPLS tunnel to the ingress node | |||
| of the MPLS tunnel that is is capable of processing an Entropy Label. | ||||
| The BGP Next-Hop Capability is a non-transitive attribute which is | ||||
| modified or deleted when the next-hop is changed to reflect the | ||||
| capabilities of the new next-hop. If we assume that the path of a | ||||
| BGP-signaled LSP transits through multiple ASNs, and/or a single ASN | ||||
| with multiple next-hops, then it is not possible for the ingress node | ||||
| to determine the ELC of the egress node. Without this end-to-end | ||||
| signaling capability the entropy label must only be used when it is | ||||
| explicitly known, through configuration or other means, that the | ||||
| egress node has support for it. Entropy for SR-MPLS over IP/UDP | ||||
| encapsulation uses the source UDP port for IPv4 and the Flow Label | ||||
| for IPv6. Again, the ingress network function should ensure | ||||
| sufficient entropy is applied to exercise available ECMP links. | ||||
| Another significant advantage of the use of native SR-MPLS or SR-MPLS | ||||
| over IP/UDP is that it allows for a lightweight interworking function | ||||
| at the DCB without the requirement for midpoint provisioning; | ||||
| interworking between the data center and the wide-area network | ||||
| domains becomes an MPLS label swap/continue action. | ||||
| Loopback addresses of network elements within the data center are | ||||
| advertised using labeled unicast BGP with the addition of SR Prefix | ||||
| SID extensions [RFC8669] containing a globally unique and persistent | ||||
| Prefix-SID. The data-plane encapsulation of SR-MPLS over IP/UDP or | Prefix-SID. The data-plane encapsulation of SR-MPLS over IP/UDP or | |||
| native SR-MPLS allows network elements within the data center to | native SR-MPLS allows network elements within the data center to | |||
| consume BGP Prefix-SIDs and legitimately use those in the | consume BGP Prefix-SIDs and legitimately use those in the | |||
| encapsulation. | ||||
| 5.4.3. Inter-Domain Routing | 5.4.2. Inter-Domain Routing | |||
| Inter-domain routing is responsible for establishing connectivity | Inter-domain routing is responsible for establishing connectivity | |||
| between any domains that form the wide-area network, and between the | between any domains that form the wide-area network, and between the | |||
| wide-area network and data center domains. It is considered unlikely | wide-area network and data center domains. It is considered unlikely | |||
| that every end-to-end LSP will require a TE path, hence there is a | that every end-to-end LSP will require a TE path, hence there is a | |||
| requirement for a default end-to-end forwarding path. This default | requirement for a default end-to-end forwarding path. This default | |||
| forwarding path may also become the path of last resort in the event | forwarding path may also become the path of last resort in the event | |||
| of a non-recoverable failure of a TE path. Similar to the seamless | of a non-recoverable failure of a TE path. Similar to the seamless | |||
| MPLS architecture this inter-domain MPLS connectivity is realized | MPLS architecture this inter-domain MPLS connectivity is realized | |||
| using labeled unicast BGP [RFC8277] with the addition of SR Prefix | using labeled unicast BGP [RFC8277] with the addition of SR Prefix | |||
| SID extensions. | SID extensions. | |||
| Within each wide-area network domain all service edge routers, DCIs, | Within each wide-area network domain all service edge routers, DCBs, | |||
| and ABRs/ASBRs form part of the labeled BGP mesh, which can be either | and ABRs/ASBRs form part of the labeled BGP mesh, which can be either | |||
| full-mesh, or more likely based on the use of route-reflection. Each | full-mesh, or more likely based on the use of route-reflection. Each | |||
| of these routers advertises its respective loopback addresses into | of these routers advertises its respective loopback addresses into | |||
| labeled BGP together with an MPLS label and a globally unique Prefix- | labeled BGP together with an MPLS label and a globally unique Prefix- | |||
| SID. Routes are advertised between wide-area network domains by | SID. Routes are advertised between wide-area network domains by | |||
| ABRs/ASBRs that impose next-hop-self on advertised routes. The | ABRs/ASBRs that impose next-hop-self on advertised routes. The | |||
| function of imposing next-hop-self for labeled routes means that the | function of imposing next-hop-self for labeled routes means that the | |||
| ABR/ASBR allocates a new label for advertised routes and programs a | ABR/ASBR allocates a new label for advertised routes and programs a | |||
| label-swap entry in the forwarding plane for received and advertised | label-swap entry in the forwarding plane for received and advertised | |||
| routes. In short it becomes part of the forwarding path. | routes. In short it becomes part of the forwarding path. | |||
| DCI routers have labeled BGP sessions towards the wide-area network | DCB routers have labeled BGP sessions towards the wide-area network | |||
| and labeled BGP sessions towards the data center. Routes are | and labeled BGP sessions towards the data center. Routes are | |||
| bidirectionally advertised between the domains subject to policy, | bidirectionally advertised between the domains subject to policy, | |||
| with the DCI imposing itself as next-hop on advertised routes. As | with the DCB imposing itself as next-hop on advertised routes. As | |||
| above, the function of imposing next-hop-self for labeled routes | above, the function of imposing next-hop-self for labeled routes | |||
| implies allocation of a new label for advertised routes and a label- | implies allocation of a new label for advertised routes and a label- | |||
| swap entry being programmed in the forwarding plane for received and | swap entry being programmed in the forwarding plane for received and | |||
| advertised labels. The DCI thereafter becomes the anchor point | advertised labels. The DCB thereafter becomes the anchor point | |||
| between the wide-area network domain and the data center domain. | between the wide-area network domain and the data center domain. | |||
| Within the wide-area network next-hops for labeled unicast routes | Within the wide-area network next-hops for labeled unicast routes | |||
| containing Prefix-SIDs are resolved to SR LSPs, and within the data | containing Prefix-SIDs are resolved to SR LSPs, and within the data | |||
| center domain next-hops for labeled unicast routes containing Prefix- | center domain next-hops for labeled unicast routes containing Prefix- | |||
| SIDs are resolved to SR LSPs or IP/UDP tunnels. This provides end- | SIDs are resolved to SR LSPs or IP/UDP tunnels. This provides end- | |||
| to-end connectivity without a traffic-engineering capability. | to-end connectivity without a traffic-engineering capability. | |||
| 5.4.4. Intra-Domain and Inter-Domain Traffic-Engineering | +---------------+ +----------------+ +---------------+ | |||
| | Data Center | | Wide-Area | | Wide-Area | | ||||
| | +-----+ Domain 1 +-----+ Domain 'n' | | ||||
| | | DCB | | ABR | | | ||||
| | +-----+ +-----+ | | ||||
| | | | | | | | ||||
| +---------------+ +----------------+ +---------------+ | ||||
| <-- SR/SRoUDP --> <---- IGP/SR ----> <--- IGP/SR ----> | ||||
| <--- BGP-LU ---> NHS <--- BGP-LU ---> NHS <--- BGP-LU ---> | ||||
| A capability to traffic-engineer intra- and inter-domain end-to-end | Default Inter-Domain Forwarding Path | |||
| Figure 1 | ||||
| 5.4.3. Intra-Domain and Inter-Domain Traffic-Engineering | ||||
| The capability to traffic-engineer intra- and inter-domain end-to-end | ||||
| paths is considered a key requirement in order to meet the service | paths is considered a key requirement in order to meet the service | |||
| objectives previously outlined. To achieve optimal end-to-end path | objectives previously outlined. To achieve optimal end-to-end path | |||
| placement the key components to be considered are path calculation, | placement the key components to be considered are path calculation, | |||
| path activation, and FEC-to-path binding procedures. | path activation, and FEC-to-path binding procedures. | |||
| In the NFIX architecture end-to-end path calculation is performed by | In the NFIX architecture end-to-end path calculation is performed by | |||
| the Interconnect controller. The mechanics of how the objectives of | the Interconnect controller. The mechanics of how the objectives of | |||
| each path is calculated is beyond the scope of this document. Once a | each path is calculated is beyond the scope of this document. Once a | |||
| path is calculated based upon its objectives and constraints, the | path is calculated based upon its objectives and constraints, the | |||
| path is advertised from the controller to the LSP headend as an | path is advertised from the controller to the LSP headend as an | |||
| skipping to change at page 14, line 18 ¶ | skipping to change at page 15, line 24 ¶ | |||
| instantiating BSID anchors as necessary at path midpoints when | instantiating BSID anchors as necessary at path midpoints when | |||
| calculating and activating a path. The use of BSID is considered | calculating and activating a path. The use of BSID is considered | |||
| fundamental to segment routing as described in | fundamental to segment routing as described in | |||
| [I-D.filsfils-spring-sr-policy-considerations]. It provides opacity | [I-D.filsfils-spring-sr-policy-considerations]. It provides opacity | |||
| between domains, ensuring that any segment churn is constrained to a | between domains, ensuring that any segment churn is constrained to a | |||
| single domain. It also reduces the number of segments/labels that | single domain. It also reduces the number of segments/labels that | |||
| the headend needs to impose, which is particularly important given | the headend needs to impose, which is particularly important given | |||
| that network elements within a data center generally have limited | that network elements within a data center generally have limited | |||
| label imposition capabilities. In the context of the NFIX | label imposition capabilities. In the context of the NFIX | |||
| architecture it is also the vehicle that allows for removal of heavy | architecture it is also the vehicle that allows for removal of heavy | |||
| midpoint provisioning at the DCI. | midpoint provisioning at the DCB. | |||
| For example, assume that VNF1 is situated in data center 1, which is | For example, assume that VNF1 is situated in data center 1, which is | |||
| interconnected to the wide-area network via DCI1. VNF1 requires | interconnected to the wide-area network via DCB1. VNF1 requires | |||
| connectivity to VNF2, situated in data center 2, which is | connectivity to VNF2, situated in data center 2, which is | |||
| interconnected to the wide-area network via DCI2. Assuming there is | interconnected to the wide-area network via DCB2. Assuming there is | |||
| no existing TE path that meet VNF1's requirements, the Interconnect | no existing TE path that meet VNF1's requirements, the Interconnect | |||
| controller will: | controller will: | |||
| o Instantiate an SR Policy on DCI1 with BSID n and a segment-list | o Instantiate an SR Policy on DCB1 with BSID n and a segment-list | |||
| containing the relevant segments of a TE path to DCI2. DCI1 | containing the relevant segments of a TE path to DCB2. DCB1 | |||
| therefore becomes a BSID anchor. | therefore becomes a BSID anchor. | |||
| o Instantiate an SR Policy on VNF1 with BSID m and a segment-list | o Instantiate an SR Policy on VNF1 with BSID m and a segment-list | |||
| containing segments {DCI1, n, VNF2}. | containing segments {DCB1, n, VNF2}. | |||
| +---------------+ +----------------+ +---------------+ | +---------------+ +----------------+ +---------------+ | |||
| | Data Center 1 | | Wide-Area | | Data Center 2 | | | Data Center 1 | | Wide-Area | | Data Center 2 | | |||
| | +----+ +----+ 3 +----+ +----+ | | | +----+ +----+ 3 +----+ +----+ | | |||
| | |VNF1| |DCI1|-1 / \ 5--|DCI2| |VNF2| | | | |VNF1| |DCB1|-1 / \ 5--|DCB2| |VNF2| | | |||
| | +----+ +----+ \ / \ / +----+ +----+ | | | +----+ +----+ \ / \ / +----+ +----+ | | |||
| | | | 2 4 | | | | | | | 2 4 | | | | |||
| +---------------+ +----------------+ +---------------+ | +---------------+ +----------------+ +---------------+ | |||
| SR Policy SR Policy | SR Policy SR Policy | |||
| BSID m BSID n | BSID m BSID n | |||
| {DCI1,n,VNF2} {1,2,3,4,5,DCI2} | {DCB1,n,VNF2} {1,2,3,4,5,DCB2} | |||
| Traffic-Engineered Path using BSID | Traffic-Engineered Path using BSID | |||
| Figure 1 | Figure 2 | |||
| In the above figure a single DCB is used to interconnect two domains. | ||||
| Similarly, in the case of two wide-area domains the DCB would be | ||||
| represented as an ABR or ASBR. In some single operator environments | ||||
| domains may be interconnected using adjacent ASBRs connected via a | ||||
| distinct physical link. In this scenario the procedures outlined | ||||
| above may be extended to incorporate the mechanisms used in Egress | ||||
| Peer Engineering (EPE) [I-D.ietf-spring-segment-routing-central-epe] | ||||
| to form a traffic-engineered path spanning distinct domains. | ||||
| 5.4.3.1. Traffic-Engineering and ECMP | ||||
| Where the Interconnect controller is used to place SR policies, | ||||
| providing support for ECMP requires some consideration. An SR Policy | ||||
| is described with one or more segment-lists, end each of those | ||||
| segment-lists may or may not provide ECMP as a sum instruction and | ||||
| each SID itself may or may not support ECMP forwarding. When an | ||||
| individual SID is a BSID, an ECMP path may or may not also be nested | ||||
| within. The Interconnect controller may choose to place a path | ||||
| consisting entirely of non-ECMP-aware Adj-SIDs (each SID representing | ||||
| a single adjacency) such that the controller has explicit hop-by-hop | ||||
| knowledge of where that SR-TE LSP is routed. This is beneficial to | ||||
| allow the controller to take corrective action if the criteria that | ||||
| was used to initially select a particular link in a particular path | ||||
| subsequently changes. For example, if the latency of a link | ||||
| increases or a link becomes congested and a path should be rerouted. | ||||
| If ECMP-aware SIDs are used in the SR policy segment-list (including | ||||
| Node-SIDs, Adj-SIDs representing parallel links, and Anycast SIDs) SR | ||||
| routers are able to make autonomous decisions about where traffic is | ||||
| forwarded. As a result, it is not possible for the controller to | ||||
| fully understand the impact of a change in network state and react to | ||||
| it. With this in mind there are a number of approaches that could be | ||||
| adopted: | ||||
| o If there is no requirement for the Interconnect controller to | ||||
| explicitly track path on a hop-by-hop basis, ECMP-aware SIDs may | ||||
| be used in the SR policy segment-list. This approach may require | ||||
| multiple [ELI, EL] pairs to be inserted at the ingress node; for | ||||
| example, above and below a BSID to provide entropy in multiple | ||||
| domains. | ||||
| o If there is a requirement for the Interconnect controller to | ||||
| explicitly track paths on a hop-by-hop to provide the capability | ||||
| to reroute them based on changes in network state, SR policy | ||||
| segment-lists should be constructed of non-ECMP-aware Adj-SIDs. | ||||
| o A hybrid approach that allows for a level of ECMP (at the headend) | ||||
| together with the ability for the Interconnect controller to | ||||
| explicitly track paths is to instantiate an SR policy consisting | ||||
| of a set of segment-lists, each containing non-ECMP-aware Adj- | ||||
| SIDs. Each segment-list will be assigned a weight to allow for | ||||
| ECMP or UCMP. This approach does however imply computation and | ||||
| programing of two paths instead of one. | ||||
| o Another hybrid approach might work as follows. Redundant DCBs | ||||
| advertise an Anycast-SID 'A' into the data center, and also | ||||
| instantiate an SR policy with a segment-list consisting of non- | ||||
| ECMP-aware Adj-SIDs meeting the required connectivity and SLA. | ||||
| The BSID value of this SR policy 'B' must be common to both | ||||
| redundant DCBs, but the calculated paths are diverse. Indeed, | ||||
| multiple segment-lists could be used in this SR policy. A VNF | ||||
| could then instantiate an SR policy with a segment-list of {A, B} | ||||
| to achieve ECMP in the data center and TE in the wide-area network | ||||
| with the option of ECMP at the BSID anchor | ||||
| 5.5. Service Layer | 5.5. Service Layer | |||
| The service layer is intended to deliver Layer 2 and/or Layer 3 VPN | The service layer is intended to deliver Layer 2 and/or Layer 3 VPN | |||
| connectivity between network functions to create an overlay utilizing | connectivity between network functions to create an overlay utilizing | |||
| the transport layer described in section 5.4. To do this the | the routing and LSP underlay described in section 5.4. To do this | |||
| solution employs the EVPN and/or VPN-IPv4/IPv6 address families to | the solution employs the EVPN and/or VPN-IPv4/IPv6 address families | |||
| exchange Layer 2 and Layer 3 Network Layer Reachability Information | to exchange Layer 2 and Layer 3 Network Layer Reachability | |||
| (NLRI). When these NLRI are exchanged between domains it is typical | Information (NLRI). When these NLRI are exchanged between domains it | |||
| for the border router to set next-hop-self on advertised routes. | is typical for the border router to set next-hop-self on advertised | |||
| With the proposed transport layer however, this is not required and | routes. With the proposed routing and LSP underlay however, this is | |||
| EVPN/VPN-IPv4/IPv6 routes should be passed end-to-end without transit | not required and EVPN/VPN-IPv4/IPv6 routes should be passed end-to- | |||
| routers modifying the next-hop attribute. | end without transit routers modifying the next-hop attribute. | |||
| Section 5.4.2 describes the use of labeled unicast BGP to exchange | Section 5.4.2 describes the use of labeled unicast BGP to exchange | |||
| inter-domain routes to establish a default forwarding path. Labeled- | inter-domain routes to establish a default forwarding path. Labeled- | |||
| unicast BGP is used to exchange prefix reachability between service | unicast BGP is used to exchange prefix reachability between service | |||
| edge routers, with domain border routes imposing next-hop-self on | edge routers, with domain border routes imposing next-hop-self on | |||
| routes advertised between domains. This provides a default inter- | routes advertised between domains. This provides a default inter- | |||
| domain forwarding path and provides the required connectivity to | domain forwarding path and provides the required connectivity to | |||
| establish inter-domain BGP sessions between service edges for the | establish inter-domain BGP sessions between service edges for the | |||
| exchange of EVPN and/or VPN-IPv4/IPv6 NLRI. If route-reflection is | exchange of EVPN and/or VPN-IPv4/IPv6 NLRI. If route-reflection is | |||
| used for the EVPN and/or VPN-IPv4/IPv6 address families within one or | used for the EVPN and/or VPN-IPv4/IPv6 address families within one or | |||
| skipping to change at page 15, line 45 ¶ | skipping to change at page 18, line 27 ¶ | |||
| +----+ | RR | +----+ | RR | +----+ | RR | +----+ | +----+ | RR | +----+ | RR | +----+ | RR | +----+ | |||
| | NF | +----+ | DCI| +----+ | DCI| +----+ | NF | | | NF | +----+ | DCI| +----+ | DCI| +----+ | NF | | |||
| +----+ +----+ +----+ +----+ | +----+ +----+ +----+ +----+ | |||
| | Domain | | Domain | | Domain | | | Domain | | Domain | | Domain | | |||
| +----------------+ +----------------+ +----------------+ | +----------------+ +----------------+ +----------------+ | |||
| <-------> <-----> NHS <-- BGP-LU ---> NHS <-----> <------> | <-------> <-----> NHS <-- BGP-LU ---> NHS <-----> <------> | |||
| <-------> <--------- EVPN/VPN-IPv4/v6 ----------> <------> | <-------> <--------- EVPN/VPN-IPv4/v6 ----------> <------> | |||
| Inter-Domain Service Layer | Inter-Domain Service Layer | |||
| Figure 2 | Figure 3 | |||
| EVPN and/or VPN-IPv4/v6 routes received from a peer in a different | EVPN and/or VPN-IPv4/v6 routes received from a peer in a different | |||
| domain will contain a next-hop equivalent to the router that sourced | domain will contain a next-hop equivalent to the router that sourced | |||
| the route. The next-hop of these routes can be resolved to labeled- | the route. The next-hop of these routes can be resolved to labeled- | |||
| unicast route (default forwarding path) or to an SR policy (traffic- | unicast route (default forwarding path) or to an SR policy (traffic- | |||
| engineered forwarding path) as appropriate to the service | engineered forwarding path) as appropriate to the service | |||
| requirements. The exchange of EVPN and/or VPN-IPv4/IPv6 routes in | requirements. The exchange of EVPN and/or VPN-IPv4/IPv6 routes in | |||
| this manner implies that Route-Distinguisher and Route-Target values | this manner implies that Route-Distinguisher and Route-Target values | |||
| remain intact end-to-end. | remain intact end-to-end. | |||
| skipping to change at page 16, line 22 ¶ | skipping to change at page 19, line 6 ¶ | |||
| without the imposition of next-hop-self at border routers complements | without the imposition of next-hop-self at border routers complements | |||
| the gateway-less transport layer architecture. It negates the | the gateway-less transport layer architecture. It negates the | |||
| requirement for midpoint service provisioning and as such provides | requirement for midpoint service provisioning and as such provides | |||
| the following benefits: | the following benefits: | |||
| o Avoids the translation of MAC/IP EVPN routes to IP-VPN routes (and | o Avoids the translation of MAC/IP EVPN routes to IP-VPN routes (and | |||
| vice versa) that is typically associated with service | vice versa) that is typically associated with service | |||
| interworking. | interworking. | |||
| o Avoids instantiation of MAC-VRFs and IP-VPNs for each tenant | o Avoids instantiation of MAC-VRFs and IP-VPNs for each tenant | |||
| resident in the DCI. | resident in the DCB. | |||
| o Avoids provisioning of demarcation functions between the data | o Avoids provisioning of demarcation functions between the data | |||
| center and wide-area network such as QoS, access-control, | center and wide-area network such as QoS, access-control, | |||
| aggregation and isolation. | aggregation and isolation. | |||
| 5.6. Service Differentiation | 5.6. Service Differentiation | |||
| As discussed in section 5.4.3, the use of TE paths is a key | As discussed in section 5.4.3, the use of TE paths is a key | |||
| capability of the NFIX solution framework described in this document. | capability of the NFIX solution framework described in this document. | |||
| The Interconnect controller computes end-to-end TE paths between NFs | The Interconnect controller computes end-to-end TE paths between NFs | |||
| and programs DC nodes, DCIs, ABR/ASBRs, via SR Policy, with the | and programs DC nodes, DCBs, ABR/ASBRs, via SR Policy, with the | |||
| necessary label forwarding entries for each [headend, color, | necessary label forwarding entries for each [headend, color, | |||
| endpoint]. The collection of [headend, endpoint] pairs for the same | endpoint]. The collection of [headend, endpoint] pairs for the same | |||
| color constitutes a logical network topology, where each topology | color constitutes a logical network topology, where each topology | |||
| satisfies a given SLA requirement. | satisfies a given SLA requirement. | |||
| The Interconnect controller discovers the endpoints associated to a | The Interconnect controller discovers the endpoints associated to a | |||
| given topology (color) upon the reception of EVPN or IPVPN routes | given topology (color) upon the reception of EVPN or IPVPN routes | |||
| advertised by the endpoint. The EVPN and IPVPN NLRIs are advertised | advertised by the endpoint. The EVPN and IPVPN NLRIs are advertised | |||
| by the endpoint nodes along with a color extended community which | by the endpoint nodes along with a color extended community which | |||
| identifies the topology to which the owner of the NLRI belongs. At a | identifies the topology to which the owner of the NLRI belongs. At a | |||
| skipping to change at page 21, line 38 ¶ | skipping to change at page 24, line 19 ¶ | |||
| declared down. As links between network elements predominantly use | declared down. As links between network elements predominantly use | |||
| direct point-to-point fiber, a link failure should be detected within | direct point-to-point fiber, a link failure should be detected within | |||
| milliseconds. BFD is also commonly used to detect IP layer failures. | milliseconds. BFD is also commonly used to detect IP layer failures. | |||
| 5.9.3. Exchange of Inter-Domain Routes | 5.9.3. Exchange of Inter-Domain Routes | |||
| Labeled unicast BGP together with SR Prefix-SID extensions are used | Labeled unicast BGP together with SR Prefix-SID extensions are used | |||
| to exchange PNF and/or VNF endpoints between domains to create end- | to exchange PNF and/or VNF endpoints between domains to create end- | |||
| to-end connectivity without TE. When advertising between domains we | to-end connectivity without TE. When advertising between domains we | |||
| assume that a given BGP prefix is advertised by at least two border | assume that a given BGP prefix is advertised by at least two border | |||
| routers (DCIs, ABRs, ASBRs) making prefixes reachable via at least | routers (DCBs, ABRs, ASBRs) making prefixes reachable via at least | |||
| two next-hops. | two next-hops. | |||
| BGP Prefix Independent Convergence (PIC) [I-D.ietf-rtgwg-bgp-pic] | BGP Prefix Independent Convergence (PIC) [I-D.ietf-rtgwg-bgp-pic] | |||
| allows failover to a pre-computed and pre-installed secondary next- | allows failover to a pre-computed and pre-installed secondary next- | |||
| hop when the primary next-hop fails and is independent of the number | hop when the primary next-hop fails and is independent of the number | |||
| of destination prefixes that are affected by the failure. When the | of destination prefixes that are affected by the failure. When the | |||
| primary BGP next-hop fails, it should be clear that BGP PIC depends | primary BGP next-hop fails, it should be clear that BGP PIC depends | |||
| on the availability o f a secondary next-hop in the Pathlist. To | on the availability o f a secondary next-hop in the Pathlist. To | |||
| ensure that multiple paths to the same destination are visible the | ensure that multiple paths to the same destination are visible the | |||
| BGP ADD-PATH [RFC7911] can be used to allow for advertisement of | BGP ADD-PATH [RFC7911] can be used to allow for advertisement of | |||
| skipping to change at page 23, line 50 ¶ | skipping to change at page 26, line 37 ¶ | |||
| transitioned active controller. LSP state is not impacted unless | transitioned active controller. LSP state is not impacted unless | |||
| redelegation is not possible before the state timeout interval | redelegation is not possible before the state timeout interval | |||
| expires. | expires. | |||
| When BGP is used for instantiation of SR policies every headend | When BGP is used for instantiation of SR policies every headend | |||
| should establish a BGP session with the master and standby controller | should establish a BGP session with the master and standby controller | |||
| capable of exchanging SR TE Policy SAFI. Candidate paths of SR | capable of exchanging SR TE Policy SAFI. Candidate paths of SR | |||
| policies are advertised only by the active controller. If the master | policies are advertised only by the active controller. If the master | |||
| controller should experience a failure, then SR policies learnt from | controller should experience a failure, then SR policies learnt from | |||
| that controller may be removed before they are re-advertised by the | that controller may be removed before they are re-advertised by the | |||
| standby (or newly-active) controller. To avoid this possibility two | standby (or newly-active) controller. To minimize this possibility | |||
| options are possible: | BGP speakers that advertise and instantiate SR policies can implement | |||
| Long Lived Graceful Retart (LLGR) [I-D.ietf-idr-long-lived-gr], also | ||||
| known as BGP persistence, to retain existing routes treated as least- | ||||
| preferred until the new route arrives. In the absence of LLGR, two | ||||
| other alternatives are possible: | ||||
| o Provide a static backup SR policy. | o Provide a static backup SR policy. | |||
| o Fallback to the default forwarding path. | o Fallback to the default forwarding path. | |||
| 5.9.5. Path and Segment Liveliness | 5.9.5. Path and Segment Liveliness | |||
| When using traffic-engineered SR paths only the ingress router holds | When using traffic-engineered SR paths only the ingress router holds | |||
| any state. The exception here is where BSIDs are used, which also | any state. The exception here is where BSIDs are used, which also | |||
| implies some state is maintained at the BSID anchor. As there is no | implies some state is maintained at the BSID anchor. As there is no | |||
| skipping to change at page 24, line 29 ¶ | skipping to change at page 27, line 19 ¶ | |||
| withdrawing an SR policy if a suitable candidate path is already in | withdrawing an SR policy if a suitable candidate path is already in | |||
| place, or simply sending a new SR policy with a different segment- | place, or simply sending a new SR policy with a different segment- | |||
| list and a higher preference value assigned to it. | list and a higher preference value assigned to it. | |||
| Verification of data plane liveliness is the responsibility of the | Verification of data plane liveliness is the responsibility of the | |||
| path headend. A given SR policy may be associated with multiple | path headend. A given SR policy may be associated with multiple | |||
| candidate paths and for the sake of clarity, we'll assume two for | candidate paths and for the sake of clarity, we'll assume two for | |||
| redundancy purposes (which can be diversely routed). Verification of | redundancy purposes (which can be diversely routed). Verification of | |||
| the liveliness of these paths can be achieved using seamless BFD | the liveliness of these paths can be achieved using seamless BFD | |||
| (S-BFD)[RFC7880], which provides an in-band failure detection | (S-BFD)[RFC7880], which provides an in-band failure detection | |||
| mechanism capable of detecting failure in the order of milliseconds. | mechanism capable of detecting failure in the order of tens of | |||
| Upon failure of the active path, failover to a secondary candidate | milliseconds. Upon failure of the active path, failover to a | |||
| path can be activated at the path headend. Details of the actual | secondary candidate path can be activated at the path headend. | |||
| failover and revert mechanisms are a local implementation matter. | Details of the actual failover and revert mechanisms are a local | |||
| implementation matter. | ||||
| S-BFD provides a fast and scalable failure detection mechanism but is | S-BFD provides a fast and scalable failure detection mechanism but is | |||
| unlikely to be implemented in many VNFs given their inability to | unlikely to be implemented in many VNFs given their inability to | |||
| offload the process to purpose-built hardware. In the absence of an | offload the process to purpose-built hardware. In the absence of an | |||
| active failure detection mechanism such as S-BFD the failover from | active failure detection mechanism such as S-BFD the failover from | |||
| active path to secondary candidate path can be triggered using | active path to secondary candidate path can be triggered using | |||
| continuous path validity checks. One of the criteria that a | continuous path validity checks. One of the criteria that a | |||
| candidate path uses to determine its validity is the ability to | candidate path uses to determine its validity is the ability to | |||
| perform path resolution for the first SID to one or more outgoing | perform path resolution for the first SID to one or more outgoing | |||
| interface(s) and next-hop(s). From the perspective of the VNF | interface(s) and next-hop(s). From the perspective of the VNF | |||
| headend the first SID in the segment-list will very likely be the DCI | headend the first SID in the segment-list will very likely be the DCB | |||
| (as BSID anchor) but could equally be another Prefix-SID hop within | (as BSID anchor) but could equally be another Prefix-SID hop within | |||
| the data center. Should this segment experience a non-recoverable | the data center. Should this segment experience a non-recoverable | |||
| failure, the headend will be unable to resolve the first SID and the | failure, the headend will be unable to resolve the first SID and the | |||
| path will be considered invalid. This will trigger a failover action | path will be considered invalid. This will trigger a failover action | |||
| to a secondary candidate path. | to a secondary candidate path. | |||
| Injection of S-BFD packets is not just constrained to the source of | Injection of S-BFD packets is not just constrained to the source of | |||
| an end-to-end LSP. When an S-BFD packet is injected into an SR | an end-to-end LSP. When an S-BFD packet is injected into an SR | |||
| policy path it is encapsulated with the label stack of the associated | policy path it is encapsulated with the label stack of the associated | |||
| segment-list. It is possible therefore to run S-BFD from a BSID | segment-list. It is possible therefore to run S-BFD from a BSID | |||
| anchor for just that section of the end-to-end path (for example, | anchor for just that section of the end-to-end path (for example, | |||
| from DCI to DCI). This allows a BSID anchor to detect failure of a | from DCB to DCB). This allows a BSID anchor to detect failure of a | |||
| path and take corrective action, while maintaining opacity between | path and take corrective action, while maintaining opacity between | |||
| domains. | domains. | |||
| 5.10. Scalability | 5.10. Scalability | |||
| There are many aspects to consider regarding scalability of the NFIX | There are many aspects to consider regarding scalability of the NFIX | |||
| architecture. The building blocks of NFIX are standards-based | architecture. The building blocks of NFIX are standards-based | |||
| technologies individually designed to scale for internet provider | technologies individually designed to scale for internet provider | |||
| networks. When combined they provide a flexible and scalable | networks. When combined they provide a flexible and scalable | |||
| solution: | solution: | |||
| skipping to change at page 28, line 36 ¶ | skipping to change at page 31, line 29 ¶ | |||
| <-------- <-------------------------- NHS <------ <------ | <-------- <-------------------------- NHS <------ <------ | |||
| EVPN/VPN-IPv4/v6(colored) | EVPN/VPN-IPv4/v6(colored) | |||
| +-----------------------------------> +-------------> | +-----------------------------------> +-------------> | |||
| TE path to DCI2 ECMP path to VNF2 | TE path to DCI2 ECMP path to VNF2 | |||
| (BSID to segment-list | (BSID to segment-list | |||
| expansion on DCI1) | expansion on DCI1) | |||
| Asymmetric Model B Service Layer | Asymmetric Model B Service Layer | |||
| Figure 3 | Figure 4 | |||
| Consider the different n topologies needed between VNF1 and VNF2 are | Consider the different n topologies needed between VNF1 and VNF2 are | |||
| really only relevant to the different TE paths that exist in the WAN. | really only relevant to the different TE paths that exist in the WAN. | |||
| The WAN is the domain in the network where there can be significant | The WAN is the domain in the network where there can be significant | |||
| differences in latency, throughput or packet loss depending on the | differences in latency, throughput or packet loss depending on the | |||
| sequence of nodes and links the traffic goes through. Based on that | sequence of nodes and links the traffic goes through. Based on that | |||
| assumption for traffic from VNF1 to DCI2 in Figure 3, traffic from | assumption, for traffic from VNF1 to DCB2 in Figure 4, traffic from | |||
| DCI2 to VNF2 can simply take an ECMP path. In this case an | DCB2 to VNF2 can simply take an ECMP path. In this case an | |||
| asymmetric model B Service layer can significantly relieve the scale | asymmetric model B Service layer can significantly relieve the scale | |||
| pressure on VNF1. | pressure on VNF1. | |||
| From a service layer perspective, the NFIX architecture described up | From a service layer perspective, the NFIX architecture described up | |||
| to now can be considered 'symmetric', meaning that the EVPN/IPVPN | to now can be considered 'symmetric', meaning that the EVPN/IPVPN | |||
| advertisements from e.g., VNF2 in Figure 1, are received on VNF1 with | advertisements from e.g., VNF2 in Figure 2, are received on VNF1 with | |||
| the next-hop of VNF2, and vice versa for VNF1's routes on VNF2. SR | the next-hop of VNF2, and vice versa for VNF1's routes on VNF2. SR | |||
| Policies to each VNF2 [endpoint, color] are then required on the | Policies to each VNF2 [endpoint, color] are then required on the | |||
| VNF1. | VNF1. | |||
| In the 'asymmetric' service design illustrated in Figure 3, VNF2's | In the 'asymmetric' service design illustrated in Figure 4, VNF2's | |||
| EVPN/IPVPN routes are received on VNF1 with the next-hop of DCI2, and | EVPN/IPVPN routes are received on VNF1 with the next-hop of DCB2, and | |||
| VNF1's routes are received on VNF2 with next-hop of DCI1. Now SR | VNF1's routes are received on VNF2 with next-hop of DCB1. Now SR | |||
| policies instantiated on VNFs can be reduced to only the number of TE | policies instantiated on VNFs can be reduced to only the number of TE | |||
| paths required to reach the remote DCI. For example, considering n | paths required to reach the remote DCB. For example, considering n | |||
| topologies, in a symmetric model VNF1 has to be instantiated with n | topologies, in a symmetric model VNF1 has to be instantiated with n | |||
| SR policy paths per remote VNF in DC2, whereas in the asymmetric | SR policy paths per remote VNF in DC2, whereas in the asymmetric | |||
| model of Figure 3, VNF1 only requires n SR policy paths per DC, i.e., | model of Figure 4, VNF1 only requires n SR policy paths per DC, i.e., | |||
| to DCI2. | to DCB2. | |||
| Asymmetric model B is a simple design choice that only requires the | Asymmetric model B is a simple design choice that only requires the | |||
| ability (on the DCI nodes) to set next-hop-self on the EVPN/IPVPN | ability (on the DCB nodes) to set next-hop-self on the EVPN/IPVPN | |||
| routes advertised to the WAN neighbors and not do next-hop-self for | routes advertised to the WAN neighbors and not do next-hop-self for | |||
| routes advertised to the DC neighbors. With this option, the | routes advertised to the DC neighbors. With this option, the | |||
| Interconnect controller only needs to establish TE paths from VNFs to | Interconnect controller only needs to establish TE paths from VNFs to | |||
| remote DCIs, as opposed to VNFs to remote VNFs. | remote DCBs, as opposed to VNFs to remote VNFs. | |||
| 6. Illustration of Use | 6. Illustration of Use | |||
| For the purpose of illustration, this section provides some examples | For the purpose of illustration, this section provides some examples | |||
| of how different end-to-end tunnels are instantiated (including the | of how different end-to-end tunnels are instantiated (including the | |||
| relevant protocols, SID values/label stacks etc.) and how services | relevant protocols, SID values/label stacks etc.) and how services | |||
| are then overlaid onto those LSPs. | are then overlaid onto those LSPs. | |||
| 6.1. Reference Topology | 6.1. Reference Topology | |||
| skipping to change at page 30, line 23 ¶ | skipping to change at page 33, line 23 ¶ | |||
| +----+ | L=5 +----+ L=5 / | +----+ +----+ | +----+ | L=5 +----+ L=5 / | +----+ +----+ | |||
| | Sn | | +-------| R4 |--------+ | |AGN2| | Dn | | | Sn | | +-------| R4 |--------+ | |AGN2| | Dn | | |||
| +----+ | / M=20 +----+ M=20 | +----+ +----+ | +----+ | / M=20 +----+ M=20 | +----+ +----+ | |||
| ~ | / | | ~ | ~ | / | | ~ | |||
| ~ +----+ +----+ +----+ +----+ +----+ ~ | ~ +----+ +----+ +----+ +----+ +----+ ~ | |||
| ~ ~ ~ ~ | R5 |-----| R6 |----| R7 |-----| R8 |-----|AGN3| ~ ~ ~ ~ | ~ ~ ~ ~ | R5 |-----| R6 |----| R7 |-----| R8 |-----|AGN3| ~ ~ ~ ~ | |||
| +----+ +----+ +----+ +----+ +----+ | +----+ +----+ +----+ +----+ +----+ | |||
| Reference Topology | Reference Topology | |||
| Figure 4 | Figure 5 | |||
| The following applies to the reference topology in figure 4: | The following applies to the reference topology in figure 5: | |||
| o Data center 1 and data center 2 both run BGP/SR. Both data | o Data center 1 and data center 2 both run BGP/SR. Both data | |||
| centers run leaf/spine topologies, which are not shown for the | centers run leaf/spine topologies, which are not shown for the | |||
| purpose of clarity. | purpose of clarity. | |||
| o R1 and R5 function as data center interconnects for DC 1. AGN1 | o R1 and R5 function as data center border routers for DC 1. AGN1 | |||
| and AGN3 function as data center interconnects for DC 2. | and AGN3 function as data center border routers for DC 2. | |||
| o Routers R1 through R8 form an independent ISIS-OSPF/SR instance. | o Routers R1 through R8 form an independent ISIS-OSPF/SR instance. | |||
| o Routers R3, R8, AGN1, AGN2, and AGN2 form an independent ISIS- | o Routers R3, R8, AGN1, AGN2, and AGN2 form an independent ISIS- | |||
| OSPF/SR instance. | OSPF/SR instance. | |||
| o All IGP link metrics within the wide area network are metric 10 | o All IGP link metrics within the wide area network are metric 10 | |||
| except for links R5-R4 and R4-R3 which are both metric 20. | except for links R5-R4 and R4-R3 which are both metric 20. | |||
| o All links have a unidirectional latency of 10 milliseconds except | o All links have a unidirectional latency of 10 milliseconds except | |||
| skipping to change at page 34, line 20 ¶ | skipping to change at page 37, line 20 ¶ | |||
| o As in the previous example the Interconnect controller also learns | o As in the previous example the Interconnect controller also learns | |||
| the MAC Advertisement Route advertised by D2 in order to correlate | the MAC Advertisement Route advertised by D2 in order to correlate | |||
| the service overlay with the underlying transport LSPs, creating | the service overlay with the underlying transport LSPs, creating | |||
| or optimizing them as required. | or optimizing them as required. | |||
| 7. Conclusions | 7. Conclusions | |||
| The NFIX architecture provides an evolutionary path to a unified | The NFIX architecture provides an evolutionary path to a unified | |||
| network fabric. It uses the base constructs of seamless-MPLS and | network fabric. It uses the base constructs of seamless-MPLS and | |||
| adds end-to-end transport LSPs capable of delivering against SLAs, | adds end-to-end LSPs capable of delivering against SLAs, seamless | |||
| seamless data center interconnect, service differentiation, service | data center interconnect, service differentiation, service function | |||
| function chaining, and a Layer-2/Layer-3 infrastructure capable of | chaining, and a Layer-2/Layer-3 infrastructure capable of | |||
| interconnecting PNF-to-PNF, PNF-to-VNF, and VNF-to-VNF. | interconnecting PNF-to-PNF, PNF-to-VNF, and VNF-to-VNF. | |||
| NFIX establishes a dynamic, seamless, and automated connectivity | NFIX establishes a dynamic, seamless, and automated connectivity | |||
| model that overcomes the operational barriers and interworking issues | model that overcomes the operational barriers and interworking issues | |||
| between data centers and the wide-area network and delivers the | between data centers and the wide-area network and delivers the | |||
| following using standards-based protocols: | following using standards-based protocols: | |||
| o A unified routing control plane: Multiprotocol BGP (MP-BGP) to | o A unified routing control plane: Multiprotocol BGP (MP-BGP) to | |||
| acquire inter-domain NLRI from the IP/MPLS transport underlay and | acquire inter-domain NLRI from the IP/MPLS underlay and the | |||
| the virtualized IP-VPN/EVPN service overlay. | virtualized IP-VPN/EVPN service overlay. | |||
| o A unified forwarding control plane: SR provides dynamic service | o A unified forwarding control plane: SR provides dynamic service | |||
| tunnels with fast restoration options to meet deterministic | tunnels with fast restoration options to meet deterministic | |||
| bandwidth, latency and path diversity constraints. SR utilizes | bandwidth, latency and path diversity constraints. SR utilizes | |||
| the appropriate data path encapsulation for seamless, end-to-end | the appropriate data path encapsulation for seamless, end-to-end | |||
| connectivity between distributed edge and core data centers across | connectivity between distributed edge and core data centers across | |||
| the wide-area network. | the wide-area network. | |||
| o Service Function Chaining: Leverage SFC extensions for BGP and | o Service Function Chaining: Leverage SFC extensions for BGP and | |||
| segment routing to interconnect network and service functions into | segment routing to interconnect network and service functions into | |||
| SFPs, with support for various data path implementations. | SFPs, with support for various data path implementations. | |||
| o Service Differentiation: Provide a framework that allows for | o Service Differentiation: Provide a framework that allows for | |||
| construction of logical end-to-end networks with differentiated | construction of logical end-to-end networks with differentiated | |||
| logical topologies and/or constraints through use of SR policies | logical topologies and/or constraints through use of SR policies | |||
| and coloring. | and coloring. | |||
| o Automation: Facilitates automation of service provisioning and | o Automation: Facilitates automation of service provisioning and | |||
| avoids heavy service interworking at DCIs. | avoids heavy service interworking at DCBs. | |||
| NFIX is deployable on existing data center and wide-area network | NFIX is deployable on existing data center and wide-area network | |||
| infrastructures and allows the underlying data forwarding plane to | infrastructures and allows the underlying data forwarding plane to | |||
| evolve with minimal impact on the services plane. | evolve with minimal impact on the services plane. | |||
| 8. Security Considerations | 8. Security Considerations | |||
| The NFIX architecture based on SR-MPLS is subject to the same | The NFIX architecture based on SR-MPLS is subject to the same | |||
| security concerns as any MPLS network. No new protocols are | security concerns as any MPLS network. No new protocols are | |||
| introduced, hence security issues of the protocols encompassed by | introduced, hence security issues of the protocols encompassed by | |||
| skipping to change at page 35, line 37 ¶ | skipping to change at page 38, line 37 ¶ | |||
| PCEPS [RFC8253] is recommended to provide confidentiality to PCEP | PCEPS [RFC8253] is recommended to provide confidentiality to PCEP | |||
| communication using Transport Layer Security (TLS). | communication using Transport Layer Security (TLS). | |||
| 9. Acknowledgements | 9. Acknowledgements | |||
| The authors would like to acknowledge Mustapha Aissaoui, Wim | The authors would like to acknowledge Mustapha Aissaoui, Wim | |||
| Henderickx, and Gunter Van de Velde. | Henderickx, and Gunter Van de Velde. | |||
| 10. Contributors | 10. Contributors | |||
| The following people contributed to the content of this document | The following people contributed to the content of this document and | |||
| should be considered co-authors. | ||||
| Juan Rodriguez | Juan Rodriguez | |||
| Nokia | Nokia | |||
| United States of America | United States of America | |||
| Email: juan.rodriguez@nokia.com | Email: juan.rodriguez@nokia.com | |||
| Jorge Rabadan | Jorge Rabadan | |||
| Nokia | Nokia | |||
| United States of America | United States of America | |||
| Email: jorge.rabadan@nokia.com | Email: jorge.rabadan@nokia.com | |||
| Figure 5 | Nick Morris | |||
| Verizon | ||||
| United States of America | ||||
| Email: nicklous.morris@verizonwireless.com | ||||
| Eddie Leyton | ||||
| Verizon | ||||
| United States of America | ||||
| Email: edward.leyton@verizonwireless.com | ||||
| Figure 6 | ||||
| 11. IANA Considerations | 11. IANA Considerations | |||
| This memo does not include any requests to IANA for allocation. | This memo does not include any requests to IANA for allocation. | |||
| 12. References | 12. References | |||
| 12.1. Normative References | 12.1. Normative References | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| skipping to change at page 36, line 39 ¶ | skipping to change at page 40, line 10 ¶ | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| 12.2. Informative References | 12.2. Informative References | |||
| [I-D.ietf-nvo3-geneve] | [I-D.ietf-nvo3-geneve] | |||
| Gross, J., Ganga, I., and T. Sridhar, "Geneve: Generic | Gross, J., Ganga, I., and T. Sridhar, "Geneve: Generic | |||
| Network Virtualization Encapsulation", draft-ietf- | Network Virtualization Encapsulation", draft-ietf- | |||
| nvo3-geneve-14 (work in progress), September 2019. | nvo3-geneve-16 (work in progress), March 2020. | |||
| [I-D.ietf-mpls-seamless-mpls] | [I-D.ietf-mpls-seamless-mpls] | |||
| Leymann, N., Decraene, B., Filsfils, C., Konstantynowicz, | Leymann, N., Decraene, B., Filsfils, C., Konstantynowicz, | |||
| M., and D. Steinberg, "Seamless MPLS Architecture", draft- | M., and D. Steinberg, "Seamless MPLS Architecture", draft- | |||
| ietf-mpls-seamless-mpls-07 (work in progress), June 2014. | ietf-mpls-seamless-mpls-07 (work in progress), June 2014. | |||
| [I-D.ietf-bess-evpn-ipvpn-interworking] | [I-D.ietf-bess-evpn-ipvpn-interworking] | |||
| Rabadan, J. and A. Sajassi, "EVPN Interworking with | Rabadan, J., Sajassi, A., Rosen, E., Drake, J., Lin, W., | |||
| IPVPN", draft-ietf-bess-evpn-ipvpn-interworking-02 (work | Uttaro, J., and A. Simpson, "EVPN Interworking with | |||
| in progress), November 2019. | IPVPN", draft-ietf-bess-evpn-ipvpn-interworking-03 (work | |||
| in progress), May 2020. | ||||
| [I-D.ietf-spring-segment-routing-policy] | [I-D.ietf-spring-segment-routing-policy] | |||
| Filsfils, C., Sivabalan, S., Voyer, D., Bogdanov, A., and | Filsfils, C., Sivabalan, S., Voyer, D., Bogdanov, A., and | |||
| P. Mattes, "Segment Routing Policy Architecture", draft- | P. Mattes, "Segment Routing Policy Architecture", draft- | |||
| ietf-spring-segment-routing-policy-06 (work in progress), | ietf-spring-segment-routing-policy-07 (work in progress), | |||
| December 2019. | May 2020. | |||
| [I-D.ietf-rtgwg-segment-routing-ti-lfa] | [I-D.ietf-rtgwg-segment-routing-ti-lfa] | |||
| Litkowski, S., Bashandy, A., Filsfils, C., Decraene, B., | Litkowski, S., Bashandy, A., Filsfils, C., Decraene, B., | |||
| Francois, P., Voyer, D., Clad, F., and P. Camarillo, | Francois, P., Voyer, D., Clad, F., and P. Camarillo, | |||
| "Topology Independent Fast Reroute using Segment Routing", | "Topology Independent Fast Reroute using Segment Routing", | |||
| draft-ietf-rtgwg-segment-routing-ti-lfa-03 (work in | draft-ietf-rtgwg-segment-routing-ti-lfa-03 (work in | |||
| progress), March 2020. | progress), March 2020. | |||
| [I-D.ietf-bess-nsh-bgp-control-plane] | [I-D.ietf-bess-nsh-bgp-control-plane] | |||
| Farrel, A., Drake, J., Rosen, E., Uttaro, J., and L. | Farrel, A., Drake, J., Rosen, E., Uttaro, J., and L. | |||
| Jalil, "BGP Control Plane for NSH SFC", draft-ietf-bess- | Jalil, "BGP Control Plane for the Network Service Header | |||
| nsh-bgp-control-plane-13 (work in progress), December | in Service Function Chaining", draft-ietf-bess-nsh-bgp- | |||
| 2019. | control-plane-15 (work in progress), June 2020. | |||
| [I-D.ietf-idr-te-lsp-distribution] | [I-D.ietf-idr-te-lsp-distribution] | |||
| Previdi, S., Talaulikar, K., Dong, J., Chen, M., Gredler, | Previdi, S., Talaulikar, K., Dong, J., Chen, M., Gredler, | |||
| H., and J. Tantsura, "Distribution of Traffic Engineering | H., and J. Tantsura, "Distribution of Traffic Engineering | |||
| (TE) Policies and State using BGP-LS", draft-ietf-idr-te- | (TE) Policies and State using BGP-LS", draft-ietf-idr-te- | |||
| lsp-distribution-12 (work in progress), October 2019. | lsp-distribution-13 (work in progress), April 2020. | |||
| [I-D.barth-pce-segment-routing-policy-cp] | [I-D.barth-pce-segment-routing-policy-cp] | |||
| Koldychev, M., Sivabalan, S., Barth, C., Li, C., and H. | Koldychev, M., Sivabalan, S., Barth, C., Peng, S., and H. | |||
| Bidgoli, "PCEP extension to support Segment Routing Policy | Bidgoli, "PCEP extension to support Segment Routing Policy | |||
| Candidate Paths", draft-barth-pce-segment-routing-policy- | Candidate Paths", draft-barth-pce-segment-routing-policy- | |||
| cp-04 (work in progress), October 2019. | cp-06 (work in progress), June 2020. | |||
| [I-D.filsfils-spring-sr-policy-considerations] | [I-D.filsfils-spring-sr-policy-considerations] | |||
| Filsfils, C., Talaulikar, K., Krol, P., Horneffer, M., and | Filsfils, C., Talaulikar, K., Krol, P., Horneffer, M., and | |||
| P. Mattes, "SR Policy Implementation and Deployment | P. Mattes, "SR Policy Implementation and Deployment | |||
| Considerations", draft-filsfils-spring-sr-policy- | Considerations", draft-filsfils-spring-sr-policy- | |||
| considerations-04 (work in progress), October 2019. | considerations-05 (work in progress), April 2020. | |||
| [I-D.ietf-rtgwg-bgp-pic] | [I-D.ietf-rtgwg-bgp-pic] | |||
| Bashandy, A., Filsfils, C., and P. Mohapatra, "BGP Prefix | Bashandy, A., Filsfils, C., and P. Mohapatra, "BGP Prefix | |||
| Independent Convergence", draft-ietf-rtgwg-bgp-pic-11 | Independent Convergence", draft-ietf-rtgwg-bgp-pic-11 | |||
| (work in progress), February 2020. | (work in progress), February 2020. | |||
| [I-D.ietf-isis-mpls-elc] | ||||
| Xu, X., Kini, S., Psenak, P., Filsfils, C., Litkowski, S., | ||||
| and M. Bocci, "Signaling Entropy Label Capability and | ||||
| Entropy Readable Label Depth Using IS-IS", draft-ietf- | ||||
| isis-mpls-elc-13 (work in progress), May 2020. | ||||
| [I-D.ietf-ospf-mpls-elc] | ||||
| Xu, X., Kini, S., Psenak, P., Filsfils, C., Litkowski, S., | ||||
| and M. Bocci, "Signaling Entropy Label Capability and | ||||
| Entropy Readable Label Depth Using OSPF", draft-ietf-ospf- | ||||
| mpls-elc-15 (work in progress), June 2020. | ||||
| [I-D.ietf-idr-next-hop-capability] | ||||
| Decraene, B., Kompella, K., and W. Henderickx, "BGP Next- | ||||
| Hop dependent capabilities", draft-ietf-idr-next-hop- | ||||
| capability-05 (work in progress), June 2019. | ||||
| [I-D.ietf-spring-segment-routing-central-epe] | ||||
| Filsfils, C., Previdi, S., Dawra, G., Aries, E., and D. | ||||
| Afanasiev, "Segment Routing Centralized BGP Egress Peer | ||||
| Engineering", draft-ietf-spring-segment-routing-central- | ||||
| epe-10 (work in progress), December 2017. | ||||
| [I-D.ietf-idr-long-lived-gr] | ||||
| Uttaro, J., Chen, E., Decraene, B., and J. Scudder, | ||||
| "Support for Long-lived BGP Graceful Restart", draft-ietf- | ||||
| idr-long-lived-gr-00 (work in progress), September 2019. | ||||
| [RFC7938] Lapukhov, P., Premji, A., and J. Mitchell, Ed., "Use of | [RFC7938] Lapukhov, P., Premji, A., and J. Mitchell, Ed., "Use of | |||
| BGP for Routing in Large-Scale Data Centers", RFC 7938, | BGP for Routing in Large-Scale Data Centers", RFC 7938, | |||
| DOI 10.17487/RFC7938, August 2016, | DOI 10.17487/RFC7938, August 2016, | |||
| <https://www.rfc-editor.org/info/rfc7938>. | <https://www.rfc-editor.org/info/rfc7938>. | |||
| [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and | [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and | |||
| S. Ray, "North-Bound Distribution of Link-State and | S. Ray, "North-Bound Distribution of Link-State and | |||
| Traffic Engineering (TE) Information Using BGP", RFC 7752, | Traffic Engineering (TE) Information Using BGP", RFC 7752, | |||
| DOI 10.17487/RFC7752, March 2016, | DOI 10.17487/RFC7752, March 2016, | |||
| <https://www.rfc-editor.org/info/rfc7752>. | <https://www.rfc-editor.org/info/rfc7752>. | |||
| skipping to change at page 40, line 47 ¶ | skipping to change at page 44, line 47 ¶ | |||
| [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP | [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP | |||
| Authentication Option", RFC 5925, DOI 10.17487/RFC5925, | Authentication Option", RFC 5925, DOI 10.17487/RFC5925, | |||
| June 2010, <https://www.rfc-editor.org/info/rfc5925>. | June 2010, <https://www.rfc-editor.org/info/rfc5925>. | |||
| [RFC8253] Lopez, D., Gonzalez de Dios, O., Wu, Q., and D. Dhody, | [RFC8253] Lopez, D., Gonzalez de Dios, O., Wu, Q., and D. Dhody, | |||
| "PCEPS: Usage of TLS to Provide a Secure Transport for the | "PCEPS: Usage of TLS to Provide a Secure Transport for the | |||
| Path Computation Element Communication Protocol (PCEP)", | Path Computation Element Communication Protocol (PCEP)", | |||
| RFC 8253, DOI 10.17487/RFC8253, October 2017, | RFC 8253, DOI 10.17487/RFC8253, October 2017, | |||
| <https://www.rfc-editor.org/info/rfc8253>. | <https://www.rfc-editor.org/info/rfc8253>. | |||
| [RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and | ||||
| L. Yong, "The Use of Entropy Labels in MPLS Forwarding", | ||||
| RFC 6790, DOI 10.17487/RFC6790, November 2012, | ||||
| <https://www.rfc-editor.org/info/rfc6790>. | ||||
| [RFC8662] Kini, S., Kompella, K., Sivabalan, S., Litkowski, S., | ||||
| Shakir, R., and J. Tantsura, "Entropy Label for Source | ||||
| Packet Routing in Networking (SPRING) Tunnels", RFC 8662, | ||||
| DOI 10.17487/RFC8662, December 2019, | ||||
| <https://www.rfc-editor.org/info/rfc8662>. | ||||
| [RFC8491] Tantsura, J., Chunduri, U., Aldrin, S., and L. Ginsberg, | ||||
| "Signaling Maximum SID Depth (MSD) Using IS-IS", RFC 8491, | ||||
| DOI 10.17487/RFC8491, November 2018, | ||||
| <https://www.rfc-editor.org/info/rfc8491>. | ||||
| [RFC8476] Tantsura, J., Chunduri, U., Aldrin, S., and P. Psenak, | ||||
| "Signaling Maximum SID Depth (MSD) Using OSPF", RFC 8476, | ||||
| DOI 10.17487/RFC8476, December 2018, | ||||
| <https://www.rfc-editor.org/info/rfc8476>. | ||||
| Authors' Addresses | Authors' Addresses | |||
| Colin Bookham (editor) | Colin Bookham (editor) | |||
| Nokia | Nokia | |||
| 740 Waterside Drive | 740 Waterside Drive | |||
| Almondsbury, Bristol | Almondsbury, Bristol | |||
| UK | UK | |||
| Email: colin.bookham@nokia.com | Email: colin.bookham@nokia.com | |||
| Andrew Stone | Andrew Stone | |||
| Nokia | Nokia | |||
| skipping to change at page 41, line 27 ¶ | skipping to change at page 46, line 4 ¶ | |||
| Email: andrew.stone@nokia.com | Email: andrew.stone@nokia.com | |||
| Jeff Tantsura | Jeff Tantsura | |||
| Apstra | Apstra | |||
| 333 Middlefield Road #200 | 333 Middlefield Road #200 | |||
| Menlo Park, CA 94025 | Menlo Park, CA 94025 | |||
| USA | USA | |||
| Email: jefftant.ietf@gmail.com | Email: jefftant.ietf@gmail.com | |||
| Muhammad Durrani | Muhammad Durrani | |||
| Equinix Inc | Equinix Inc | |||
| 1188 Arques Ave | 1188 Arques Ave | |||
| Sunnyvale CA | Sunnyvale CA | |||
| USA | USA | |||
| Email: mdurrani@equinix.com | Email: mdurrani@equinix.com | |||
| Bruno Decraene | ||||
| Orange | ||||
| 38-40 Rue de General Leclerc | ||||
| 92794 Issey Moulineaux cedex 9 | ||||
| France | ||||
| Email: bruno.decraene@orange.com | ||||
| End of changes. 83 change blocks. | ||||
| 169 lines changed or deleted | 356 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||