| < draft-ietf-bess-bgp-multicast-controller-08.txt | draft-ietf-bess-bgp-multicast-controller-09.txt > | |||
|---|---|---|---|---|
| BESS Z. Zhang | BESS Z. Zhang | |||
| Internet-Draft Juniper Networks | Internet-Draft Juniper Networks | |||
| Intended status: Standards Track R. Raszuk | Intended status: Standards Track R. Raszuk | |||
| Expires: October 8, 2022 NTT Network Innovations | Expires: October 13, 2022 NTT Network Innovations | |||
| D. Pacella | D. Pacella | |||
| Verizon | Verizon | |||
| A. Gulko | A. Gulko | |||
| Edward Jones Wealth Management | Edward Jones Wealth Management | |||
| April 6, 2022 | April 11, 2022 | |||
| Controller Based BGP Multicast Signaling | Controller Based BGP Multicast Signaling | |||
| draft-ietf-bess-bgp-multicast-controller-08 | draft-ietf-bess-bgp-multicast-controller-09 | |||
| Abstract | Abstract | |||
| This document specifies a way that one or more centralized | This document specifies a way that one or more centralized | |||
| controllers can use BGP to set up multicast distribution trees | controllers can use BGP to set up multicast distribution trees | |||
| (identified by either IP source/destination address pair, mLDP FEC, | (identified by either IP source/destination address pair, mLDP FEC, | |||
| or SR-P2MP Tree-ID) in a network. Since the controllers calculate | or SR-P2MP Tree-ID) in a network. Since the controllers calculate | |||
| the trees, they can use sophisticated algorithms and constraints to | the trees, they can use sophisticated algorithms and constraints to | |||
| achieve traffic engineering. The controllers directly signal dynamic | achieve traffic engineering. The controllers directly signal dynamic | |||
| replication state to tree nodes, leading to very simple multicast | replication state to tree nodes, leading to very simple multicast | |||
| skipping to change at page 2, line 7 ¶ | skipping to change at page 2, line 7 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on October 8, 2022. | This Internet-Draft will expire on October 13, 2022. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2022 IETF Trust and the persons identified as the | Copyright (c) 2022 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 37 ¶ | skipping to change at page 2, line 37 ¶ | |||
| 1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 3 | 1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 1.2. Resilience . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.2. Resilience . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.3. Signaling . . . . . . . . . . . . . . . . . . . . . . . . 5 | 1.3. Signaling . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 1.4. Label Allocation . . . . . . . . . . . . . . . . . . . . 6 | 1.4. Label Allocation . . . . . . . . . . . . . . . . . . . . 6 | |||
| 1.4.1. Using a Common per-tree Label for All Routers . . . . 7 | 1.4.1. Using a Common per-tree Label for All Routers . . . . 7 | |||
| 1.4.2. Upstream-assignment from Controller's Local Label | 1.4.2. Upstream-assignment from Controller's Local Label | |||
| Space . . . . . . . . . . . . . . . . . . . . . . . . 8 | Space . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 1.5. Determining Root/Leaves . . . . . . . . . . . . . . . . . 9 | 1.5. Determining Root/Leaves . . . . . . . . . . . . . . . . . 9 | |||
| 1.5.1. PIM-SSM/Bidir or mLDP . . . . . . . . . . . . . . . . 9 | 1.5.1. PIM-SSM/Bidir or mLDP . . . . . . . . . . . . . . . . 9 | |||
| 1.5.2. PIM ASM . . . . . . . . . . . . . . . . . . . . . . . 9 | 1.5.2. PIM ASM . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 1.6. Multiple Domains . . . . . . . . . . . . . . . . . . . . 9 | 1.6. Multiple Domains . . . . . . . . . . . . . . . . . . . . 10 | |||
| 1.7. SR-P2MP . . . . . . . . . . . . . . . . . . . . . . . . . 11 | 1.7. SR-P2MP . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 2. Alternative to BGP-MVPN . . . . . . . . . . . . . . . . . . . 11 | 2. Alternative to BGP-MVPN . . . . . . . . . . . . . . . . . . . 11 | |||
| 3. Specification . . . . . . . . . . . . . . . . . . . . . . . . 13 | 3. Specification . . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 3.1. Enhancements to TEA . . . . . . . . . . . . . . . . . . . 13 | 3.1. Enhancements to TEA . . . . . . . . . . . . . . . . . . . 13 | |||
| 3.1.1. Any-Encapsulation Tunnel . . . . . . . . . . . . . . 13 | 3.1.1. Any-Encapsulation Tunnel . . . . . . . . . . . . . . 13 | |||
| 3.1.2. Load-balancing Tunnel . . . . . . . . . . . . . . . . 13 | 3.1.2. Load-balancing Tunnel . . . . . . . . . . . . . . . . 14 | |||
| 3.1.3. Receiving MPLS Label Stack . . . . . . . . . . . . . 14 | 3.1.3. Segment List Tunnel . . . . . . . . . . . . . . . . . 14 | |||
| 3.1.4. RPF Sub-TLV . . . . . . . . . . . . . . . . . . . . . 14 | 3.1.4. Receiving MPLS Label Stack . . . . . . . . . . . . . 14 | |||
| 3.1.5. Tree Label Stack sub-TLV . . . . . . . . . . . . . . 15 | 3.1.5. RPF Sub-TLV . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 3.1.6. Backup Tunnel sub-TLV . . . . . . . . . . . . . . . . 15 | 3.1.6. Tree Label Stack sub-TLV . . . . . . . . . . . . . . 15 | |||
| 3.2. Context Label TLV in BGP-LS Node Attribute . . . . . . . 16 | 3.1.7. Backup Tunnel sub-TLV . . . . . . . . . . . . . . . . 16 | |||
| 3.2. Context Label TLV in BGP-LS Node Attribute . . . . . . . 17 | ||||
| 3.3. Replicate State Route Type . . . . . . . . . . . . . . . 17 | 3.3. Replicate State Route Type . . . . . . . . . . . . . . . 17 | |||
| 3.4. SR P2MP Signaling . . . . . . . . . . . . . . . . . . . . 17 | 3.4. SR P2MP Signaling . . . . . . . . . . . . . . . . . . . . 18 | |||
| 3.4.1. Replication State Route for SR P2MP . . . . . . . . . 18 | 3.4.1. Replication State Route for SR P2MP . . . . . . . . . 18 | |||
| 3.4.2. BGP Community Container for SR P2MP Policy . . . . . 18 | 3.4.2. BGP Community Container for SR P2MP Policy . . . . . 19 | |||
| 3.4.3. Tunnel Encapsulation Attribute for SR-P2MP . . . . . 19 | 3.4.3. Tunnel Encapsulation Attribute . . . . . . . . . . . 20 | |||
| 3.4.3.1. TEA with Tunnel TLVs Being Replication Branches . 19 | ||||
| 3.4.3.2. TEA with a Single SR-P2MP Policy Tunnel . . . . . 20 | ||||
| 3.5. Replication State Route with Label Stack for Tree | 3.5. Replication State Route with Label Stack for Tree | |||
| Identification . . . . . . . . . . . . . . . . . . . . . 20 | Identification . . . . . . . . . . . . . . . . . . . . . 21 | |||
| 4. Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 21 | 4. Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
| 5. Security Considerations . . . . . . . . . . . . . . . . . . . 21 | 5. Security Considerations . . . . . . . . . . . . . . . . . . . 22 | |||
| 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 | 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 | |||
| 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 22 | 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 23 | |||
| 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 | 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
| 8.1. Normative References . . . . . . . . . . . . . . . . . . 22 | 8.1. Normative References . . . . . . . . . . . . . . . . . . 23 | |||
| 8.2. Informative References . . . . . . . . . . . . . . . . . 23 | 8.2. Informative References . . . . . . . . . . . . . . . . . 24 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25 | |||
| 1. Overview | 1. Overview | |||
| 1.1. Introduction | 1.1. Introduction | |||
| [I-D.ietf-bess-bgp-multicast] describes a way to use BGP as a | [I-D.ietf-bess-bgp-multicast] describes a way to use BGP as a | |||
| replacement signaling for PIM [RFC7761] or mLDP [RFC6388]. The BGP- | replacement signaling for PIM [RFC7761] or mLDP [RFC6388]. The BGP- | |||
| based multicast signaling described there provides a mechanism for | based multicast signaling described there provides a mechanism for | |||
| setting up both (s,g)/(*,g) multicast trees (as PIM does, but | setting up both (s,g)/(*,g) multicast trees (as PIM does, but | |||
| optionally with labels) and labeled (MPLS) multicast tunnels (as mLDP | optionally with labels) and labeled (MPLS) multicast tunnels (as mLDP | |||
| skipping to change at page 3, line 45 ¶ | skipping to change at page 3, line 44 ¶ | |||
| for more than one flow. In the latter case, the trees are often | for more than one flow. In the latter case, the trees are often | |||
| referred to as "multicast tunnels" or "multipoint tunnels", and | referred to as "multicast tunnels" or "multipoint tunnels", and | |||
| specifically in this document they are mLDP tunnels (except that they | specifically in this document they are mLDP tunnels (except that they | |||
| are set up with BGP signaling). While it actually does not have to | are set up with BGP signaling). While it actually does not have to | |||
| be restricted to mLDP tunnels, mLDP FEC is conveniently borrowed to | be restricted to mLDP tunnels, mLDP FEC is conveniently borrowed to | |||
| identify the tunnel. In the rest of the document, the term tree and | identify the tunnel. In the rest of the document, the term tree and | |||
| tunnel are used interchangeably. | tunnel are used interchangeably. | |||
| The trees/tunnels are set up using the "receiver-initiated join" | The trees/tunnels are set up using the "receiver-initiated join" | |||
| technique of PIM/mLDP, hop by hop from downstream routers towards the | technique of PIM/mLDP, hop by hop from downstream routers towards the | |||
| root. The BGP messages are either sent hop by hop between downstream | root. The BGP messages of MCAST-TREE SAFI are either sent hop by hop | |||
| routers and their upstream neighbors, or can be reflected by Route | between downstream routers and their upstream neighbors, or can be | |||
| Reflectors (RRs). | reflected by Route Reflectors (RRs). | |||
| As an alternative to each hop independently determining its upstream | As an alternative to each hop independently determining its upstream | |||
| router and signaling upstream towards the root (following PIM/mLDP | router and signaling upstream towards the root (following PIM/mLDP | |||
| model), the entire tree can be calculated by a centralized | model), the entire tree can be calculated by a centralized | |||
| controller, and the signaling can be entirely done from the | controller, and the signaling can be entirely done from the | |||
| controller. For that, some additional procedures and optimizations | controller using the same MCAST-TREE SAFI. For that, some additional | |||
| are specified in this document. | procedures and optimizations are specified in this document. | |||
| [I-D.ietf-bess-bgp-multicast] uses S-PMSI, Leaf, and Source Active | [I-D.ietf-bess-bgp-multicast] uses S-PMSI, Leaf, and Source Active | |||
| Auto-Discovery (A-D) routes because the main procedures and concepts | Auto-Discovery (A-D) routes because the main procedures and concepts | |||
| are borrowed from the BGP-MVPN [RFC6514]. While the same Leaf A-D | are borrowed from the BGP-MVPN [RFC6514]. While the same Leaf A-D | |||
| routes can be used to signal replication state to tree nodes from | routes can be used to signal replication state to tree nodes from | |||
| controllers, this document introduces a new route type "Replication | controllers, this document introduces a new route type "Replication | |||
| State" for the same functionality, so that familiarity with the BGP- | State" for the same functionality, so that familiarity with the BGP- | |||
| MVPN concepts is not required. | MVPN concepts is not required. | |||
| While it is outside the scope of this document, signaling from the | While it is outside the scope of this document, signaling from the | |||
| skipping to change at page 5, line 32 ¶ | skipping to change at page 5, line 31 ¶ | |||
| the route, which indicates that this router is the target and | the route, which indicates that this router is the target and | |||
| consumer of the route hence it should not be re-advertised further. | consumer of the route hence it should not be re-advertised further. | |||
| The routes includes the forwarding information in the form of Tunnel | The routes includes the forwarding information in the form of Tunnel | |||
| Encapsulation Attributes (TEA) [RFC9012], with enhancements specified | Encapsulation Attributes (TEA) [RFC9012], with enhancements specified | |||
| in this document. | in this document. | |||
| Suppose that for a particular tree, there are two downstream routers | Suppose that for a particular tree, there are two downstream routers | |||
| D1 and D2 for a particular upstream router U. A controller C sends | D1 and D2 for a particular upstream router U. A controller C sends | |||
| one Replication State route to U, with the Tree Node's IP Address | one Replication State route to U, with the Tree Node's IP Address | |||
| field (see Section 3.3) set to U's IP address and the TEA specifying | field (see Section 3.3) set to U's IP address and the TEA specifying | |||
| both the two downstreams and its upstream (see Section 3.1.4). In | both the two downstreams and its upstream (see Section 3.1.5). In | |||
| this case, the Originating Router's Address field of the Replication | this case, the Originating Router's Address field of the Replication | |||
| State route is set to the controller's address. Note that for a TEA | State route is set to the controller's address. Note that for a TEA | |||
| attached to a unicast NLRI, only one of the tunnels in a TEA is used | attached to a unicast NLRI, only one of the tunnels in a TEA is used | |||
| for forwarding a particular packet, while all the tunnels in a TEA | for forwarding a particular packet, while all the tunnels in a TEA | |||
| are used to reach multiple endpoints when it is attached to a | are used to reach multiple endpoints when it is attached to a | |||
| multicast NLRI. | multicast NLRI. | |||
| It could be that U may need to replicate to many downstream routers, | ||||
| say D1 through D1000. In that case, it may not be possible to encode | ||||
| all those branches in a single TEA, or may not be optimal to update a | ||||
| large TEA when a branch is added/removed. In that case, C may send | ||||
| multiple Replication State routes, each with a different Originating | ||||
| Router's Address field and a different TEA that encodes a subset of | ||||
| the branches. This provides a flexible way to optimize the encoding | ||||
| of large number of branches and incremental updates of branches. | ||||
| Notice that, in case of labeled trees, the (x,g), mLDP FEC, or SR- | Notice that, in case of labeled trees, the (x,g), mLDP FEC, or SR- | |||
| P2MP tree identification (Section 1.7) signaling is actually not | P2MP tree identification (Section 1.7) signaling is actually not | |||
| needed to transit routers but only needed to tunnel root/leaves. | needed to transit routers but only needed to tunnel root/leaves. | |||
| However, for consistency among the root/leaf/transit nodes, and for | However, for consistency among the root/leaf/transit nodes, and for | |||
| consistency with the hop-by-hop signaling, the same signaling (with | consistency with the hop-by-hop signaling, the same signaling (with | |||
| tree identification encoded in the NLRI) is used to all routers. | tree identification encoded in the NLRI) is used to all routers. | |||
| Nonetheless, a new NLRI route type is defined to encode label/SID | Nonetheless, a new NLRI route type of the MCAST-TREE SAFI is defined | |||
| instead of tree identification in the NLRI, for scenarios where there | to encode label/SID instead of tree identification in the NLRI, for | |||
| is really no need to signal tree identification, e.g. as described in | scenarios where there is really no need to signal tree | |||
| Section 2. On a tunnel root, the tree's binding SID can be encoded | identification, e.g. as described in Section 2. On a tunnel root, | |||
| in the NLRI. | the tree's binding SID can be encoded in the NLRI. | |||
| For a tree node to acknowledge to the controller that it has received | For a tree node to acknowledge to the controller that it has received | |||
| the signaling and installed corresponding forwarding state, it | the signaling and installed corresponding forwarding state, it | |||
| advertises a corresponding Replication State route, with the | advertises a corresponding Replication State route, with the | |||
| Originating Router's IP Address set to itself and with a Route Target | Originating Router's IP Address set to itself and with a Route Target | |||
| to match the controller. For comparison, the tree signaling | to match the controller. For comparison, the tree signaling | |||
| Replication State route from the controller has the Originating | Replication State route from the controller has the Originating | |||
| Router's IP Address set to the controller and the Route Target | Router's IP Address set to the controller and the Route Target | |||
| matching the tree node. The two Replication State routes (for | matching the tree node. The two Replication State routes (for | |||
| controller to signal to a tree node and for a tree node to | controller to signal to a tree node and for a tree node to | |||
| skipping to change at page 8, line 41 ¶ | skipping to change at page 9, line 4 ¶ | |||
| corresponding to each controller C, a context label that identifies | corresponding to each controller C, a context label that identifies | |||
| the upstream-assigned label space used by that controller. This | the upstream-assigned label space used by that controller. This | |||
| label, call it Lc-D, is communicated by D to C via BGP-LS [RFC 7752]. | label, call it Lc-D, is communicated by D to C via BGP-LS [RFC 7752]. | |||
| Suppose a controller is setting up unidirectional tree T. It assigns | Suppose a controller is setting up unidirectional tree T. It assigns | |||
| that tree the label Lt, and assigns label Lu to identify router U | that tree the label Lt, and assigns label Lu to identify router U | |||
| which is the upstream of router D on tree T. C needs to tell U: "to | which is the upstream of router D on tree T. C needs to tell U: "to | |||
| send a packet on the given tree/tunnel, one of the things you have to | send a packet on the given tree/tunnel, one of the things you have to | |||
| do is push Lt onto the packet's label stack, then push Lu, then push | do is push Lt onto the packet's label stack, then push Lu, then push | |||
| Lc-D onto the packet's label stack, then unicast the packet to D". | Lc-D onto the packet's label stack, then unicast the packet to D". | |||
| Controller C also needs to inform router D of the correspondence | Controller C also needs to inform router D of the correspondence | |||
| between <Lc-D, Lu, Lt> and tree T. | between <Lc-D, Lu, Lt> and tree T. | |||
| To achieve that, when C sends a Replication State route, for each | To achieve that, when C sends a Replication State route, for each | |||
| tunnel in the TEA, it may include a label stack Sub-TLV [RFC9012], | tunnel in the TEA, it may include a label stack Sub-TLV [RFC9012], | |||
| with the outer label being the context label Lc-D (received by the | with the outer label being the context label Lc-D (received by the | |||
| controller from the corresponding downstream), the next label being | controller from the corresponding downstream), the next label being | |||
| the upstream neighbor label Lu, and the inner label being the label | the upstream neighbor label Lu, and the inner label being the label | |||
| Lt assigned by the controller for the tree. The router receiving the | Lt assigned by the controller for the tree. The router receiving the | |||
| route will use the label stacks to send traffic to its downstreams. | route will use the label stacks to send traffic to its downstreams. | |||
| For C to signal the expected label stack for D to receive traffic | For C to signal the expected label stack for D to receive traffic | |||
| with, we overload a tunnel TLV in the TEA of the Replication State | with, we overload a tunnel TLV in the TEA of the Replication State | |||
| route sent to D - if the tunnel TLV has a RPF sub-TLV | route sent to D - if the tunnel TLV has a RPF sub-TLV | |||
| (Section 3.1.4), then it indicates that this is actually for | (Section 3.1.5), then it indicates that this is actually for | |||
| receiving traffic from the upstream. | receiving traffic from the upstream. | |||
| 1.5. Determining Root/Leaves | 1.5. Determining Root/Leaves | |||
| For the controller to calculate a tree, it needs to determine the | For the controller to calculate a tree, it needs to determine the | |||
| root and leaves of the tree. This may be based on provisioning | root and leaves of the tree. This may be based on provisioning | |||
| (static or dynamically programmed), or based on BGP signaling as | (static or dynamically programmed), or based on BGP signaling as | |||
| described in the following two sections. | described in the following two sections. | |||
| In both of the following cases, the BGP updates are targeted at the | In both of the following cases, the BGP updates are targeted at the | |||
| skipping to change at page 13, line 20 ¶ | skipping to change at page 13, line 29 ¶ | |||
| data rate is needed, the ingress PE can advertise/withdraw S-PMSI | data rate is needed, the ingress PE can advertise/withdraw S-PMSI | |||
| routes targeted only at the controllers, without PMSI Tunnel | routes targeted only at the controllers, without PMSI Tunnel | |||
| Attribute attached. The controller then updates relevant MCAST-TREE | Attribute attached. The controller then updates relevant MCAST-TREE | |||
| Replication State routes to update C-multicast forwarding states on | Replication State routes to update C-multicast forwarding states on | |||
| PEs to switch to a new tunnel. | PEs to switch to a new tunnel. | |||
| 3. Specification | 3. Specification | |||
| 3.1. Enhancements to TEA | 3.1. Enhancements to TEA | |||
| This document specifies two new Tunnel Types and four new sub-TLVs. | A TEA may encode a list of tunnels. A TEA attached to an MCAST-TREE | |||
| NLRI encodes replication information for a <tree, node > that is | ||||
| identified by the NRLI. Each tunnel in the TEA identifies a branch - | ||||
| either an upstream branch towards the tree root (Section 3.1.5) or a | ||||
| downstream branch towards some leaves. A tunnel in the TEA could | ||||
| have an outer encapsulation (e.g. MPLS label stack) or it could just | ||||
| be a one-hop direct connection for native IP multicast forwarding | ||||
| without any outer encapsulation. | ||||
| This document specifies three new Tunnel Types and four new sub-TLVs. | ||||
| The type codes will be assigned by IANA from the "BGP Tunnel | The type codes will be assigned by IANA from the "BGP Tunnel | |||
| Encapsulation Attribute Tunnel Types". | Encapsulation Attribute Tunnel Types". | |||
| 3.1.1. Any-Encapsulation Tunnel | 3.1.1. Any-Encapsulation Tunnel | |||
| When a multicast packet needs to be sent from an upstream node to a | When a multicast packet needs to be sent from an upstream node to a | |||
| downstream node, it may not matter how it is sent - natively when the | downstream node, it may not matter how it is sent - natively when the | |||
| two nodes are directly connected or tunneled otherwise. In case of | two nodes are directly connected or tunneled otherwise. In case of | |||
| tunneling, it may not matter what kind of tunnel is used - MPLS, GRE, | tunneling, it may not matter what kind of tunnel is used - MPLS, GRE, | |||
| IPinIP, or whatever. | IPinIP, or whatever. | |||
| To support this, an "Any-Encapsulation" tunnel type of value 20 is | To support this, an "Any-Encapsulation" tunnel type of value 20 is | |||
| defined. This tunnel MAY have a Tunnel Endpoint and other Sub-TLVs. | defined. This tunnel MAY have a Tunnel Egress Endpoint and other | |||
| The Tunnel Endpoint Sub-TLV specifies an IP address, which could be | Sub-TLVs. The Tunnel Egress Endpoint Sub-TLV specifies an IP | |||
| any of the following: | address, which could be any of the following: | |||
| o An interface's local address - when a packet needs to sent out of | o An interface's local address - when a packet needs to sent out of | |||
| the corresponding interface natively. On a LAN multicast MAC | the corresponding interface natively. On a LAN multicast MAC | |||
| address MUST be used. | address MUST be used. | |||
| o A directly connected neighbor's interface address - when a packet | o A directly connected neighbor's interface address - when a packet | |||
| needs to unicast to the address natively. | needs to unicast to the address natively. | |||
| o An address that is not directly connected - when a packet needs to | o An address that is not directly connected - when a packet needs to | |||
| be tunneled to the address (any tunnel type/instance can be used). | be tunneled to the address (any tunnel type/instance can be used). | |||
| 3.1.2. Load-balancing Tunnel | 3.1.2. Load-balancing Tunnel | |||
| Consider that a multicast packet needs to be sent to a downstream | Consider that a multicast packet needs to be sent to a downstream | |||
| node, which could be reached via four paths P1~P4. If it does not | node, which could be reached via four paths P1~P4. If it does not | |||
| matter which of path is taken, an "Any-Encapsulation" tunnel with the | matter which of path is taken, an "Any-Encapsulation" tunnel with the | |||
| Tunnel Endpoint Sub-TLV specifying the downstream node's loopback | Tunnel Egress Endpoint Sub-TLV specifying the downstream node's | |||
| address works well. If the controller wants to specify that only | loopback address works well. If the controller wants to specify that | |||
| P1~P2 should be used, then a "Load-balancing" tunnel needs to be | only P1~P2 should be used, then a "Load-balancing" tunnel needs to be | |||
| used, listing P1 and P2 as member tunnels of the "Load-balancing" | used, listing P1 and P2 as member tunnels of the "Load-balancing" | |||
| tunnel. | tunnel. | |||
| A load-balancing tunnel has one "Member Tunnels" Sub-TLV defined in | A load-balancing tunnel has one "Member Tunnels" Sub-TLV defined in | |||
| this document. The Sub-TLV is a list of tunnels, each specifying a | this document. The Sub-TLV is a list of tunnels, each specifying a | |||
| way to reach the downstream. A packet will be sent out of one of the | way to reach the downstream. A packet will be sent out of one of the | |||
| tunnels listed in the Member Tunnels Sub-TLV of the load-balancing | tunnels listed in the Member Tunnels Sub-TLV of the load-balancing | |||
| tunnel. | tunnel. | |||
| 3.1.3. Receiving MPLS Label Stack | 3.1.3. Segment List Tunnel | |||
| A Segment List tunnel has a Segment List sub-TLV. The encoding of | ||||
| the sub-TLV is as specified in Section 2.4.4 of | ||||
| [I-D.ietf-idr-segment-routing-te-policy]. An example use of a | ||||
| Segment List tunnel is provided in Section 3.4.3. | ||||
| 3.1.4. Receiving MPLS Label Stack | ||||
| While [I-D.ietf-bess-bgp-multicast] uses S-PMSI A-D routes to signal | While [I-D.ietf-bess-bgp-multicast] uses S-PMSI A-D routes to signal | |||
| forwarding information for MP2MP upstream traffic, when controller | forwarding information for MP2MP upstream traffic, when controller | |||
| signaling is used, a single Replication State route is used for both | signaling is used, a single Replication State route is used for both | |||
| upstream and downstream traffic. Since different upstream and | upstream and downstream traffic. Since different upstream and | |||
| downstream labels need to be used, a new "Receiving MPLS Label Stack" | downstream labels need to be used, a new "Receiving MPLS Label Stack" | |||
| of type TBD is added as a tunnel sub-TLV in addition to the existing | of type TBD is added as a tunnel sub-TLV in addition to the existing | |||
| MPLS Label Stack sub-TLV. Other than type difference, the two are | MPLS Label Stack sub-TLV. Other than type difference, the two are | |||
| the encoded the same way. | the encoded the same way. | |||
| skipping to change at page 14, line 37 ¶ | skipping to change at page 15, line 16 ¶ | |||
| tunnel in the TEA of Replication State route for an MP2MP tunnel to | tunnel in the TEA of Replication State route for an MP2MP tunnel to | |||
| specify the forwarding information for upstream traffic from the | specify the forwarding information for upstream traffic from the | |||
| corresponding downstream node. A label stack instead of a single | corresponding downstream node. A label stack instead of a single | |||
| label is used because of the need for neighbor based RPF check, as | label is used because of the need for neighbor based RPF check, as | |||
| further explained in the following section. | further explained in the following section. | |||
| The Receiving MPLS Label Stack sub-TLV is also used for downstream | The Receiving MPLS Label Stack sub-TLV is also used for downstream | |||
| traffic from the upstream for both P2MP and MP2MP, as specified | traffic from the upstream for both P2MP and MP2MP, as specified | |||
| below. | below. | |||
| 3.1.4. RPF Sub-TLV | 3.1.5. RPF Sub-TLV | |||
| The RPF sub-TLV is of type 124 allocated by IANA and has a one-octet | The RPF sub-TLV is of type 124 allocated by IANA and has a one-octet | |||
| length. The length is 0 currently, but if necessary in the future, | length. The length is 0 currently, but if necessary in the future, | |||
| sub-sub-TLVs could be placed in its value part. If the RPF sub-TLV | sub-sub-TLVs could be placed in its value part. If the RPF sub-TLV | |||
| appears in a tunnel, it indicates that the "tunnel" is for the | appears in a tunnel, it indicates that the "tunnel" is for the | |||
| upstream node instead of a downstream node. | upstream node instead of a downstream node. | |||
| In case of MPLS, the tunnel contains an Receiving MPLS Label Stack | In case of MPLS, the tunnel contains an Receiving MPLS Label Stack | |||
| sub-TLV for downstream traffic from the upstream node, and in case of | sub-TLV for downstream traffic from the upstream node, and in case of | |||
| MP2MP it also contains a regular MPLS Label Stack sub-TLV for | MP2MP it also contains a regular MPLS Label Stack sub-TLV for | |||
| upstream traffic to the upstream node. | upstream traffic to the upstream node. | |||
| The inner most label in the Receiving MPLS Label Stack is the | The inner most label in the Receiving MPLS Label Stack is the | |||
| incoming label identifying the tree (for comparison the inner most | incoming label identifying the tree (for comparison the inner most | |||
| label for a regular MPLS Label Stack is the outgoing label). If the | label for a regular MPLS Label Stack is the outgoing label). If the | |||
| Receiving MPLS Label Stack sub-TLVe has more than one labels, the | Receiving MPLS Label Stack sub-TLVe has more than one labels, the | |||
| second inner most label in the stack identifies the expected upstream | second inner most label in the stack identifies the expected upstream | |||
| neighbor and explicit RPF checking needs to be set up for the tree | neighbor and explicit RPF checking needs to be set up for the tree | |||
| label accordingly. | label accordingly. | |||
| 3.1.5. Tree Label Stack sub-TLV | 3.1.6. Tree Label Stack sub-TLV | |||
| The MPLS Label Stack sub-TLV can be used to specify the complete | The MPLS Label Stack sub-TLV can be used to specify the complete | |||
| label stack used to send traffic, with the stack including both a | label stack used to send traffic, with the stack including both a | |||
| transport label (stack) and label(s) that identify the (tree, | transport label (stack) and label(s) that identify the (tree, | |||
| neighbor) to the downstream node. There are cases where the | neighbor) to the downstream node. There are cases where the | |||
| controller only wants to specify the tree-identifying labels but | controller only wants to specify the tree-identifying labels but | |||
| leave the transport details to the router itself. For example, the | leave the transport details to the router itself. For example, the | |||
| router could locally determine a transport label (stack) and combine | router could locally determine a transport label (stack) and combine | |||
| with the tree-identifying labels signaled from the controller to get | with the tree-identifying labels signaled from the controller to get | |||
| the complete outgoing label stack. | the complete outgoing label stack. | |||
| For that purpose, a new Tree Label Stack sub-TLV of type 125 is | For that purpose, a new Tree Label Stack sub-TLV of type 125 is | |||
| defined, with a one-octet length field. The value field contains a | defined, with a one-octet length field. It MAY appear in an Any- | |||
| label stack with the same encoding as value part of the MPLS Label | Encapsulation tunnel. The value field contains a label stack with | |||
| Stack sub-TLV, but with a different type. A stack is specified | the same encoding as value part of the MPLS Label Stack sub-TLV, but | |||
| because it may take up to three labels (see Section 1.4): | with a different type. A stack is specified because it may take up | |||
| to three labels (see Section 1.4): | ||||
| o If different nodes use different labels (allocated from the common | o If different nodes use different labels (allocated from the common | |||
| SRGB or the node's SRLB) for a (tree, neighbor) tuple, only a | SRGB or the node's SRLB) for a (tree, neighbor) tuple, only a | |||
| single label is in the stack. This is similar to current mLDP hop | single label is in the stack. This is similar to current mLDP hop | |||
| by hop signaling case. | by hop signaling case. | |||
| o If different nodes use the same tree label, then an additional | o If different nodes use the same tree label, then an additional | |||
| neighbor-identifying label is needed in front of the tree label. | neighbor-identifying label is needed in front of the tree label. | |||
| o For the previous bullet, if the neighbor-identifying label is | o For the previous bullet, if the neighbor-identifying label is | |||
| allocated from the controller's local label space, then an | allocated from the controller's local label space, then an | |||
| additional context label is needed in front of the neighbor label. | additional context label is needed in front of the neighbor label. | |||
| 3.1.6. Backup Tunnel sub-TLV | 3.1.7. Backup Tunnel sub-TLV | |||
| The Backup Tunnel sub-TLV is used to specify the backup paths for the | The Backup Tunnel sub-TLV is used to specify the backup paths for an | |||
| tunnel. The length is two-octet. The value part encodes a one-octet | Any-Encapsulation or Segment List tunnel. The length is two-octet. | |||
| flags field and a variable length Tunnel Encapsulation Attribute. If | The value part encodes a one-octet flags field and a variable length | |||
| the tunnel goes down, traffic that is normally sent out of the tunnel | Tunnel Encapsulation Attribute. If the tunnel goes down, traffic | |||
| is fast rerouted to the tunnels listed in the encoded TEA. | that is normally sent out of the tunnel is fast rerouted to the | |||
| tunnels listed in the encoded TEA. | ||||
| +--------------------------------+ | +--------------------------------+ | |||
| | Sub-TLV Type (1 Octet, TBD) | | | Sub-TLV Type (1 Octet, TBD) | | |||
| +--------------------------------+ | +--------------------------------+ | |||
| | Sub-TLV Length (2 Octets) | | | Sub-TLV Length (2 Octets) | | |||
| +--------------------------------+ | +--------------------------------+ | |||
| | P | rest of 1 Octet Flags | | | P | rest of 1 Octet Flags | | |||
| +--------------------------------+ | +--------------------------------+ | |||
| | Backup TEA (variable length) | | | Backup TEA (variable length) | | |||
| +--------------------------------+ | +--------------------------------+ | |||
| skipping to change at page 19, line 36 ¶ | skipping to change at page 20, line 36 ¶ | |||
| The root receives one Replication State route for each Candidate Path | The root receives one Replication State route for each Candidate Path | |||
| of the policy. Only one of the routes need to, though more than one | of the policy. Only one of the routes need to, though more than one | |||
| MAY include the above listed optional Atom TLVs in the SR P2MP Policy | MAY include the above listed optional Atom TLVs in the SR P2MP Policy | |||
| BCC. | BCC. | |||
| Alternatively, an additional route type can be used to carry policy | Alternatively, an additional route type can be used to carry policy | |||
| information instead. Details/decision to be specified in a future | information instead. Details/decision to be specified in a future | |||
| revision. | revision. | |||
| 3.4.3. Tunnel Encapsulation Attribute for SR-P2MP | 3.4.3. Tunnel Encapsulation Attribute | |||
| For SR-P2MP, there are two methods of encoding forwarding information | ||||
| in the TEA, as described below. | ||||
| 3.4.3.1. TEA with Tunnel TLVs Being Replication Branches | ||||
| In this method, a TEA with tunnels being replication branches as | ||||
| specified in earlier sections can be used just as in non SR-P2NP | ||||
| cases. | ||||
| Additionally, a replication branch can also be encoded as a segment | ||||
| list, with a "Segment List" tunnel type. The tunnel has a Segment | ||||
| List sub-TLV as specified in Section 2.4.4 of | ||||
| [I-D.ietf-idr-segment-routing-te-policy]. | ||||
| For a "Segment List" tunnel, the last segment in the segment list | ||||
| represents the SID of the tree. When it is without the RPF sub-TLV, | ||||
| the previous segments in the list steer traffic to the downstream | ||||
| node, and the segment before the last one MAY also be a binding SID | ||||
| for another P2MP tunnel, meaning that the replication branch | ||||
| represented by this "Segment List" is actually a P2MP tunnel to a set | ||||
| of downstream nodes. | ||||
| 3.4.3.2. TEA with a Single SR-P2MP Policy Tunnel | The TEA attached to a Replication State route for SR-P2MP encodes | |||
| tunnels as specified in earlier sections. A tunnel could be an Any- | ||||
| Encapsulation tunnel with MPLS Label Stack sub-TLV or Receiving MPLS | ||||
| Label Stack sub-TLV (in case of SR-MPLS), a Segment List tunnel, or a | ||||
| Load-balancing tunnel. | ||||
| Alternatively, a TEA with a single SR-P2MP Policy tunnel type similar | For a Segment List tunnel in this context, the last segment in the | |||
| to the SR Policy tunnel type can be used. The details are specified | segment list represents the SID of the tree. When it is without the | |||
| in [I-D.hb-idr-sr-p2mp-policy] but may be moved here depending on WG | RPF sub-TLV, the previous segments in the list steer traffic to the | |||
| consensus. | downstream node, and the segment before the last one MAY also be a | |||
| binding SID for another P2MP tunnel, meaning that the replication | ||||
| branch represented by this "Segment List" is actually a P2MP tunnel | ||||
| to a set of downstream nodes. | ||||
| 3.5. Replication State Route with Label Stack for Tree Identification | 3.5. Replication State Route with Label Stack for Tree Identification | |||
| As described in Section 1.3, tree label instead of tree | As described in Section 1.3, tree label instead of tree | |||
| identification could be encoded in the NLRI to identify the tree in | identification could be encoded in the NLRI to identify the tree in | |||
| the control plane as well as in the forwarding plane. For that a new | the control plane as well as in the forwarding plane. For that a new | |||
| Tree Type of 2 is used and the Replication State route has the | Tree Type of 2 is used and the Replication State route has the | |||
| following format: | following format: | |||
| +-------------------------------------+ | +-------------------------------------+ | |||
| skipping to change at page 23, line 32 ¶ | skipping to change at page 24, line 27 ¶ | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| [RFC9012] Patel, K., Van de Velde, G., Sangli, S., and J. Scudder, | [RFC9012] Patel, K., Van de Velde, G., Sangli, S., and J. Scudder, | |||
| "The BGP Tunnel Encapsulation Attribute", RFC 9012, | "The BGP Tunnel Encapsulation Attribute", RFC 9012, | |||
| DOI 10.17487/RFC9012, April 2021, | DOI 10.17487/RFC9012, April 2021, | |||
| <https://www.rfc-editor.org/info/rfc9012>. | <https://www.rfc-editor.org/info/rfc9012>. | |||
| 8.2. Informative References | 8.2. Informative References | |||
| [I-D.hb-idr-sr-p2mp-policy] | ||||
| Bidgoli, H., Voyer, D., Stone, A., Parekh, R., Krier, S., | ||||
| and A. Venkateswaran, "Advertising p2mp policies in BGP", | ||||
| draft-hb-idr-sr-p2mp-policy-04 (work in progress), October | ||||
| 2021. | ||||
| [RFC6388] Wijnands, IJ., Ed., Minei, I., Ed., Kompella, K., and B. | [RFC6388] Wijnands, IJ., Ed., Minei, I., Ed., Kompella, K., and B. | |||
| Thomas, "Label Distribution Protocol Extensions for Point- | Thomas, "Label Distribution Protocol Extensions for Point- | |||
| to-Multipoint and Multipoint-to-Multipoint Label Switched | to-Multipoint and Multipoint-to-Multipoint Label Switched | |||
| Paths", RFC 6388, DOI 10.17487/RFC6388, November 2011, | Paths", RFC 6388, DOI 10.17487/RFC6388, November 2011, | |||
| <https://www.rfc-editor.org/info/rfc6388>. | <https://www.rfc-editor.org/info/rfc6388>. | |||
| [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ | [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ | |||
| BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February | BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February | |||
| 2012, <https://www.rfc-editor.org/info/rfc6513>. | 2012, <https://www.rfc-editor.org/info/rfc6513>. | |||
| End of changes. 30 change blocks. | ||||
| 90 lines changed or deleted | 96 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||