< draft-ietf-bess-bgp-multicast-controller-08.txt   draft-ietf-bess-bgp-multicast-controller-09.txt >
BESS Z. Zhang BESS Z. Zhang
Internet-Draft Juniper Networks Internet-Draft Juniper Networks
Intended status: Standards Track R. Raszuk Intended status: Standards Track R. Raszuk
Expires: October 8, 2022 NTT Network Innovations Expires: October 13, 2022 NTT Network Innovations
D. Pacella D. Pacella
Verizon Verizon
A. Gulko A. Gulko
Edward Jones Wealth Management Edward Jones Wealth Management
April 6, 2022 April 11, 2022
Controller Based BGP Multicast Signaling Controller Based BGP Multicast Signaling
draft-ietf-bess-bgp-multicast-controller-08 draft-ietf-bess-bgp-multicast-controller-09
Abstract Abstract
This document specifies a way that one or more centralized This document specifies a way that one or more centralized
controllers can use BGP to set up multicast distribution trees controllers can use BGP to set up multicast distribution trees
(identified by either IP source/destination address pair, mLDP FEC, (identified by either IP source/destination address pair, mLDP FEC,
or SR-P2MP Tree-ID) in a network. Since the controllers calculate or SR-P2MP Tree-ID) in a network. Since the controllers calculate
the trees, they can use sophisticated algorithms and constraints to the trees, they can use sophisticated algorithms and constraints to
achieve traffic engineering. The controllers directly signal dynamic achieve traffic engineering. The controllers directly signal dynamic
replication state to tree nodes, leading to very simple multicast replication state to tree nodes, leading to very simple multicast
skipping to change at page 2, line 7 skipping to change at page 2, line 7
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on October 8, 2022. This Internet-Draft will expire on October 13, 2022.
Copyright Notice Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 37 skipping to change at page 2, line 37
1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 3
1.2. Resilience . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Resilience . . . . . . . . . . . . . . . . . . . . . . . 4
1.3. Signaling . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3. Signaling . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4. Label Allocation . . . . . . . . . . . . . . . . . . . . 6 1.4. Label Allocation . . . . . . . . . . . . . . . . . . . . 6
1.4.1. Using a Common per-tree Label for All Routers . . . . 7 1.4.1. Using a Common per-tree Label for All Routers . . . . 7
1.4.2. Upstream-assignment from Controller's Local Label 1.4.2. Upstream-assignment from Controller's Local Label
Space . . . . . . . . . . . . . . . . . . . . . . . . 8 Space . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5. Determining Root/Leaves . . . . . . . . . . . . . . . . . 9 1.5. Determining Root/Leaves . . . . . . . . . . . . . . . . . 9
1.5.1. PIM-SSM/Bidir or mLDP . . . . . . . . . . . . . . . . 9 1.5.1. PIM-SSM/Bidir or mLDP . . . . . . . . . . . . . . . . 9
1.5.2. PIM ASM . . . . . . . . . . . . . . . . . . . . . . . 9 1.5.2. PIM ASM . . . . . . . . . . . . . . . . . . . . . . . 9
1.6. Multiple Domains . . . . . . . . . . . . . . . . . . . . 9 1.6. Multiple Domains . . . . . . . . . . . . . . . . . . . . 10
1.7. SR-P2MP . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.7. SR-P2MP . . . . . . . . . . . . . . . . . . . . . . . . . 11
2. Alternative to BGP-MVPN . . . . . . . . . . . . . . . . . . . 11 2. Alternative to BGP-MVPN . . . . . . . . . . . . . . . . . . . 11
3. Specification . . . . . . . . . . . . . . . . . . . . . . . . 13 3. Specification . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1. Enhancements to TEA . . . . . . . . . . . . . . . . . . . 13 3.1. Enhancements to TEA . . . . . . . . . . . . . . . . . . . 13
3.1.1. Any-Encapsulation Tunnel . . . . . . . . . . . . . . 13 3.1.1. Any-Encapsulation Tunnel . . . . . . . . . . . . . . 13
3.1.2. Load-balancing Tunnel . . . . . . . . . . . . . . . . 13 3.1.2. Load-balancing Tunnel . . . . . . . . . . . . . . . . 14
3.1.3. Receiving MPLS Label Stack . . . . . . . . . . . . . 14 3.1.3. Segment List Tunnel . . . . . . . . . . . . . . . . . 14
3.1.4. RPF Sub-TLV . . . . . . . . . . . . . . . . . . . . . 14 3.1.4. Receiving MPLS Label Stack . . . . . . . . . . . . . 14
3.1.5. Tree Label Stack sub-TLV . . . . . . . . . . . . . . 15 3.1.5. RPF Sub-TLV . . . . . . . . . . . . . . . . . . . . . 15
3.1.6. Backup Tunnel sub-TLV . . . . . . . . . . . . . . . . 15 3.1.6. Tree Label Stack sub-TLV . . . . . . . . . . . . . . 15
3.2. Context Label TLV in BGP-LS Node Attribute . . . . . . . 16 3.1.7. Backup Tunnel sub-TLV . . . . . . . . . . . . . . . . 16
3.2. Context Label TLV in BGP-LS Node Attribute . . . . . . . 17
3.3. Replicate State Route Type . . . . . . . . . . . . . . . 17 3.3. Replicate State Route Type . . . . . . . . . . . . . . . 17
3.4. SR P2MP Signaling . . . . . . . . . . . . . . . . . . . . 17 3.4. SR P2MP Signaling . . . . . . . . . . . . . . . . . . . . 18
3.4.1. Replication State Route for SR P2MP . . . . . . . . . 18 3.4.1. Replication State Route for SR P2MP . . . . . . . . . 18
3.4.2. BGP Community Container for SR P2MP Policy . . . . . 18 3.4.2. BGP Community Container for SR P2MP Policy . . . . . 19
3.4.3. Tunnel Encapsulation Attribute for SR-P2MP . . . . . 19 3.4.3. Tunnel Encapsulation Attribute . . . . . . . . . . . 20
3.4.3.1. TEA with Tunnel TLVs Being Replication Branches . 19
3.4.3.2. TEA with a Single SR-P2MP Policy Tunnel . . . . . 20
3.5. Replication State Route with Label Stack for Tree 3.5. Replication State Route with Label Stack for Tree
Identification . . . . . . . . . . . . . . . . . . . . . 20 Identification . . . . . . . . . . . . . . . . . . . . . 21
4. Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 21 4. Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 21
5. Security Considerations . . . . . . . . . . . . . . . . . . . 21 5. Security Considerations . . . . . . . . . . . . . . . . . . . 22
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 22 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 23
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 23
8.1. Normative References . . . . . . . . . . . . . . . . . . 22 8.1. Normative References . . . . . . . . . . . . . . . . . . 23
8.2. Informative References . . . . . . . . . . . . . . . . . 23 8.2. Informative References . . . . . . . . . . . . . . . . . 24
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25
1. Overview 1. Overview
1.1. Introduction 1.1. Introduction
[I-D.ietf-bess-bgp-multicast] describes a way to use BGP as a [I-D.ietf-bess-bgp-multicast] describes a way to use BGP as a
replacement signaling for PIM [RFC7761] or mLDP [RFC6388]. The BGP- replacement signaling for PIM [RFC7761] or mLDP [RFC6388]. The BGP-
based multicast signaling described there provides a mechanism for based multicast signaling described there provides a mechanism for
setting up both (s,g)/(*,g) multicast trees (as PIM does, but setting up both (s,g)/(*,g) multicast trees (as PIM does, but
optionally with labels) and labeled (MPLS) multicast tunnels (as mLDP optionally with labels) and labeled (MPLS) multicast tunnels (as mLDP
skipping to change at page 3, line 45 skipping to change at page 3, line 44
for more than one flow. In the latter case, the trees are often for more than one flow. In the latter case, the trees are often
referred to as "multicast tunnels" or "multipoint tunnels", and referred to as "multicast tunnels" or "multipoint tunnels", and
specifically in this document they are mLDP tunnels (except that they specifically in this document they are mLDP tunnels (except that they
are set up with BGP signaling). While it actually does not have to are set up with BGP signaling). While it actually does not have to
be restricted to mLDP tunnels, mLDP FEC is conveniently borrowed to be restricted to mLDP tunnels, mLDP FEC is conveniently borrowed to
identify the tunnel. In the rest of the document, the term tree and identify the tunnel. In the rest of the document, the term tree and
tunnel are used interchangeably. tunnel are used interchangeably.
The trees/tunnels are set up using the "receiver-initiated join" The trees/tunnels are set up using the "receiver-initiated join"
technique of PIM/mLDP, hop by hop from downstream routers towards the technique of PIM/mLDP, hop by hop from downstream routers towards the
root. The BGP messages are either sent hop by hop between downstream root. The BGP messages of MCAST-TREE SAFI are either sent hop by hop
routers and their upstream neighbors, or can be reflected by Route between downstream routers and their upstream neighbors, or can be
Reflectors (RRs). reflected by Route Reflectors (RRs).
As an alternative to each hop independently determining its upstream As an alternative to each hop independently determining its upstream
router and signaling upstream towards the root (following PIM/mLDP router and signaling upstream towards the root (following PIM/mLDP
model), the entire tree can be calculated by a centralized model), the entire tree can be calculated by a centralized
controller, and the signaling can be entirely done from the controller, and the signaling can be entirely done from the
controller. For that, some additional procedures and optimizations controller using the same MCAST-TREE SAFI. For that, some additional
are specified in this document. procedures and optimizations are specified in this document.
[I-D.ietf-bess-bgp-multicast] uses S-PMSI, Leaf, and Source Active [I-D.ietf-bess-bgp-multicast] uses S-PMSI, Leaf, and Source Active
Auto-Discovery (A-D) routes because the main procedures and concepts Auto-Discovery (A-D) routes because the main procedures and concepts
are borrowed from the BGP-MVPN [RFC6514]. While the same Leaf A-D are borrowed from the BGP-MVPN [RFC6514]. While the same Leaf A-D
routes can be used to signal replication state to tree nodes from routes can be used to signal replication state to tree nodes from
controllers, this document introduces a new route type "Replication controllers, this document introduces a new route type "Replication
State" for the same functionality, so that familiarity with the BGP- State" for the same functionality, so that familiarity with the BGP-
MVPN concepts is not required. MVPN concepts is not required.
While it is outside the scope of this document, signaling from the While it is outside the scope of this document, signaling from the
skipping to change at page 5, line 32 skipping to change at page 5, line 31
the route, which indicates that this router is the target and the route, which indicates that this router is the target and
consumer of the route hence it should not be re-advertised further. consumer of the route hence it should not be re-advertised further.
The routes includes the forwarding information in the form of Tunnel The routes includes the forwarding information in the form of Tunnel
Encapsulation Attributes (TEA) [RFC9012], with enhancements specified Encapsulation Attributes (TEA) [RFC9012], with enhancements specified
in this document. in this document.
Suppose that for a particular tree, there are two downstream routers Suppose that for a particular tree, there are two downstream routers
D1 and D2 for a particular upstream router U. A controller C sends D1 and D2 for a particular upstream router U. A controller C sends
one Replication State route to U, with the Tree Node's IP Address one Replication State route to U, with the Tree Node's IP Address
field (see Section 3.3) set to U's IP address and the TEA specifying field (see Section 3.3) set to U's IP address and the TEA specifying
both the two downstreams and its upstream (see Section 3.1.4). In both the two downstreams and its upstream (see Section 3.1.5). In
this case, the Originating Router's Address field of the Replication this case, the Originating Router's Address field of the Replication
State route is set to the controller's address. Note that for a TEA State route is set to the controller's address. Note that for a TEA
attached to a unicast NLRI, only one of the tunnels in a TEA is used attached to a unicast NLRI, only one of the tunnels in a TEA is used
for forwarding a particular packet, while all the tunnels in a TEA for forwarding a particular packet, while all the tunnels in a TEA
are used to reach multiple endpoints when it is attached to a are used to reach multiple endpoints when it is attached to a
multicast NLRI. multicast NLRI.
It could be that U may need to replicate to many downstream routers,
say D1 through D1000. In that case, it may not be possible to encode
all those branches in a single TEA, or may not be optimal to update a
large TEA when a branch is added/removed. In that case, C may send
multiple Replication State routes, each with a different Originating
Router's Address field and a different TEA that encodes a subset of
the branches. This provides a flexible way to optimize the encoding
of large number of branches and incremental updates of branches.
Notice that, in case of labeled trees, the (x,g), mLDP FEC, or SR- Notice that, in case of labeled trees, the (x,g), mLDP FEC, or SR-
P2MP tree identification (Section 1.7) signaling is actually not P2MP tree identification (Section 1.7) signaling is actually not
needed to transit routers but only needed to tunnel root/leaves. needed to transit routers but only needed to tunnel root/leaves.
However, for consistency among the root/leaf/transit nodes, and for However, for consistency among the root/leaf/transit nodes, and for
consistency with the hop-by-hop signaling, the same signaling (with consistency with the hop-by-hop signaling, the same signaling (with
tree identification encoded in the NLRI) is used to all routers. tree identification encoded in the NLRI) is used to all routers.
Nonetheless, a new NLRI route type is defined to encode label/SID Nonetheless, a new NLRI route type of the MCAST-TREE SAFI is defined
instead of tree identification in the NLRI, for scenarios where there to encode label/SID instead of tree identification in the NLRI, for
is really no need to signal tree identification, e.g. as described in scenarios where there is really no need to signal tree
Section 2. On a tunnel root, the tree's binding SID can be encoded identification, e.g. as described in Section 2. On a tunnel root,
in the NLRI. the tree's binding SID can be encoded in the NLRI.
For a tree node to acknowledge to the controller that it has received For a tree node to acknowledge to the controller that it has received
the signaling and installed corresponding forwarding state, it the signaling and installed corresponding forwarding state, it
advertises a corresponding Replication State route, with the advertises a corresponding Replication State route, with the
Originating Router's IP Address set to itself and with a Route Target Originating Router's IP Address set to itself and with a Route Target
to match the controller. For comparison, the tree signaling to match the controller. For comparison, the tree signaling
Replication State route from the controller has the Originating Replication State route from the controller has the Originating
Router's IP Address set to the controller and the Route Target Router's IP Address set to the controller and the Route Target
matching the tree node. The two Replication State routes (for matching the tree node. The two Replication State routes (for
controller to signal to a tree node and for a tree node to controller to signal to a tree node and for a tree node to
skipping to change at page 8, line 41 skipping to change at page 9, line 4
corresponding to each controller C, a context label that identifies corresponding to each controller C, a context label that identifies
the upstream-assigned label space used by that controller. This the upstream-assigned label space used by that controller. This
label, call it Lc-D, is communicated by D to C via BGP-LS [RFC 7752]. label, call it Lc-D, is communicated by D to C via BGP-LS [RFC 7752].
Suppose a controller is setting up unidirectional tree T. It assigns Suppose a controller is setting up unidirectional tree T. It assigns
that tree the label Lt, and assigns label Lu to identify router U that tree the label Lt, and assigns label Lu to identify router U
which is the upstream of router D on tree T. C needs to tell U: "to which is the upstream of router D on tree T. C needs to tell U: "to
send a packet on the given tree/tunnel, one of the things you have to send a packet on the given tree/tunnel, one of the things you have to
do is push Lt onto the packet's label stack, then push Lu, then push do is push Lt onto the packet's label stack, then push Lu, then push
Lc-D onto the packet's label stack, then unicast the packet to D". Lc-D onto the packet's label stack, then unicast the packet to D".
Controller C also needs to inform router D of the correspondence Controller C also needs to inform router D of the correspondence
between <Lc-D, Lu, Lt> and tree T. between <Lc-D, Lu, Lt> and tree T.
To achieve that, when C sends a Replication State route, for each To achieve that, when C sends a Replication State route, for each
tunnel in the TEA, it may include a label stack Sub-TLV [RFC9012], tunnel in the TEA, it may include a label stack Sub-TLV [RFC9012],
with the outer label being the context label Lc-D (received by the with the outer label being the context label Lc-D (received by the
controller from the corresponding downstream), the next label being controller from the corresponding downstream), the next label being
the upstream neighbor label Lu, and the inner label being the label the upstream neighbor label Lu, and the inner label being the label
Lt assigned by the controller for the tree. The router receiving the Lt assigned by the controller for the tree. The router receiving the
route will use the label stacks to send traffic to its downstreams. route will use the label stacks to send traffic to its downstreams.
For C to signal the expected label stack for D to receive traffic For C to signal the expected label stack for D to receive traffic
with, we overload a tunnel TLV in the TEA of the Replication State with, we overload a tunnel TLV in the TEA of the Replication State
route sent to D - if the tunnel TLV has a RPF sub-TLV route sent to D - if the tunnel TLV has a RPF sub-TLV
(Section 3.1.4), then it indicates that this is actually for (Section 3.1.5), then it indicates that this is actually for
receiving traffic from the upstream. receiving traffic from the upstream.
1.5. Determining Root/Leaves 1.5. Determining Root/Leaves
For the controller to calculate a tree, it needs to determine the For the controller to calculate a tree, it needs to determine the
root and leaves of the tree. This may be based on provisioning root and leaves of the tree. This may be based on provisioning
(static or dynamically programmed), or based on BGP signaling as (static or dynamically programmed), or based on BGP signaling as
described in the following two sections. described in the following two sections.
In both of the following cases, the BGP updates are targeted at the In both of the following cases, the BGP updates are targeted at the
skipping to change at page 13, line 20 skipping to change at page 13, line 29
data rate is needed, the ingress PE can advertise/withdraw S-PMSI data rate is needed, the ingress PE can advertise/withdraw S-PMSI
routes targeted only at the controllers, without PMSI Tunnel routes targeted only at the controllers, without PMSI Tunnel
Attribute attached. The controller then updates relevant MCAST-TREE Attribute attached. The controller then updates relevant MCAST-TREE
Replication State routes to update C-multicast forwarding states on Replication State routes to update C-multicast forwarding states on
PEs to switch to a new tunnel. PEs to switch to a new tunnel.
3. Specification 3. Specification
3.1. Enhancements to TEA 3.1. Enhancements to TEA
This document specifies two new Tunnel Types and four new sub-TLVs. A TEA may encode a list of tunnels. A TEA attached to an MCAST-TREE
NLRI encodes replication information for a <tree, node > that is
identified by the NRLI. Each tunnel in the TEA identifies a branch -
either an upstream branch towards the tree root (Section 3.1.5) or a
downstream branch towards some leaves. A tunnel in the TEA could
have an outer encapsulation (e.g. MPLS label stack) or it could just
be a one-hop direct connection for native IP multicast forwarding
without any outer encapsulation.
This document specifies three new Tunnel Types and four new sub-TLVs.
The type codes will be assigned by IANA from the "BGP Tunnel The type codes will be assigned by IANA from the "BGP Tunnel
Encapsulation Attribute Tunnel Types". Encapsulation Attribute Tunnel Types".
3.1.1. Any-Encapsulation Tunnel 3.1.1. Any-Encapsulation Tunnel
When a multicast packet needs to be sent from an upstream node to a When a multicast packet needs to be sent from an upstream node to a
downstream node, it may not matter how it is sent - natively when the downstream node, it may not matter how it is sent - natively when the
two nodes are directly connected or tunneled otherwise. In case of two nodes are directly connected or tunneled otherwise. In case of
tunneling, it may not matter what kind of tunnel is used - MPLS, GRE, tunneling, it may not matter what kind of tunnel is used - MPLS, GRE,
IPinIP, or whatever. IPinIP, or whatever.
To support this, an "Any-Encapsulation" tunnel type of value 20 is To support this, an "Any-Encapsulation" tunnel type of value 20 is
defined. This tunnel MAY have a Tunnel Endpoint and other Sub-TLVs. defined. This tunnel MAY have a Tunnel Egress Endpoint and other
The Tunnel Endpoint Sub-TLV specifies an IP address, which could be Sub-TLVs. The Tunnel Egress Endpoint Sub-TLV specifies an IP
any of the following: address, which could be any of the following:
o An interface's local address - when a packet needs to sent out of o An interface's local address - when a packet needs to sent out of
the corresponding interface natively. On a LAN multicast MAC the corresponding interface natively. On a LAN multicast MAC
address MUST be used. address MUST be used.
o A directly connected neighbor's interface address - when a packet o A directly connected neighbor's interface address - when a packet
needs to unicast to the address natively. needs to unicast to the address natively.
o An address that is not directly connected - when a packet needs to o An address that is not directly connected - when a packet needs to
be tunneled to the address (any tunnel type/instance can be used). be tunneled to the address (any tunnel type/instance can be used).
3.1.2. Load-balancing Tunnel 3.1.2. Load-balancing Tunnel
Consider that a multicast packet needs to be sent to a downstream Consider that a multicast packet needs to be sent to a downstream
node, which could be reached via four paths P1~P4. If it does not node, which could be reached via four paths P1~P4. If it does not
matter which of path is taken, an "Any-Encapsulation" tunnel with the matter which of path is taken, an "Any-Encapsulation" tunnel with the
Tunnel Endpoint Sub-TLV specifying the downstream node's loopback Tunnel Egress Endpoint Sub-TLV specifying the downstream node's
address works well. If the controller wants to specify that only loopback address works well. If the controller wants to specify that
P1~P2 should be used, then a "Load-balancing" tunnel needs to be only P1~P2 should be used, then a "Load-balancing" tunnel needs to be
used, listing P1 and P2 as member tunnels of the "Load-balancing" used, listing P1 and P2 as member tunnels of the "Load-balancing"
tunnel. tunnel.
A load-balancing tunnel has one "Member Tunnels" Sub-TLV defined in A load-balancing tunnel has one "Member Tunnels" Sub-TLV defined in
this document. The Sub-TLV is a list of tunnels, each specifying a this document. The Sub-TLV is a list of tunnels, each specifying a
way to reach the downstream. A packet will be sent out of one of the way to reach the downstream. A packet will be sent out of one of the
tunnels listed in the Member Tunnels Sub-TLV of the load-balancing tunnels listed in the Member Tunnels Sub-TLV of the load-balancing
tunnel. tunnel.
3.1.3. Receiving MPLS Label Stack 3.1.3. Segment List Tunnel
A Segment List tunnel has a Segment List sub-TLV. The encoding of
the sub-TLV is as specified in Section 2.4.4 of
[I-D.ietf-idr-segment-routing-te-policy]. An example use of a
Segment List tunnel is provided in Section 3.4.3.
3.1.4. Receiving MPLS Label Stack
While [I-D.ietf-bess-bgp-multicast] uses S-PMSI A-D routes to signal While [I-D.ietf-bess-bgp-multicast] uses S-PMSI A-D routes to signal
forwarding information for MP2MP upstream traffic, when controller forwarding information for MP2MP upstream traffic, when controller
signaling is used, a single Replication State route is used for both signaling is used, a single Replication State route is used for both
upstream and downstream traffic. Since different upstream and upstream and downstream traffic. Since different upstream and
downstream labels need to be used, a new "Receiving MPLS Label Stack" downstream labels need to be used, a new "Receiving MPLS Label Stack"
of type TBD is added as a tunnel sub-TLV in addition to the existing of type TBD is added as a tunnel sub-TLV in addition to the existing
MPLS Label Stack sub-TLV. Other than type difference, the two are MPLS Label Stack sub-TLV. Other than type difference, the two are
the encoded the same way. the encoded the same way.
skipping to change at page 14, line 37 skipping to change at page 15, line 16
tunnel in the TEA of Replication State route for an MP2MP tunnel to tunnel in the TEA of Replication State route for an MP2MP tunnel to
specify the forwarding information for upstream traffic from the specify the forwarding information for upstream traffic from the
corresponding downstream node. A label stack instead of a single corresponding downstream node. A label stack instead of a single
label is used because of the need for neighbor based RPF check, as label is used because of the need for neighbor based RPF check, as
further explained in the following section. further explained in the following section.
The Receiving MPLS Label Stack sub-TLV is also used for downstream The Receiving MPLS Label Stack sub-TLV is also used for downstream
traffic from the upstream for both P2MP and MP2MP, as specified traffic from the upstream for both P2MP and MP2MP, as specified
below. below.
3.1.4. RPF Sub-TLV 3.1.5. RPF Sub-TLV
The RPF sub-TLV is of type 124 allocated by IANA and has a one-octet The RPF sub-TLV is of type 124 allocated by IANA and has a one-octet
length. The length is 0 currently, but if necessary in the future, length. The length is 0 currently, but if necessary in the future,
sub-sub-TLVs could be placed in its value part. If the RPF sub-TLV sub-sub-TLVs could be placed in its value part. If the RPF sub-TLV
appears in a tunnel, it indicates that the "tunnel" is for the appears in a tunnel, it indicates that the "tunnel" is for the
upstream node instead of a downstream node. upstream node instead of a downstream node.
In case of MPLS, the tunnel contains an Receiving MPLS Label Stack In case of MPLS, the tunnel contains an Receiving MPLS Label Stack
sub-TLV for downstream traffic from the upstream node, and in case of sub-TLV for downstream traffic from the upstream node, and in case of
MP2MP it also contains a regular MPLS Label Stack sub-TLV for MP2MP it also contains a regular MPLS Label Stack sub-TLV for
upstream traffic to the upstream node. upstream traffic to the upstream node.
The inner most label in the Receiving MPLS Label Stack is the The inner most label in the Receiving MPLS Label Stack is the
incoming label identifying the tree (for comparison the inner most incoming label identifying the tree (for comparison the inner most
label for a regular MPLS Label Stack is the outgoing label). If the label for a regular MPLS Label Stack is the outgoing label). If the
Receiving MPLS Label Stack sub-TLVe has more than one labels, the Receiving MPLS Label Stack sub-TLVe has more than one labels, the
second inner most label in the stack identifies the expected upstream second inner most label in the stack identifies the expected upstream
neighbor and explicit RPF checking needs to be set up for the tree neighbor and explicit RPF checking needs to be set up for the tree
label accordingly. label accordingly.
3.1.5. Tree Label Stack sub-TLV 3.1.6. Tree Label Stack sub-TLV
The MPLS Label Stack sub-TLV can be used to specify the complete The MPLS Label Stack sub-TLV can be used to specify the complete
label stack used to send traffic, with the stack including both a label stack used to send traffic, with the stack including both a
transport label (stack) and label(s) that identify the (tree, transport label (stack) and label(s) that identify the (tree,
neighbor) to the downstream node. There are cases where the neighbor) to the downstream node. There are cases where the
controller only wants to specify the tree-identifying labels but controller only wants to specify the tree-identifying labels but
leave the transport details to the router itself. For example, the leave the transport details to the router itself. For example, the
router could locally determine a transport label (stack) and combine router could locally determine a transport label (stack) and combine
with the tree-identifying labels signaled from the controller to get with the tree-identifying labels signaled from the controller to get
the complete outgoing label stack. the complete outgoing label stack.
For that purpose, a new Tree Label Stack sub-TLV of type 125 is For that purpose, a new Tree Label Stack sub-TLV of type 125 is
defined, with a one-octet length field. The value field contains a defined, with a one-octet length field. It MAY appear in an Any-
label stack with the same encoding as value part of the MPLS Label Encapsulation tunnel. The value field contains a label stack with
Stack sub-TLV, but with a different type. A stack is specified the same encoding as value part of the MPLS Label Stack sub-TLV, but
because it may take up to three labels (see Section 1.4): with a different type. A stack is specified because it may take up
to three labels (see Section 1.4):
o If different nodes use different labels (allocated from the common o If different nodes use different labels (allocated from the common
SRGB or the node's SRLB) for a (tree, neighbor) tuple, only a SRGB or the node's SRLB) for a (tree, neighbor) tuple, only a
single label is in the stack. This is similar to current mLDP hop single label is in the stack. This is similar to current mLDP hop
by hop signaling case. by hop signaling case.
o If different nodes use the same tree label, then an additional o If different nodes use the same tree label, then an additional
neighbor-identifying label is needed in front of the tree label. neighbor-identifying label is needed in front of the tree label.
o For the previous bullet, if the neighbor-identifying label is o For the previous bullet, if the neighbor-identifying label is
allocated from the controller's local label space, then an allocated from the controller's local label space, then an
additional context label is needed in front of the neighbor label. additional context label is needed in front of the neighbor label.
3.1.6. Backup Tunnel sub-TLV 3.1.7. Backup Tunnel sub-TLV
The Backup Tunnel sub-TLV is used to specify the backup paths for the The Backup Tunnel sub-TLV is used to specify the backup paths for an
tunnel. The length is two-octet. The value part encodes a one-octet Any-Encapsulation or Segment List tunnel. The length is two-octet.
flags field and a variable length Tunnel Encapsulation Attribute. If The value part encodes a one-octet flags field and a variable length
the tunnel goes down, traffic that is normally sent out of the tunnel Tunnel Encapsulation Attribute. If the tunnel goes down, traffic
is fast rerouted to the tunnels listed in the encoded TEA. that is normally sent out of the tunnel is fast rerouted to the
tunnels listed in the encoded TEA.
+--------------------------------+ +--------------------------------+
| Sub-TLV Type (1 Octet, TBD) | | Sub-TLV Type (1 Octet, TBD) |
+--------------------------------+ +--------------------------------+
| Sub-TLV Length (2 Octets) | | Sub-TLV Length (2 Octets) |
+--------------------------------+ +--------------------------------+
| P | rest of 1 Octet Flags | | P | rest of 1 Octet Flags |
+--------------------------------+ +--------------------------------+
| Backup TEA (variable length) | | Backup TEA (variable length) |
+--------------------------------+ +--------------------------------+
skipping to change at page 19, line 36 skipping to change at page 20, line 36
The root receives one Replication State route for each Candidate Path The root receives one Replication State route for each Candidate Path
of the policy. Only one of the routes need to, though more than one of the policy. Only one of the routes need to, though more than one
MAY include the above listed optional Atom TLVs in the SR P2MP Policy MAY include the above listed optional Atom TLVs in the SR P2MP Policy
BCC. BCC.
Alternatively, an additional route type can be used to carry policy Alternatively, an additional route type can be used to carry policy
information instead. Details/decision to be specified in a future information instead. Details/decision to be specified in a future
revision. revision.
3.4.3. Tunnel Encapsulation Attribute for SR-P2MP 3.4.3. Tunnel Encapsulation Attribute
For SR-P2MP, there are two methods of encoding forwarding information
in the TEA, as described below.
3.4.3.1. TEA with Tunnel TLVs Being Replication Branches
In this method, a TEA with tunnels being replication branches as
specified in earlier sections can be used just as in non SR-P2NP
cases.
Additionally, a replication branch can also be encoded as a segment
list, with a "Segment List" tunnel type. The tunnel has a Segment
List sub-TLV as specified in Section 2.4.4 of
[I-D.ietf-idr-segment-routing-te-policy].
For a "Segment List" tunnel, the last segment in the segment list
represents the SID of the tree. When it is without the RPF sub-TLV,
the previous segments in the list steer traffic to the downstream
node, and the segment before the last one MAY also be a binding SID
for another P2MP tunnel, meaning that the replication branch
represented by this "Segment List" is actually a P2MP tunnel to a set
of downstream nodes.
3.4.3.2. TEA with a Single SR-P2MP Policy Tunnel The TEA attached to a Replication State route for SR-P2MP encodes
tunnels as specified in earlier sections. A tunnel could be an Any-
Encapsulation tunnel with MPLS Label Stack sub-TLV or Receiving MPLS
Label Stack sub-TLV (in case of SR-MPLS), a Segment List tunnel, or a
Load-balancing tunnel.
Alternatively, a TEA with a single SR-P2MP Policy tunnel type similar For a Segment List tunnel in this context, the last segment in the
to the SR Policy tunnel type can be used. The details are specified segment list represents the SID of the tree. When it is without the
in [I-D.hb-idr-sr-p2mp-policy] but may be moved here depending on WG RPF sub-TLV, the previous segments in the list steer traffic to the
consensus. downstream node, and the segment before the last one MAY also be a
binding SID for another P2MP tunnel, meaning that the replication
branch represented by this "Segment List" is actually a P2MP tunnel
to a set of downstream nodes.
3.5. Replication State Route with Label Stack for Tree Identification 3.5. Replication State Route with Label Stack for Tree Identification
As described in Section 1.3, tree label instead of tree As described in Section 1.3, tree label instead of tree
identification could be encoded in the NLRI to identify the tree in identification could be encoded in the NLRI to identify the tree in
the control plane as well as in the forwarding plane. For that a new the control plane as well as in the forwarding plane. For that a new
Tree Type of 2 is used and the Replication State route has the Tree Type of 2 is used and the Replication State route has the
following format: following format:
+-------------------------------------+ +-------------------------------------+
skipping to change at page 23, line 32 skipping to change at page 24, line 27
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC9012] Patel, K., Van de Velde, G., Sangli, S., and J. Scudder, [RFC9012] Patel, K., Van de Velde, G., Sangli, S., and J. Scudder,
"The BGP Tunnel Encapsulation Attribute", RFC 9012, "The BGP Tunnel Encapsulation Attribute", RFC 9012,
DOI 10.17487/RFC9012, April 2021, DOI 10.17487/RFC9012, April 2021,
<https://www.rfc-editor.org/info/rfc9012>. <https://www.rfc-editor.org/info/rfc9012>.
8.2. Informative References 8.2. Informative References
[I-D.hb-idr-sr-p2mp-policy]
Bidgoli, H., Voyer, D., Stone, A., Parekh, R., Krier, S.,
and A. Venkateswaran, "Advertising p2mp policies in BGP",
draft-hb-idr-sr-p2mp-policy-04 (work in progress), October
2021.
[RFC6388] Wijnands, IJ., Ed., Minei, I., Ed., Kompella, K., and B. [RFC6388] Wijnands, IJ., Ed., Minei, I., Ed., Kompella, K., and B.
Thomas, "Label Distribution Protocol Extensions for Point- Thomas, "Label Distribution Protocol Extensions for Point-
to-Multipoint and Multipoint-to-Multipoint Label Switched to-Multipoint and Multipoint-to-Multipoint Label Switched
Paths", RFC 6388, DOI 10.17487/RFC6388, November 2011, Paths", RFC 6388, DOI 10.17487/RFC6388, November 2011,
<https://www.rfc-editor.org/info/rfc6388>. <https://www.rfc-editor.org/info/rfc6388>.
[RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/
BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February
2012, <https://www.rfc-editor.org/info/rfc6513>. 2012, <https://www.rfc-editor.org/info/rfc6513>.
 End of changes. 30 change blocks. 
90 lines changed or deleted 96 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/