idnits 2.17.1 draft-ietf-bier-te-arch-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (Oct 30, 2020) is 1271 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '2' on line 1367 -- Looks like a reference, but probably isn't: '1' on line 1381 == Missing Reference: 'SI' is mentioned on line 1421, but not defined == Missing Reference: 'I' is mentioned on line 1428, but not defined == Missing Reference: 'VRF' is mentioned on line 1953, but not defined == Outdated reference: A later version (-06) exists of draft-ietf-bier-multicast-http-response-04 == Outdated reference: A later version (-27) exists of draft-ietf-teas-rfc3272bis-01 Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Eckert, Ed. 3 Internet-Draft Futurewei 4 Intended status: Standards Track G. Cauchie 5 Expires: May 3, 2021 Bouygues Telecom 6 M. Menth 7 University of Tuebingen 8 Oct 30, 2020 10 Tree Engineering for Bit Index Explicit Replication (BIER-TE) 11 draft-ietf-bier-te-arch-09 13 Abstract 15 This memo introduces per-packet stateless strict and loose path 16 steered replication and forwarding for Bit Index Explicit Replication 17 packets (RFC8279). This is called BIER Tree Engineering (BIER-TE). 18 BIER-TE can be used as a path steering mechanism in future Traffic 19 Engineering solutions for BIER (BIER-TE). 21 BIER-TE leverages RFC8279 and extends it with a new semantic for bits 22 in the bitstring. BIER-TE can leverage BIER forwarding engines with 23 little or no changes. 25 In BIER, the BitPositions (BP) of the packets bitstring indicate BIER 26 Forwarding Egress Routers (BFER), and hop-by-hop forwarding uses a 27 Routing Underlay such as an IGP. 29 In BIER-TE, BitPositions indicate adjacencies. The BIFT of each BFR 30 are only populated with BPs that are adjacent to the BFR in the BIER- 31 TE topology. The BIER-TE topology can consist of layer 2 or remote 32 (routed) adjacencies. The BFR then replicates and forwards BIER 33 packets to those adjacencies. This results in the aforementioned 34 strict and loose path steering and replications. 36 BIER-TE can co-exist with BIER forwarding in the same domain, for 37 example by using separate BIER sub-domains. In the absence of routed 38 adjacencies, BIER-TE does not require a BIER routing underlay, and 39 can then be operated without requiring an Interior Gateway Routing 40 protocol (IGP). 42 BIER-TE operates without explicit in-network tree-state and carries 43 the multicast distribution tree in the packet header. It can 44 therefore be a good fit to support multicast path steering in Segment 45 Routing (SR) networks. 47 Name explanation 49 [RFC-editor: This section to be removed before publication.] 51 Explanation for name change from BIER-TE to mean "Traffic 52 Engineering" to BIER-TE "Tree Engineering" in WG last-call (to 53 benefit IETF/IESG reviewers): 55 This document started by calling itself BIER-TE, "Traffic 56 Engineering" as it is a mode of BIER specifically beneficial for 57 Traffic Engineering. It supports per-packet bitstring based policy 58 steering and replication. BIER-TE technology itself does not provide 59 a complete traffic engineering solution for BIER but would require 60 combination with other technologies for a full BIER based TE 61 solution, such as a PCE and queuing mechanisms to provide bandwidth 62 and latency reservations. It is also not the only option to build a 63 traffic engineering solution utilizing BIER, for example BIER trees 64 could be steered through IGP metric engineering, such as through 65 Flex-Topologies. The architecure for Traffic Engineering with either 66 modes of BIER (BIER-TE/BIER) is intended to be defined in a separate 67 document, most likely in TEAs WG. 69 Because the name of such an overall solution is intended to be BIER- 70 TE, the expansion of BIER-TE was therefore changed to name this BIER 71 mode "Tree Engineering", so the overall solution can be distinguished 72 better from its tree building/engineering method without having to 73 change the long time well-established abbreviation BIER-TE. 75 Status of This Memo 77 This Internet-Draft is submitted in full conformance with the 78 provisions of BCP 78 and BCP 79. 80 Internet-Drafts are working documents of the Internet Engineering 81 Task Force (IETF). Note that other groups may also distribute 82 working documents as Internet-Drafts. The list of current Internet- 83 Drafts is at https://datatracker.ietf.org/drafts/current/. 85 Internet-Drafts are draft documents valid for a maximum of six months 86 and may be updated, replaced, or obsoleted by other documents at any 87 time. It is inappropriate to use Internet-Drafts as reference 88 material or to cite them other than as "work in progress." 90 This Internet-Draft will expire on May 3, 2021. 92 Copyright Notice 94 Copyright (c) 2020 IETF Trust and the persons identified as the 95 document authors. All rights reserved. 97 This document is subject to BCP 78 and the IETF Trust's Legal 98 Provisions Relating to IETF Documents 99 (https://trustee.ietf.org/license-info) in effect on the date of 100 publication of this document. Please review these documents 101 carefully, as they describe your rights and restrictions with respect 102 to this document. Code Components extracted from this document must 103 include Simplified BSD License text as described in Section 4.e of 104 the Trust Legal Provisions and are provided without warranty as 105 described in the Simplified BSD License. 107 Table of Contents 109 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 110 1.1. Basic Examples . . . . . . . . . . . . . . . . . . . . . 5 111 1.2. BIER-TE Topology and adjacencies . . . . . . . . . . . . 8 112 1.3. Comparison with BIER . . . . . . . . . . . . . . . . . . 9 113 1.4. Requirements Language . . . . . . . . . . . . . . . . . . 9 114 2. Components . . . . . . . . . . . . . . . . . . . . . . . . . 10 115 2.1. The Multicast Flow Overlay . . . . . . . . . . . . . . . 10 116 2.2. The BIER-TE Controller . . . . . . . . . . . . . . . . . 10 117 2.2.1. Assignment of BitPositions to adjacencies of the 118 network topology . . . . . . . . . . . . . . . . . . 11 119 2.2.2. Changes in the network topology . . . . . . . . . . . 11 120 2.2.3. Set up per-multicast flow BIER-TE state . . . . . . . 11 121 2.2.4. Link/Node Failures and Recovery . . . . . . . . . . . 12 122 2.3. The BIER-TE Forwarding Layer . . . . . . . . . . . . . . 12 123 2.4. The Routing Underlay . . . . . . . . . . . . . . . . . . 12 124 2.5. Traffic Engineering Considerations . . . . . . . . . . . 13 125 3. BIER-TE Forwarding . . . . . . . . . . . . . . . . . . . . . 14 126 3.1. The Bit Index Forwarding Table (BIFT) . . . . . . . . . . 14 127 3.2. Adjacency Types . . . . . . . . . . . . . . . . . . . . . 15 128 3.2.1. Forward Connected . . . . . . . . . . . . . . . . . . 15 129 3.2.2. Forward Routed . . . . . . . . . . . . . . . . . . . 16 130 3.2.3. ECMP . . . . . . . . . . . . . . . . . . . . . . . . 16 131 3.2.4. Local Decap . . . . . . . . . . . . . . . . . . . . . 16 132 3.3. Encapsulation considerations . . . . . . . . . . . . . . 17 133 3.4. Basic BIER-TE Forwarding Example . . . . . . . . . . . . 17 134 3.5. Forwarding comparison with BIER . . . . . . . . . . . . . 19 135 3.6. Requirements . . . . . . . . . . . . . . . . . . . . . . 20 136 4. BIER-TE Controller BitPosition Assignments . . . . . . . . . 20 137 4.1. P2P Links . . . . . . . . . . . . . . . . . . . . . . . . 21 138 4.2. BFER . . . . . . . . . . . . . . . . . . . . . . . . . . 21 139 4.3. Leaf BFERs . . . . . . . . . . . . . . . . . . . . . . . 21 140 4.4. LANs . . . . . . . . . . . . . . . . . . . . . . . . . . 22 141 4.5. Hub and Spoke . . . . . . . . . . . . . . . . . . . . . . 22 142 4.6. Rings . . . . . . . . . . . . . . . . . . . . . . . . . . 23 143 4.7. Equal Cost MultiPath (ECMP) . . . . . . . . . . . . . . . 24 144 4.8. Routed adjacencies . . . . . . . . . . . . . . . . . . . 26 145 4.8.1. Reducing BitPositions . . . . . . . . . . . . . . . . 26 146 4.8.2. Supporting nodes without BIER-TE . . . . . . . . . . 27 147 4.9. Reuse of BitPositions (without DNR) . . . . . . . . . . . 27 148 4.10. Summary of BP optimizations . . . . . . . . . . . . . . . 28 149 5. Avoiding duplicates and loops . . . . . . . . . . . . . . . . 29 150 5.1. Loops . . . . . . . . . . . . . . . . . . . . . . . . . . 29 151 5.2. Duplicates . . . . . . . . . . . . . . . . . . . . . . . 30 152 6. BIER-TE Forwarding Pseudocode . . . . . . . . . . . . . . . . 30 153 7. Managing SI, subdomains and BFR-ids . . . . . . . . . . . . . 33 154 7.1. Why SI and sub-domains . . . . . . . . . . . . . . . . . 34 155 7.2. Bit assignment comparison BIER and BIER-TE . . . . . . . 35 156 7.3. Using BFR-id with BIER-TE . . . . . . . . . . . . . . . . 35 157 7.4. Assigning BFR-ids for BIER-TE . . . . . . . . . . . . . . 36 158 7.5. Example bit allocations . . . . . . . . . . . . . . . . . 37 159 7.5.1. With BIER . . . . . . . . . . . . . . . . . . . . . . 37 160 7.5.2. With BIER-TE . . . . . . . . . . . . . . . . . . . . 38 161 7.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 39 162 8. BIER-TE and Segment Routing . . . . . . . . . . . . . . . . . 39 163 9. Security Considerations . . . . . . . . . . . . . . . . . . . 40 164 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 41 165 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 41 166 12. Change log [RFC Editor: Please remove] . . . . . . . . . . . 41 167 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 47 168 13.1. Normative References . . . . . . . . . . . . . . . . . . 47 169 13.2. Informative References . . . . . . . . . . . . . . . . . 47 170 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 48 172 1. Introduction 174 BIER-TE shares architecture, terminology and packet formats with BIER 175 as described in [RFC8279] and [RFC8296]. This document describes 176 BIER-TE in the expectation that the reader is familiar with these two 177 documents. 179 In BIER-TE, BitPositions (BP) indicate adjacencies. The BIFT of each 180 BFR is only populated with BP that are adjacent to the BFR in the 181 BIER-TE Topology. Other BPs are left without adjacency. The BFR 182 replicate and forwards BIER packets to adjacent BPs that are set in 183 the packet. BPs are normally also reset upon forwarding to avoid 184 duplicates and loops. This is detailed further below. 186 Note that related work, [I-D.ietf-roll-ccast] uses Bloom filters 187 [Bloom70] to represent leaves or edges of the intended delivery tree. 189 Bloom filters in general can support larger trees/topologies with 190 fewer addressing bits than explicit bitstrings, but they introduce 191 the heuristic risk of false positives and cannot reset bits in the 192 bitstring during forwarding to avoid loops. For these reasons, BIER- 193 TE uses explicit bitstrings like BIER. The explicit bitstrings of 194 BIER-TE can also be seen as a special type of Bloom filter, and this 195 is how related work [ICC] describes it. 197 1.1. Basic Examples 199 BIER-TE forwarding is best introduced with simple examples. 201 BIER-TE Topology: 203 Diagram: 205 p5 p6 206 --- BFR3 --- 207 p3/ p13 \p7 208 BFR1 ---- BFR2 BFR5 ----- BFR6 209 p1 p2 p4\ p14 /p10 p11 p12 210 --- BFR4 --- 211 p8 p9 213 (simplified) BIER-TE Bit Index Forwarding Tables (BIFT): 215 BFR1: p1 -> local_decap 216 p2 -> forward_connected to BFR2 218 BFR2: p1 -> forward_connected to BFR1 219 p5 -> forward_connected to BFR3 220 p8 -> forward_connected to BFR4 222 BFR3: p3 -> forward_connected to BFR2 223 p7 -> forward_connected to BFR5 224 p13 -> local_decap 226 BFR4: p4 -> forward_connected to BFR2 227 p10 -> forward_connected to BFR5 228 p14 -> local_decap 230 BFR5: p6 -> forward_connected to BFR3 231 p9 -> forward_connected to BFR4 232 p12 -> forward_connected to BFR6 234 BFR6: p11 -> forward_connected to BFR5 235 p12 -> local_decap 237 Figure 1: BIER-TE basic example 239 Consider the simple network in the above BIER-TE overview example 240 picture with 6 BFRs. p1...p14 are the BitPositions (BP) used. All 241 BFRs can act as ingress BFR (BFIR), BFR1, BFR3, BFR4 and BFR6 can 242 also be egress BFR (BFER). Forward_connected is the name for 243 adjacencies that are representing subnet adjacencies of the network. 244 Local_decap is the name of the adjacency to decapsulate BIER-TE 245 packets and pass their payload to higher layer processing. 247 Assume a packet from BFR1 should be sent via BFR4 to BFR6. This 248 requires a bitstring (p2,p8,p10,p12). When this packet is examined 249 by BIER-TE on BFR1, the only BitPosition from the bitstring that is 250 also set in the BIFT is p2. This will cause BFR1 to send the only 251 copy of the packet to BFR2. Similarly, BFR2 will forward to BFR4 252 because of p8, BFR4 to BFR5 because of p10 and BFR5 to BFR6 because 253 of p12. p12 also makes BFR6 receive and decapsulate the packet. 255 To send in addition to BFR6 via BFR4 also a copy to BFR3, the 256 bitstring needs to be (p2,p5,p8,p10,p12,p13). When this packet is 257 examined by BFR2, p5 causes one copy to be sent to BFR3 and p8 one 258 copy to BFR4. When BFR3 receives the packet, p13 will cause it to 259 receive and decapsulate the packet. 261 If instead the bitstring was (p2,p6,p8,p10,p12,p13), the packet would 262 be copied by BFR5 towards BFR3 because of p6 instead of being copied 263 by BFR2 to BFR3 because of p5 in the prior case. This is showing the 264 ability of the shown BIER-TE Topology to make the traffic pass across 265 any possible path and be replicated where desired. 267 BIER-TE has various options to minimize BP assignments, many of which 268 are based on assumptions about the required multicast traffic paths 269 and bandwidth consumption in the network. 271 The following picture shows a modified example, in which Rtr2 and 272 Rtr5 are assumed not to support BIER-TE, so traffic has to be unicast 273 encapsulated across them. Unicast tunneling of BIER-TE packets can 274 leverage any feasible mechanism such as MPLS or IP, these 275 encapsulations are out of scope of this document. To emphasize non- 276 native forwarding of BIER-TE packets, these adjacencies are called 277 "forward_routed", but otherwise there is no difference in their 278 processing over the aforementioned "forward_connected" adjacencies. 280 In addition, bits are saved in the following example by assuming that 281 BFR1 only needs to be BFIR but not BFER or transit BFR. 283 BIER-TE Topology: 285 Diagram: 287 p1 p3 p7 288 ....> BFR3 <.... p5 289 ........ ........> 290 BFR1 (Rtr2) (Rtr5) BFR6 291 ........ ........> 292 ....> BFR4 <.... p6 293 p2 p4 p8 295 (simplified) BIER-TE Bit Index Forwarding Tables (BIFT): 297 BFR1: p1 -> forward_routed to BFR3 298 p2 -> forward_routed to BFR4 300 BFR3: p3 -> local_decap 301 p5 -> forward_routed to BFR6 303 BFR4: p4 -> local_decap 304 p6 -> forward_routed to BFR6 306 BFR6: p5 -> local_decap 307 p6 -> local_decap 308 p7 -> forward_routed to BFR3 309 p8 -> forward_routed to BFR4 311 Figure 2: BIER-TE basic overlay example 313 To send a BIER-TE packet from BFR1 via BFR3 to BFR6, the bitstring is 314 (p1,p5). From BFR1 via BFR4 to BFR6 it is (p2,p6). A packet from 315 BFR1 to BFR3,BFR4 and from BFR3 to BFR6 uses (p1,p2,p3,p4,p5). A 316 packet from BFR1 to BFR3,BFR4 and from BFR4 to BFR uses 317 (p1,p2,p3,p4,p6). A packet from BFR1 to BFR4, and from BFR4 to BFR6 318 and from BFR6 to BFR3 uses (p2,p3,p4,p6,p7). A packet from BFR1 to 319 BFR3, and from BFR3 to BFR6 and from BFR6 to BFR4 uses 320 (p1,p3,p4,p5,p8). 322 1.2. BIER-TE Topology and adjacencies 324 The key new component in BIER-TE compared to BIER is the BIER-TE 325 topology as introduced through the two examples in Section 1.1. It 326 is used to control where replication can or should happen and how to 327 minimize the required number of BP for adjacencies. 329 The BIER-TE Topology consists of the BIFT of all the BFR and can also 330 be expressed as a directed graph where the edges are the adjacencies 331 between the BFR labelled with the BP used for the adjacency. 332 Adjacencies are naturally unidirectional. BP can be reused across 333 multiple adjacencies as long as this does not lead to undesired 334 duplicates or loops as explained further down in the text. 336 If the BIER-TE topology represents the underlying (layer 2) topology 337 of the network, this is called "native" BIER-TE as shown in the first 338 example. This can be freely mixed with "overlay" BIER-TE, in 339 "forward_routed" adjacencies are used. 341 1.3. Comparison with BIER 343 The key differences over BIER are: 345 o BIER-TE replaces in-network autonomous path calculation by 346 explicit paths calculated by the BIER-TE Controller. 348 o In BIER-TE every BitPosition of the BitString of a BIER-TE packet 349 indicates one or more adjacencies - instead of a BFER as in BIER. 351 o BIER-TE in each BFR has no routing table but only a BIER-TE 352 Forwarding Table (BIFT) indexed by SI:BitPosition and populated 353 with only those adjacencies to which the BFR should replicate 354 packets to. 356 BIER-TE headers use the same format as BIER headers. 358 BIER-TE forwarding does not require/use the BFIR-ID. The BFIR-ID can 359 still be useful though for coordinated BFIR/BFER functions, such as 360 the context for upstream assigned labels for MPLS payloads in MVPN 361 over BIER-TE. 363 If the BIER-TE domain is also running BIER, then the BFIR-ID in BIER- 364 TE packets can be set to the same BFIR-ID as used with BIER packets. 366 If the BIER-TE domain is not running full BIER or does not want to 367 reduce the need to allocate bits in BIER bitstrings for BFIR-ID 368 values, then the allocation of BFIR-ID values in BIER-TE packets can 369 be done through other mechanisms outside the scope of this document, 370 as long as this is appropriately agreed upon between all BFIR/BFER. 372 1.4. Requirements Language 374 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 375 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 376 document are to be interpreted as described in RFC 2119 [RFC2119]. 378 2. Components 380 End to end BIER-TE operations consists of four mayor components: The 381 "Multicast Flow Overlay", the "BIER-TE control plane" consisting of 382 the "BIER-TE Controller" and its signaling channels to the BFR, the 383 "Routing Underlay" and the "BIER-TE forwarding layer". The Bier-TE 384 Controller is the new architectural component in BIER-TE compared to 385 BIER. 387 Picture 2: Components of BIER-TE 389 <------BGP/PIM-----> 390 |<-IGMP/PIM-> multicast flow <-PIM/IGMP->| 391 overlay 393 [BIER-TE Controller] <=> [BIER-TE Topology] 394 BIER-TE control plane 395 ^ ^ ^ 396 / | \ BIER-TE control protocol 397 | | | e.g. Netconf/Restconf/Yang 398 v v v 399 Src -> Rtr1 -> BFIR-----BFR-----BFER -> Rtr2 -> Rcvr 401 |<----------------->| 402 BIER-TE forwarding layer 404 |<- BIER-TE domain->| 406 |<--------------------->| 407 Routing underlay 409 Figure 3: BIER-TE architecture 411 2.1. The Multicast Flow Overlay 413 The Multicast Flow Overlay operates as in BIER. See [RFC8279]. 414 Instead of interacting with the BIER forwarding layer (as in BIER), 415 it interacts with the BIER-TE Controller. 417 2.2. The BIER-TE Controller 419 The BIER-TE Controller is representing the control plane of BIER-TE. 420 It communicates two sets of information with BFRs: 422 During initial provisioning or modifications of the network topology, 423 the BIER-TE Controller discovers the network topology and creates the 424 BIER-TE topology from it: determine which adjacencies are required/ 425 desired and assign BitPositions to them. Then it signals the 426 resulting of BitPositions and their adjacencies to each BFR to set up 427 their BIER-TE BIFTs. 429 During day-to-day operations of the network, the BIER-TE Controller 430 signals to BFIRs what multicast flows are mapped to what BitStrings. 432 Communications between the BIER-TE Controller and BFRs is ideally via 433 standardized protocols and data-models such as Netconf/Restconf/Yang. 434 This is currently outside the scope of this document. Vendor- 435 specific CLI on the BFRs is also a possible stopgap option (as in 436 many other SDN solutions lacking definition of standardized data 437 model). 439 For simplicity, the procedures of the BIER-TE Controller are 440 described in this document as if it is a single, centralized 441 automated entity, such as an SDN controller. It could equally be an 442 operator setting up CLI on the BFRs. Distribution of the functions 443 of the BIER-TE Controller is currently outside the scope of this 444 document. 446 2.2.1. Assignment of BitPositions to adjacencies of the network 447 topology 449 The BIER-TE Controller tracks the BFR topology of the BIER-TE domain. 450 It determines what adjacencies require BitPositions so that BIER-TE 451 explicit paths can be built through them as desired by operator 452 policy. 454 The BIER-TE Controller then pushes the BitPositions/adjacencies to 455 the BIFT of the BFRs, populating only those SI:BitPositions to the 456 BIFT of each BFR to which that BFR should be able to send packets to 457 - adjacencies connecting to this BFR. 459 2.2.2. Changes in the network topology 461 If the network topology changes (not failure based) so that 462 adjacencies that are assigned to BitPositions are no longer needed, 463 the BIER-TE Controller can re-use those BitPositions for new 464 adjacencies. First, these BitPositions need to be removed from any 465 BFIR flow state and BFR BIFT state, then they can be repopulated, 466 first into BIFT and then into the BFIR. 468 2.2.3. Set up per-multicast flow BIER-TE state 470 The BIER-TE Controller interacts with the multicast flow overlay to 471 determine what multicast flow needs to be sent by a BFIR to which set 472 of BFER. It calculates the desired distribution tree across the 473 BIER-TE domain based on algorithms outside the scope of this document 474 (e.g. CSFP, Steiner Tree, ...). It then pushes the calculated 475 BitString into the BFIR. 477 See [I-D.ietf-bier-multicast-http-response] for a solution describing 478 this interaction. 480 2.2.4. Link/Node Failures and Recovery 482 When link or nodes fail or recover in the topology, BIER-TE can 483 quickly respond with the optional FRR procedures described in [I- 484 D.eckert-bier-te-frr]. It can also more slowly react by 485 recalculating the BitStrings of affected multicast flows. This 486 reaction is slower than the FRR procedure because the BIER-TE 487 Controller needs to receive link/node up/down indications, 488 recalculate the desired BitStrings and push them down into the BFIRs. 489 With FRR, this is all performed locally on a BFR receiving the 490 adjacency up/down notification. 492 2.3. The BIER-TE Forwarding Layer 494 When the BIER-TE Forwarding Layer receives a packet, it simply looks 495 up the BitPositions that are set in the BitString of the packet in 496 the Bit Index Forwarding Table (BIFT) that was populated by the BIER- 497 TE Controller. For every BP that is set in the BitString, and that 498 has one or more adjacencies in the BIFT, a copy is made according to 499 the type of adjacencies for that BP in the BIFT. Before sending any 500 copy, the BFR resets all BP in the BitString of the packet for which 501 the BFR has one or more adjacencies in the BIFT, except when the 502 adjacency indicates "DoNotReset" (DNR, see Section 3.2.1). This is 503 done to inhibit that packets can loop. 505 2.4. The Routing Underlay 507 For forward_connected adjacencies, BIER-TE is sending BIER packets to 508 directly connected BIER-TE neighbors as L2 (unicasted) BIER packets 509 without requiring a routing underlay. For forward_routed 510 adjacencies, BIER-TE forwarding encapsulates a copy of the BIER 511 packet so that it can be delivered by the forwarding plane of the 512 routing underlay to the routable destination address indicated in the 513 adjacency. See Section 3.2.2 for the adjacency definition. 515 BIER relies on the routing underlay to calculate paths towards BFER 516 and derive next-hop BFR adjacencies for those paths. This commonly 517 relies on BIER specific extensions to the routing protocols of the 518 routing underlay but may also be established by a controller. In 519 BIER-TE, the next-hops of a packet are determined by the bitstring 520 through the BIER-TE Controller established adjacencies on the BFR for 521 the BPs of the bitsring. There is thus no need for BFER specific 522 routing underlay extensions to forward BIER packets with BIER-TE 523 semantics. 525 BIER encapsulations may have BFER independent extensions in the 526 routing underlay, such as the label range for BIER packets in the 527 BIER over MPLS encapsulation ([RFC8296]). These BIER specific 528 functions of the routing underlay are equally useable by BIER-TE. 529 Alternatively, these encapsulation parameters can be provisioned by 530 the BIER-TE controller into the forward_connected or forward_routed 531 adjacencies directly without relying on a routing underlay. 533 If the BFR intends to support FRR for BIER-TE, then the BIER-TE 534 forwarding plane needs to receive fast adjacency up/down 535 notifications: Link up/down or neighbor up/down, e.g. from BFD. 536 Providing these notifications is considered to be part of the routing 537 underlay in this document. 539 2.5. Traffic Engineering Considerations 541 Traffic Engineering ([I-D.ietf-teas-rfc3272bis]) provides performance 542 optimization of operational IP networks while utilizing network 543 resources economically and reliably. The key elements needed to 544 effect TE are policy, path steering and resource management. These 545 elements require support at the control/controller level and within 546 the forwarding plane. 548 Policy decisions are made within the BIER-TE control plane, i.e., 549 within BIER-TE Controllers. Controllers use policy when composing 550 BitStrings (BFR flow state) and BFR BIFT state. The mapping of user/ 551 IP traffic to specific BitStrings/BIER-TE flows is made based on 552 policy. The specifics details of BIER-TE policies and how a 553 controller uses such are out of scope of this document. 555 Path steering is supported via the definition of a BitString. 556 BitStrings used in BIER-TE are composed based on policy and resource 557 management considerations. When composing BIER-TE BitStrings, a 558 Controller MUST take into account the resources available at each BFR 559 and for each BP when it is providing congestion loss free services 560 such as Rate Controlled Service Disciplines [RCSD94]. Resource 561 availability could be provided for example via routing protocol 562 information, but may also be obtained via a BIER-TE control protocol 563 such as Netconf or any other protocol commonly used by a PCE to 564 understand the resources of the network it operates on. The resource 565 usage of the BIER-TE traffic admitted by the BIER-TE controller can 566 be solely tracked on the BIER-TE Controller based on local accounting 567 as long as no forward_routed adjacencies are used (see Section 3.2.1 568 for the definition of forward_routed adjacencies). When 569 forward_routed adjacencies are used, the paths selected by the 570 underlying routing protocol need to be tracked as well. 572 Resource management has implications on the forwarding plane beyond 573 the BIER-TE defined steering of packets. This includes allocation of 574 buffers to guarantee the worst case requirements of admitted RCSD 575 trafic and potential policing and/or rate-shaping mechanisms, 576 typically done via various forms of queuing. This level of resource 577 control, while optional, is important in networks that wish to 578 support congestion management policies to control or regulate the 579 offered traffic to deliver different levels of service and alleviate 580 congestion problems, or those networks that wish to control latencies 581 experienced by specific traffic flows. 583 3. BIER-TE Forwarding 585 3.1. The Bit Index Forwarding Table (BIFT) 587 The Bit Index Forwarding Table (BIFT) exists in every BFR. For every 588 subdomain in use, it is a table indexed by SI:BitPosition and is 589 populated by the BIER-TE control plane. Each index can be empty or 590 contain a list of one or more adjacencies. 592 BIER-TE can support multiple subdomains like BIER. Each one with a 593 separate BIFT 595 In the BIER architecture, indices into the BIFT are explained to be 596 both BFR-id and SI:BitString (BitPosition). This is because there is 597 a 1:1 relationship between BFR-id and SI:BitString - every bit in 598 every SI is/can be assigned to a BFIR/BFER. In BIER-TE there are 599 more bits used in each BitString than there are BFIR/BFER assigned to 600 the bitstring. This is because of the bits required to express the 601 engineered path through the topology. The BIER-TE forwarding 602 definitions do therefore not use the term BFR-id at all. Instead, 603 BFR-ids are only used as required by routing underlay, flow overlay 604 of BIER headers. Please refer to Section 7 for explanations how to 605 deal with SI, subdomains and BFR-id in BIER-TE. 607 ------------------------------------------------------------------ 608 | Index: | Adjacencies: | 609 | SI:BitPosition | or one or more per entry | 610 ================================================================== 611 | 0:1 | forward_connected(interface,neighbor{,DNR}) | 612 ------------------------------------------------------------------ 613 | 0:2 | forward_connected(interface,neighbor{,DNR}) | 614 | | forward_connected(interface,neighbor{,DNR}) | 615 ------------------------------------------------------------------ 616 | 0:3 | local_decap({VRF}) | 617 ------------------------------------------------------------------ 618 | 0:4 | forward_routed({VRF,}l3-neighbor) | 619 ------------------------------------------------------------------ 620 | 0:5 | | 621 ------------------------------------------------------------------ 622 | 0:6 | ECMP({adjacency1,...adjacencyN}, seed) | 623 ------------------------------------------------------------------ 624 ... 625 | BitStringLength | ... | 626 ------------------------------------------------------------------ 627 Bit Index Forwarding Table 629 Figure 4: BIFT adjacencies 631 The BIFT is programmed into the data plane of BFRs by the BIER-TE 632 Controller and used to forward packets, according to the rules 633 specified in the BIER-TE Forwarding Procedures. 635 Adjacencies for the same BP when populated in more than one BFR by 636 the BIER-TE Controller does not have to have the same adjacencies. 637 This is up to the BIER-TE Controller. BPs for p2p links are one case 638 (see below). 640 {VRF}indicates the Virtual Routing and Forwarding context into which 641 the BIER payload is to be delivered. This is optional and depends on 642 the multicast flow overlay. 644 3.2. Adjacency Types 646 3.2.1. Forward Connected 648 A "forward_connected" adjacency is towards a directly connected BFR 649 neighbor using an interface address of that BFR on the connecting 650 interface. A forward_connected adjacency does not route packets but 651 only L2 forwards them to the neighbor. 653 Packets sent to an adjacency with "DoNotReset" (DNR) set in the BIFT 654 will not have the BitPosition for that adjacency reset when the BFR 655 creates a copy for it. The BitPosition will still be reset for 656 copies of the packet made towards other adjacencies. This can be 657 used for example in ring topologies as explained below. 659 3.2.2. Forward Routed 661 A "forward_routed" adjacency is an adjacency towards a BFR that is 662 not a forward_connected adjacency: towards a loopback address of a 663 BFR or towards an interface address that is non-directly connected. 664 Forward_routed packets are forwarded via the Routing Underlay. 666 If the Routing Underlay has multiple paths for a forward_routed 667 adjacency, it will perform ECMP independent of BIER-TE for packets 668 forwarded across a forward_routed adjacency. This is independent of 669 BIER-TE ECMP described in Section 3.2.3. 671 If the Routing Underlay has FRR, it will perform FRR independent of 672 BIER-TE for packets forwarded across a forward_routed adjacency. 674 3.2.3. ECMP 676 The ECMP mechanisms in BIER are tied to the BIER BIFT and are 677 therefore not directly useable with BIER-TE. The following 678 procedures describe ECMP for BIER-TE that we consider to be 679 lightweight but also well manageable. It leverages the existing 680 entropy parameter in the BIER header to keep packets of the flows on 681 the same path and it introduces a "seed" parameter to allow for 682 traffic flows to be polarized or randomized across multiple hops. 684 An "Equal Cost Multipath" (ECMP) adjacency has a list of two or more 685 adjacencies included in it. It copies the BIER-TE to one of those 686 adjacencies based on the ECMP hash calculation. The BIER-TE ECMP 687 hash algorithm must select the same adjacency from that list for all 688 packets with the same "entropy" value in the BIER-TE header if the 689 same number of adjacencies and same seed are given as parameters. 690 Further use of the seed parameter is explained below. 692 3.2.4. Local Decap 694 A "local_decap" adjacency passes a copy of the payload of the BIER-TE 695 packet to the packets NextProto within the BFR (IPv4/IPv6, 696 Ethernet,...). A local_decap adjacency turns the BFR into a BFER for 697 matching packets. Local_decap adjacencies require the BFER to 698 support routing or switching for NextProto to determine how to 699 further process the packet. 701 3.3. Encapsulation considerations 703 Specifications for BIER-TE encapsulation are outside the scope of 704 this document. This section gives explanations and guidelines. 706 Because a BFR needs to interpret the BitString of a BIER-TE packet 707 differently from a BIER packet, it is necessary to distinguish BIER 708 from BIER-TE packets. This is subject to definitions in BIER 709 encapsulation specifications. 711 MPLS encapsulation [RFC8296] for example assigns one label by which 712 BFRs recognizes BIER packets for every (SI,subdomain) combination. 713 If it is desirable that every subdomain can forward only BIER or 714 BIER-TE packets, then the label allocation could stay the same, and 715 only the forwarding model (BIER/BIER-TE) would have to be defined per 716 subdomain. If it is desirable to support both BIER and BIER-TE 717 forwarding in the same subdomain, then additional labels would need 718 to be assigned for BIER-TE forwarding. 720 "forward_routed" requires an encapsulation permitting to unicast 721 BIER-TE packets to a specific interface address on a target BFR. 722 With MPLS encapsulation, this can simply be done via a label stack 723 with that addresses label as the top label - followed by the label 724 assigned to (SI,subdomain) - and if necessary (see above) BIER-TE. 725 With non-MPLS encapsulation, some form of IP encapsulation would be 726 required (for example IP/GRE). 728 The encapsulation used for "forward_routed" adjacencies can equally 729 support existing advanced adjacency information such as "loose source 730 routes" via e.g. MPLS label stacks or appropriate header extensions 731 (e.g. for IPv6). 733 3.4. Basic BIER-TE Forwarding Example 735 [RFC Editor: remove this section.] 737 THIS SECTION TO BE REMOVED IN RFC BECAUSE IT WAS SUPERCEEDED BY 738 SECTION 1.1 EXAMPLE - UNLESS REVIEWERS CHIME IN AND EXPRESS DESIRE TO 739 KEEP THIS ADDITIONAL EXAMPLE SECTION. 741 Step by step example of basic BIER-TE forwarding. This does not use 742 ECMP or forward_routed adjacencies nor does it try to minimize the 743 number of required BitPositions for the topology. 745 [BIER-TE Controller] 746 / | \ 747 v v v 749 | p13 p1 | 750 +- BFIR2 --+ | 751 | | p2 p6 | LAN2 752 | +-- BFR3 --+ | 753 | | | p7 p11 | 754 Src -+ +-- BFER1 --+ 755 | | p3 p8 | | 756 | +-- BFR4 --+ +-- Rcv1 757 | | | | 758 | | 759 | p14 p4 | 760 +- BFIR1 --+ | 761 | +-- BFR5 --+ p10 p12 | 762 LAN1 | p5 p9 +-- BFER2 --+ 763 | +-- Rcv2 764 | 765 LAN3 767 IP |..... BIER-TE network......| IP 769 Figure 5: BIER-TE Forwarding Example 771 pXX indicate the BitPositions number assigned by the BIER-TE 772 Controller to adjacencies in the BIER-TE topology. For example, p9 773 is the adjacency towards BFR5 on the LAN connecting to BFER2. 775 BIFT BFIR2: 776 p13: local_decap() 777 p2: forward_connected(BFR3) 779 BIFT BFR3: 780 p1: forward_connected(BFIR2) 781 p7: forward_connected(BFER1) 782 p8: forward_connected(BFR4) 784 BIFT BFER1: 785 p11: local_decap() 786 p6: forward_connected(BFR3) 787 p8: forward_connected(BFR4) 789 Figure 6: BIER-TE Forwarding Example Adjacencies 791 ...and so on. 793 For example, we assume that some multicast traffic seen on LAN1 needs 794 to be sent via BIER-TE by BFIR2 towards Rcv1 and Rcv2. The BIER-TE 795 Controller determines it wants it to pass this traffic across the 796 following paths: 798 -> BFER1 ---------------> Rcv1 799 BFIR2 -> BFR3 800 -> BFR4 -> BFR5 -> BFER2 -> Rcv2 802 Figure 7: BIER-TE Forwarding Example Paths 804 These paths equal to the following BitString: p2, p5, p7, p8, p10, 805 p11, p12. 807 This BitString is assigned by BFIR2 to the example multicast traffic 808 received from LAN1. 810 Then BFIR2 forwards this multicast traffic with BIER-TE based on that 811 BitString. The BIFT of BFIR2 has only p2 and p13 populated. Only p2 812 is in the BitString and this is an adjacency towards BFR3. BFIR2 813 therefore resets p2 in the BitString and sends a copy towards BFR2. 815 BFR3 sees a BitString of p5,p7,p8,p10,p11,p12. It is only interested 816 in p1,p7,p8. It creates a copy of the packet to BFER1 (due to p7) 817 and one to BFR4 (due to p8). It resets p7, p8 before sending. 819 BFER1 sees a BitString of p5,p10,p11,p12. It is only interested in 820 p6,p7,p8,p11 and therefore considers only p11. p11 is a "local_decap" 821 adjacency installed by the BIER-TE Controller because BFER1 should 822 pass packets to IP multicast. The local_decap adjacency instructs 823 BFER1 to create a copy, decapsulate it from the BIER header and pass 824 it on to the NextProtocol, in this example IP multicast. IP 825 multicast will then forward the packet out to LAN2 because it did 826 receive PIM or IGMP joins on LAN2 for the traffic. 828 Further processing of the packet in BFR4, BFR5 and BFER2 accordingly. 830 3.5. Forwarding comparison with BIER 832 Forwarding of BIER-TE is designed to allow common forwarding hardware 833 with BIER. In fact, one of the main goals of this document is to 834 encourage the building of forwarding hardware that can not only 835 support BIER, but also BIER-TE - to allow experimentation with BIER- 836 TE and support building of BIER-TE control plane code. 838 The pseudocode in Section 6 shows how existing BIER/BIFT forwarding 839 can be amended to support basic BIER-TE forwarding, by using BIER 840 BIFT's F-BM. Only the masking of bits due to avoid duplicates must 841 be skipped when forwarding is for BIER-TE. 843 Whether to use BIER or BIER-TE forwarding can simply be a configured 844 choice per subdomain and accordingly be set up by a BIER-TE 845 Controller. The BIER packet encapsulation [RFC8296] too can be 846 reused without changes except that the currently defined BIER-TE ECMP 847 adjacency does not leverage the entropy field so that field would be 848 unused when BIER-TE forwarding is used. 850 3.6. Requirements 852 Basic BIER-TE forwarding MUST support to configure Subdomains to use 853 basic BIER-TE forwarding rules (instead of BIER). With basic BIER-TE 854 forwarding, every bit MUST support to have zero or one adjacency. It 855 MUST support the adjacency types forward_connected without DNR flag, 856 forward_routed and local_decap. All other BIER-TE forwarding 857 features are optional. These basic BIER-TE requirements make BIER-TE 858 forwarding exactly the same as BIER forwarding with the exception of 859 skipping the aforementioned F-BM masking on egress. 861 BIER-TE forwarding SHOULD support the DNR flag, as this is highly 862 useful to save bits in rings (see Section 4.6). 864 BIER-TE forwarding MAY support more than one adjacency on a bit and 865 ECMP adjacencies. The importance of ECMP adjacencies is unclear when 866 traffic steering is used because it may be more desirable to 867 explicitly steer traffic across non-ECMP paths to make per-path 868 traffic calculation easier for BIER-TE Controllers. Having more than 869 one adjacency for a bit allows further savings of bits in hub&spoke 870 scenarios, but unlike rings it is less "natural" to flood traffic 871 across multiple links unconditional. Both ECMP and multiple 872 adjacencies are forwarding plane features that should be possible to 873 support later when needed as they do not impact the basic BIER-TE 874 replication loop. This is true because there is no inter-copy 875 dependency through resetting of F-BM as in BIER. 877 4. BIER-TE Controller BitPosition Assignments 879 This section describes how the BIER-TE Controller can use the 880 different BIER-TE adjacency types to define the BitPositions of a 881 BIER-TE domain. 883 Because the size of the BitString is limiting the size of the BIER-TE 884 domain, many of the options described exist to support larger 885 topologies with fewer BitPositions (4.1, 4.3, 4.4, 4.5, 4.6, 4.7, 886 4.8). 888 4.1. P2P Links 890 Each P2p link in the BIER-TE domain is assigned one unique 891 BitPosition with a forward_connected adjacency pointing to the 892 neighbor on the p2p link. 894 4.2. BFER 896 Every non-Leaf BFER is given a unique BitPosition with a local_decap 897 adjacency. 899 4.3. Leaf BFERs 901 BFR1(P) BFR2(P) BFR1(P) BFR2(P) 902 | \ / | | | 903 | X | | | 904 | / \ | | | 905 BFER1(PE) BFER2(PE) BFER1(PE)----BFER2(PE) 907 Leaf BFER / Non-Leaf BFER / 908 PE-router PE-router 910 Figure 8: Leaf vs. non-Leaf BFER Example 912 Leaf BFERs are BFERs where incoming BIER-TE packets never need to be 913 forwarded to another BFR but are only sent to the BFER to exit the 914 BIER-TE domain. For example, in networks where PEs are spokes 915 connected to P routers, those PEs are Leaf BFERs unless there is a 916 U-turn between two PEs. Consider how redundant disjoint traffic can 917 reach BFER1/BFER2 in above picture: When BFER1/BFER2 are Non-Leaf 918 BFER as shown on the right hand side, one traffic copy would be 919 forwarded to BFER1 from BFR1, but the other one could only reach 920 BFER1 via BFER2, which makes BFER2 a non-Leaf BFER. Likewise BFER1 921 is a non-Leaf BFER when forwarding traffic to BFER2. 923 Note that the BFERs in the left hand picture are only guaranteed to 924 be leaf-BFER by fitting routing configuration that prohibits transit 925 traffic to pass through the BFERs, which is commonly applied in these 926 topologies. 928 All leaf-BFER in a BIER-TE domain can share a single BitPosition. 929 This is possible because the BitPosition for the adjacency to reach 930 the BFER can be used to distinguish whether or not packets should 931 reach the BFER. 933 This optimization will not work if an upstream interface of the BFER 934 is using a BitPosition optimized as described in the following two 935 sections (LAN, Hub and Spoke). 937 4.4. LANs 939 In a LAN, the adjacency to each neighboring BFR on the LAN is given a 940 unique BitPosition. The adjacency of this BitPosition is a 941 forward_connected adjacency towards the BFR and this BitPosition is 942 populated into the BIFT of all the other BFRs on that LAN. 944 BFR1 945 |p1 946 LAN1-+-+---+-----+ 947 p3| p4| p2| 948 BFR3 BFR4 BFR7 950 Figure 9: LAN Example 952 If Bandwidth on the LAN is not an issue and most BIER-TE traffic 953 should be copied to all neighbors on a LAN, then BitPositions can be 954 saved by assigning just a single BitPosition to the LAN and 955 populating the BitPosition of the BIFTs of each BFRs on the LAN with 956 a list of forward_connected adjacencies to all other neighbors on the 957 LAN. 959 This optimization does not work in the case of BFRs redundantly 960 connected to more than one LANs with this optimization because these 961 BFRs would receive duplicates and forward those duplicates into the 962 opposite LANs. Adjacencies of such BFRs into their LANs still need a 963 separate BitPosition. 965 4.5. Hub and Spoke 967 In a setup with a hub and multiple spokes connected via separate p2p 968 links to the hub, all p2p links can share the same BitPosition. The 969 BitPosition on the hub's BIFT is set up with a list of 970 forward_connected adjacencies, one for each Spoke. 972 This option is similar to the BitPosition optimization in LANs: 973 Redundantly connected spokes need their own BitPositions. 975 This type of optimized BP could be used for example when all traffic 976 is "broadcast" traffic (very dense receiver set) such as live-TV or 977 situation-awareness (SA). This BP optimization can then be used to 978 explicitly steer different traffic flows across different ECMP paths 979 in Data-Center or broadband-aggregation networks with minimal use of 980 BPs. 982 4.6. Rings 984 In L3 rings, instead of assigning a single BitPosition for every p2p 985 link in the ring, it is possible to save BitPositions by setting the 986 "Do Not Reset" (DNR) flag on forward_connected adjacencies. 988 For the rings shown in the following picture, a single BitPosition 989 will suffice to forward traffic entering the ring at BFRa or BFRb all 990 the way up to BFR1: 992 On BFRa, BFRb, BFR30,... BFR3, the BitPosition is populated with a 993 forward_connected adjacency pointing to the clockwise neighbor on the 994 ring and with DNR set. On BFR2, the adjacency also points to the 995 clockwise neighbor BFR1, but without DNR set. 997 Handling DNR this way ensures that copies forwarded from any BFR in 998 the ring to a BFR outside the ring will not have the ring BitPosition 999 set, therefore minimizing the chance to create loops. 1001 v v 1002 | | 1003 L1 | L2 | L3 1004 /-------- BFRa ---- BFRb --------------------\ 1005 | | 1006 \- BFR1 - BFR2 - BFR3 - ... - BFR29 - BFR30 -/ 1007 | | L4 | | 1008 p33| p15| 1009 BFRd BFRc 1011 Figure 10: Ring Example 1013 Note that this example only permits for packets to enter the ring at 1014 BFRa and BFRb, and that packets will always travel clockwise. If 1015 packets should be allowed to enter the ring at any ring BFR, then one 1016 would have to use two ring BitPositions. One for clockwise, one for 1017 counterclockwise. 1019 Both would be set up to stop rotating on the same link, e.g. L1. 1020 When the ingress ring BFR creates the clockwise copy, it will reset 1021 the counterclockwise BitPosition because the DNR bit only applies to 1022 the bit for which the replication is done. Likewise for the 1023 clockwise BitPosition for the counterclockwise copy. In result, the 1024 ring ingress BFR will send a copy in both directions, serving BFRs on 1025 either side of the ring up to L1. 1027 4.7. Equal Cost MultiPath (ECMP) 1029 The ECMP adjacency allows to use just one BP per link bundle between 1030 two BFRs instead of one BP for each p2p member link of that link 1031 bundle. In the following picture, one BP is used across L1,L2,L3. 1033 --L1----- 1034 BFR1 --L2----- BFR2 1035 --L3----- 1037 BIFT entry in BFR1: 1038 ------------------------------------------------------------------ 1039 | Index | Adjacencies | 1040 ================================================================== 1041 | 0:6 | ECMP({forward_connected(L1, BFR2), | 1042 | | forward_connected(L2, BFR2), | 1043 | | forward_connected(L3, BFR2)}, seed) | 1044 ------------------------------------------------------------------ 1046 BIFT entry in BFR2: 1047 ------------------------------------------------------------------ 1048 | Index | Adjacencies | 1049 ================================================================== 1050 | 0:6 | ECMP({forward_connected(L1, BFR1), | 1051 | | forward_connected(L2, BFR1), | 1052 | | forward_connected(L3, BFR1)}, seed) | 1053 ------------------------------------------------------------------ 1055 Figure 11: ECMP Example 1057 This document does not standardize any ECMP algorithm because it is 1058 sufficient for implementations to document their freely chosen ECMP 1059 algorithm. This allows the BIER-TE Controller to calculate ECMP 1060 paths and seeds. The following picture shows an example ECMP 1061 algorithm: 1063 forward(packet, ECMP(adj(0), adj(1),... adj(N-1), seed)): 1064 i = (packet(bier-header-entropy) XOR seed) % N 1065 forward packet to adj(i) 1067 Figure 12: ECMP algorithm Example 1069 In the following example, all traffic from BFR1 towards BFR10 is 1070 intended to be ECMP load split equally across the topology. This 1071 example is not meant as a likely setup, but to illustrate that ECMP 1072 can be used to share BPs not only across link bundles, and it 1073 explains the use of the seed parameter. 1075 BFR1 (BFIR) 1076 /L11 \L12 1077 / \ 1078 BFR2 BFR3 1079 /L21 \L22 /L31 \L32 1080 / \ / \ 1081 BFR4 BFR5 BFR6 BFR7 1082 \ / \ / 1083 \ / \ / 1084 BFR8 BFR9 1085 \ / 1086 \ / 1087 BFR10 (BFER) 1089 BIFT entry in BFR1: 1090 ------------------------------------------------------------------ 1091 | 0:6 | ECMP({forward_connected(L11, BFR2), | 1092 | | forward_connected(L12, BFR3)}, seed1) | 1093 ------------------------------------------------------------------ 1095 BIFT entry in BFR2: 1096 ------------------------------------------------------------------ 1097 | 0:7 | ECMP({forward_connected(L21, BFR4), | 1098 | | forward_connected(L22, BFR5)}, seed1) | 1099 ------------------------------------------------------------------ 1101 BIFT entry in BFR3: 1102 ------------------------------------------------------------------ 1103 | 0:7 | ECMP({forward_connected(L31, BFR6), | 1104 | | forward_connected(L32, BFR7)}, seed1) | 1105 ------------------------------------------------------------------ 1107 BIFT entry in BFR4, BFR5: 1108 ------------------------------------------------------------------ 1109 | 0:8 | forward_connected(Lxx, BFR8) |xx differs on BFR4/BFR5| 1110 ------------------------------------------------------------------ 1112 BIFT entry in BFR6, BFR7: 1113 ------------------------------------------------------------------ 1114 | 0:8 | forward_connected(Lxx, BFR9) |xx differs on BFR6/BFR7| 1115 ------------------------------------------------------------------ 1117 BIFT entry in BFR8, BFR9: 1118 ------------------------------------------------------------------ 1119 | 0:9 | forward_connected(Lxx, BFR10) |xx differs on BFR8/BFR9| 1120 ------------------------------------------------------------------ 1122 Figure 13: Polarization Example 1124 Note that for the following discussion of ECMP, only the BIFT ECMP 1125 adjacencies on BFR1, BFR2, BFR3 are relevant. The re-use of BP 1126 across BFR in this example is further explained in Section 4.9 below. 1128 With the setup of ECMP in above topology, traffic would not be 1129 equally load-split. Instead, links L22 and L31 would see no traffic 1130 at all: BFR2 will only see traffic from BFR1 for which the ECMP hash 1131 in BFR1 selected the first adjacency in the list of 2 adjacencies 1132 given as parameters to the ECMP. It is link L11-to-BFR2. BFR2 1133 performs again ECMP with two adjacencies on that subset of traffic 1134 using the same seed1, and will therefore again select the first of 1135 its two adjacencies: L21-to-BFR4. And therefore L22 and BFR5 sees no 1136 traffic. Likewise for L31 and BFR6. 1138 This issue in BFR2/BFR3 is called polarization. It results from the 1139 re-use of the same hash function across multiple consecutive hops in 1140 topologies like these. To resolve this issue, the ECMP adjacency on 1141 BFR1 can be set up with a different seed2 than the ECMP adjacencies 1142 on BFR2/BFR3. BFR2/BFR3 can use the same hash because packets will 1143 not sequentially pass across both of them. Therefore, they can also 1144 use the same BP 0:7. 1146 Note that ECMP solutions outside of BIER often hide the seed by auto- 1147 selecting it from local entropy such as unique local or next-hop 1148 identifiers. The solutions chosen for BIER-TE to allow the BIER-TE 1149 Controller to explicitly set the seed maximizes the ability of the 1150 BIER-TE Controller to choose the seed, independent of such seed 1151 source that the BIER-TE Controller may not be able to control well, 1152 and even calculate optimized seeds for multi-hop cases. 1154 4.8. Routed adjacencies 1156 4.8.1. Reducing BitPositions 1158 Routed adjacencies can reduce the number of BitPositions required 1159 when the path steering requirement is not hop-by-hop explicit path 1160 selection, but loose-hop selection. Routed adjacencies can also 1161 allow to operate BIER-TE across intermediate hop routers that do not 1162 support BIER-TE. 1164 ............... 1165 ...BFR1--... ...--L1-- BFR2... 1166 ... .Routers. ...--L2--/ 1167 ...BFR4--... ...------ BFR3... 1168 ............... | 1169 LO 1170 Network Area 1 1172 Figure 14: Routed Adjacencies Example 1174 Assume the requirement in the above picture is to explicitly steer 1175 traffic flows that have arrived at BFR1 or BFR4 via a shortest path 1176 in the routing underlay "Network Area 1" to one of the following 1177 three next segments: (1) BFR2 via link L1, (2) BFR2 via link L2, (3) 1178 via BFR3. 1180 To enable this, both BFR1 and BFR4 are set up with a forward_routed 1181 adjacency BitPosition towards an address of BFR2 on link L1, another 1182 forward_routed BitPosition towards an address of BFR2 on link L2 and 1183 a third forward_routed Bitposition towards a node address LO of BFR3. 1185 4.8.2. Supporting nodes without BIER-TE 1187 Routed adjacencies also enable incremental deployment of BIER-TE. 1188 Only the nodes through which BIER-TE traffic needs to be steered - 1189 with or without replication - need to support BIER-TE. Where they 1190 are not directly connected to each other, forward_routed adjacencies 1191 are used to pass over non BIER-TE enabled nodes. 1193 4.9. Reuse of BitPositions (without DNR) 1195 BitPositions can be re-used across multiple BFR to minimize the 1196 number of BP needed. This happens when adjacencies on multiple BFR 1197 use the DNR flag as described above, but it can also be done for non- 1198 DNR adjacencies. This section only discussses this non-DNR case. 1200 Because BP are reset after passing a BFR with an adjacency for that 1201 BP, reuse of BP across multiple BFR does not introduce any problems 1202 with duplicates or loops that do not also exist when every adjacency 1203 has a unique BP: Instead of setting one BP in a BitString that is 1204 reused in N-adjacencies, one would get the same or worse results if 1205 each of these adjacencies had a unique BP and all of them where set 1206 in the BitString. Instead, based on the case, BPs can be reused 1207 without limitation, or they introduce fewer path steering choices, or 1208 they do not work. 1210 BP cannot be reused across two BFR that would need to be passed 1211 sequentially for some path: The first BFR will reset the BP, so those 1212 paths cannot be built. BP can be set across BFR that would (A) only 1213 occur across different paths or (B) across different branches of the 1214 same tree. 1216 An example of (A) was given in Figure 13, where BP 0:7, BP 0:8 and BP 1217 0:9 are each reused across multiple BFR because a single packet/path 1218 would never be able to reach more than one BFR sharing the same BP. 1220 Assume the example was changed: BFR1 has no ECMP adjacency for BP 1221 0:6, but instead BP 0:5 with forward_connected to BFR2 and BP 0:6 1222 with forward_connected to BFR3. Packets with both BP 0:5 and BP 0:6 1223 would now be able to reach both BFR2 and BFR3 and the still existing 1224 re-use of BP 0:7 between BFR2 and BFR3 is a case of (B) where reuse 1225 of BP is perfect because it does not limit the set of useful path 1226 choices: 1228 If instead of reusing BP 0:7, BFR3 used a separate BP 0:10 for its 1229 ECMP adjacency, no useful additional path steering options would be 1230 enabled. If duplicates at BFR10 where undesirable, this would be 1231 done by not setting BP 0:5 and BP 0:6 for the same packet. If the 1232 duplicates where desirable (e.g.: resilient transmission), the 1233 additional BP 0:10 would also not render additional value. 1235 Reuse may also save BPs in larger topologies. Consider the topology 1236 shown in Figure 17, but only the following explanations: A BFIR/ 1237 sender (e.g.: video headend) is attached to area 1, and area 2...6 1238 contain receivers/BFER. Assume each area had a distribution ring, 1239 each with two BPs to indicate the direction (as explained in before). 1240 These two BPs could be reused across the 5 areas. Packets would be 1241 replicated through other BPs to the desired subset of areas, and once 1242 a packet copy reaches the ring of the area, the two ring BPs come 1243 into play. This reuse is a case of (B), but it limits the topology 1244 choices: Packets can only flow around the same direction in the rings 1245 of all areas. This may or may not be acceptable based on the desired 1246 path steering options: If resilient transmission is the path 1247 engineering goal, then it is likely a good optimization, if the 1248 bandwidth of each ring was to be optimized separately, it would not 1249 be a good limitation. 1251 4.10. Summary of BP optimizations 1253 This section reviewed a range of techniques by which a BIER-TE 1254 Controller can create a BIER-TE topology in a way that minimizes the 1255 number of necessary BPs. 1257 Without any optimization, a BIER-TE Controller would attempt to map 1258 the network subnet topology 1:1 into the BIER-TE topology and every 1259 subnet adjacent neighbor requires a forward_connected BP and every 1260 BFER requires a local_decap BP. 1262 The optimizations described are then as follows: 1264 o P2p links require only one BP (Section 4.1). 1266 o All leaf-BFER can share a single local_decap BP (Section 4.3). 1268 o A LAN with N BFR needs at most N BP (one for each BFR). It only 1269 needs one BP for all those BFR tha are not redundanty connected to 1270 multiple LANs (Section 4.4). 1272 o A hub with p2p connections to multiple non-leaf-BFER spokes can 1273 share one BP to all spokes if traffic can be flooded to all 1274 spokes, e.g.: because of no bandwidth concerns or dense receiver 1275 sets (Section 4.5). 1277 o Rings of BFR can be built with just two BP (one for each 1278 direction) except for BFR with multiple ring connections - similar 1279 to LANs (Section 4.6). 1281 o ECMP adjacencies to N neighbors can replace N BP with 1 BP. 1282 Multihop ECMP can avoid polarization through different seeds of 1283 the ECMP algorithm (Section 4.7). 1285 o Routed adjacencies allow to "tunnel" across non-BIER-TE capable 1286 routers and across BIER-TE capable routers where no traffic- 1287 steering or replications are required (Section 4.8). 1289 o BP can generally be reused across nodes that do not need to be 1290 consecutive in paths, but depending on scenario, this may limit 1291 the feasible path steering options (Section 4.9). 1293 Note that the described list of optimizations is not exhaustive. 1294 Especially when the set of required path steering choices is limited 1295 and the set of possible subsets of BFER that should be able to 1296 receive traffic is limited, further optimizations of BP are possible. 1297 The hub & spoke optimization is a simple example of such traffic 1298 pattern dependent optimizations. 1300 5. Avoiding duplicates and loops 1302 5.1. Loops 1304 Whenever BIER-TE creates a copy of a packet, the BitString of that 1305 copy will have all BitPositions cleared that are associated with 1306 adjacencies on the BFR. This inhibits looping of packets. The only 1307 exception are adjacencies with DNR set. 1309 With DNR set, looping can happen. Consider in the ring picture that 1310 link L4 from BFR3 is plugged into the L1 interface of BFRa. This 1311 creates a loop where the rings clockwise BitPosition is never reset 1312 for copies of the packets traveling clockwise around the ring. 1314 To inhibit looping in the face of such physical misconfiguration, 1315 only forward_connected adjacencies are permitted to have DNR set, and 1316 the link layer port unique unicast destination address of the 1317 adjacency (e.g. MAC address) protects against closing the loop. 1318 Link layers without port unique link layer addresses should not be 1319 used with the DNR flag set. 1321 5.2. Duplicates 1323 Duplicates happen when the topology of the BitString is not a tree 1324 but redundantly connecting BFRs with each other. The BIER-TE 1325 Controller must therefore ensure to only create BitStrings that are 1326 trees in the topology. 1328 When links are incorrectly physically re-connected before the BIER-TE 1329 Controller updates BitStrings in BFIRs, duplicates can happen. Like 1330 loops, these can be inhibited by link layer addressing in 1331 forward_connected adjacencies. 1333 If interface or loopback addresses used in forward_routed adjacencies 1334 are moved from one BFR to another, duplicates can equally happen. 1335 Such re-addressing operations must be coordinated with the BIER-TE 1336 Controller. 1338 6. BIER-TE Forwarding Pseudocode 1340 The following simplified pseudocode for BIER-TE forwarding is using 1341 BIER forwarding pseudocode of [RFC8279], section 6.5 with the one 1342 modification necessary to support basic BIER-TE forwarding. Like the 1343 BIER pseudo forwarding code, for simplicity it does hide the details 1344 of the adjacency processing inside PacketSend() which can be 1345 forward_connected, forward_routed or local_decap. 1347 void ForwardBitMaskPacket_withTE (Packet) 1348 { 1349 SI=GetPacketSI(Packet); 1350 Offset=SI*BitStringLength; 1351 for (Index = GetFirstBitPosition(Packet->BitString); Index ; 1352 Index = GetNextBitPosition(Packet->BitString, Index)) { 1353 F-BM = BIFT[Index+Offset]->F-BM; 1354 if (!F-BM) continue; 1355 BFR-NBR = BIFT[Index+Offset]->BFR-NBR; 1356 PacketCopy = Copy(Packet); 1357 PacketCopy->BitString &= F-BM; [2] 1358 PacketSend(PacketCopy, BFR-NBR); 1359 // The following must not be done for BIER-TE: 1360 // Packet->BitString &= ~F-BM; [1] 1361 } 1362 } 1364 Figure 15: Simplified BIER-TE Forwarding Pseudocode 1366 The difference is that in BIER-TE, step [1] must not be performed, 1367 but is replaced with [2] (when the forwarding plane algorithm is 1368 implemented verbatim as shown above). 1370 In BIER, the F-BM of a BP has all BP set that are meant to be 1371 forwarded via the same neighbor. It is used to reset those BP in the 1372 packet after the first copy to this neighbor has been made to inhibit 1373 multiple copies to the same neighbor. 1375 In BIER-TE, the F-BM of a particular BP with an adjacency is the list 1376 of all BPs with an adjacency on this BFR except the particular BP 1377 itself if it has an adjacency with the DNR bit set. The F-BM is used 1378 to reset the F-BM BPs before creating copies. 1380 In BIER, the order of BPs impacts the result of forwarding because of 1381 [1]. In BIER-TE, forwarding is not impacted by the order of BPs. It 1382 is therefore possible to further optimize forwarding than in BIER. 1383 For example, BIER-TE forwarding can be parallelized such that a 1384 parallel instance (such as an egres linecard) can process any subset 1385 of BPs without any considerations for the other BPs - and without any 1386 prior, cross-BP shared processing. 1388 The above simplified pseudocode is elaborated further as follows: 1390 o This pseudocode eliminates per-bit F-BM, therefore reducing state 1391 by BitStringLength^2*SI and eliminating the need for per-packet- 1392 copy masking operation except for adjacencies with DNR flag set: 1394 * AdjacentBits[SI] are bits with a non-empty list of adjacencies. 1395 This can be computed whenever the BIER-TE Controller updates 1396 the adjacencies. 1398 * Only the AdjacentBits need to be examined in the loop for 1399 packet copies. 1401 * The packets BitString is masked with those AdjacentBits on 1402 ingress to avoid packets looping. 1404 o The code loops over the adjacencies because there may be more than 1405 one adjacency for a bit. 1407 o When an adjacency has the DNR bit, the bit is set in the packet 1408 copy (to save bits in rings for example). 1410 o The ECMP adjacency is shown. Its parameters are a 1411 ListOfAdjacencies from which one is picked. 1413 o The forward_local, forward_routed, local_decap adjacencies are 1414 shown with their parameters. 1416 void ForwardBitMaskPacket_withTE (Packet) 1417 { 1418 SI=GetPacketSI(Packet); 1419 Offset=SI*BitStringLength; 1420 AdjacentBitstring = Packet->BitString &= ~AdjacentBits[SI]; 1421 Packet->BitString &= AdjacentBits[SI]; 1422 for (Index = GetFirstBitPosition(AdjacentBits); Index ; 1423 Index = GetNextBitPosition(AdjacentBits, Index)) { 1424 foreach adjacency BIFT[Index+Offset] { 1425 if(adjacency == ECMP(ListOfAdjacencies, seed) ) { 1426 I = ECMP_hash(sizeof(ListOfAdjacencies), 1427 Packet->Entropy, seed); 1428 adjacency = ListOfAdjacencies[I]; 1429 } 1430 PacketCopy = Copy(Packet); 1431 switch(adjacency) { 1432 case forward_connected(interface,neighbor,DNR): 1433 if(DNR) 1434 PacketCopy->BitString |= 2<<(Index-1); 1435 SendToL2Unicast(PacketCopy,interface,neighbor); 1437 case forward_routed({VRF},neighbor): 1438 SendToL3(PacketCopy,{VRF,}l3-neighbor); 1440 case local_decap({VRF},neighbor): 1441 DecapBierHeader(PacketCopy); 1442 PassTo(PacketCopy,{VRF,}Packet->NextProto); 1443 } 1444 } 1445 } 1446 } 1448 Figure 16: BIER-TE Forwarding Pseudocode 1450 7. Managing SI, subdomains and BFR-ids 1452 When the number of bits required to represent the necessary hops in 1453 the topology and BFER exceeds the supported bitstring length, 1454 multiple SI and/or subdomains must be used. This section discusses 1455 how. 1457 BIER-TE forwarding does not require the concept of BFR-id, but 1458 routing underlay, flow overlay and BIER headers may. This section 1459 also discusses how BFR-ids can be assigned to BFIR/BFER for BIER-TE. 1461 7.1. Why SI and sub-domains 1463 For BIER and BIER-TE forwarding, the most important result of using 1464 multiple SI and/or subdomains is the same: Packets that need to be 1465 sent to BFER in different SI or subdomains require different BIER 1466 packets: each one with a bitstring for a different (SI,subdomain) 1467 combination. Each such bitstring uses one bitstring length sized SI 1468 block in the BIFT of the subdomain. We call this a BIFT:SI (block). 1470 For BIER and BIER-TE forwarding itself there is also no difference 1471 whether different SI and/or sub-domains are chosen, but SI and 1472 subdomain have different purposes in the BIER architecture shared by 1473 BIER-TE. This impacts how operators are managing them and how 1474 especially flow overlays will likely use them. 1476 By default, every possible BFIR/BFER in a BIER network would likely 1477 be given a BFR-id in subdomain 0 (unless there are > 64k BFIR/BFER). 1479 If there are different flow services (or service instances) requiring 1480 replication to different subsets of BFER, then it will likely not be 1481 possible to achieve the best replication efficiency for all of these 1482 service instances via subdomain 0. Ideal replication efficiency for 1483 N BFER exists in a subdomain if they are split over not more than 1484 ceiling(N/bitstring-length) SI. 1486 If service instances justify additional BIER:SI state in the network, 1487 additional subdomains will be used: BFIR/BFER are assigned BFIR-id in 1488 those subdomains and each service instance is configured to use the 1489 most appropriate subdomain. This results in improved replication 1490 efficiency for different services. 1492 Even if creation of subdomains and assignment of BFR-id to BFIR/BFER 1493 in those subdomains is automated, it is not expected that individual 1494 service instances can deal with BFER in different subdomains. A 1495 service instance may only support configuration of a single subdomain 1496 it should rely on. 1498 To be able to easily reuse (and modify as little as possible) 1499 existing BIER procedures including flow-overlay and routing underlay, 1500 when BIER-TE forwarding is added, we therefore reuse SI and subdomain 1501 logically in the same way as they are used in BIER: All necessary 1502 BFIR/BFER for a service use a single BIER-TE BIFT and are split 1503 across as many SI as necessary (see below). Different services may 1504 use different subdomains that primarily exist to provide more 1505 efficient replication (and for BIER-TE desirable path steering) for 1506 different subsets of BFIR/BFER. 1508 7.2. Bit assignment comparison BIER and BIER-TE 1510 In BIER, bitstrings only need to carry bits for BFER, which leads to 1511 the model that BFR-ids map 1:1 to each bit in a bitstring. 1513 In BIER-TE, bitstrings need to carry bits to indicate not only the 1514 receiving BFER but also the intermediate hops/links across which the 1515 packet must be sent. The maximum number of BFER that can be 1516 supported in a single bitstring or BIFT:SI depends on the number of 1517 bits necessary to represent the desired topology between them. 1519 "Desired" topology because it depends on the physical topology, and 1520 on the desire of the operator to allow for explicit path steeering 1521 across every single hop (which requires more bits), or reducing the 1522 number of required bits by exploiting optimizations such as unicast 1523 (forward_route), ECMP or flood (DNR) over "uninteresting" sub-parts 1524 of the topology - e.g. parts where different trees do not need to 1525 take different paths due to path steering reasons. 1527 The total number of bits to describe the topology vs. the BFER in a 1528 BIFT:SI can range widely based on the size of the topology and the 1529 amount of alternative paths in it. The higher the percentage, the 1530 higher the likelihood, that those topology bits are not just BIER-TE 1531 overhead without additional benefit, but instead that they will allow 1532 to express desirable path steering alternatives. 1534 7.3. Using BFR-id with BIER-TE 1536 Because there is no 1:1 mapping between bits in the bitstring and 1537 BFER, BIER-TE cannot simply rely on the BIER 1:1 mapping between bits 1538 in a bitstring and BFR-id. 1540 In BIER, automatic schemes could assign all possible BFR-ids 1541 sequentially to BFERs. This will not work in BIER-TE. In BIER-TE, 1542 the operator or BIER-TE Controller has to determine a BFR-id for each 1543 BFER in each required subdomain. The BFR-id may or may not have a 1544 relationship with a bit in the bitstring. Suggestions are detailed 1545 below. Once determined, the BFR-id can then be configured on the 1546 BFER and used by flow overlay, routing underlay and the BIER header 1547 almost the same as the BFR-id in BIER. 1549 The one exception are application/flow-overlays that automatically 1550 calculate the bitstring(s) of BIER packets by converting BFR-id to 1551 bits. In BIER-TE, this operation can be done in two ways: 1553 "Independent branches": For a given application or (set of) trees, 1554 the branches from a BFIR to every BFER are independent of the 1555 branches to any other BFER. For example, shortest part trees have 1556 independent branches. 1558 "Interdependent branches": When a BFER is added or deleted from a 1559 particular distribution tree, branches to other BFER still in the 1560 tree may need to change. Steiner tree are examples of dependent 1561 branch trees. 1563 If "independent branches" are sufficient, the BIER-TE Controller can 1564 provide to such applications for every BFR-id a SI:bitstring with the 1565 BIER-TE bits for the branch towards that BFER. The application can 1566 then independently calculate the SI:bitstring for all desired BFER by 1567 OR'ing their bitstrings. 1569 If "interdependent branches" are required, the application could call 1570 a BIER-TE Controller API with the list of required BFER-id and get 1571 the required bitstring back. Whenever the set of BFER-id changes, 1572 this is repeated. 1574 Note that in either case (unlike in BIER), the bits in BIER-TE may 1575 need to change upon link/node failure/recovery, network expansion and 1576 network resource consumption by other traffic as part of traffic 1577 engineering goals (e.g.: re-optimization of lower priority traffic 1578 flows). Interactions between such BFIR applications and the BIER-TE 1579 Controller do therefore need to support dynamic updates to the 1580 bitstrings. 1582 7.4. Assigning BFR-ids for BIER-TE 1584 For a non-leaf BFER, there is usually a single bit k for that BFER 1585 with a local_decap() adjacency on the BFER. The BFR-id for such a 1586 BFER is therefore most easily the one it would have in BIER: SI * 1587 bitstring-length + k. 1589 As explained earlier in the document, leaf BFERs do not need such a 1590 separate bit because the fact alone that the BIER-TE packet is 1591 forwarded to the leaf BFER indicates that the BFER should decapsulate 1592 it. Such a BFER will have one or more bits for the links leading 1593 only to it. The BFR-id could therefore most easily be the BFR-id 1594 derived from the lowest bit for those links. 1596 These two rules are only recommendations for the operator or BIER-TE 1597 Controller assigning the BFR-ids. Any allocation scheme can be used, 1598 the BFR-ids just need to be unique across BFRs in each subdomain. 1600 It is not currently determined if a single subdomain could or should 1601 be allowed to forward both BIER and BIER-TE packets. If this should 1602 be supported, there are two options: 1604 A. BIER and BIER-TE have different BFR-id in the same subdomain. 1605 This allows higher replication efficiency for BIER because their BFR- 1606 id can be assigned sequentially, while the bitstrings for BIER-TE 1607 will have also the additional bits for the topology. There is no 1608 relationship between a BFR BIER BFR-id and BIER-TE BFR-id. 1610 B. BIER and BIER-TE share the same BFR-id. The BFR-id are assigned 1611 as explained above for BIER-TE and simply reused for BIER. The 1612 replication efficiency for BIER will be as low as that for BIER-TE in 1613 this approach. Depending on topology, only the same 20%..80% of bits 1614 as possible for BIER-TE can be used for BIER. 1616 7.5. Example bit allocations 1618 7.5.1. With BIER 1620 Consider a network setup with a bitstring length of 256 for a network 1621 topology as shown in the picture below. The network has 6 areas, 1622 each with ca. 170 BFR, connecting via a core with some larger (core) 1623 BFR. To address all BFER with BIER, 4 SI are required. To send a 1624 BIER packet to all BFER in the network, 4 copies need to be sent by 1625 the BFIR. On the BFIR it does not make a difference how the BFR-id 1626 are allocated to BFER in the network, but for efficiency further down 1627 in the network it does make a difference. 1629 area1 area2 area3 1630 BFR1a BFR1b BFR2a BFR2b BFR3a BFR3b 1631 | \ / \ / | 1632 ................................ 1633 . Core . 1634 ................................ 1635 | / \ / \ | 1636 BFR4a BFR4b BFR5a BFR5b BFR6a BFR6b 1637 area4 area5 area6 1639 Figure 17: Scaling BIER-TE bits by reuse 1641 With random allocation of BFR-id to BFER, each receiving area would 1642 (most likely) have to receive all 4 copies of the BIER packet because 1643 there would be BFR-id for each of the 4 SI in each of the areas. 1644 Only further towards each BFER would this duplication subside - when 1645 each of the 4 trees runs out of branches. 1647 If BFR-id are allocated intelligently, then all the BFER in an area 1648 would be given BFR-id with as few as possible different SI. Each 1649 area would only have to forward one or two packets instead of 4. 1651 Given how networks can grow over time, replication efficiency in an 1652 area will also easily go down over time when BFR-id are network wide 1653 allocated sequentially over time. An area that initially only has 1654 BFR-id in one SI might end up with many SI over a longer period of 1655 growth. Allocating SIs to areas with initially sufficiently many 1656 spare bits for growths can help to alleviate this issue. Or renumber 1657 BFR-id after network expansion. In this example one may consider to 1658 use 6 SI and assign one to each area. 1660 This example shows that intelligent BFR-id allocation within at least 1661 subdomain 0 can even be helpful or even necessary in BIER. 1663 7.5.2. With BIER-TE 1665 In BIER-TE one needs to determine a subset of the physical topology 1666 and attached BFER so that the "desired" representation of this 1667 topology and the BFER fit into a single bitstring. This process 1668 needs to be repeated until the whole topology is covered. 1670 Once bits/SIs are assigned to topology and BFER, BFR-id is just a 1671 derived set of identifiers from the operator/BIER-TE Controller as 1672 explained above. 1674 Every time that different sub-topologies have overlap, bits need to 1675 be repeated across the bitstrings, increasing the overall amount of 1676 bits required across all bitstring/SIs. In the worst case, random 1677 subsets of BFER are assigned to different SI. This is much worse 1678 than in BIER because it not only reduces replication efficiency with 1679 the same number of overall bits, but even further - because more bits 1680 are required due to duplication of bits for topology across multiple 1681 SI. Intelligent BFER to SI assignment and selecting specific 1682 "desired" subtopologies can minimize this problem. 1684 To set up BIER-TE efficiently for above topology, the following bit 1685 allocation methods can be used. This method can easily be expanded 1686 to other, similarly structured larger topologies. 1688 Each area is allocated one or more SI depending on the number of 1689 future expected BFER and number of bits required for the topology in 1690 the area. In this example, 6 SI, one per area. 1692 In addition, we use 4 bits in each SI: bia, bib, bea, beb: bit 1693 ingress a, bit ingress b, bit egress a, bit egress b. These bits 1694 will be used to pass BIER packets from any BFIR via any combination 1695 of ingress area a/b BFR and egress area a/b BFR into a specific 1696 target area. These bits are then set up with the right 1697 forward_routed adjacencies on the BFIR and area edge BFR: 1699 On all BFIR in an area j, bia in each BIFT:SI is populated with the 1700 same forward_routed(BFRja), and bib with forward_routed(BFRjb). On 1701 all area edge BFR, bea in BIFT:SI=k is populated with 1702 forward_routed(BFRka) and beb in BIFT:SI=k with 1703 forward_routed(BFRkb). 1705 For BIER-TE forwarding of a packet to some subset of BFER across all 1706 areas, a BFIR would create at most 6 copies, with SI=1...SI=6, In 1707 each packet, the bits indicate bits for topology and BFER in that 1708 topology plus the four bits to indicate whether to pass this packet 1709 via the ingress area a or b border BFR and the egress area a or b 1710 border BFR, therefore allowing path steering for those two "unicast" 1711 legs: 1) BFIR to ingress are edge and 2) core to egress area edge. 1712 Replication only happens inside the egress areas. For BFER in the 1713 same area as in the BFIR, these four bits are not used. 1715 7.6. Summary 1717 BIER-TE can like BIER support multiple SI within a sub-domain to 1718 allow re-using the concept of BFR-id and therefore minimize BIER-TE 1719 specific functions in underlay routing, flow overlay methods and BIER 1720 headers. 1722 The number of BFIR/BFER possible in a subdomain is smaller than in 1723 BIER because BIER-TE uses additional bits for topology. 1725 Subdomains can in BIER-TE be used like in BIER to create more 1726 efficient replication to known subsets of BFER. 1728 Assigning bits for BFER intelligently into the right SI is more 1729 important in BIER-TE than in BIER because of replication efficiency 1730 and overall amount of bits required. 1732 8. BIER-TE and Segment Routing 1734 SR aims to enable lightweight path steering via loose source routing. 1735 Compared to its more heavy-weight predecessor RSVP-TE, SR does for 1736 example not require per-path signaling to each of these hops. 1738 BIER-TE supports the same design philosophy for multicast. Like in 1739 SR, it relies on source-routing - via the definition of a BitString. 1740 Like SR, it only requires to consider the "hops" on which either 1741 replication has to happen, or across which the traffic should be 1742 steered (even without replication). Any other hops can be skipped 1743 via the use of routed adjacencies. 1745 BIER-TE BitPosition (BP) can be understood as the BIER-TE equivalent 1746 of "forwarding segments" in SR, but they have a different scope than 1747 SR forwarding segments. Whereas forwarding segments in SR are global 1748 or local, BPs in BIER-TE have a scope that is the group of BFR(s) 1749 that have adjacencies for this BP in their BIFT. This can be called 1750 "adjacency" scoped forwarding segments. 1752 Adjacency scope could be global, but then every BFR would need an 1753 adjacency for this BP, for example a forward_routed adjacency with 1754 encapsulation to the global SR SID of the destination. Such a BP 1755 would always result in ingress replication though. The first BFR 1756 encountering this BP would directly replicate to it. Only by using 1757 non-global adjacency scope for BPs can traffic be steered and 1758 replicated on non-ingress BFR. 1760 SR can naturally be combined with BIER-TE and help to optimize it. 1761 For example, instead of defining BitPositions for non-replicating 1762 hops, it is equally possible to use segment routing encapsulations 1763 (eg: MPLS label stacks) for the encapsulation of "forward_routed" 1764 adjacencies. 1766 Note that BIER itself can also be seen to be similar to SR. BIER BPs 1767 act as global destination Node-SIDs and the BIER bitstring is simply 1768 a highly optimized mechanism to indicate multiple such SIDS and let 1769 the network take care of effectively replicating the packet hop-by- 1770 hop to each destination Node-SID. What BIER does not allow is to 1771 indicate intermediate hops, or terms of SR the ability to indicate a 1772 sequence of SID to reach the destination. This is what BIER-TE and 1773 its adjacency scoped BP enables. 1775 Both BIER and BIER-TE allow BFIR to "opportunistically" copy packets 1776 to a set of desired BFER on a packet-by-packet basis. In BIER, this 1777 is done by OR'ing the BP for the desired BFER. In BIER-TE this can 1778 be done by OR'ing for each desired BFER a bitstring using the 1779 "independent branches" approach described in Section 7.3 and 1780 therefore also indicating the engineered path towards each desired 1781 BFER. This is the approach that 1782 [I-D.ietf-bier-multicast-http-response] relies on. 1784 9. Security Considerations 1786 The security considerations are the same as for BIER with the 1787 following differences: 1789 BFR-ids and BFR-prefixes are not used in BIER-TE, nor are procedures 1790 for their distribution, so these are not attack vectors against BIER- 1791 TE. 1793 10. IANA Considerations 1795 This document requests no action by IANA. 1797 11. Acknowledgements 1799 The authors would like to thank Greg Shepherd, Ijsbrand Wijnands, 1800 Neale Ranns, Dirk Trossen, Sandy Zheng, Lou Berger and Jeffrey Zhang 1801 for their reviews and suggestions. 1803 12. Change log [RFC Editor: Please remove] 1805 draft-ietf-bier-te-arch: 1807 09: Incorporated fixes for feedback from Shepherd (Xuesong Geng). 1809 Added references for Bloom Filders and Rate Controlled Service 1810 Disciplines. 1812 1.1 Fixed numbering of example 1 topology explanation. Improved 1813 language on second example (less abbreviating to avoid confusion 1814 about meaning). 1816 1.2 Improved explanation of BIER-TE topology, fixed terminology of 1817 graphs (BIER-TE topology is a directed graph where the edges are 1818 the adjacencies). 1820 2.4 Fixed and amended routing underlay explanations: detailled why 1821 no need for BFER routing underlay routing protocol etensions, but 1822 potential to re-use BIER routing underlay routing protocol 1823 extensions for non-BFER related extensions. 1825 3.1 Added explanation for VRF and its use in adjacencies. 1827 08: Incorporated (with hopefully acceptable fixes) for Lou 1828 suggested section 2.5, TE considerations. 1830 Fixes are primarily to the point to a) emphasize that BIER-TE does 1831 not depend on the routing underlay unless forward_routed 1832 adjacencies are used, and b) that the allocation and tracking of 1833 resources does not explicitly have to be tied to BPs, because they 1834 are just steering labels. Instead, it would ideally come from 1835 per-hop resource management that can be maintained only via local 1836 accounting in the controller. 1838 07: Further reworking text for Lou. 1840 Renamed BIER-PE to BIER-TE standing for "Tree Engineering" after 1841 votes from BIER WG. 1843 Removed section 1.1 (introduced by version 06) because not 1844 considered necessary in this doc by Lou (for framework doc). 1846 Added [RFC editor pls. remove] Section to explain name change to 1847 future reviewers. 1849 06: Concern by Lou Berger re. BIER-TE as full traffic engineering 1850 solution. 1852 Changed title "Traffic Engineering" to "Path Engineering" 1854 Added intro section of relationship BIER-PE to traffic 1855 engineering. 1857 Changed "traffic engineering" term in text" to "path engineering", 1858 where appropriate 1860 Other: 1862 Shortened "BIER-TE Controller Host" to "BIER-TE Controller". 1863 Fixed up all instances of controller to do this. 1865 05: Review Jeffrey Zhang. 1867 Part 2: 1869 4.3 added note about leaf-BFER being also a propery of routing 1870 setup. 1872 4.7 Added missing details from example to avoid confusion with 1873 routed adjacencies, also compressed explanatory text and better 1874 justification why seed is explicitly configured by controller. 1876 4.9 added section discussing generic reuse of BP methods. 1878 4.10 added section summarizing BP optimizations of section 4. 1880 6. Rewrote/compressed explanation of comparison BIER/BIER-TE 1881 forwarding difference. Explained benefit of BIER-TE per-BP 1882 forwarding being independent of forwarding for other BPs. 1884 Part 1: 1886 Explicitly ue forwarded_connected adjcency in ECMP adjcency 1887 examples to avoid confusion. 1889 4.3 Add picture as example for leav vs. non-leaf BFR in topology. 1890 Improved description. 1892 4.5 Exampe for traffic that can be broadcast -> for single BP in 1893 hub&spoke. 1895 4.8.1 Simplified example picture for routed adjacency, explanatory 1896 text. 1898 Review from Dirk Trossen: 1900 Fixed up explanation of ICC paper vs. bloom filter. 1902 04: spell check run. 1904 Addded remaining fixes for Sandys (Zhang Zheng) review: 1906 4.7 Enhance ECMP explanations: 1908 example ECMP algorithm, highlight that doc does not standardize 1909 ECMP algorithm. 1911 Review from Dirk Trossen: 1913 1. Added mentioning of prior work for traffic engineered paths 1914 with bloom filters. 1916 2. Changed title from layers to components and added "BIER-TE 1917 control plane" to "BIER-TE Controller" to make it clearer, what it 1918 does. 1920 2.2.3. Added reference to I-D.ietf-bier-multicast-http-response 1921 as an example solution. 1923 2.3. clarified sentence about resetting BPs before sending copies 1924 (also forgot to mention DNR here). 1926 3.4. Added text saying this section will be removed unless IESG 1927 review finds enough redeeming value in this example given how -03 1928 introduced section 1.1 with basic examples. 1930 7.2. Removed explicit numbers 20%/80% for number of topology bits 1931 in BIER-TE, replaced with more vague (high/low) description, 1932 because we do not have good reference material Added text saying 1933 this section will be removed unless IESG review finds enough 1934 redeeming value in this example given how -03 introduced section 1935 1.1 with basic examples. 1937 many typos fixed. Thanks a lot. 1939 03: Last call textual changes by authors to improve readability: 1941 removed Wolfgang Braun as co-authors (as requested). 1943 Improved abstract to be more explanatory. Removed mentioning of 1944 FRR (not concluded on so far). 1946 Added new text into Introduction section because the text was too 1947 difficult to jump into (too many forward pointers). This 1948 primarily consists of examples and the early introduction of the 1949 BIER-TE Topology concept enabled by these examples. 1951 Amended comparison to SR. 1953 Changed syntax from [VRF] to {VRF} to indicate its optional and to 1954 make idnits happy. 1956 Split references into normative / informative, added references. 1958 02: Refresh after IETF104 discussion: changed intended status back 1959 to standard. Reasoning: 1961 Tighter review of standards document == ensures arch will be 1962 better prepared for possible adoption by other WGs (e.g. DetNet) 1963 or std. bodies. 1965 Requirement against the degree of existing implementations is self 1966 defined by the WG. BIER WG seems to think it is not necessary to 1967 apply multiple interoperating implementations against an 1968 architecture level document at this time to make it qualify to go 1969 to standards track. Also, the levels of support introduced in -01 1970 rev. should allow all BIER forwarding engines to also be able to 1971 support the base level BIER-TE forwarding. 1973 01: Added note comparing BIER and SR to also hopefully clarify 1974 BIER-TE vs. BIER comparison re. SR. 1976 - added requirements section mandating only most basic BIER-TE 1977 forwarding features as MUST. 1979 - reworked comparison with BIER forwarding section to only 1980 summarize and point to pseudocode section. 1982 - reworked pseudocode section to have one pseudocode that mirrors 1983 the BIER forwarding pseudocode to make comparison easier and a 1984 second pseudocode that shows the complete set of BIER-TE 1985 forwarding options and simplification/optimization possible vs. 1986 BIER forwarding. Removed MyBitsOfInterest (was pure 1987 optimization). 1989 - Added captions to pictures. 1991 - Part of review feedback from Sandy (Zhang Zheng) integrated. 1993 00: Changed target state to experimental (WG conclusion), updated 1994 references, mod auth association. 1996 - Source now on http://www.github.com/toerless/bier-te-arch 1998 - Please open issues on the github for change/improvement requests 1999 to the document - in addition to posting them on the list 2000 (bier@ietf.). Thanks!. 2002 draft-eckert-bier-te-arch: 2004 06: Added overview of forwarding differences between BIER, BIER- 2005 TE. 2007 05: Author affiliation change only. 2009 04: Added comparison to Live-Live and BFIR to FRR section 2010 (Eckert). 2012 04: Removed FRR content into the new FRR draft [I-D.eckert-bier- 2013 te-frr] (Braun). 2015 - Linked FRR information to new draft in Overview/Introduction 2017 - Removed BTAFT/FRR from "Changes in the network topology" 2019 - Linked new draft in "Link/Node Failures and Recovery" 2021 - Removed FRR from "The BIER-TE Forwarding Layer" 2023 - Moved FRR section to new draft 2025 - Moved FRR parts of Pseudocode into new draft 2027 - Left only non FRR parts 2029 - removed FrrUpDown(..) and //FRR operations in 2030 ForwardBierTePacket(..) 2031 - New draft contains FrrUpDown(..) and ForwardBierTePacket(Packet) 2032 from bier-arch-03 2034 - Moved "BIER-TE and existing FRR to new draft 2036 - Moved "BIER-TE and Segment Routing" section one level up 2038 - Thus, removed "Further considerations" that only contained this 2039 section 2041 - Added Changes for version 04 2043 03: Updated the FRR section. Added examples for FRR key concepts. 2044 Added BIER-in-BIER tunneling as option for tunnels in backup 2045 paths. BIFT structure is expanded and contains an additional 2046 match field to support full node protection with BIER-TE FRR. 2048 03: Updated FRR section. Explanation how BIER-in-BIER 2049 encapsulation provides P2MP protection for node failures even 2050 though the routing underlay does not provide P2MP. 2052 02: Changed the definition of BIFT to be more inline with BIER. 2053 In revs. up to -01, the idea was that a BIFT has only entries for 2054 a single bitstring, and every SI and subdomain would be a separate 2055 BIFT. In BIER, each BIFT covers all SI. This is now also how we 2056 define it in BIER-TE. 2058 02: Added Section 7 to explain the use of SI, subdomains and BFR- 2059 id in BIER-TE and to give an example how to efficiently assign 2060 bits for a large topology requiring multiple SI. 2062 02: Added further detailed for rings - how to support input from 2063 all ring nodes. 2065 01: Fixed BFIR -> BFER for section 4.3. 2067 01: Added explanation of SI, difference to BIER ECMP, 2068 consideration for Segment Routing, unicast FRR, considerations for 2069 encapsulation, explanations of BIER-TE Controller and CLI. 2071 00: Initial version. 2073 13. References 2075 13.1. Normative References 2077 [RFC8279] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., 2078 Przygienda, T., and S. Aldrin, "Multicast Using Bit Index 2079 Explicit Replication (BIER)", RFC 8279, 2080 DOI 10.17487/RFC8279, November 2017, 2081 . 2083 [RFC8296] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., 2084 Tantsura, J., Aldrin, S., and I. Meilik, "Encapsulation 2085 for Bit Index Explicit Replication (BIER) in MPLS and Non- 2086 MPLS Networks", RFC 8296, DOI 10.17487/RFC8296, January 2087 2018, . 2089 13.2. Informative References 2091 [Bloom70] Bloom, B., "Space/time trade-offs in hash coding with 2092 allowable errors", Comm. ACM 13(7):422-6, July 1970. 2094 [I-D.ietf-bier-multicast-http-response] 2095 Trossen, D., Rahman, A., Wang, C., and T. Eckert, 2096 "Applicability of BIER Multicast Overlay for Adaptive 2097 Streaming Services", draft-ietf-bier-multicast-http- 2098 response-04 (work in progress), July 2020. 2100 [I-D.ietf-roll-ccast] 2101 Bergmann, O., Bormann, C., Gerdes, S., and H. Chen, 2102 "Constrained-Cast: Source-Routed Multicast for RPL", 2103 draft-ietf-roll-ccast-01 (work in progress), October 2017. 2105 [I-D.ietf-teas-rfc3272bis] 2106 Farrel, A., "Overview and Principles of Internet Traffic 2107 Engineering", draft-ietf-teas-rfc3272bis-01 (work in 2108 progress), July 2020. 2110 [ICC] Reed, M., Al-Naday, M., Thomos, N., Trossen, D., 2111 Petropoulos, G., and S. Spirou, "Stateless multicast 2112 switching in software defined networks", IEEE 2113 International Conference on Communications (ICC), Kuala 2114 Lumpur, Malaysia, 2016, May 2016, 2115 . 2117 [RCSD94] Zhang, H. and D. Domenico, "Rate-Controlled Service 2118 Disciplines", Journal of High-Speed Networks, 1994, May 2119 1994, . 2121 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2122 Requirement Levels", BCP 14, RFC 2119, 2123 DOI 10.17487/RFC2119, March 1997, 2124 . 2126 Authors' Addresses 2128 Toerless Eckert (editor) 2129 Futurewei Technologies Inc. 2130 2330 Central Expy 2131 Santa Clara 95050 2132 USA 2134 Email: tte+ietf@cs.fau.de 2136 Gregory Cauchie 2137 Bouygues Telecom 2139 Email: GCAUCHIE@bouyguestelecom.fr 2141 Michael Menth 2142 University of Tuebingen 2144 Email: menth@uni-tuebingen.de