idnits 2.17.1 draft-ietf-bier-te-arch-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC8279]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 1, 2019) is 1638 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '2' on line 1260 -- Looks like a reference, but probably isn't: '1' on line 1274 == Missing Reference: 'SI' is mentioned on line 1314, but not defined == Missing Reference: 'I' is mentioned on line 1321, but not defined == Missing Reference: 'VRF' is mentioned on line 1788, but not defined == Outdated reference: A later version (-06) exists of draft-ietf-bier-multicast-http-response-01 Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Eckert, Ed. 3 Internet-Draft Futurewei 4 Intended status: Standards Track G. Cauchie 5 Expires: May 4, 2020 Bouygues Telecom 6 M. Menth 7 University of Tuebingen 8 November 1, 2019 10 Traffic Engineering for Bit Index Explicit Replication (BIER-TE) 11 draft-ietf-bier-te-arch-05 13 Abstract 15 This memo introduces per-packet stateless strict and loose path 16 engineered replication and forwarding for Bit Index Explicit 17 Replication packets ([RFC8279]). This is called BIER-TE. 19 BIER-TE leverages the BIER architecture ([RFC8279]) and extends it 20 with a new semantic for bits in the bitstring. BIER-TE can leverage 21 BIER forwarding engines with little or no changes. 23 In BIER, the BitPositions (BP) of the packets bitstring indicate BIER 24 Forwarding Egress Routers (BFER), and hop-by-hop forwarding uses a 25 Routing Underlay such as an IGP. 27 In BIER-TE, BitPositions indicate adjacencies. The BIFT of each BFR 28 are only populated with BPs that are adjacent to the BFR in the BIER- 29 TE topology. The BIER-TE topology can consist of layer 2 or remote 30 (route) adjacencies. The BFR then replicates and forwards BIER 31 packets to those adjacencies. This results in the aforementioned 32 strict and loose path forwarding. 34 BIER-TE can co-exist with BIER forwarding in the same domain, for 35 example by using separate sub-domains. In the absence of routed 36 adjacencies, BIER-TE does not require a BIER routing underlay, and 37 can then be operated without requiring an IGP routing protocol. 39 BIER-TE operates without explicit in-network tree-building and 40 carries the multicast distribution tree in the packet header. It can 41 therefore be a good fit to support multicast path steering in Segment 42 Routing (SR) networks. 44 Status of This Memo 46 This Internet-Draft is submitted in full conformance with the 47 provisions of BCP 78 and BCP 79. 49 Internet-Drafts are working documents of the Internet Engineering 50 Task Force (IETF). Note that other groups may also distribute 51 working documents as Internet-Drafts. The list of current Internet- 52 Drafts is at https://datatracker.ietf.org/drafts/current/. 54 Internet-Drafts are draft documents valid for a maximum of six months 55 and may be updated, replaced, or obsoleted by other documents at any 56 time. It is inappropriate to use Internet-Drafts as reference 57 material or to cite them other than as "work in progress." 59 This Internet-Draft will expire on May 4, 2020. 61 Copyright Notice 63 Copyright (c) 2019 IETF Trust and the persons identified as the 64 document authors. All rights reserved. 66 This document is subject to BCP 78 and the IETF Trust's Legal 67 Provisions Relating to IETF Documents 68 (https://trustee.ietf.org/license-info) in effect on the date of 69 publication of this document. Please review these documents 70 carefully, as they describe your rights and restrictions with respect 71 to this document. Code Components extracted from this document must 72 include Simplified BSD License text as described in Section 4.e of 73 the Trust Legal Provisions and are provided without warranty as 74 described in the Simplified BSD License. 76 Table of Contents 78 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 79 1.1. Basic Examples . . . . . . . . . . . . . . . . . . . . . 4 80 1.2. BIER-TE Topology and adjacencies . . . . . . . . . . . . 7 81 1.3. Comparison with BIER . . . . . . . . . . . . . . . . . . 8 82 1.4. Requirements Language . . . . . . . . . . . . . . . . . . 8 83 2. Components . . . . . . . . . . . . . . . . . . . . . . . . . 8 84 2.1. The Multicast Flow Overlay . . . . . . . . . . . . . . . 9 85 2.2. The BIER-TE Controller Host . . . . . . . . . . . . . . . 9 86 2.2.1. Assignment of BitPositions to adjacencies of the 87 network topology . . . . . . . . . . . . . . . . . . 10 88 2.2.2. Changes in the network topology . . . . . . . . . . . 10 89 2.2.3. Set up per-multicast flow BIER-TE state . . . . . . . 10 90 2.2.4. Link/Node Failures and Recovery . . . . . . . . . . . 11 91 2.3. The BIER-TE Forwarding Layer . . . . . . . . . . . . . . 11 92 2.4. The Routing Underlay . . . . . . . . . . . . . . . . . . 11 93 3. BIER-TE Forwarding . . . . . . . . . . . . . . . . . . . . . 11 94 3.1. The Bit Index Forwarding Table (BIFT) . . . . . . . . . . 11 95 3.2. Adjacency Types . . . . . . . . . . . . . . . . . . . . . 13 96 3.2.1. Forward Connected . . . . . . . . . . . . . . . . . . 13 97 3.2.2. Forward Routed . . . . . . . . . . . . . . . . . . . 13 98 3.2.3. ECMP . . . . . . . . . . . . . . . . . . . . . . . . 13 99 3.2.4. Local Decap . . . . . . . . . . . . . . . . . . . . . 14 100 3.3. Encapsulation considerations . . . . . . . . . . . . . . 14 101 3.4. Basic BIER-TE Forwarding Example . . . . . . . . . . . . 14 102 3.5. Forwarding comparison with BIER . . . . . . . . . . . . . 17 103 3.6. Requirements . . . . . . . . . . . . . . . . . . . . . . 17 104 4. BIER-TE Controller Host BitPosition Assignments . . . . . . . 18 105 4.1. P2P Links . . . . . . . . . . . . . . . . . . . . . . . . 18 106 4.2. BFER . . . . . . . . . . . . . . . . . . . . . . . . . . 18 107 4.3. Leaf BFERs . . . . . . . . . . . . . . . . . . . . . . . 18 108 4.4. LANs . . . . . . . . . . . . . . . . . . . . . . . . . . 19 109 4.5. Hub and Spoke . . . . . . . . . . . . . . . . . . . . . . 20 110 4.6. Rings . . . . . . . . . . . . . . . . . . . . . . . . . . 20 111 4.7. Equal Cost MultiPath (ECMP) . . . . . . . . . . . . . . . 21 112 4.8. Routed adjacencies . . . . . . . . . . . . . . . . . . . 24 113 4.8.1. Reducing BitPositions . . . . . . . . . . . . . . . . 24 114 4.8.2. Supporting nodes without BIER-TE . . . . . . . . . . 24 115 4.9. Reuse of BitPositions (without DNR) . . . . . . . . . . . 24 116 4.10. Summary of BP optimizations . . . . . . . . . . . . . . . 26 117 5. Avoiding loops and duplicates . . . . . . . . . . . . . . . . 27 118 5.1. Loops . . . . . . . . . . . . . . . . . . . . . . . . . . 27 119 5.2. Duplicates . . . . . . . . . . . . . . . . . . . . . . . 27 120 6. BIER-TE Forwarding Pseudocode . . . . . . . . . . . . . . . . 27 121 7. Managing SI, subdomains and BFR-ids . . . . . . . . . . . . . 30 122 7.1. Why SI and sub-domains . . . . . . . . . . . . . . . . . 31 123 7.2. Bit assignment comparison BIER and BIER-TE . . . . . . . 32 124 7.3. Using BFR-id with BIER-TE . . . . . . . . . . . . . . . . 32 125 7.4. Assigning BFR-ids for BIER-TE . . . . . . . . . . . . . . 33 126 7.5. Example bit allocations . . . . . . . . . . . . . . . . . 34 127 7.5.1. With BIER . . . . . . . . . . . . . . . . . . . . . . 34 128 7.5.2. With BIER-TE . . . . . . . . . . . . . . . . . . . . 35 129 7.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 36 130 8. BIER-TE and Segment Routing (SR) . . . . . . . . . . . . . . 36 131 9. Security Considerations . . . . . . . . . . . . . . . . . . . 37 132 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 38 133 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 38 134 12. Change log [RFC Editor: Please remove] . . . . . . . . . . . 38 135 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 42 136 13.1. Normative References . . . . . . . . . . . . . . . . . . 42 137 13.2. Informative References . . . . . . . . . . . . . . . . . 43 138 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 43 140 1. Introduction 142 BIER-TE shares architecture, terminology and packet formats with BIER 143 as described in [RFC8279] and [RFC8296]. This document describes 144 BIER-TE in the expectation that the reader is familiar with these two 145 documents. 147 In BIER-TE, BitPositions (BP) indicate adjacencies. The BIFT of each 148 BFR is only populated with BP that are adjacent to the BFR in the 149 BIER-TE Topology. Other BPs are left without adjacency. The BFR 150 replicate and forwards BIER packets to adjacent BPs that are set in 151 the packet. BPs are normally also reset upon forwarding to avoid 152 duplicates and loops. This is detailed further below. 154 Note that related work, [I-D.ietf-roll-ccast] uses bloom filters to 155 represent leaves or edges of the intended delivery tree. Bloom 156 filters in general can support larger trees/topologies with fewer 157 addressing bits than explicit bitstrings, but they introduce the 158 heuristic risk of false positives and cannot reset bits in the 159 bitstring during forwarding to avoid loops. For these reasons, BIER- 160 TE uses explicit bitstrings like BIER. The explicit bitstrings of 161 BIER-TE can also be seen as a special type of bloom filter, and this 162 is how related work [ICC] describes it. 164 1.1. Basic Examples 166 BIER-TE forwarding is best introduced with simple examples. 168 BIER-TE Topology: 170 Diagram: 172 p5 p6 173 --- BFR3 --- 174 p3/ p13 \p7 175 BFR1 ---- BFR2 BFR5 ----- BFR6 176 p1 p2 p4\ p14 /p10 p11 p12 177 --- BFR4 --- 178 p8 p9 180 (simplified) BIER-TE Bit Index Forwarding Tables (BIFT): 182 BFR1: p1 -> local_decap 183 p2 -> forward_connected to BFR2 185 BFR2: p1 -> forward_connected to BFR1 186 p5 -> forward_connected to BFR3 187 p8 -> forward_connected to BFR4 189 BFR3: p3 -> forward_connected to BFR2 190 p7 -> forward_connected to BFR5 191 p13 -> local_decap 193 BFR4: p4 -> forward_connected to BFR2 194 p10 -> forward_connected to BFR5 195 p14 -> local_decap 197 BFR5: p6 -> forward_connected to BFR3 198 p9 -> forward_connected to BFR4 199 p12 -> forward_connected to BFR6 201 BFR6: p11 -> forward_connected to BFR5 202 p12 -> local_decap 204 Figure 1: BIER-TE basic example 206 Consider the simple network in the above BIER-TE overview example 207 picture with 6 BFRs. p1...p14 are the BitPositions (BP) used. All 208 BFRs can act as ingress BFR (BFIR), BFR1, BFR3, BFR4 and BFR6 can 209 also be egress BFR (BFER). Forward_connected is the name for 210 adjacencies that are representing subnet adjacencies of the network. 211 Local_decap is the name of the adjacency to decapsulate BIER-TE 212 packets and pass their payload to higher layer processing. 214 Assume a packet from BFR1 should be sent via BFR4 to BFR6. This 215 requires a bitstring (p2,p8,p10,p12). When this packet is examined 216 by BIER-TE on BFR1, the only BitPosition from the bitstring that is 217 also set in the BIFT is p2. This will cause BFR1 to send the only 218 copy of the packet to BFR2. Similarly, BFR2 will forward to BFR4 219 because of p8, BFR4 to BFR5 because of p10 and BFR5 to BFR6 because 220 of p12. p12 also makes BFR6 receive and decapsulate the packet. 222 To send in addition to BFR6 via BFR4 also a copy to BFR3, the 223 bitstring needs to be (p2,p5,p8,p10,p12,p13). When this packet is 224 examined by BFR2, p5 causes one copy to be sent to BFR3 and p8 one 225 copy to BFR4. When BFR3 receives the packet, p13 will cause it to 226 receive and decapsulate the packet. 228 If instead the bitstring was (p2,p6,p8,p10,p12,p13), the packet would 229 be copied by BFR5 towards BFR3 because p6 instead of BFR2 to BFR5 230 because of p6 in the prior case. This is showing the ability of the 231 shown BIER-TE Topology to make the traffic pass across any possible 232 path and be replicated where desired. 234 BIER-TE has various options to minimize BP assignments, many of which 235 are based on assumptions about the required multicast traffic paths 236 and bandwidth consumption in the network. 238 The following picture shows a modified example, in which Rtr2 and 239 Rtr5 are assumed not to support BIER-TE, so traffic has to be unicast 240 encapsulated across them. Unicast tunneling of BIER-TE packets can 241 leverage any feasible mechanism such as MPLS or IP, these 242 encapsulations are out of scope of this document. To emphasize non- 243 native forwarding of BIER-TE packets, these adjacencies are called 244 "forward_routed", but otherwise there is no difference in their 245 processing over the aforementioned "forward_connected" adjacencies. 247 In addition, bits are saved in the following example by assuming that 248 BFR1 only needs to be BFIR but not BFER or transit BFR. 250 BIER-TE Topology: 252 Diagram: 254 p1 p3 p7 255 ....> BFR3 <.... p5 256 ........ ........> 257 BFR1 (Rtr2) (Rtr5) BFR6 258 ........ ........> 259 ....> BFR4 <.... p6 260 p2 p4 p8 262 (simplified) BIER-TE Bit Index Forwarding Tables (BIFT): 264 BFR1: p1 -> forward_routed to BFR3 265 p2 -> forward_routed to BFR4 267 BFR3: p3 -> local_decap 268 p5 -> forward_routed to BFR6 270 BFR4: p4 -> local_decap 271 p6 -> forward_routed to BFR6 273 BFR6: p5 -> local_decap 274 p6 -> local_decap 275 p7 -> forward_routed to BFR3 276 p8 -> forward_routed to BFR4 278 Figure 2: BIER-TE basic overlay example 280 To send a BIER-TE packet from BFR1 via BFR3 to BFR6, the bitstring is 281 (p1,p5). From BFR1 via BFR4 to BFR6 it is (p2,p6). A packet from 282 BFR1 to BFR3,BFR4 and BFR6 can use (p1,p2,p3,p4,p5) or 283 (p1,p2,p3,p4,p6), or via BFR6 (p2,p3,p4,p6,p7) or (p1.p3,p4,p5,p8). 285 1.2. BIER-TE Topology and adjacencies 287 The key new component in BIER-TE to control where replication can or 288 should happens and how to minimize the required BP for segments is - 289 as shown in these two examples - the BIER-TE topology. 291 The BIER-TE Topology effectively consists of the BIFT of all the BFR 292 and can also be expressed in a diagram as a graph where the edges are 293 the adjacencies between the BFR. Adjacencies are naturally 294 unidirectional. BP can be reused across multiple adjacencies as long 295 as this does not lead to undesired duplicates or loops as explained 296 further down in the text. 298 If the BIER-TE topology represents the underlying (layer 2) topology 299 of the network, this is called "native" BIER-TE as shown in the first 300 example. This can be freely mixed with "overlay" BIER-TE, in 301 "forward_routed" adjacencies are used. 303 1.3. Comparison with BIER 305 The key differences over BIER are: 307 o BIER-TE replaces in-network autonomous path calculation by 308 explicit paths calculated off-path by the BIER-TE controller host. 310 o In BIER-TE every BitPosition of the BitString of a BIER-TE packet 311 indicates one or more adjacencies - instead of a BFER as in BIER. 313 o BIER-TE in each BFR has no routing table but only a BIER-TE 314 Forwarding Table (BIFT) indexed by SI:BitPosition and populated 315 with only those adjacencies to which the BFR should replicate 316 packets to. 318 BIER-TE headers use the same format as BIER headers. 320 BIER-TE forwarding does not require/use the BFIR-ID. The BFIR-ID can 321 still be useful though for coordinated BFIR/BFER functions, such as 322 the context for upstream assigned labels for MPLS payloads in MVPN 323 over BIER-TE. 325 If the BIER-TE domain is also running BIER, then the BFIR-ID in BIER- 326 TE packets can be set to the same BFIR-ID as used with BIER packets. 328 If the BIER-TE domain is not running full BIER or does not want to 329 reduce the need to allocate bits in BIER bitstrings for BFIR-ID 330 values, then the allocation of BFIR-ID values in BIER-TE packets can 331 be done through other mechanisms outside the scope of this document, 332 as long as this is appropriately agreed upon between all BFIR/BFER. 334 1.4. Requirements Language 336 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 337 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 338 document are to be interpreted as described in RFC 2119 [RFC2119]. 340 2. Components 342 End to end BIER-TE operations consists of four mayor components: The 343 "Multicast Flow Overlay", the "BIER-TE control plane" consisting of 344 the "BIER-TE Controller Host" and its signaling channels to the BFR, 345 the "Routing Underlay" and the "BIER-TE forwarding layer". The Bier- 346 TE Controller Host is the new architectural component in BIER-TE 347 compared to BIER. 349 Picture 2: Components of BIER-TE 351 <------BGP/PIM-----> 352 |<-IGMP/PIM-> multicast flow <-PIM/IGMP->| 353 overlay 355 [BIER-TE Controller Host] <=> [BIER-TE Topology] 356 BIER-TE control plane 357 ^ ^ ^ 358 / | \ BIER-TE control protocol 359 | | | e.g. Netconf/Restconf/Yang 360 v v v 361 Src -> Rtr1 -> BFIR-----BFR-----BFER -> Rtr2 -> Rcvr 363 |<----------------->| 364 BIER-TE forwarding layer 366 |<- BIER-TE domain->| 368 |<--------------------->| 369 Routing underlay 371 Figure 3: BIER-TE architecture 373 2.1. The Multicast Flow Overlay 375 The Multicast Flow Overlay operates as in BIER. See [RFC8279]. 376 Instead of interacting with the BIER forwarding layer (as in BIER), 377 it interacts with the BIER-TE Controller Host. 379 2.2. The BIER-TE Controller Host 381 The BIER-TE controller host is representing the control plane of 382 BIER-TE. It communicates two sets of information with BFRs: 384 During initial provisioning or modifications of the network topology, 385 the controller discovers the network topology and creates the BIER-TE 386 topology from it: determine which adjacencies are required/desired 387 and assign BitPositions to them. Then it signals the resulting of 388 BitPositions and their adjacencies to each BFR to set up their BIER- 389 TE BIFTs. 391 During day-to-day operations of the network, the controller signals 392 to BFIRs what multicast flows are mapped to what BitStrings. 394 Communications between the BIER-TE controller host to BFRs is ideally 395 via standardized protocols and data-models such as Netconf/Restconf/ 396 Yang. This is currently outside the scope of this document. Vendor- 397 specific CLI on the BFRs is also a possible stopgap option (as in 398 many other SDN solutions lacking definition of standardized data 399 model). 401 For simplicity, the procedures of the BIER-TE controller host are 402 described in this document as if it is a single, centralized 403 automated entity, such as an SDN controller. It could equally be an 404 operator setting up CLI on the BFRs. Distribution of the functions 405 of the BIER-TE controller host is currently outside the scope of this 406 document. 408 2.2.1. Assignment of BitPositions to adjacencies of the network 409 topology 411 The BIER-TE controller host tracks the BFR topology of the BIER-TE 412 domain. It determines what adjacencies require BitPositions so that 413 BIER-TE explicit paths can be built through them as desired by 414 operator policy. 416 The controller then pushes the BitPositions/adjacencies to the BIFT 417 of the BFRs, populating only those SI:BitPositions to the BIFT of 418 each BFR to which that BFR should be able to send packets to - 419 adjacencies connecting to this BFR. 421 2.2.2. Changes in the network topology 423 If the network topology changes (not failure based) so that 424 adjacencies that are assigned to BitPositions are no longer needed, 425 the controller can re-use those BitPositions for new adjacencies. 426 First, these BitPositions need to be removed from any BFIR flow state 427 and BFR BIFT state, then they can be repopulated, first into BIFT and 428 then into the BFIR. 430 2.2.3. Set up per-multicast flow BIER-TE state 432 The BIER-TE controller host interacts with the multicast flow overlay 433 to determine what multicast flow needs to be sent by a BFIR to which 434 set of BFER. It calculates the desired distribution tree across the 435 BIER-TE domain based on algorithms outside the scope of this document 436 (e.g. CSFP, Steiner Tree, ...). It then pushes the calculated 437 BitString into the BFIR. 439 See [I-D.ietf-bier-multicast-http-response] for a solution describing 440 this interaction. 442 2.2.4. Link/Node Failures and Recovery 444 When link or nodes fail or recover in the topology, BIER-TE can 445 quickly respond with the optional FRR procedures described in [I- 446 D.eckert-bier-te-frr]. It can also more slowly react by 447 recalculating the BitStrings of affected multicast flows. This 448 reaction is slower than the FRR procedure because the controller 449 needs to receive link/node up/down indications, recalculate the 450 desired BitStrings and push them down into the BFIRs. With FRR, this 451 is all performed locally on a BFR receiving the adjacency up/down 452 notification. 454 2.3. The BIER-TE Forwarding Layer 456 When the BIER-TE Forwarding Layer receives a packet, it simply looks 457 up the BitPositions that are set in the BitString of the packet in 458 the Bit Index Forwarding Table (BIFT) that was populated by the BIER- 459 TE controller host. For every BP that is set in the BitString, and 460 that has one or more adjacencies in the BIFT, a copy is made 461 according to the type of adjacencies for that BP in the BIFT. Before 462 sending any copy, the BFR resets all BP in the BitString of the 463 packet for which the BFR has one or more adjacencies in the BIFT, 464 except when the adjacency indicates "DoNotReset" (DNR, see 465 Section 3.2.1). This is done to inhibit that packets can loop. 467 2.4. The Routing Underlay 469 BIER-TE is sending BIER packets to directly connected BIER-TE 470 neighbors as L2 (unicasted) BIER packets without requiring a routing 471 underlay. BIER-TE forwarding uses the Routing underlay for 472 forward_routed adjacencies which copy BIER-TE packets to not- 473 directly-connected BFRs (see below for adjacency definitions). 475 If the BFR intends to support FRR for BIER-TE, then the BIER-TE 476 forwarding plane needs to receive fast adjacency up/down 477 notifications: Link up/down or neighbor up/down, e.g. from BFD. 478 Providing these notifications is considered to be part of the routing 479 underlay in this document. 481 3. BIER-TE Forwarding 483 3.1. The Bit Index Forwarding Table (BIFT) 485 The Bit Index Forwarding Table (BIFT) exists in every BFR. For every 486 subdomain in use, it is a table indexed by SI:BitPosition and is 487 populated by the BIER-TE control plane. Each index can be empty or 488 contain a list of one or more adjacencies. 490 BIER-TE can support multiple subdomains like BIER. Each one with a 491 separate BIFT 493 In the BIER architecture, indices into the BIFT are explained to be 494 both BFR-id and SI:BitString (BitPosition). This is because there is 495 a 1:1 relationship between BFR-id and SI:BitString - every bit in 496 every SI is/can be assigned to a BFIR/BFER. In BIER-TE there are 497 more bits used in each BitString than there are BFIR/BFER assigned to 498 the bitstring. This is because of the bits required to express the 499 (traffic engineered) path through the topology. The BIER-TE 500 forwarding definitions do therefore not use the term BFR-id at all. 501 Instead, BFR-ids are only used as required by routing underlay, flow 502 overlay of BIER headers. Please refer to Section 7 for explanations 503 how to deal with SI, subdomains and BFR-id in BIER-TE. 505 ------------------------------------------------------------------ 506 | Index: | Adjacencies: | 507 | SI:BitPosition | or one or more per entry | 508 ================================================================== 509 | 0:1 | forward_connected(interface,neighbor{,DNR}) | 510 ------------------------------------------------------------------ 511 | 0:2 | forward_connected(interface,neighbor{,DNR}) | 512 | | forward_connected(interface,neighbor{,DNR}) | 513 ------------------------------------------------------------------ 514 | 0:3 | local_decap({VRF}) | 515 ------------------------------------------------------------------ 516 | 0:4 | forward_routed({VRF,}l3-neighbor) | 517 ------------------------------------------------------------------ 518 | 0:5 | | 519 ------------------------------------------------------------------ 520 | 0:6 | ECMP({adjacency1,...adjacencyN}, seed) | 521 ------------------------------------------------------------------ 522 ... 523 | BitStringLength | ... | 524 ------------------------------------------------------------------ 525 Bit Index Forwarding Table 527 Figure 4: BIFT adjacencies 529 The BIFT is programmed into the data plane of BFRs by the BIER-TE 530 controller host and used to forward packets, according to the rules 531 specified in the BIER-TE Forwarding Procedures. 533 Adjacencies for the same BP when populated in more than one BFR by 534 the controller does not have to have the same adjacencies. This is 535 up to the controller. BPs for p2p links are one case (see below). 537 3.2. Adjacency Types 539 3.2.1. Forward Connected 541 A "forward_connected" adjacency is towards a directly connected BFR 542 neighbor using an interface address of that BFR on the connecting 543 interface. A forward_connected adjacency does not route packets but 544 only L2 forwards them to the neighbor. 546 Packets sent to an adjacency with "DoNotReset" (DNR) set in the BIFT 547 will not have the BitPosition for that adjacency reset when the BFR 548 creates a copy for it. The BitPosition will still be reset for 549 copies of the packet made towards other adjacencies. This can be 550 used for example in ring topologies as explained below. 552 3.2.2. Forward Routed 554 A "forward_routed" adjacency is an adjacency towards a BFR that is 555 not a forward_connected adjacency: towards a loopback address of a 556 BFR or towards an interface address that is non-directly connected. 557 Forward_routed packets are forwarded via the Routing Underlay. 559 If the Routing Underlay has multiple paths for a forward_routed 560 adjacency, it will perform ECMP independent of BIER-TE for packets 561 forwarded across a forward_routed adjacency. This is independent of 562 BIER-TE ECMP described in Section 3.2.3. 564 If the Routing Underlay has FRR, it will perform FRR independent of 565 BIER-TE for packets forwarded across a forward_routed adjacency. 567 3.2.3. ECMP 569 The ECMP mechanisms in BIER are tied to the BIER BIFT and are 570 therefore not directly useable with BIER-TE. The following 571 procedures describe ECMP for BIER-TE that we consider to be 572 lightweight but also well manageable. It leverages the existing 573 entropy parameter in the BIER header to keep packets of the flows on 574 the same path and it introduces a "seed" parameter to allow 575 engineering traffic to be polarized or randomized across multiple 576 hops. 578 An "Equal Cost Multipath" (ECMP) adjacency has a list of two or more 579 adjacencies included in it. It copies the BIER-TE to one of those 580 adjacencies based on the ECMP hash calculation. The BIER-TE ECMP 581 hash algorithm must select the same adjacency from that list for all 582 packets with the same "entropy" value in the BIER-TE header if the 583 same number of adjacencies and same seed are given as parameters. 584 Further use of the seed parameter is explained below. 586 3.2.4. Local Decap 588 A "local_decap" adjacency passes a copy of the payload of the BIER-TE 589 packet to the packets NextProto within the BFR (IPv4/IPv6, 590 Ethernet,...). A local_decap adjacency turns the BFR into a BFER for 591 matching packets. Local_decap adjacencies require the BFER to 592 support routing or switching for NextProto to determine how to 593 further process the packet. 595 3.3. Encapsulation considerations 597 Specifications for BIER-TE encapsulation are outside the scope of 598 this document. This section gives explanations and guidelines. 600 Because a BFR needs to interpret the BitString of a BIER-TE packet 601 differently from a BIER packet, it is necessary to distinguish BIER 602 from BIER-TE packets. This is subject to definitions in BIER 603 encapsulation specifications. 605 MPLS encapsulation [RFC8296] for example assigns one label by which 606 BFRs recognizes BIER packets for every (SI,subdomain) combination. 607 If it is desirable that every subdomain can forward only BIER or 608 BIER-TE packets, then the label allocation could stay the same, and 609 only the forwarding model (BIER/BIER-TE) would have to be defined per 610 subdomain. If it is desirable to support both BIER and BIER-TE 611 forwarding in the same subdomain, then additional labels would need 612 to be assigned for BIER-TE forwarding. 614 "forward_routed" requires an encapsulation permitting to unicast 615 BIER-TE packets to a specific interface address on a target BFR. 616 With MPLS encapsulation, this can simply be done via a label stack 617 with that addresses label as the top label - followed by the label 618 assigned to (SI,subdomain) - and if necessary (see above) BIER-TE. 619 With non-MPLS encapsulation, some form of IP tunneling (IP in IP, 620 LISP, GRE) would be required. 622 The encapsulation used for "forward_routed" adjacencies can equally 623 support existing advanced adjacency information such as "loose source 624 routes" via e.g. MPLS label stacks or appropriate header extensions 625 (e.g. for IPv6). 627 3.4. Basic BIER-TE Forwarding Example 629 [RFC Editor: remove this section.] 631 THIS SECTION TO BE REMOVED IN RFC BECAUSE IT WAS SUPERCEEDED BY 632 SECTION 1.1 EXAMPLE - UNLESS REVIEWERS CHIME IN AND EXPRESS DESIRE TO 633 KEEP THIS ADDITIONAL EXAMPLE SECTION. 635 Step by step example of basic BIER-TE forwarding. This does not use 636 ECMP or forward_routed adjacencies nor does it try to minimize the 637 number of required BitPositions for the topology. 639 [Bier-Te Controller Host] 640 / | \ 641 v v v 643 | p13 p1 | 644 +- BFIR2 --+ | 645 | | p2 p6 | LAN2 646 | +-- BFR3 --+ | 647 | | | p7 p11 | 648 Src -+ +-- BFER1 --+ 649 | | p3 p8 | | 650 | +-- BFR4 --+ +-- Rcv1 651 | | | | 652 | | 653 | p14 p4 | 654 +- BFIR1 --+ | 655 | +-- BFR5 --+ p10 p12 | 656 LAN1 | p5 p9 +-- BFER2 --+ 657 | +-- Rcv2 658 | 659 LAN3 661 IP |..... BIER-TE network......| IP 663 Figure 5: BIER-TE Forwarding Example 665 pXX indicate the BitPositions number assigned by the BIER-TE 666 controller host to adjacencies in the BIER-TE topology. For example, 667 p9 is the adjacency towards BFR5 on the LAN connecting to BFER2. 669 BIFT BFIR2: 670 p13: local_decap() 671 p2: forward_connected(BFR3) 673 BIFT BFR3: 674 p1: forward_connected(BFIR2) 675 p7: forward_connected(BFER1) 676 p8: forward_connected(BFR4) 678 BIFT BFER1: 679 p11: local_decap() 680 p6: forward_connected(BFR3) 681 p8: forward_connected(BFR4) 683 Figure 6: BIER-TE Forwarding Example Adjacencies 685 ...and so on. 687 For example, we assume that some multicast traffic seen on LAN1 needs 688 to be sent via BIER-TE by BFIR2 towards Rcv1 and Rcv2. The 689 controller determines it wants it to pass this traffic across the 690 following paths: 692 -> BFER1 ---------------> Rcv1 693 BFIR2 -> BFR3 694 -> BFR4 -> BFR5 -> BFER2 -> Rcv2 696 Figure 7: BIER-TE Forwarding Example Paths 698 These paths equal to the following BitString: p2, p5, p7, p8, p10, 699 p11, p12. 701 This BitString is assigned by BFIR2 to the example multicast traffic 702 received from LAN1. 704 Then BFIR2 forwards this multicast traffic with BIER-TE based on that 705 BitString. The BIFT of BFIR2 has only p2 and p13 populated. Only p2 706 is in the BitString and this is an adjacency towards BFR3. BFIR2 707 therefore resets p2 in the BitString and sends a copy towards BFR2. 709 BFR3 sees a BitString of p5,p7,p8,p10,p11,p12. It is only interested 710 in p1,p7,p8. It creates a copy of the packet to BFER1 (due to p7) 711 and one to BFR4 (due to p8). It resets p7, p8 before sending. 713 BFER1 sees a BitString of p5,p10,p11,p12. It is only interested in 714 p6,p7,p8,p11 and therefore considers only p11. p11 is a "local_decap" 715 adjacency installed by the BIER-TE controller host because BFER1 716 should pass packets to IP multicast. The local_decap adjacency 717 instructs BFER1 to create a copy, decapsulate it from the BIER header 718 and pass it on to the NextProtocol, in this example IP multicast. IP 719 multicast will then forward the packet out to LAN2 because it did 720 receive PIM or IGMP joins on LAN2 for the traffic. 722 Further processing of the packet in BFR4, BFR5 and BFER2 accordingly. 724 3.5. Forwarding comparison with BIER 726 Forwarding of BIER-TE is designed to allow common forwarding hardware 727 with BIER. In fact, one of the main goals of this document is to 728 encourage the building of forwarding hardware that can not only 729 support BIER, but also BIER-TE - to allow experimentation with BIER- 730 TE and support building of BIER-TE control plane code. 732 The pseudocode in Section 6 shows how existing BIER/BIFT forwarding 733 can be amended to support basic BIER-TE forwarding, by using BIER 734 BIFT's F-BM. Only the masking of bits due to avoid duplicates must 735 be skipped when forwarding is for BIER-TE. 737 Whether to use BIER or BIER-TE forwarding can simply be a configured 738 choice per subdomain and accordingly be set up by a BIER-TE 739 controller host. The BIER packet encapsulation [RFC8296] too can be 740 reused without changes except that the currently defined BIER-TE ECMP 741 adjacency does not leverage the entropy field so that field would be 742 unused when BIER-TE forwarding is used. 744 3.6. Requirements 746 Basic BIER-TE forwarding MUST support to configure Subdomains to use 747 basic BIER-TE forwarding rules (instead of BIER). With basic BIER-TE 748 forwarding, every bit MUST support to have zero or one adjacency. It 749 MUST support the adjacency types forward_connected without DNR flag, 750 forward_routed and local_decap. All other BIER-TE forwarding 751 features are optional. These basic BIER-TE requirements make BIER-TE 752 forwarding exactly the same as BIER forwarding with the exception of 753 skipping the aforementioned F-BM masking on egress. 755 BIER-TE forwarding SHOULD support the DNR flag, as this is highly 756 useful to save bits in rings (see Section 4.6). 758 BIER-TE forwarding MAY support more than one adjacency on a bit and 759 ECMP adjacencies. The importance of ECMP adjacencies is unclear when 760 traffic engineering is used because it may be more desirable to 761 explicitly steer traffic across non-ECMP paths to make per-path 762 traffic calculation easier for controllers. Having more than one 763 adjacency for a bit allows further savings of bits in hub&spoke 764 scenarios, but unlike rings it is less "natural" to flood traffic 765 across multiple links unconditional. Both ECMP and multiple 766 adjacencies are forwarding plane features that should be possible to 767 support later when needed as they do not impact the basic BIER-TE 768 replication loop. This is true because there is no inter-copy 769 dependency through resetting of F-BM as in BIER. 771 4. BIER-TE Controller Host BitPosition Assignments 773 This section describes how the BIER-TE controller host can use the 774 different BIER-TE adjacency types to define the BitPositions of a 775 BIER-TE domain. 777 Because the size of the BitString is limiting the size of the BIER-TE 778 domain, many of the options described exist to support larger 779 topologies with fewer BitPositions (4.1, 4.3, 4.4, 4.5, 4.6, 4.7, 780 4.8). 782 4.1. P2P Links 784 Each P2p link in the BIER-TE domain is assigned one unique 785 BitPosition with a forward_connected adjacency pointing to the 786 neighbor on the p2p link. 788 4.2. BFER 790 Every non-Leaf BFER is given a unique BitPosition with a local_decap 791 adjacency. 793 4.3. Leaf BFERs 795 BFR1(P) BFR2(P) BFR1(P) BFR2(P) 796 | \ / | | | 797 | X | | | 798 | / \ | | | 799 BFER1(PE) BFER2(PE) BFER1(PE)----BFER2(PE) 801 Leaf BFER / Non-Leaf BFER / 802 PE-router PE-router 804 Figure 8: Leaf vs. non-Leaf BFER Example 806 Leaf BFERs are BFERs where incoming BIER-TE packets never need to be 807 forwarded to another BFR but are only sent to the BFER to exit the 808 BIER-TE domain. For example, in networks where PEs are spokes 809 connected to P routers, those PEs are Leaf BFERs unless there is a 810 U-turn between two PEs. Consider how redundant disjoint traffic can 811 reach BFER1/BFER2 in above picture: When BFER1/BFER2 are Non-Leaf 812 BFER as shown on the right hand side, one traffic copy would be 813 forwarded to BFER1 from BFR1, but the other one could only reach 814 BFER1 via BFER2, which makes BFER2 a non-Leaf BFER. Likewise BFER1 815 is a non-Leaf BFER when forwarding traffic to BFER2. 817 Note that the BFERs in the left hand picture are only guaranteed to 818 be leaf-BFER by fitting routing configuration that prohibits transit 819 traffic to pass through the BFERs, which is commonly applied in these 820 topologies. 822 All leaf-BFER in a BIER-TE domain can share a single BitPosition. 823 This is possible because the BitPosition for the adjacency to reach 824 the BFER can be used to distinguish whether or not packets should 825 reach the BFER. 827 This optimization will not work if an upstream interface of the BFER 828 is using a BitPosition optimized as described in the following two 829 sections (LAN, Hub and Spoke). 831 4.4. LANs 833 In a LAN, the adjacency to each neighboring BFR on the LAN is given a 834 unique BitPosition. The adjacency of this BitPosition is a 835 forward_connected adjacency towards the BFR and this BitPosition is 836 populated into the BIFT of all the other BFRs on that LAN. 838 BFR1 839 |p1 840 LAN1-+-+---+-----+ 841 p3| p4| p2| 842 BFR3 BFR4 BFR7 844 Figure 9: LAN Example 846 If Bandwidth on the LAN is not an issue and most BIER-TE traffic 847 should be copied to all neighbors on a LAN, then BitPositions can be 848 saved by assigning just a single BitPosition to the LAN and 849 populating the BitPosition of the BIFTs of each BFRs on the LAN with 850 a list of forward_connected adjacencies to all other neighbors on the 851 LAN. 853 This optimization does not work in the case of BFRs redundantly 854 connected to more than one LANs with this optimization because these 855 BFRs would receive duplicates and forward those duplicates into the 856 opposite LANs. Adjacencies of such BFRs into their LANs still need a 857 separate BitPosition. 859 4.5. Hub and Spoke 861 In a setup with a hub and multiple spokes connected via separate p2p 862 links to the hub, all p2p links can share the same BitPosition. The 863 BitPosition on the hub's BIFT is set up with a list of 864 forward_connected adjacencies, one for each Spoke. 866 This option is similar to the BitPosition optimization in LANs: 867 Redundantly connected spokes need their own BitPositions. 869 This type of optimized BP could be used for example when all traffic 870 is "broadcast" traffic (very dense receiver set) such as live-TV or 871 situation-awareness (SA). This BP optimization can then be used to 872 explicitly steer different traffic flows across different ECMP paths 873 in Data-Center or broadband-aggregation networks with minimal use of 874 BPs. 876 4.6. Rings 878 In L3 rings, instead of assigning a single BitPosition for every p2p 879 link in the ring, it is possible to save BitPositions by setting the 880 "Do Not Reset" (DNR) flag on forward_connected adjacencies. 882 For the rings shown in the following picture, a single BitPosition 883 will suffice to forward traffic entering the ring at BFRa or BFRb all 884 the way up to BFR1: 886 On BFRa, BFRb, BFR30,... BFR3, the BitPosition is populated with a 887 forward_connected adjacency pointing to the clockwise neighbor on the 888 ring and with DNR set. On BFR2, the adjacency also points to the 889 clockwise neighbor BFR1, but without DNR set. 891 Handling DNR this way ensures that copies forwarded from any BFR in 892 the ring to a BFR outside the ring will not have the ring BitPosition 893 set, therefore minimizing the chance to create loops. 895 v v 896 | | 897 L1 | L2 | L3 898 /-------- BFRa ---- BFRb --------------------\ 899 | | 900 \- BFR1 - BFR2 - BFR3 - ... - BFR29 - BFR30 -/ 901 | | L4 | | 902 p33| p15| 903 BFRd BFRc 905 Figure 10: Ring Example 907 Note that this example only permits for packets to enter the ring at 908 BFRa and BFRb, and that packets will always travel clockwise. If 909 packets should be allowed to enter the ring at any ring BFR, then one 910 would have to use two ring BitPositions. One for clockwise, one for 911 counterclockwise. 913 Both would be set up to stop rotating on the same link, e.g. L1. 914 When the ingress ring BFR creates the clockwise copy, it will reset 915 the counterclockwise BitPosition because the DNR bit only applies to 916 the bit for which the replication is done. Likewise for the 917 clockwise BitPosition for the counterclockwise copy. In result, the 918 ring ingress BFR will send a copy in both directions, serving BFRs on 919 either side of the ring up to L1. 921 4.7. Equal Cost MultiPath (ECMP) 923 The ECMP adjacency allows to use just one BP per link bundle between 924 two BFRs instead of one BP for each p2p member link of that link 925 bundle. In the following picture, one BP is used across L1,L2,L3. 927 --L1----- 928 BFR1 --L2----- BFR2 929 --L3----- 931 BIFT entry in BFR1: 932 ------------------------------------------------------------------ 933 | Index | Adjacencies | 934 ================================================================== 935 | 0:6 | ECMP({forward_connected(L1, BFR2), | 936 | | forward_connected(L2, BFR2), | 937 | | forward_connected(L3, BFR2)}, seed) | 938 ------------------------------------------------------------------ 940 BIFT entry in BFR2: 941 ------------------------------------------------------------------ 942 | Index | Adjacencies | 943 ================================================================== 944 | 0:6 | ECMP({forward_connected(L1, BFR1), | 945 | | forward_connected(L2, BFR1), | 946 | | forward_connected(L3, BFR1)}, seed) | 947 ------------------------------------------------------------------ 949 Figure 11: ECMP Example 951 This document does not standardize any ECMP algorithm because it is 952 sufficient for implementations to document their freely chosen ECMP 953 algorithm. This allows the BIER-TE controller host to calculate ECMP 954 paths and seeds. The following picture shows an example ECMP 955 algorithm: 957 forward(packet, ECMP(adj(0), adj(1),... adj(N-1), seed)): 958 i = (packet(bier-header-entropy) XOR seed) % N 959 forward packet to adj(i) 961 Figure 12: ECMP algorithm Example 963 In the following example, all traffic from BFR1 towards BFR10 is 964 intended to be ECMP load split equally across the topology. This 965 example is not meant as a likely setup, but to illustrate that ECMP 966 can be used to share BPs not only across link bundles, and it 967 explains the use of the seed parameter. 969 BFR1 (BFIR) 970 /L11 \L12 971 / \ 972 BFR2 BFR3 973 /L21 \L22 /L31 \L32 974 / \ / \ 975 BFR4 BFR5 BFR6 BFR7 976 \ / \ / 977 \ / \ / 978 BFR8 BFR9 979 \ / 980 \ / 981 BFR10 (BFER) 983 BIFT entry in BFR1: 984 ------------------------------------------------------------------ 985 | 0:6 | ECMP({forward_connected(L11, BFR2), | 986 | | forward_connected(L12, BFR3)}, seed1) | 987 ------------------------------------------------------------------ 989 BIFT entry in BFR2: 990 ------------------------------------------------------------------ 991 | 0:7 | ECMP({forward_connected(L21, BFR4), | 992 | | forward_connected(L22, BFR5)}, seed1) | 993 ------------------------------------------------------------------ 995 BIFT entry in BFR3: 996 ------------------------------------------------------------------ 997 | 0:7 | ECMP({forward_connected(L31, BFR6), | 998 | | forward_connected(L32, BFR7)}, seed1) | 999 ------------------------------------------------------------------ 1000 BIFT entry in BFR4, BFR5: 1001 ------------------------------------------------------------------ 1002 | 0:8 | forward_connected(Lxx, BFR8) |xx differs on BFR4/BFR5| 1003 ------------------------------------------------------------------ 1005 BIFT entry in BFR6, BFR7: 1006 ------------------------------------------------------------------ 1007 | 0:8 | forward_connected(Lxx, BFR9) |xx differs on BFR6/BFR7| 1008 ------------------------------------------------------------------ 1010 BIFT entry in BFR8, BFR9: 1011 ------------------------------------------------------------------ 1012 | 0:9 | forward_connected(Lxx, BFR10) |xx differs on BFR8/BFR9| 1013 ------------------------------------------------------------------ 1015 Figure 13: Polarization Example 1017 Note that for the following discussion of ECMP, only the BIFT ECMP 1018 adjacencies on BFR1, BFR2, BFR3 are relevant. The re-use of BP 1019 across BFR in this example is further explained in Section 4.9 below. 1021 With the setup of ECMP in above topology, traffic would not be 1022 equally load-split. Instead, links L22 and L31 would see no traffic 1023 at all: BFR2 will only see traffic from BFR1 for which the ECMP hash 1024 in BFR1 selected the first adjacency in the list of 2 adjacencies 1025 given as parameters to the ECMP. It is link L11-to-BFR2. BFR2 1026 performs again ECMP with two adjacencies on that subset of traffic 1027 using the same seed1, and will therefore again select the first of 1028 its two adjacencies: L21-to-BFR4. And therefore L22 and BFR5 sees no 1029 traffic. Likewise for L31 and BFR6. 1031 This issue in BFR2/BFR3 is called polarization. It results from the 1032 re-use of the same hash function across multiple consecutive hops in 1033 topologies like these. To resolve this issue, the ECMP adjacency on 1034 BFR1 can be set up with a different seed2 than the ECMP adjacencies 1035 on BFR2/BFR3. BFR2/BFR3 can use the same hash because packets will 1036 not sequentially pass across both of them. Therefore, they can also 1037 use the same BP 0:7. 1039 Note that ECMP solutions outside of BIER often hide the seed by auto- 1040 selecting it from local entropy such as unique local or next-hop 1041 identifiers. The solutions choosen for BIER-TE to allow the 1042 controller to explicitly set the seed maximizes the ability of the 1043 controller to choose the seed, independent of such seed source that 1044 the controller may not be able to control well, and even calculate 1045 optimized seeds for multi-hop cases. 1047 4.8. Routed adjacencies 1049 4.8.1. Reducing BitPositions 1051 Routed adjacencies can reduce the number of BitPositions required 1052 when the traffic engineering requirement is not hop-by-hop explicit 1053 path selection, but loose-hop selection. Routed adjacencies can also 1054 allow to operate BIER-TE across intermediate hop routers that do not 1055 support BIER-TE. 1057 ............... 1058 ...BFR1--... ...--L1-- BFR2... 1059 ... .Routers. ...--L2--/ 1060 ...BFR4--... ...------ BFR3... 1061 ............... | 1062 LO 1063 Network Area 1 1065 Figure 14: Routed Adjacencies Example 1067 Assume the requirement in the above picture is to explicitly steer 1068 traffic flows that have arrived at BFR1 or BFR4 via a shortest path 1069 in the routing underlay "Network Area 1" to one of the following 1070 three next segments: (1) BFR2 via link L1, (2) BFR2 via link L2, (3) 1071 via BFR3. 1073 To enable this, both BFR1 and BFR4 are set up with a forward_routed 1074 adjacency BitPosition towards an address of BFR2 on link L1, another 1075 forward_routed BitPosition towards an address of BFR2 on link L2 and 1076 a third forward_routed Bitposition towards a node address LO of BFR3. 1078 4.8.2. Supporting nodes without BIER-TE 1080 Routed adjacencies also enable incremental deployment of BIER-TE. 1081 Only the nodes through which BIER-TE traffic needs to be steered - 1082 with or without replication - need to support BIER-TE. Where they 1083 are not directly connected to each other, forward_routed adjacencies 1084 are used to pass over non BIER-TE enabled nodes. 1086 4.9. Reuse of BitPositions (without DNR) 1088 BitPositions can be re-used across multiple BFR to minimize the 1089 number of BP needed. This happens when adjacencies on multiple BFR 1090 use the DNR flag as described above, but it can also be done for non- 1091 DNR adjacencies. This section only discussses this non-DNR case. 1093 Because BP are reset after passing a BFR with an adjacency for that 1094 BP, reuse of BP across multiple BFR does not introduce any problems 1095 with duplicates or loops that do not also exist when every adjacency 1096 has a unique BP: Instead of setting one BP in a BitString that is 1097 reused in N-adjacencies, one would get the same or worse results if 1098 each of these adjacencies had a unique BP and all of them where set 1099 in the BitString. Instead, based on the case, BPs can be reused 1100 without limitation, or they introduce fewer path engineering choices, 1101 or they do not work. 1103 BP cannot be reused across two BFR that would need to be passed 1104 sequentially for some path: The first BFR will reset the BP, so those 1105 paths cannot be built. BP can be set across BFR that would (A) only 1106 occur across different paths or (B) across different branches of the 1107 same tree. 1109 An example of (A) was given in Figure 13, where BP 0:7, BP 0:8 and BP 1110 0:9 are each reused across multiple BFR because a single packet/path 1111 would never be able to reach more than one BFR sharing the same BP. 1113 Assume the example was changed: BFR1 has no ECMP adjacency for BP 1114 0:6, but instead BP 0:5 with forward_connected to BFR2 and BP 0:6 1115 with forward_connected to BFR3. Packets with both BP 0:5 and BP 0:6 1116 would now be able to reach both BFR2 and BFR3 and the still existing 1117 re-use of BP 0:7 between BFR2 and BFR3 is a case of (B) where reuse 1118 of BP is perfect because it does not limit the set of useful path 1119 choices: 1121 If instead of reusing BP 0:7, BFR3 used a separate BP 0:10 for its 1122 ECMP adjacency, no useful additional path engineering would be 1123 enabled. If duplicates at BFR10 where undesirable, this would be 1124 done by not setting BP 0:5 and BP 0:6 for the same packet. If the 1125 duplicates where desirable (e.g.: resilient transmission), the 1126 additional BP 0:10 would also not render additional value. 1128 Reuse may also save BPs in larger topologies. Consider the topology 1129 shown in Figure 17, but only the following explanations: A BFIR/ 1130 sender (e.g.: video headend) is attached to area 1, and area 2...6 1131 contain receivers/BFER. Assume each area had a distribution ring, 1132 each with two BPs to indicate the direction (as explained in before). 1133 These two BPs could be reused across the 5 areas. Packets would be 1134 replicated through other BPs to the desired subset of areas, and once 1135 a packet copy reaches the ring of the area, the two ring BPs come 1136 into play. This reuse is a case of (B), but it limits the topology 1137 choices: Packets can only flow around the same direction in the rings 1138 of all areas. This may or may not be acceptable based on the desired 1139 traffic engineering: If resilient transmission is the traffic 1140 engineering goal, then it is likely a good optimization, if the 1141 bandwidth of each ring was to be optimized separately, it would not 1142 be a good limitation. 1144 4.10. Summary of BP optimizations 1146 This section reviewed a range of techniques by which a controller can 1147 create a BIER-TE topology in a way that minimizes the number of 1148 necessary BPs. 1150 Without any optimization, a controller would attempt to map the 1151 network subnet topology 1:1 into the BIER-TE topology and every 1152 subnet adjacent neighbor requires a forward_connected BP and every 1153 BFER requires a local_decap BP. 1155 The optimizations described are then as follows: 1157 o P2p links require only one BP (Section 4.1). 1159 o All leaf-BFER can share a single local_decap BP (Section 4.3). 1161 o A LAN with N BFR needs at most N BP (one for each BFR). It only 1162 needs one BP for all those BFR tha are not redundanty connected to 1163 multiple LANs (Section 4.4). 1165 o A hub with p2p connections to multiple non-leaf-BFER spokes can 1166 share one BP to all spokes if traffic can be flooded to all 1167 spokes, e.g.: because of no bandwidth concerns or dense receiver 1168 sets (Section 4.5). 1170 o Rings of BFR can be built with just two BP (one for each 1171 direction) except for BFR with multiple ring connections - similar 1172 to LANs (Section 4.6). 1174 o ECMP adjacencies to N neighbors can replace N BP with 1 BP. 1175 Multihop ECMP can avoid polarization through different seeds of 1176 the ECMP algorithm (Section 4.7). 1178 o Routed adjacencies allow to "tunnel" across non-BIER-TE capable 1179 routers and across BIER-TE capable routers where no traffic- 1180 steering or replications are required (Section 4.8). 1182 o BP can generally be reused across nodes that do not need to be 1183 consecutive in paths, but depending on scenario, this may limit 1184 the feasible traffic engineering options (Section 4.9). 1186 Note that the described list of optimizations is not exhaustive. 1187 Especially when the set of required path engineering choices is 1188 limited and the set of possible subsets of BFER that should be able 1189 to receive traffic is limited, further optimizations of BP are 1190 possible. The hub & spoke optimization is a simple example of such 1191 traffic pattern dependent optimizations. 1193 5. Avoiding loops and duplicates 1195 5.1. Loops 1197 Whenever BIER-TE creates a copy of a packet, the BitString of that 1198 copy will have all BitPositions cleared that are associated with 1199 adjacencies on the BFR. This inhibits looping of packets. The only 1200 exception are adjacencies with DNR set. 1202 With DNR set, looping can happen. Consider in the ring picture that 1203 link L4 from BFR3 is plugged into the L1 interface of BFRa. This 1204 creates a loop where the rings clockwise BitPosition is never reset 1205 for copies of the packets traveling clockwise around the ring. 1207 To inhibit looping in the face of such physical misconfiguration, 1208 only forward_connected adjacencies are permitted to have DNR set, and 1209 the link layer port unique unicast destination address of the 1210 adjacency (e.g. MAC address) protects against closing the loop. 1211 Link layers without port unique link layer addresses should not be 1212 used with the DNR flag set. 1214 5.2. Duplicates 1216 Duplicates happen when the topology of the BitString is not a tree 1217 but redundantly connecting BFRs with each other. The controller must 1218 therefore ensure to only create BitStrings that are trees in the 1219 topology. 1221 When links are incorrectly physically re-connected before the 1222 controller updates BitStrings in BFIRs, duplicates can happen. Like 1223 loops, these can be inhibited by link layer addressing in 1224 forward_connected adjacencies. 1226 If interface or loopback addresses used in forward_routed adjacencies 1227 are moved from one BFR to another, duplicates can equally happen. 1228 Such re-addressing operations must be coordinated with the 1229 controller. 1231 6. BIER-TE Forwarding Pseudocode 1233 The following simplified pseudocode for BIER-TE forwarding is using 1234 BIER forwarding pseudocode of [RFC8279], section 6.5 with the one 1235 modification necessary to support basic BIER-TE forwarding. Like the 1236 BIER pseudo forwarding code, for simplicity it does hide the details 1237 of the adjacency processing inside PacketSend() which can be 1238 forward_connected, forward_routed or local_decap. 1240 void ForwardBitMaskPacket_withTE (Packet) 1241 { 1242 SI=GetPacketSI(Packet); 1243 Offset=SI*BitStringLength; 1244 for (Index = GetFirstBitPosition(Packet->BitString); Index ; 1245 Index = GetNextBitPosition(Packet->BitString, Index)) { 1246 F-BM = BIFT[Index+Offset]->F-BM; 1247 if (!F-BM) continue; 1248 BFR-NBR = BIFT[Index+Offset]->BFR-NBR; 1249 PacketCopy = Copy(Packet); 1250 PacketCopy->BitString &= F-BM; [2] 1251 PacketSend(PacketCopy, BFR-NBR); 1252 // The following must not be done for BIER-TE: 1253 // Packet->BitString &= ~F-BM; [1] 1254 } 1255 } 1257 Figure 15: Simplified BIER-TE Forwarding Pseudocode 1259 The difference is that in BIER-TE, step [1] must not be performed, 1260 but is replaced with [2] (when the forwarding plane algorithm is 1261 implemented verbatim as shown above). 1263 In BIER, the F-BM of a BP has all BP set that are meant to be 1264 forwarded via the same neighbor. It is used to reset those BP in the 1265 packet after the first copy to this neighbor has been made to inhibit 1266 multiple copies to the same neighbor. 1268 In BIER-TE, the F-BM of a particular BP with an adjacency is the list 1269 of all BPs with an adjacency on this BFR except the particular BP 1270 itself if it has an adjacency with the DNR bit set. The F-BM is used 1271 to reset the F-BM BPs before creating copies. 1273 In BIER, the order of BPs impacts the result of forwarding because of 1274 [1]. In BIER-TE, forwarding is not impacted by the order of BPs. It 1275 is therefore possible to further optimize forwarding than in BIER. 1276 For example, BIER-TE forwarding can be parallelized such that a 1277 parallel instance (such as an egres linecard) can process any subset 1278 of BPs without any considerations for the other BPs - and without any 1279 prior, cross-BP shared processing. 1281 The above simplified pseudocode is elaborated further as follows: 1283 o This pseudocode eliminates per-bit F-BM, therefore reducing state 1284 by BitStringLength^2*SI and eliminating the need for per-packet- 1285 copy masking operation except for adjacencies with DNR flag set: 1287 * AdjacentBits[SI] are bits with a non-empty list of adjacencies. 1288 This can be computed whenever the BIER-TE controller host 1289 updates the adjacencies. 1291 * Only the AdjacentBits need to be examined in the loop for 1292 packet copies. 1294 * The packets BitString is masked with those AdjacentBits on 1295 ingress to avoid packets looping. 1297 o The code loops over the adjacencies because there may be more than 1298 one adjacency for a bit. 1300 o When an adjacency has the DNR bit, the bit is set in the packet 1301 copy (to save bits in rings for example). 1303 o The ECMP adjacency is shown. Its parameters are a 1304 ListOfAdjacencies from which one is picked. 1306 o The forward_local, forward_routed, local_decap adjacencies are 1307 shown with their parameters. 1309 void ForwardBitMaskPacket_withTE (Packet) 1310 { 1311 SI=GetPacketSI(Packet); 1312 Offset=SI*BitStringLength; 1313 AdjacentBitstring = Packet->BitString &= ~AdjacentBits[SI]; 1314 Packet->BitString &= AdjacentBits[SI]; 1315 for (Index = GetFirstBitPosition(AdjacentBits); Index ; 1316 Index = GetNextBitPosition(AdjacentBits, Index)) { 1317 foreach adjacency BIFT[Index+Offset] { 1318 if(adjacency == ECMP(ListOfAdjacencies, seed) ) { 1319 I = ECMP_hash(sizeof(ListOfAdjacencies), 1320 Packet->Entropy, seed); 1321 adjacency = ListOfAdjacencies[I]; 1322 } 1323 PacketCopy = Copy(Packet); 1324 switch(adjacency) { 1325 case forward_connected(interface,neighbor,DNR): 1326 if(DNR) 1327 PacketCopy->BitString |= 2<<(Index-1); 1328 SendToL2Unicast(PacketCopy,interface,neighbor); 1330 case forward_routed({VRF},neighbor): 1331 SendToL3(PacketCopy,{VRF,}l3-neighbor); 1333 case local_decap({VRF},neighbor): 1334 DecapBierHeader(PacketCopy); 1335 PassTo(PacketCopy,{VRF,}Packet->NextProto); 1336 } 1337 } 1338 } 1339 } 1341 Figure 16: BIER-TE Forwarding Pseudocode 1343 7. Managing SI, subdomains and BFR-ids 1345 When the number of bits required to represent the necessary hops in 1346 the topology and BFER exceeds the supported bitstring length, 1347 multiple SI and/or subdomains must be used. This section discusses 1348 how. 1350 BIER-TE forwarding does not require the concept of BFR-id, but 1351 routing underlay, flow overlay and BIER headers may. This section 1352 also discusses how BFR-ids can be assigned to BFIR/BFER for BIER-TE. 1354 7.1. Why SI and sub-domains 1356 For BIER and BIER-TE forwarding, the most important result of using 1357 multiple SI and/or subdomains is the same: Packets that need to be 1358 sent to BFER in different SI or subdomains require different BIER 1359 packets: each one with a bitstring for a different (SI,subdomain) 1360 combination. Each such bitstring uses one bitstring length sized SI 1361 block in the BIFT of the subdomain. We call this a BIFT:SI (block). 1363 For BIER and BIER-TE forwarding itself there is also no difference 1364 whether different SI and/or sub-domains are chosen, but SI and 1365 subdomain have different purposes in the BIER architecture shared by 1366 BIER-TE. This impacts how operators are managing them and how 1367 especially flow overlays will likely use them. 1369 By default, every possible BFIR/BFER in a BIER network would likely 1370 be given a BFR-id in subdomain 0 (unless there are > 64k BFIR/BFER). 1372 If there are different flow services (or service instances) requiring 1373 replication to different subsets of BFER, then it will likely not be 1374 possible to achieve the best replication efficiency for all of these 1375 service instances via subdomain 0. Ideal replication efficiency for 1376 N BFER exists in a subdomain if they are split over not more than 1377 ceiling(N/bitstring-length) SI. 1379 If service instances justify additional BIER:SI state in the network, 1380 additional subdomains will be used: BFIR/BFER are assigned BFIR-id in 1381 those subdomains and each service instance is configured to use the 1382 most appropriate subdomain. This results in improved replication 1383 efficiency for different services. 1385 Even if creation of subdomains and assignment of BFR-id to BFIR/BFER 1386 in those subdomains is automated, it is not expected that individual 1387 service instances can deal with BFER in different subdomains. A 1388 service instance may only support configuration of a single subdomain 1389 it should rely on. 1391 To be able to easily reuse (and modify as little as possible) 1392 existing BIER procedures including flow-overlay and routing underlay, 1393 when BIER-TE forwarding is added, we therefore reuse SI and subdomain 1394 logically in the same way as they are used in BIER: All necessary 1395 BFIR/BFER for a service use a single BIER-TE BIFT and are split 1396 across as many SI as necessary (see below). Different services may 1397 use different subdomains that primarily exist to provide more 1398 efficient replication (and for BIER-TE desirable traffic engineering) 1399 for different subsets of BFIR/BFER. 1401 7.2. Bit assignment comparison BIER and BIER-TE 1403 In BIER, bitstrings only need to carry bits for BFER, which leads to 1404 the model that BFR-ids map 1:1 to each bit in a bitstring. 1406 In BIER-TE, bitstrings need to carry bits to indicate not only the 1407 receiving BFER but also the intermediate hops/links across which the 1408 packet must be sent. The maximum number of BFER that can be 1409 supported in a single bitstring or BIFT:SI depends on the number of 1410 bits necessary to represent the desired topology between them. 1412 "Desired" topology because it depends on the physical topology, and 1413 on the desire of the operator to allow for explicit traffic 1414 engineering across every single hop (which requires more bits), or 1415 reducing the number of required bits by exploiting optimizations such 1416 as unicast (forward_route), ECMP or flood (DNR) over "uninteresting" 1417 sub-parts of the topology - e.g. parts where different trees do not 1418 need to take different paths due to traffic-engineering reasons. 1420 The total number of bits to describe the topology vs. the BFER in a 1421 BIFT:SI can range widely based on the size of the topology and the 1422 amount of alternative paths in it. The higher the percentage, the 1423 higher the likelihood, that those topology bits are not just BIER-TE 1424 overhead without additional benefit, but instead that they will allow 1425 to express desirable traffic-engineering path alternatives. 1427 7.3. Using BFR-id with BIER-TE 1429 Because there is no 1:1 mapping between bits in the bitstring and 1430 BFER, BIER-TE cannot simply rely on the BIER 1:1 mapping between bits 1431 in a bitstring and BFR-id. 1433 In BIER, automatic schemes could assign all possible BFR-ids 1434 sequentially to BFERs. This will not work in BIER-TE. In BIER-TE, 1435 the operator or BIER-TE controller host has to determine a BFR-id for 1436 each BFER in each required subdomain. The BFR-id may or may not have 1437 a relationship with a bit in the bitstring. Suggestions are detailed 1438 below. Once determined, the BFR-id can then be configured on the 1439 BFER and used by flow overlay, routing underlay and the BIER header 1440 almost the same as the BFR-id in BIER. 1442 The one exception are application/flow-overlays that automatically 1443 calculate the bitstring(s) of BIER packets by converting BFR-id to 1444 bits. In BIER-TE, this operation can be done in two ways: 1446 "Independent branches": For a given application or (set of) trees, 1447 the branches from a BFIR to every BFER are independent of the 1448 branches to any other BFER. For example, shortest part trees have 1449 independent branches. 1451 "Interdependent branches": When a BFER is added or deleted from a 1452 particular distribution tree, branches to other BFER still in the 1453 tree may need to change. Steiner tree are examples of dependent 1454 branch trees. 1456 If "independent branches" are sufficient, the BIER-TE controller host 1457 can provide to such applications for every BFR-id a SI:bitstring with 1458 the BIER-TE bits for the branch towards that BFER. The application 1459 can then independently calculate the SI:bitstring for all desired 1460 BFER by OR'ing their bitstrings. 1462 If "interdependent branches" are required, the application could call 1463 a BIER-TE controller host API with the list of required BFER-id and 1464 get the required bitstring back. Whenever the set of BFER-id 1465 changes, this is repeated. 1467 Note that in either case (unlike in BIER), the bits in BIER-TE may 1468 need to change upon link/node failure/recovery, network expansion and 1469 network load by other traffic (as part of traffic engineering goals). 1470 Interactions between such BFIR applications and the BIER-TE 1471 controller host do therefore need to support dynamic updates to the 1472 bitstrings. 1474 7.4. Assigning BFR-ids for BIER-TE 1476 For a non-leaf BFER, there is usually a single bit k for that BFER 1477 with a local_decap() adjacency on the BFER. The BFR-id for such a 1478 BFER is therefore most easily the one it would have in BIER: SI * 1479 bitstring-length + k. 1481 As explained earlier in the document, leaf BFERs do not need such a 1482 separate bit because the fact alone that the BIER-TE packet is 1483 forwarded to the leaf BFER indicates that the BFER should decapsulate 1484 it. Such a BFER will have one or more bits for the links leading 1485 only to it. The BFR-id could therefore most easily be the BFR-id 1486 derived from the lowest bit for those links. 1488 These two rules are only recommendations for the operator or BIER-TE 1489 controller assigning the BFR-ids. Any allocation scheme can be used, 1490 the BFR-ids just need to be unique across BFRs in each subdomain. 1492 It is not currently determined if a single subdomain could or should 1493 be allowed to forward both BIER and BIER-TE packets. If this should 1494 be supported, there are two options: 1496 A. BIER and BIER-TE have different BFR-id in the same subdomain. 1497 This allows higher replication efficiency for BIER because their BFR- 1498 id can be assigned sequentially, while the bitstrings for BIER-TE 1499 will have also the additional bits for the topology. There is no 1500 relationship between a BFR BIER BFR-id and BIER-TE BFR-id. 1502 B. BIER and BIER-TE share the same BFR-id. The BFR-id are assigned 1503 as explained above for BIER-TE and simply reused for BIER. The 1504 replication efficiency for BIER will be as low as that for BIER-TE in 1505 this approach. Depending on topology, only the same 20%..80% of bits 1506 as possible for BIER-TE can be used for BIER. 1508 7.5. Example bit allocations 1510 7.5.1. With BIER 1512 Consider a network setup with a bitstring length of 256 for a network 1513 topology as shown in the picture below. The network has 6 areas, 1514 each with ca. 170 BFR, connecting via a core with some larger (core) 1515 BFR. To address all BFER with BIER, 4 SI are required. To send a 1516 BIER packet to all BFER in the network, 4 copies need to be sent by 1517 the BFIR. On the BFIR it does not make a difference how the BFR-id 1518 are allocated to BFER in the network, but for efficiency further down 1519 in the network it does make a difference. 1521 area1 area2 area3 1522 BFR1a BFR1b BFR2a BFR2b BFR3a BFR3b 1523 | \ / \ / | 1524 ................................ 1525 . Core . 1526 ................................ 1527 | / \ / \ | 1528 BFR4a BFR4b BFR5a BFR5b BFR6a BFR6b 1529 area4 area5 area6 1531 Figure 17: Scaling BIER-TE bits by reuse 1533 With random allocation of BFR-id to BFER, each receiving area would 1534 (most likely) have to receive all 4 copies of the BIER packet because 1535 there would be BFR-id for each of the 4 SI in each of the areas. 1536 Only further towards each BFER would this duplication subside - when 1537 each of the 4 trees runs out of branches. 1539 If BFR-id are allocated intelligently, then all the BFER in an area 1540 would be given BFR-id with as few as possible different SI. Each 1541 area would only have to forward one or two packets instead of 4. 1543 Given how networks can grow over time, replication efficiency in an 1544 area will also easily go down over time when BFR-id are network wide 1545 allocated sequentially over time. An area that initially only has 1546 BFR-id in one SI might end up with many SI over a longer period of 1547 growth. Allocating SIs to areas with initially sufficiently many 1548 spare bits for growths can help to alleviate this issue. Or renumber 1549 BFR-id after network expansion. In this example one may consider to 1550 use 6 SI and assign one to each area. 1552 This example shows that intelligent BFR-id allocation within at least 1553 subdomain 0 can even be helpful or even necessary in BIER. 1555 7.5.2. With BIER-TE 1557 In BIER-TE one needs to determine a subset of the physical topology 1558 and attached BFER so that the "desired" representation of this 1559 topology and the BFER fit into a single bitstring. This process 1560 needs to be repeated until the whole topology is covered. 1562 Once bits/SIs are assigned to topology and BFER, BFR-id is just a 1563 derived set of identifiers from the operator/BIER-TE controller as 1564 explained above. 1566 Every time that different sub-topologies have overlap, bits need to 1567 be repeated across the bitstrings, increasing the overall amount of 1568 bits required across all bitstring/SIs. In the worst case, random 1569 subsets of BFER are assigned to different SI. This is much worse 1570 than in BIER because it not only reduces replication efficiency with 1571 the same number of overall bits, but even further - because more bits 1572 are required due to duplication of bits for topology across multiple 1573 SI. Intelligent BFER to SI assignment and selecting specific 1574 "desired" subtopologies can minimize this problem. 1576 To set up BIER-TE efficiently for above topology, the following bit 1577 allocation methods can be used. This method can easily be expanded 1578 to other, similarly structured larger topologies. 1580 Each area is allocated one or more SI depending on the number of 1581 future expected BFER and number of bits required for the topology in 1582 the area. In this example, 6 SI, one per area. 1584 In addition, we use 4 bits in each SI: bia, bib, bea, beb: bit 1585 ingress a, bit ingress b, bit egress a, bit egress b. These bits 1586 will be used to pass BIER packets from any BFIR via any combination 1587 of ingress area a/b BFR and egress area a/b BFR into a specific 1588 target area. These bits are then set up with the right 1589 forward_routed adjacencies on the BFIR and area edge BFR: 1591 On all BFIR in an area j, bia in each BIFT:SI is populated with the 1592 same forward_routed(BFRja), and bib with forward_routed(BFRjb). On 1593 all area edge BFR, bea in BIFT:SI=k is populated with 1594 forward_routed(BFRka) and beb in BIFT:SI=k with 1595 forward_routed(BFRkb). 1597 For BIER-TE forwarding of a packet to some subset of BFER across all 1598 areas, a BFIR would create at most 6 copies, with SI=1...SI=6, In 1599 each packet, the bits indicate bits for topology and BFER in that 1600 topology plus the four bits to indicate whether to pass this packet 1601 via the ingress area a or b border BFR and the egress area a or b 1602 border BFR, therefore allowing path engineering for those two 1603 "unicast" legs: 1) BFIR to ingress are edge and 2) core to egress 1604 area edge. Replication only happens inside the egress areas. For 1605 BFER in the same area as in the BFIR, these four bits are not used. 1607 7.6. Summary 1609 BIER-TE can like BIER support multiple SI within a sub-domain to 1610 allow re-using the concept of BFR-id and therefore minimize BIER-TE 1611 specific functions in underlay routing, flow overlay methods and BIER 1612 headers. 1614 The number of BFIR/BFER possible in a subdomain is smaller than in 1615 BIER because BIER-TE uses additional bits for topology. 1617 Subdomains can in BIER-TE be used like in BIER to create more 1618 efficient replication to known subsets of BFER. 1620 Assigning bits for BFER intelligently into the right SI is more 1621 important in BIER-TE than in BIER because of replication efficiency 1622 and overall amount of bits required. 1624 8. BIER-TE and Segment Routing (SR) 1626 Segment Routing (SR ([RFC8402])) aims to enable lightweight path 1627 engineering via loose source routing. Compared to its more heavy- 1628 weight predecessor RSVP-TE ([RFC3209]), SR does for example not 1629 require per-path signaling to each of these hops. 1631 BIER-TE supports the same design philosophy for multicast. Like in 1632 SR, it relies on source-routing - via the definition of a BitString. 1633 Like SR, it only requires to consider the "hops" on which either 1634 replication has to happen, or across which the traffic should be 1635 steered (even without replication). Any other hops can be skipped 1636 via the use of routed adjacencies. 1638 BIER-TE BitPosition (BP) can be understood as the BIER-TE equivalent 1639 of "forwarding segments" in SR, but they have a different scope than 1640 SR forwarding segments. Whereas forwarding segments in SR are global 1641 or local, BPs in BIER-TE have a scope that is the group of BFR(s) 1642 that have adjacencies for this BP in their BIFT. This can be called 1643 "adjacency" scoped forwarding segments. 1645 Adjacency scope could be global, but then every BFR would need an 1646 adjacency for this BP, for example a forward_routed adjacency with 1647 encapsulation to the global SR SID of the destination. Such a BP 1648 would always result in ingress replication though. The first BFR 1649 encountering this BP would directly replicate to it. Only by using 1650 non-global adjacency scope for BPs can traffic be steered and 1651 replicated on non-ingress BFR. 1653 SR can naturally be combined with BIER-TE and help to optimize it. 1654 For example, instead of defining BitPositions for non-replicating 1655 hops, it is equally possible to use segment routing encapsulations 1656 (eg: MPLS label stacks) for the encapsulation of "forward_routed" 1657 adjacencies. 1659 Note that BIER itself can also be seen to be similar to SR. BIER BPs 1660 act as global destination Node-SIDs and the BIER bitstring is simply 1661 a highly optimized mechanism to indicate multiple such SIDS and let 1662 the network take care of effectively replicating the packet hop-by- 1663 hop to each destination Node-SID. What BIER does not allow is to 1664 indicate intermediate hops, or terms of SR the ability to indicate a 1665 sequence of SID to reach the destination. This is what BIER-TE and 1666 its adjacency scoped BP enables. 1668 Both BIER and BIER-TE allow BFIR to "opportunistically" copy packets 1669 to a set of desired BFER on a packet-by-packet basis. In BIER, this 1670 is done by OR'ing the BP for the desired BFER. In BIER-TE this can 1671 be done by OR'ing for each desired BFER a bitstring using the 1672 "independent branches" approach described in Section 7.3 and 1673 therefore also indicating the engineered path towards each desired 1674 BFER. This is the approach that 1675 [I-D.ietf-bier-multicast-http-response] relies on. 1677 9. Security Considerations 1679 The security considerations are the same as for BIER with the 1680 following differences: 1682 BFR-ids and BFR-prefixes are not used in BIER-TE, nor are procedures 1683 for their distribution, so these are not attack vectors against BIER- 1684 TE. 1686 10. IANA Considerations 1688 This document requests no action by IANA. 1690 11. Acknowledgements 1692 The authors would like to thank Greg Shepherd, Ijsbrand Wijnands, 1693 Neale Ranns, Dirk Trossen, Sandy Zheng and Jeffrey Zhang for their 1694 extensive review and suggestions. 1696 12. Change log [RFC Editor: Please remove] 1698 draft-ietf-bier-te-arch: 1700 05: Review Jeffrey Zhang. 1702 Part 2: 1704 4.3 added note about leaf-BFER being also a propery of routing 1705 setup. 1707 4.7 Added missing details from example to avoid confusion with 1708 routed adjacencies, also compressed explanatory text and better 1709 justification why seed is explicitly configured by controller. 1711 4.9 added section discussing generic reuse of BP methods. 1713 4.10 added section summarizing BP optimizations of section 4. 1715 6. Rewrote/compressed explanation of comparison BIER/BIER-TE 1716 forwarding difference. Explained benefit of BIER-TE per-BP 1717 forwarding being independent of forwarding for other BPs. 1719 Part 1: 1721 Explicitly ue forwarded_connected adjcency in ECMP adjcency 1722 examples to avoid confusion. 1724 4.3 Add picture as example for leav vs. non-leaf BFR in topology. 1725 Improved description. 1727 4.5 Exampe for traffic that can be broadcast -> for single BP in 1728 hub&spoke. 1730 4.8.1 Simplified example picture for routed adjacency, explanatory 1731 text. 1733 Review from Dirk Trossen: 1735 Fixed up explanation of ICC paper vs. bloom filter. 1737 04: spell check run. 1739 Addded remaining fixes for Sandys (Zhang Zheng) review: 1741 4.7 Enhance ECMP explanations: 1743 example ECMP algorithm, highlight that doc does not standardize 1744 ECMP algorithm. 1746 Review from Dirk Trossen: 1748 1. Added mentioning of prior work for traffic engineered paths 1749 with bloom filters. 1751 2. Changed title from layers to components and added "BIER-TE 1752 control plane" to "BIER-TE controller host" to make it clearer, 1753 what it does. 1755 2.2.3. Added reference to I-D.ietf-bier-multicast-http-response 1756 as an example solution. 1758 2.3. clarified sentence about resetting BPs before sending copies 1759 (also forgot to mention DNR here). 1761 3.4. Added text saying this section will be removed unless IESG 1762 review finds enough redeeming value in this example given how -03 1763 introduced section 1.1 with basic examples. 1765 7.2. Removed explicit numbers 20%/80% for number of topology bits 1766 in BIER-TE, replaced with more vague (high/low) description, 1767 because we do not have good reference material Added text saying 1768 this section will be removed unless IESG review finds enough 1769 redeeming value in this example given how -03 introduced section 1770 1.1 with basic examples. 1772 many typos fixed. Thanks a lot. 1774 03: Last call textual changes by authors to improve readability: 1776 removed Wolfgang Braun as co-authors (as requested). 1778 Improved abstract to be more explanatory. Removed mentioning of 1779 FRR (not concluded on so far). 1781 Added new text into Introduction section because the text was too 1782 difficult to jump into (too many forward pointers). This 1783 primarily consists of examples and the early introduction of the 1784 BIER-TE Topology concept enabled by these examples. 1786 Amended comparison to SR. 1788 Changed syntax from [VRF] to {VRF} to indicate its optional and to 1789 make idnits happy. 1791 Split references into normative / informative, added references. 1793 02: Refresh after IETF104 discussion: changed intended status back 1794 to standard. Reasoning: 1796 Tighter review of standards document == ensures arch will be 1797 better prepared for possible adoption by other WGs (e.g. DetNet) 1798 or std. bodies. 1800 Requirement against the degree of existing implementations is self 1801 defined by the WG. BIER WG seems to think it is not necessary to 1802 apply multiple interoperating implementations against an 1803 architecture level document at this time to make it qualify to go 1804 to standards track. Also, the levels of support introduced in -01 1805 rev. should allow all BIER forwarding engines to also be able to 1806 support the base level BIER-TE forwarding. 1808 01: Added note comparing BIER and SR to also hopefully clarify 1809 BIER-TE vs. BIER comparison re. SR. 1811 - added requirements section mandating only most basic BIER-TE 1812 forwarding features as MUST. 1814 - reworked comparison with BIER forwarding section to only 1815 summarize and point to pseudocode section. 1817 - reworked pseudocode section to have one pseudocode that mirrors 1818 the BIER forwarding pseudocode to make comparison easier and a 1819 second pseudocode that shows the complete set of BIER-TE 1820 forwarding options and simplification/optimization possible vs. 1821 BIER forwarding. Removed MyBitsOfInterest (was pure 1822 optimization). 1824 - Added captions to pictures. 1826 - Part of review feedback from Sandy (Zhang Zheng) integrated. 1828 00: Changed target state to experimental (WG conclusion), updated 1829 references, mod auth association. 1831 - Source now on http://www.github.com/toerless/bier-te-arch 1833 - Please open issues on the github for change/improvement requests 1834 to the document - in addition to posting them on the list 1835 (bier@ietf.). Thanks!. 1837 draft-eckert-bier-te-arch: 1839 06: Added overview of forwarding differences between BIER, BIER- 1840 TE. 1842 05: Author affiliation change only. 1844 04: Added comparison to Live-Live and BFIR to FRR section 1845 (Eckert). 1847 04: Removed FRR content into the new FRR draft [I-D.eckert-bier- 1848 te-frr] (Braun). 1850 - Linked FRR information to new draft in Overview/Introduction 1852 - Removed BTAFT/FRR from "Changes in the network topology" 1854 - Linked new draft in "Link/Node Failures and Recovery" 1856 - Removed FRR from "The BIER-TE Forwarding Layer" 1858 - Moved FRR section to new draft 1860 - Moved FRR parts of Pseudocode into new draft 1862 - Left only non FRR parts 1864 - removed FrrUpDown(..) and //FRR operations in 1865 ForwardBierTePacket(..) 1867 - New draft contains FrrUpDown(..) and ForwardBierTePacket(Packet) 1868 from bier-arch-03 1870 - Moved "BIER-TE and existing FRR to new draft 1872 - Moved "BIER-TE and Segment Routing" section one level up 1874 - Thus, removed "Further considerations" that only contained this 1875 section 1877 - Added Changes for version 04 1878 03: Updated the FRR section. Added examples for FRR key concepts. 1879 Added BIER-in-BIER tunneling as option for tunnels in backup 1880 paths. BIFT structure is expanded and contains an additional 1881 match field to support full node protection with BIER-TE FRR. 1883 03: Updated FRR section. Explanation how BIER-in-BIER 1884 encapsulation provides P2MP protection for node failures even 1885 though the routing underlay does not provide P2MP. 1887 02: Changed the definition of BIFT to be more inline with BIER. 1888 In revs. up to -01, the idea was that a BIFT has only entries for 1889 a single bitstring, and every SI and subdomain would be a separate 1890 BIFT. In BIER, each BIFT covers all SI. This is now also how we 1891 define it in BIER-TE. 1893 02: Added Section 7 to explain the use of SI, subdomains and BFR- 1894 id in BIER-TE and to give an example how to efficiently assign 1895 bits for a large topology requiring multiple SI. 1897 02: Added further detailed for rings - how to support input from 1898 all ring nodes. 1900 01: Fixed BFIR -> BFER for section 4.3. 1902 01: Added explanation of SI, difference to BIER ECMP, 1903 consideration for Segment Routing, unicast FRR, considerations for 1904 encapsulation, explanations of BIER-TE controller host and CLI. 1906 00: Initial version. 1908 13. References 1910 13.1. Normative References 1912 [RFC8279] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., 1913 Przygienda, T., and S. Aldrin, "Multicast Using Bit Index 1914 Explicit Replication (BIER)", RFC 8279, 1915 DOI 10.17487/RFC8279, November 2017, 1916 . 1918 [RFC8296] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., 1919 Tantsura, J., Aldrin, S., and I. Meilik, "Encapsulation 1920 for Bit Index Explicit Replication (BIER) in MPLS and Non- 1921 MPLS Networks", RFC 8296, DOI 10.17487/RFC8296, January 1922 2018, . 1924 13.2. Informative References 1926 [I-D.ietf-bier-multicast-http-response] 1927 Trossen, D., Rahman, A., Wang, C., and T. Eckert, 1928 "Applicability of BIER Multicast Overlay for Adaptive 1929 Streaming Services", draft-ietf-bier-multicast-http- 1930 response-01 (work in progress), June 2019. 1932 [I-D.ietf-roll-ccast] 1933 Bergmann, O., Bormann, C., Gerdes, S., and H. Chen, 1934 "Constrained-Cast: Source-Routed Multicast for RPL", 1935 draft-ietf-roll-ccast-01 (work in progress), October 2017. 1937 [ICC] Reed, M., Al-Naday, M., Thomos, N., Trossen, D., 1938 Petropoulos, G., and S. Spirou, "Stateless multicast 1939 switching in software defined networks", IEEE 1940 International Conference on Communications (ICC), Kuala 1941 Lumpur, Malaysia, 2016, May 2016, 1942 . 1944 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1945 Requirement Levels", BCP 14, RFC 2119, 1946 DOI 10.17487/RFC2119, March 1997, 1947 . 1949 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 1950 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 1951 Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, 1952 . 1954 [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., 1955 Decraene, B., Litkowski, S., and R. Shakir, "Segment 1956 Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, 1957 July 2018, . 1959 Authors' Addresses 1961 Toerless Eckert (editor) 1962 Futurewei Technologies Inc. 1963 2330 Central Expy 1964 Santa Clara 95050 1965 USA 1967 Email: tte+ietf@cs.fau.de 1968 Gregory Cauchie 1969 Bouygues Telecom 1971 Email: GCAUCHIE@bouyguestelecom.fr 1973 Michael Menth 1974 University of Tuebingen 1976 Email: menth@uni-tuebingen.de