idnits 2.17.1 

draft-eckert-bier-te-arch-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([I-D.ietf-bier-architecture]),
     which it shouldn't.  Please replace those with straight textual mentions
     of the documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document doesn't use any RFC 2119 keywords, yet seems to have RFC
     2119 boilerplate text.

  -- The document date (July 8, 2016) is 2841 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'I-D.eckert-bier-te-frr' is mentioned on line 135,
     but not defined

  == Missing Reference: 'VRF' is mentioned on line 856, but not defined

  == Missing Reference: 'Index' is mentioned on line 838, but not defined

  == Missing Reference: 'I' is mentioned on line 843, but not defined

  == Outdated reference: A later version (-08) exists of
     draft-ietf-bier-architecture-03

  == Outdated reference: A later version (-12) exists of
     draft-ietf-bier-mpls-encapsulation-04


     Summary: 2 errors (**), 0 flaws (~~), 8 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          T. Eckert
3	Internet-Draft                                       Cisco Systems, Inc.
4	Intended status: Standards Track                              G. Cauchie
5	Expires: January 9, 2017                                Bouygues Telecom
6	                                                                W. Braun
7	                                                                M. Menth
8	                                                 University of Tuebingen
9	                                                            July 8, 2016

11	     Traffic Engineering for Bit Index Explicit Replication BIER-TE
12	                      draft-eckert-bier-te-arch-04

14	Abstract

16	   This document proposes an architecture for BIER-TE: Traffic
17	   Engineering for Bit Index Explicit Replication (BIER).

19	   BIER-TE shares part of its architecture with BIER as described in
20	   [I-D.ietf-bier-architecture].  It also proposes to share the packet
21	   format with BIER.

23	   BIER-TE forwards and replicates packets like BIER based on a
24	   BitString in the packet header but it does not require an IGP.  It
25	   does support traffic engineering by explicit hop-by-hop forwarding
26	   and loose hop forwarding of packets.  It does support Fast ReRoute
27	   (FRR) for link and node protection and incremental deployment.
28	   Because BIER-TE like BIER operates without explicit in-network tree-
29	   building but also supports traffic engineering, it is more similar to
30	   SR than RSVP-TE.

32	Status of This Memo

34	   This Internet-Draft is submitted in full conformance with the
35	   provisions of BCP 78 and BCP 79.

37	   Internet-Drafts are working documents of the Internet Engineering
38	   Task Force (IETF).  Note that other groups may also distribute
39	   working documents as Internet-Drafts.  The list of current Internet-
40	   Drafts is at http://datatracker.ietf.org/drafts/current/.

42	   Internet-Drafts are draft documents valid for a maximum of six months
43	   and may be updated, replaced, or obsoleted by other documents at any
44	   time.  It is inappropriate to use Internet-Drafts as reference
45	   material or to cite them other than as "work in progress."

47	   This Internet-Draft will expire on January 9, 2017.

49	Copyright Notice

51	   Copyright (c) 2016 IETF Trust and the persons identified as the
52	   document authors.  All rights reserved.

54	   This document is subject to BCP 78 and the IETF Trust's Legal
55	   Provisions Relating to IETF Documents
56	   (http://trustee.ietf.org/license-info) in effect on the date of
57	   publication of this document.  Please review these documents
58	   carefully, as they describe your rights and restrictions with respect
59	   to this document.  Code Components extracted from this document must
60	   include Simplified BSD License text as described in Section 4.e of
61	   the Trust Legal Provisions and are provided without warranty as
62	   described in the Simplified BSD License.

64	Table of Contents

66	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
67	     1.1.  Overview  . . . . . . . . . . . . . . . . . . . . . . . .   3
68	     1.2.  Requirements Language . . . . . . . . . . . . . . . . . .   4
69	   2.  Layering  . . . . . . . . . . . . . . . . . . . . . . . . . .   4
70	     2.1.  The Multicast Flow Overlay  . . . . . . . . . . . . . . .   5
71	     2.2.  The BIER-TE Controller Host . . . . . . . . . . . . . . .   5
72	       2.2.1.  Assignment of BitPositions to adjacencies of the
73	               network topology  . . . . . . . . . . . . . . . . . .   6
74	       2.2.2.  Changes in the network topology . . . . . . . . . . .   6
75	       2.2.3.  Set up per-multicast flow BIER-TE state . . . . . . .   6
76	       2.2.4.  Link/Node Failures and Recovery . . . . . . . . . . .   6
77	     2.3.  The BIER-TE Forwarding Layer  . . . . . . . . . . . . . .   7
78	     2.4.  The Routing Underlay  . . . . . . . . . . . . . . . . . .   7
79	   3.  BIER-TE Forwarding  . . . . . . . . . . . . . . . . . . . . .   7
80	     3.1.  The Bit Index Forwarding Table (BIFT) . . . . . . . . . .   7
81	     3.2.  Adjacency Types . . . . . . . . . . . . . . . . . . . . .   8
82	       3.2.1.  Forward Connected . . . . . . . . . . . . . . . . . .   8
83	       3.2.2.  Forward Routed  . . . . . . . . . . . . . . . . . . .   9
84	       3.2.3.  ECMP  . . . . . . . . . . . . . . . . . . . . . . . .   9
85	       3.2.4.  Local Decap . . . . . . . . . . . . . . . . . . . . .   9
86	     3.3.  Encapsulation considerations  . . . . . . . . . . . . . .  10
87	     3.4.  Basic BIER-TE Forwarding Example  . . . . . . . . . . . .  10
88	   4.  BIER-TE Controller Host BitPosition Assignments . . . . . . .  12
89	     4.1.  P2P Links . . . . . . . . . . . . . . . . . . . . . . . .  12
90	     4.2.  BFER  . . . . . . . . . . . . . . . . . . . . . . . . . .  13
91	     4.3.  Leaf BFERs  . . . . . . . . . . . . . . . . . . . . . . .  13
92	     4.4.  LANs  . . . . . . . . . . . . . . . . . . . . . . . . . .  13
93	     4.5.  Hub and Spoke . . . . . . . . . . . . . . . . . . . . . .  14
94	     4.6.  Rings . . . . . . . . . . . . . . . . . . . . . . . . . .  14
95	     4.7.  Equal Cost MultiPath (ECMP) . . . . . . . . . . . . . . .  15
96	     4.8.  Routed adjacencies  . . . . . . . . . . . . . . . . . . .  17
97	       4.8.1.  Reducing BitPositions . . . . . . . . . . . . . . . .  17
98	       4.8.2.  Supporting nodes without BIER-TE  . . . . . . . . . .  17
99	   5.  Avoiding loops and duplicates . . . . . . . . . . . . . . . .  17
100	     5.1.  Loops . . . . . . . . . . . . . . . . . . . . . . . . . .  17
101	     5.2.  Duplicates  . . . . . . . . . . . . . . . . . . . . . . .  18
102	   6.  BIER-TE Forwarding Pseudocode . . . . . . . . . . . . . . . .  18
103	   7.  Managing SI, subdomains and BFR-ids . . . . . . . . . . . . .  19
104	     7.1.  Why SI and sub-domains  . . . . . . . . . . . . . . . . .  20
105	     7.2.  Bit assignment comparison BIER and BIER-TE  . . . . . . .  21
106	     7.3.  Using BFR-id with BIER-TE . . . . . . . . . . . . . . . .  21
107	     7.4.  Assigning BFR-ids for BIER-TE . . . . . . . . . . . . . .  22
108	     7.5.  Example bit allocations . . . . . . . . . . . . . . . . .  23
109	       7.5.1.  With BIER . . . . . . . . . . . . . . . . . . . . . .  23
110	       7.5.2.  With BIER-TE  . . . . . . . . . . . . . . . . . . . .  24
111	     7.6.  Summary . . . . . . . . . . . . . . . . . . . . . . . . .  25
112	   8.  BIER-TE and Segment Routing . . . . . . . . . . . . . . . . .  25
113	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  26
114	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  26
115	   11. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  26
116	   12. Change log [RFC Editor: Please remove]  . . . . . . . . . . .  26
117	   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  27
118	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  28

120	1.  Introduction

122	1.1.  Overview

124	   This document specifies the architecture for BIER-TE: traffic
125	   engineering for Bit Index Explicit Replication BIER.

127	   BIER-TE shares architecture and packet formats with BIER as described
128	   in [I-D.ietf-bier-architecture].

130	   BIER-TE forwards and replicates packets like BIER based on a
131	   BitString in the packet header but it does not require an IGP.  It
132	   does support traffic engineering by explicit hop-by-hop forwarding
133	   and loose hop forwarding of packets.  It does support incremental
134	   deployment and a Fast ReRoute (FRR) extension for link and node
135	   protection is given in [I-D.eckert-bier-te-frr].  Because BIER-TE
136	   like BIER operates without explicit in-network tree-building but also
137	   supports traffic engineering, it is more similar to Segment Routing
138	   (SR) than RSVP-TE.

140	   The key differences over BIER are:

142	   o  BIER-TE replaces in-network autonomous path calculation by
143	      explicit paths calculated offpath by the BIER-TE controller host.

145	   o  In BIER-TE every BitPosition of the BitString of a BIER-TE packet
146	      indicates one or more adjacencies - instead of a BFER as in BIER.

148	   o  BIER-TE in each BFR has no routing table but only a BIER-TE
149	      Forwarding Table (BIFT) indexed by SI:BitPosition and populated
150	      with only those adjacencies to which the BFR should replicate
151	      packets to.

153	   BIER-TE headers use the same format as BIER headers.

155	   BIER-TE forwarding does not require/use the BFIR-ID.  The BFIR-ID can
156	   still be useful though for coordinated BFIR/BFER functions, such as
157	   the context for upstream assigned labels for MPLS payloads in MVPN
158	   over BIER-TE.

160	   If the BIER-TE domain is also running BIER, then the BFIR-ID in BIER-
161	   TE packets can be set to the same BFIR-ID as used with BIER packets.

163	   If the BIER-TE domain is not running full BIER or does not want to
164	   reduce the need to allocate bits in BIER bitstrings for BFIR-ID
165	   values, then the allocation of BFIR-ID values in BIER-TE packets can
166	   be done through other mechanisms outside the scope of this document,
167	   as long as this is appropriately agreed upon between all BFIR/BFER.

169	1.2.  Requirements Language

171	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
172	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
173	   document are to be interpreted as described in RFC 2119 [RFC2119].

175	2.  Layering

177	   End to end BIER-TE operations consists of four components: The
178	   "Multicast Flow Overlay", the "BIER-TE Controller Host", the "Routing
179	   Underlay" and the "BIER-TE forwarding layer".

181	      Picture 2: Layers of BIER-TE

183	                   <------BGP/PIM----->
184	      |<-IGMP/PIM->  multicast flow   <-PIM/IGMP->|
185	                        overlay

187	                   [Bier-TE Controller Host]
188	                      ^      ^     ^
189	                     /       |      \   BIER-TE control protocol
190	                    |        |       |  eg.: Netconf/Restconf/Yang
191	                    v        v       v
192	    Src -> Rtr1 -> BFIR-----BFR-----BFER -> Rtr2 -> Rcvr

194	                   |--------------------->|
195	                   BIER-TE forwarding layer

197	                   |<- BIER-TE domain-->|

199	                  |<--------------------->|
200	                      Routing underlay

202	2.1.  The Multicast Flow Overlay

204	   The Multicast Flow Overlay operates as in BIER.  See
205	   [I-D.ietf-bier-architecture].  Instead of interacting with the BIER
206	   layer, it interacts with the BIER-TE Controller Host

208	2.2.  The BIER-TE Controller Host

210	   The BIER-TE controller host is representing the control plane of
211	   BIER-TE.  It communicates two sets of information with BFRs:

213	   During bring-up or modifications of the network topology, the
214	   controller discovers the network topology, assigns BitPositions to
215	   adjacencies and signals the resulting mapping of BitPositions to
216	   adjacencies to each BFR connecting to the adjacency.

218	   During day-to-day operations of the network, the controller signals
219	   to BFIRs what multicast flows are mapped to what BitStrings.

221	   Communications between the BIER-TE controller host to BFRs is ideally
222	   via standardized protocols and data-models such as Netconf/Retconf/
223	   Yang.  This is currently outside the scope of this document.  Vendor-
224	   specific CLI on the BFRs is also a possible stopgap option (as in
225	   many other SDN solutions lacking definition of standardized data
226	   model).

228	   For simplicity, the procedures of the BIER-TE controller host are
229	   described in this document as if it is a single, centralized
230	   automated entity, such as an SDN controller.  It could equally be an
231	   operator setting up CLI on the BFRs.  Distribution of the functions
232	   of the BIER-TE controller host is currently outside the scope of this
233	   document.

235	2.2.1.  Assignment of BitPositions to adjacencies of the network
236	        topology

238	   The BIER-TE controller host tracks the BFR topology of the BIER-TE
239	   domain.  It determines what adjacencies require BitPositions so that
240	   BIER-TE explicit paths can be built through them as desired by
241	   operator policy.

243	   The controller then pushes the BitPositions/adjacencies to the BIFT
244	   of the BFRs, populating only those SI:BitPositions to the BIFT of
245	   each BFR to which that BFR should be able to send packets to -
246	   adjacencies connecting to this BFR.

248	2.2.2.  Changes in the network topology

250	   If the network topology changes (not failure based) so that
251	   adjacencies that are assigned to BitPositions are no longer needed,
252	   the controller can re-use those BitPositions for new adjacencies.
253	   First, these BitPositions need to be removed from any BFIR flow state
254	   and BFR BIFT state, then they can be repopulated, first into BIFT and
255	   then into the BFIR.

257	2.2.3.  Set up per-multicast flow BIER-TE state

259	   The BIER-TE controller host tracks the multicast flow overlay to
260	   determine what multicast flow needs to be sent by a BFIR to which set
261	   of BFER.  It calculates the desired distribution tree across the
262	   BIER-TE domain based on algorithms outside the scope of this document
263	   (eg.: CSFP, Steiner Tree,...).  It then pushes the calculated
264	   BitString into the BFIR.

266	2.2.4.  Link/Node Failures and Recovery

268	   When link or nodes fail or recover in the topology, BIER-TE can
269	   quickly respond with the optional FRR procedures described in [I-
270	   D.eckert-bier-te-frr].  It can also more slowly react by
271	   recalculating the BitStrings of affected multicast flows.  This
272	   reaction is slower than the FRR procedure because the controller
273	   needs to receive link/node up/down indications, recalculate the
274	   desired BitStrings and push them down into the BFIRs.  With FRR, this
275	   is all performed locally on a BFR receiving the adjacency up/down
276	   notification.

278	2.3.  The BIER-TE Forwarding Layer

280	   When the BIER-TE Forwarding Layer receives a packet, it simply looks
281	   up the BitPositions that are set in the BitString of the packet in
282	   the Bit Index Forwarding Table (BIFT) that was populated by the BIER-
283	   TE controller host.  For every BP that is set in the BitString, and
284	   that has one or more adjacencies in the BIFT, a copy is made
285	   according to the type of adjacencies for that BP in the BIFT.  Before
286	   sending any copy, the BFR resets all BitPositions in the BitString of
287	   the packet to which it can create a copy.  This is done to inhibit
288	   that packets can loop.

290	2.4.  The Routing Underlay

292	   BIER-TE is sending BIER packets to directly connected BIER-TE
293	   neighbors as L2 (unicasted) BIER packets without requiring a routing
294	   underlay.  BIER-TE forwarding uses the Routing underlay for
295	   forward_routed adjacencies which copy BIER-TE packets to not-
296	   directly-connected BFRs (see below for adjacency definitions).

298	   If the BFR intends to support FRR for BIER-TE, then the BIER-TE
299	   forwarding plane needs to receive fast adjacency up/down
300	   notifications: Link up/down or neighbor up/down, eg.: from BFD.
301	   Providing these notifications is considered to be part of the routing
302	   underlay in this document.

304	3.  BIER-TE Forwarding

306	3.1.  The Bit Index Forwarding Table (BIFT)

308	   The Bit Index Forwarding Table (BIFT) exists in every BFR.  For every
309	   subdomain in use, it is a table indexed by SI:BitPosition and is
310	   populated by the BIER-TE control plane.  Each index can be empty or
311	   contain a list of one or more adjacencies.

313	   BIER-TE can support multiple subdomains like BIER.  Each one with a
314	   separate BIFT

316	   In the BIER architecture, indices into the BIFT are explained to be
317	   both BFR-id and SI:BitString (BitPosition).  This is because there is
318	   a 1:1 relationship between BFR-id and SI:BitString - every bit in
319	   every SI is/can be assigned to a BFIR/BFER.  In BIER-TE there are
320	   more bits used in each BitString than there are BFIR/BFER assigned to
321	   the bitstring.  This is because of the bits required to express the
322	   (traffic engineered) path through the topology.  The BIER-TE
323	   forwarding definitions do therefore not use the term BFR-id at all.
324	   Instead, BFR-ids are only used as required by routing underlay, flow
325	   overlay of BIER headers.  Please refer to Section 7 for explanations
326	   how to deal with SI, subdomains and BFR-id in BIER-TE.

328	     ------------------------------------------------------------------
329	     | Index:          |  Adjacencies:                                |
330	     | SI:BitPosition  |  <empty> or one or more per entry            |
331	     ==================================================================
332	     | 0:1             |  forward_connected(interface,neighbor,DNR)   |
333	     ------------------------------------------------------------------
334	     | 0:2             |  forward_connected(interface,neighbor,DNR)   |
335	     |                 |  forward_connected(interface,neighbor,DNR)   |
336	     ------------------------------------------------------------------
337	     | 0:3             |  local_decap([VRF])                          |
338	     ------------------------------------------------------------------
339	     | 0:4             |  forward_routed([VRF,]l3-neighbor)           |
340	     ------------------------------------------------------------------
341	     | 0:5             |  <empty>                                     |
342	     ------------------------------------------------------------------
343	     | 0:6             |  ECMP({adjacency1,...adjacencyN}, seed)      |
344	     ------------------------------------------------------------------
345	     ...
346	     | BitStringLength |  ...                                         |
347	     ------------------------------------------------------------------
348	                      Bit Index Forwarding Table

350	   The BIFT is programmed into the data plane of BFRs by the BIER-TE
351	   controller host and used to forward packets, according to the rules
352	   specified in the BIER-TE Forwarding Procedures.

354	   Adjacencies for the same BP when populated in more than one BFR by
355	   the controller do not have to have the same adjacencies.  This is up
356	   to the controller.  BPs for p2p links are one case (see below).

358	3.2.  Adjacency Types

360	3.2.1.  Forward Connected

362	   A "forward_connected" adjacency is towards a directly connected BFR
363	   neighbor using an interface address of that BFR on the connecting
364	   interface.  A forward_connected adjacency does not route packets but
365	   only L2 forwards them to the neighbor.

367	   Packets sent to an adjacency with "DoNotReset" (DNR) set in the BIFT
368	   will not have the BitPosition for that adjacency reset when the BFR
369	   creates a copy for it.  The BitPosition will still be reset for
370	   copies of the packet made towards other adjacencies.  The can be used
371	   for example in ring topologies as explained below.

373	3.2.2.  Forward Routed

375	   A "forward_routed" adjacency is an adjacency towards a BFR that is
376	   not a forward_connected adjacency: towards a loopback address of a
377	   BFR or towards an interface address that is non-directly connected.
378	   Forward_routed packets are forwarded via the Routing Underlay.

380	   If the Routing Underlay has multiple paths for a forward_routed
381	   adjacency, it will perform ECMP independent of BIER-TE for packets
382	   forwarded across a forward_routed adjacency.

384	   If the Routing Underlay has FRR, it will perform FRR independent of
385	   BIER-TE for packets forwarded across a forward_routed adjacency.

387	3.2.3.  ECMP

389	   The ECMP mechanisms in BIER are tied to the BIER BIFT and are are
390	   therefore not directly useable with BIER-TE.  The following
391	   procedures describe ECMP for BIER-TE that we consider to be
392	   lightweight but also well manageable.  It leverages the existing
393	   entropy parameter in the BIER header to keep packets of the flows on
394	   the same path and it introduces a "seed" parameter to allow
395	   engineering traffic to be polarized or randomized across multiple
396	   hops.

398	   An "Equal Cost Multipath" (ECMP) adjacency has a list of two or more
399	   adjacencies included in it.  It copies the BIER-TE to one of those
400	   adjacencies based on the ECMP hash calculation.  The BIER-TE ECMP
401	   hash algorithm must select the same adjacency from that list for all
402	   packets with the same "entropy" value in the BIER-TE header if the
403	   same number of adjacencies and same seed are given as parameters.
404	   Further use of the seed parameter is explained below.

406	3.2.4.  Local Decap

408	   A "local_decap" adjacency passes a copy of the payload of the BIER-TE
409	   packet to the packets NextProto within the BFR (IPv4/IPv6,
410	   Ethernet,...).  A local_decap adjacency turns the BFR into a BFER for
411	   matching packets.  Local_decap adjacencies require the BFER to
412	   support routing or switching for NextProto to determine how to
413	   further process the packet.

415	3.3.  Encapsulation considerations

417	   Specifications for BIER-TE encapsulation are outside the scope of
418	   this document.  This section gives explanations and guidelines.

420	   Because a BFR needs to interpret the BitString of a BIER-TE packet
421	   differently from a BIER packet, it is necessary to distinguish BIER
422	   from BIER-TE packets.  This is subject to definitions in BIER
423	   encapsulation specifications.

425	   MPLS encapsulation [I-D.ietf-bier-mpls-encapsulation] for example
426	   assigns one label by which BFRs recognizes BIER packets for every
427	   (SI,subdomain) combination.  If it is desirable that every subdomain
428	   can forward only BIER or BIER-TE packets, then the label allocation
429	   could stay the same, and only the forwarding model (BIER/BIER-TE)
430	   would have to be defined per subdomain.  If it is desirable to
431	   support both BIER and BIER-TE forwarding in the same subdomain, then
432	   additional labels would need to be assigned for BIER-TE forwarding.

434	   "forward_routed" requires an encapsulation permitting to unicast
435	   BIER-TE packets to a specific interface address on a target BFR.
436	   With MPLS encapsulation, this can simply be done via a label stack
437	   with that addresses label as the top label - followed by the label
438	   assigned to (SI,subdomain) - and if necessary (see above) BIER-TE.
439	   With non-MPLS encapsulation, some form of IP tunneling (IP in IP,
440	   LISP, GRE) would be required.

442	   The encapsulation used for "forward_routed" adjacencies can equally
443	   support existing advanced adjacency information such as "loose source
444	   routes" via eg: MPLS label stacks or appropriate header extensions
445	   (eg: for IPv6).

447	3.4.  Basic BIER-TE Forwarding Example

449	   Step by step example of basic BIER-TE forwarding.  This does not use
450	   ECMP or forward_routed adjacencies nor does it try to minimize the
451	   number of required BitPositions for the topology.

453	     Picture 1: Forwarding Example

455	               [Bier-Te Controller Host]
456	                       /   | \
457	                      v    v  v

459	           | p13   p1 |
460	           +- BFIR2 --+          |
461	           |          | p2   p6  |           LAN2
462	           |          +-- BFR3 --+           |
463	           |          |          |  p7  p11  |
464	      Src -+                     +-- BFER1 --+
465	           |          | p3   p8  |           |
466	           |          +-- BFR4 --+           +-- Rcv1
467	           |          |          |           |
468	           |          |
469	           | p14  p4  |
470	           +- BFIR1 --+          |
471	           |          +-- BFR5 --+ p10  p12  |
472	         LAN1         | p5   p9  +-- BFER2 --+
473	                                 |           +-- Rcv2
474	                                             |
475	                                             LAN3

477	          IP  |..... BIER-TE network......| IP

479	   pXX indicate the BitPositions number assigned by the BIER-TE
480	   controller host to adjacencies in the BIER-TE topology.  For example,
481	   p9 is the adjacency towards BFR9 on the LAN connecting to BFER2.

483	      BIFT BFIR2:
484	        p13: local_decap()
485	         p2: forward_connected(BFR3)

487	      BIFT BFR3:
488	         p1: forward_connected(BFIR2)
489	         p7: forward_connected(BFER1)
490	         p8: forward_connected(BFR4)

492	      BIFT BFER1:
493	        p11: local_decap()
494	         p6: forward_connected(BFR3)
495	         p8: forward_connected(BFR4)

497	   ...and so on.

499	   Traffic needs to flow from BFIR2 towards Rcv1, Rcv2.  The controller
500	   determines it wants it to pass across the following paths:

502	                 -> BFER1 ---------------> Rcv1
503	    BFIR2 -> BFR3
504	                 -> BFR4 -> BFR5 -> BFER2 -> Rcv2

506	   These paths equal to the following BitString: p2, p5, p7, p8, p10,
507	   p11, p12.

509	   This BitString is set up in BFIR2.  Multicast packets arriving at
510	   BFIR2 from Src are assigned this BitString.

512	   BFIR2 forwards based on that BitString.  It has p2 and p13 populated.
513	   Only p13 is in BitString which has an adjacency towards BFR3.  BFIR2
514	   resets p2 in BitString and sends a copy towards BFR2.

516	   BFR3 sees a BitString of p5,p7,p8,p10,p11,p12.  It is only interested
517	   in p1,p7,p8.  It creates a copy of the packet to BFER1 (due to p7)
518	   and one to BFR4 (due to p8).  It resets p7, p8 before sending.

520	   BFER1 sees a BitString of p5,p10,p11,p12.  It is only interested in
521	   p6,p7,p8,p11 and therefore considers only p11. p11 is a "local_decap"
522	   adjacency installed by the BIER-TE controller host because BFER1
523	   should pass packets to IP multicast.  The local_decap adjacency
524	   instructs BFER1 to create a copy, decapsulate it from the BIER header
525	   and pass it on to the NextProtocol, in this example IP multicast.  IP
526	   multicast will then forward the packet out to LAN2 because it did
527	   receive PIM or IGMP joins on LAN2 for the traffic.

529	   Further processing of the packet in BFR4, BFR5 and BFER2 accordingly.

531	4.  BIER-TE Controller Host BitPosition Assignments

533	   This section describes how the BIER-TE controller host can use the
534	   different BIER-TE adjacency types to define the BitPositions of a
535	   BIER-TE domain.

537	   Because the size of the BitString is limiting the size of the BIER-TE
538	   domain, many of the options described exist to support larger
539	   topologies with fewer BitPositions (4.1, 4.3, 4.4, 4.5, 4.6, 4.7,
540	   4.8).

542	4.1.  P2P Links

544	   Each P2p link in the BIER-TE domain is assigned one unique
545	   BitPosition with a forward_connected adjacency pointing to the
546	   neighbor on the p2p link.

548	4.2.  BFER

550	   Every BFER is given a unique BitPosition with a local_decap
551	   adjacency.

553	4.3.  Leaf BFERs

555	   Leaf BFERs are BFERs where incoming BIER-TE packets never need to be
556	   forwarded to another BFR but are only sent to the BFER to exit the
557	   BIER-TE domain.  For example, in networks where PEs are spokes
558	   connected to P routers, those PEs are Leaf BFIRs unless there is a
559	   U-turn between two PEs.

561	   All leaf-BFER in a BIER-TE domain can share a single BitPosition.
562	   This is possible because the BitPosition for the adjacency to reach
563	   the BFER can be used to distinguish whether or not packets should
564	   reach the BFER.

566	   This optimization will not work if an upstream interface of the BFER
567	   is using a BitPosition optimized as described in the following two
568	   sections (LAN, Hub and Spoke).

570	4.4.  LANs

572	   In a LAN, the adjacency to each neighboring BFR on the LAN is given a
573	   unique BitPosition.  The adjacency of this BitPosition is a
574	   forward_connected adjacency towards the BFR and this BitPosition is
575	   populated into the BIFT of all the other BFRs on that LAN.

577	            BFR1
578	             |p1
579	      LAN1-+-+---+-----+
580	          p3|  p4|   p2|
581	          BFR3 BFR4  BFR7

583	   If Bandwidth on the LAN is not an issue and most BIER-TE traffic
584	   should be copied to all neighbors on a LAN, then BitPositions can be
585	   saved by assigning just a single BitPosition to the LAN and
586	   populating the BitPosition of the BIFTs of each BFRs on the LAN with
587	   a list of forward_connected adjacencies to all other neighbors on the
588	   LAN.

590	   This optimization does not work in the face of BFRs redundantly
591	   connected to more than one LANs with this optimization because these
592	   BFRs would receive duplicates and forward those duplicates into the
593	   opposite LANs.  Adjacencies of such BFRs into their LANs still need a
594	   separate BitPosition.

596	4.5.  Hub and Spoke

598	   In a setup with a hub and multiple spokes connected via separate p2p
599	   links to the hub, all p2p links can share the same BitPosition.  The
600	   BitPosition on the hubs BIFT is set up with a list of
601	   forward_connected adjacencies, one for each Spoke.

603	   This option is similar to the BitPosition optimization in LANs:
604	   Redundantly connected spokes need their own BitPositions.

606	4.6.  Rings

608	   In L3 rings, instead of assigning a single BitPosition for every p2p
609	   link in the ring, it is possible to save BitPositions by setting the
610	   "Do Not Reset" (DNR) flag on forward_connected adjacencies.

612	   For the rings shown in the following picture, a single BitPosition
613	   will suffice to forward traffic entering the ring at BFRa or BFRb all
614	   the way up to BFR1:

616	   On BFRa, BFRb, BFR30,... BFR3, the BitPosition is populated with a
617	   forward_connected adjacency pointing to the clockwise neighbor on the
618	   ring and with DNR set.  On BFR2, the adjacency also points to the
619	   clockwise neighbor BFR1, but without DNR set.

621	   Handling DNR this way ensures that copies forwarded from any BFR in
622	   the ring to a BFR outside the ring will not have the ring BitPosition
623	   set, therefore minimizing the chance to create loops.

625	                  v        v
626	                  |        |
627	           L1     |   L2   |   L3
628	       /-------- BFRa ---- BFRb --------------------\
629	       |                                            |
630	       \- BFR1 - BFR2 - BFR3 - ... - BFR29 - BFR30 -/
631	           |      |    L4               |      |
632	        p33|                         p15|
633	           BFRd                       BFRc

635	   Note that this example only permits for packets to enter the ring at
636	   BFRa and BFRb, and that packets will always travel clockwise.  If
637	   packets should be allowed to enter the ring at any ring BFR, then one
638	   would have to use two ring BitPositions.  One for clockwise, one for
639	   counterclockwise.

641	   Both would be set up to stop rotating on the same link, eg: L1.  When
642	   the ingress ring BFR creates the clockwise copy, it will reset the
643	   counterclockwise BitPosition because the DNR bit only applies to the
644	   bit for which the replication is done.  Likewise for the clockwise
645	   BitPosition for the counterclockwise copy.  In result, the ring
646	   ingress BFR will send a copy in both directions, serving BFRs on
647	   either side of the ring up to L1.

649	4.7.  Equal Cost MultiPath (ECMP)

651	   The ECMP adjacency allows to use just one BP per link bundle between
652	   two BFRs instead of one BP for each p2p member link of that link
653	   bundle.  In the following picture, one BP is used across L1,L2,L3 and
654	   BFR1/BFR2 have for the BP

656	                --L1-----
657	           BFR1 --L2----- BFR2
658	                --L3-----

660	     BIFT entry in BFR1:
661	     ------------------------------------------------------------------
662	     | Index |  Adjacencies                                           |
663	     ==================================================================
664	     | 0:6   |  ECMP({L1-to-BFR2,L2-to-BFR2,L3-to-BFR2}, seed)        |
665	     ------------------------------------------------------------------

667	     BIFT entry in BFR2:
668	     ------------------------------------------------------------------
669	     | Index |  Adjacencies                                           |
670	     ==================================================================
671	     | 0:6   |  ECMP({L1-to-BFR1,L2-to-BFR1,L3-to-BFR1}, seed)        |
672	     ------------------------------------------------------------------

674	   In the following example, all traffic from BFR1 towards BFR10 is
675	   intended to be ECMP load split equally across the topology.  This
676	   example is not mean as a likely setup, but to illustrate that ECMP
677	   can be used to share BPs not only across link bundles, and it
678	   explains the use of the seed parameter.

680	                    BFR1
681	                  /     \
682	                 /L11    \L12
683	             BFR2         BFR3
684	            /    \       /    \
685	           /L21   \L22  /L31   \L32
686	          BFR4  BFR5   BFR6  BFR7
687	           \      /     \      /
688	            \    /       \    /
689	             BFR8         BFR9
690	                 \       /
691	                  \     /
692	                   BFR10

694	     BIFT entry in BFR1:
695	     ------------------------------------------------------------------
696	     | 0:6   |  ECMP({L11-to-BFR2,L12-to-BFR3}, seed)                 |
697	     ------------------------------------------------------------------

699	     BIFT entry in BFR2:
700	     ------------------------------------------------------------------
701	     | 0:6   |  ECMP({L21-to-BFR4,L22-to-BFR5}, seed)                 |
702	     ------------------------------------------------------------------

704	     BIFT entry in BFR3:
705	     ------------------------------------------------------------------
706	     | 0:6   |  ECMP({L31-to-BFR6,L32-to-BFR7}, seed)                 |
707	     ------------------------------------------------------------------

709	   With the setup of ECMP in above topology, traffic would not be
710	   equally load-split.  Instead, links L22 and L31 would see no traffic
711	   at all: BFR2 will only see traffic from BFR1 for which the ECMP hash
712	   in BFR1 selected the first adjacency in a list of 2 adjacencies: link
713	   L11-to-BFR2.  When forwarding in BFR2 performs again an ECMP with two
714	   adjacencies on that subset of traffic, then it will again select the
715	   first of its two adjacencies to it: L21-to-BFR4.  And therefore L22
716	   and BFR5 sees no traffic.

718	   To resolve this issue, the ECMP adjacency on BFR1 simply needs to be
719	   set up with a different seed than the ECMP adjacencies on BFR2/BFR3

721	   This issue is called polarization.  It depends on the ECMP hash.  It
722	   is possible to build ECMP that does not have polarization, for
723	   example by taking entropy from the actual adjacency members into
724	   account, but that can make it harder to achieve evenly balanced load-
725	   splitting on all BFR without making the ECMP hash algorithm
726	   potentially too complex for fast forwarding in the BFRs.

728	4.8.  Routed adjacencies

730	4.8.1.  Reducing BitPositions

732	   Routed adjacencies can reduce the number of BitPositions required
733	   when the traffic engineering requirement is not hop-by-hop explicit
734	   path selection, but loose-hop selection.

736	              ...............             ...............
737	       BFR1--... Redundant ...--L1-- BFR2... Redundant ...---
738	          \--... Network   ...--L2--/    ... Network   ...---
739	       BFR4--... Segment 1 ...--L3-- BFR3... Segment 2 ...---
740	              ...............             ...............

742	   Assume the requirement in above network is to explicitly engineer
743	   paths such that specific traffic flows are passed from segment 1 to
744	   segment 2 via link L1 (or via L2 or via L3).

746	   To achieve this, BFR1 and BFR4 are set up with a forward_routed
747	   adjacency BitPosition towards an address of BFR2 on link L1 (or link
748	   L2 BFR3 via L3).

750	   For paths to be engineered through a specific node BFR2 (or BFR3),
751	   BFR1 and BFR4 are set up up with a forward_routed adjacency
752	   BitPosition towards a loopback address of BFR2 (or BFR3).

754	4.8.2.  Supporting nodes without BIER-TE

756	   Routed adjacencies also enable incremental deployment of BIER-TE.
757	   Only the nodes through which BIER-TE traffic needs to be steered -
758	   with or without replication - need to support BIER-TE.  Where they
759	   are not directly connected to each other, forward_routed adjacencies
760	   are used to pass over non BIER-TE enabled nodes.

762	5.  Avoiding loops and duplicates

764	5.1.  Loops

766	   Whenever BIER-TE creates a copy of a packet, the BitString of that
767	   copy will have all BitPositions cleared that are associated with
768	   adjacencies in the BFR.  This inhibits looping of packets.  The only
769	   exception are adjacencies with DNR set.

771	   With DNR set, looping can happen.  Consider in the ring picture that
772	   link L4 from BFR3 is plugged into the L1 interface of BFRa.  This
773	   creates a loop where the rings clockwise BitPosition is never reset
774	   for copies of the packets traveling clockwise around the ring.

776	   To inhibit looping in the face of such physical misconfiguration,
777	   only forward_connected adjacencies are permitted to have DNR set, and
778	   the link layer destination address of the adjacency (eg.: MAC
779	   address) protects against closing the loop.  Link layers without port
780	   unique link layer addresses should not used with the DNR flag set.

782	5.2.  Duplicates

784	   Duplicates happen when the topology of the BitString is not a tree
785	   but redundantly connecting BFRs with each other.  The controller must
786	   therefore ensure to only create BitStrings that are trees in the
787	   topology.

789	   When links are incorrectly physically re-connected before the
790	   controller updates BitStrings in BFIRs, duplicates can happen.  Like
791	   loops, these can be inhibited by link layer addressing in
792	   forward_connected adjacencies.

794	   If interface or loopback addresses used in forward_routed adjacencies
795	   are moved from one BFR to another, duplicates can equally happen.
796	   Such re-addressing operations must be coordinated with the
797	   controller.

799	6.  BIER-TE Forwarding Pseudocode

801	   The following sections of Pseudocode are meant to illustrate the
802	   BIER-TE forwarding plane.  This code is not meant to be normative but
803	   to serve both as a potentially easier to read and more precise
804	   representation of the forwarding functionality and to illustrate how
805	   simple BIER-TE forwarding is and that it can be efficiently be
806	   implemented.

808	   The following procedure is executed on a BFR whenever the BIFT is
809	   changed by the BIER-TE controller host:

811	      global MyBitsOfInterest

813	      void BIFTChanged()
814	      {
815	          for (Index = 0; Index++ ; Index <= BitStringLength)
816	              if(BIFT[Index] != <empty>)
817	                  MyBitsOfInterest != 2<<(Index-1)
818	      }

820	   The following procedure is executed whenever a BIER-TE packet is to
821	   be forwarded:

823	      void ForwardBierTePacket (Packet)
824	      {
825	          // We calculate in BitMask the subset of BPs of the BitString
826	          // for which we have adjacencies. This is purely an
827	          // optimization to avoid to replicate for every BP
828	          // set in BitString only to discover that for most of them,
829	          // the BIFT has no adjacency.

831	          local BitMask = Packet->BitString
832	          Packet->BitString &= ~MyBitsOfInterest
833	          BitMask &= MyBitsOfInterest

835	          // Replication
836	          for (Index = GetFirstBitPosition(BitMask); Index ;
837	               Index = GetNextBitPosition(BitMask, Index))
838	              foreach adjacency BIFT[Index]

840	                  if(adjacency == ECMP(ListOfAdjacencies, seed) )
841	                      I = ECMP_hash(sizeof(ListOfAdjacencies),
842	                                    Packet->Entropy, seed)
843	                      adjacency = ListOfAdjacencies[I]

845	                  PacketCopy = Copy(Packet)

847	                  switch(adjacency)
848	                      case forward_connected(interface,neighbor,DNR):
849	                          if(DNR)
850	                              PacketCopy->BitString |= 2<<(Index-1)
851	                          SendToL2Unicast(PacketCopy,interface,neighbor)

853	                      case forward_routed([VRF],neighbor):
854	                          SendToL3(PacketCopy,[VRF,]l3-neighbor)

856	                      case local_decap([VRF],neighbor):
857	                          DecapBierHeader(PacketCopy)
858	                          PassTo(PacketCopy,[VRF,]Packet->NextProto)
859	      }

861	7.  Managing SI, subdomains and BFR-ids

863	   When the number of bits required to represent the necessary hops in
864	   the topology and BFER exceeds the supported bitstring length,
865	   multiple SI and/or subdomains must be used.  This section discusses
866	   how.

868	   BIER-TE forwarding does not require the concept of BFR-id, but
869	   routing underlay, flow overlay and BIER headers may.  This section
870	   also discusses how BFR-id can be assigned to BFIR/BFER for BIER-TE.

872	7.1.  Why SI and sub-domains

874	   For BIER and BIER-TE forwarding, the most important result of using
875	   multiple SI and/or subdomains is the same: Packets that need to be
876	   sent to BFER in different SI or subdomains require different BIER
877	   packets: each one with a bitstring for a different (SI,subdomain)
878	   bitstring.  Each such bitstring uses one bitstring length sized SI
879	   block in the BIFT of the subdomain.  We call this a BIFT:SI (block).

881	   For BIER and BIER-TE forwarding itself there is also no difference
882	   whether different SI and/or sub-domains are chosen, but SI and
883	   subdomain have different purposes in the BIER architecture shared by
884	   BIER-TE.  This impacts how operators are managing them and how
885	   especially flow overlays will likely use them.

887	   By default, every possible BFIR/BFER in a BIER network would likely
888	   be given a BFR-id in subdomain 0 (unless there are > 64k BFIR/BFER).

890	   If there are different flow services (or service instances) requiring
891	   replication to different subsets of BFER, then it will likely not be
892	   possible to achieve the best replication efficiency for all of these
893	   service instances via subdomain 0.  Ideal replication efficiency for
894	   N BFER exists in a subdomain if they are split over not more than
895	   ceiling(N/bitstring-length) SI.

897	   If service instances justify additional BIER:SI state in the network,
898	   additional subdomains will be used: BFIR/BFER are assigned BFIR-id in
899	   those subdomains and each service instance is configured to use the
900	   most appropriate subdomain.  This results in improved replication
901	   efficiency for different services.

903	   Even if creation of subdomains and assignment of BFR-id to BFIR/BFER
904	   in those subdomains is automated, it is not expected that individual
905	   service instances can deal with BFER in different subdomains.  A
906	   service instance may only support configuration of a single subdomain
907	   it should rely on.

909	   To be able to easily reuse (and modify as little as possible)
910	   existing BIER procedures including flow-overlay and routing underlay,
911	   when BIER-TE forwarding is added, we therefore reuse SI and subdomain
912	   logically in the same way as they are used in BIER: All necessary
913	   BFIR/BFER for a service use a single BIER-TE BIFT and are split
914	   across as many SI as necessary (see below).  Different services may
915	   use different subdomains that primarily exist to provide more
916	   efficient replication (and for BIER-TE desirable traffic engineering)
917	   for different subsets of BFIR/BFER.

919	7.2.  Bit assignment comparison BIER and BIER-TE

921	   In BIER, bitstrings only need to carry bits for BFER, which lead to
922	   the model that BFR-ids map 1:1 to each bit in a bitstring.

924	   In BIER-TE, bitstrings need to carry bits to indicate not only the
925	   receiving BFER but also the intermediate hops/links across which the
926	   packet must be sent.  The maximum number of BFER that can be
927	   supported in a single bitstring or BIFT:SI depends on the number of
928	   bits necessary to represent the desired topology between them.

930	   "Desired" topology because it depends on the physical topology, and
931	   on the desire of the operator to allow for explicit traffic
932	   engineering across every single hop (which requires more bits), or
933	   reducing the number of required bits by exploiting optimizations such
934	   as unicast (forward_route), ECMP or flood (DNR) over "uninteresting"
935	   sub-parts of the topology - eg: parts where different trees do not
936	   need to take different paths due to traffic-engineering reasons.

938	   The total number of bits to describe the topology in a BIFT:SI can
939	   therefore easily be as low as 20% or as high as 80%. The higher the
940	   percentage, the higher the likelihood, that those topology bits are
941	   not just BIER-TE overhead without additional benefit, but instead
942	   they will allow to express the desired traffic-engineering
943	   alternatives.

945	7.3.  Using BFR-id with BIER-TE

947	   Because there is no 1:1 mapping between bits in the bitstring and
948	   BFER, BIER-TE can not simply rely on the BIER 1:1 mapping between
949	   bits in a bitstring and BFR-id.

951	   In BIER, automatic schemes could assign all possible BFR-ids
952	   sequentially to BFERs.  This will not work in BIER-TE.  In BIER-TE,
953	   the operator or BIER-TE controller host has to determine a BFR-id for
954	   each BFER in each required subdomain.  The BFR-id may or may not have
955	   a relationship with a bit in the bitstring.  Suggestions are detailed
956	   below.  Once determined, the BFR-id can then be configured on the
957	   BFER and used by flow overlay, routing underlay and the BIER header
958	   almost the same as the BFR-id in BIER.

960	   The one exception are application/flow-overlays that automatically
961	   calculate the bitstring(s) of BIER packets by converting BFR-id to
962	   bits.  In BIER-TE, this operation can be done in two ways:

964	   "Independent branches": For a given application or (set of) trees,
965	   the branches from a BFIR to every BFER are independent of the
966	   branches to any other BFER.  For example, shortest part trees have
967	   independent branches.

969	   "Interdependent branches": When a BFER is added or deleted from a
970	   particular distribution tree, branches to other BFER still in the
971	   tree may need to change.  Steiner tree are examples of dependent
972	   branch trees.

974	   If "independent branches" are sufficient, the BIER-TE controller host
975	   can provide to such applications for every BFR-id a SI:bitstring with
976	   the BIER-TE bits for the branch towards that BFER.  The application
977	   can then independently calculate the SI:bitstring for all desired
978	   BFER by OR'ing their bitstrings.

980	   If "interdependent branches" are required, the application could call
981	   a BIER-TE controller host API with the list of required BFER-id and
982	   get the required bitstring back.  Whenever the set of BFER-id
983	   changes, this is repeated.

985	   Note that in either case (unlike in BIER), the bits in BIER-TE may
986	   need to change upon link/node failure/recovery, network expansion and
987	   network load by other traffic (as part of traffic engineering goals).
988	   Interactions between such BFIR applications and the BIER-TE
989	   controller host do therefore need to support dynamic updates to the
990	   bitstrings.

992	7.4.  Assigning BFR-ids for BIER-TE

994	   For non-leaf BFER, there is usually a single bit k for that BFER with
995	   a local_decap() adjacency on the BFER.  The BFR-id for such a BFER is
996	   therefore most easily the one it would have in BIER: SI * bitstring-
997	   length + k.

999	   As explained earlier in the document, leaf BFER do not need such a
1000	   separate bit because the fact alone that the BIER-TE packet is
1001	   forwarded to the leaf BFER indicates that the BFER should decapsulate
1002	   it.  Such a BFER will have one or more bits for the links leading
1003	   only to it.  The BFR-id could therefore most easily be the BFR-id
1004	   derived from the lowest bit for those links.

1006	   These two rules are only recommendations for the operator or BIER-TE
1007	   controller assigning the BFR-ids.  Any allocation scheme can be used,
1008	   the BFR-ids just need to be unique across BFRs in each subdomain.

1010	   It is not currently determined if a single subdomain could or should
1011	   be allowed to forward both BIER and BIER-TE packets.  If this should
1012	   be supported, there are two options:

1014	   A.  BIER and BIER-TE have different BFR-id in the same subdomain.
1015	   This allows higher replication efficiency for BIER because their BFR-
1016	   id can be assigned sequentially, while the bitstrings for BIER-TE
1017	   will have also the additional bits for the topology.  There is no
1018	   relationship between a BFR BIER BFR-id and BIER-TE BFR-id.

1020	   B.  BIER and BIER-TE share the same BFR-id.  The BFR-id are assigned
1021	   as explained above for BIER-TE and simply reused for BIER.  The
1022	   replication efficiency for BIER will be as low as that for BIER-TE in
1023	   this approach.  Depending on topology, only the same 20%..80% of bits
1024	   as possible for BIER-TE can be used for BIER.

1026	7.5.  Example bit allocations

1028	7.5.1.  With BIER

1030	   Consider a network setup with a bitstring length of 256 for a network
1031	   topology as shown in the picture below.  The network has 6 areas,
1032	   each with ca. 180 BFR, connecting via a core with some larger (core)
1033	   BFR.  To address all BFER with BIER, 4 SI are required.  To send a
1034	   BIER packet to all BFER in the network, 4 copies need to be sent by
1035	   the BFIR.  On the BFIR it does not make a difference how the BFR-id
1036	   are allocated to BFER in the network, but for efficiency further down
1037	   in the network it does make a difference.

1039	                area1           area2        area3
1040	               BFR1a BFR1b  BFR2a BFR2b   BFR3a BFR3b
1041	                 |  \         /    \        /  |
1042	                 ................................
1043	                 .                Core          .
1044	                 ................................
1045	                 |    /       \    /        \  |
1046	               BFR4a BFR4b  BFR5a BFR5b   BFR6a BFR6b
1047	                area4          area5        area6

1049	   With random allocation of BFR-id to BFER, each receiving area would
1050	   (most likely) have to receive all 4 copies of the BIER packet because
1051	   there would be BFR-id for each of the 4 SI in each of the areas.
1052	   Only further towards each BFER would this duplication subside - when
1053	   each of the 4 trees runs out of branches.

1055	   If BFR-id are allocated intelligently, then all the BFER in an area
1056	   would be given BFR-id with as few as possible different SI.  Each
1057	   area would only have to forward one or two packets instead of 4.

1059	   Given how networks can grow over time, replication efficiency in an
1060	   area will also easily go down over time when BFR-id are network wide
1061	   allocated sequentially over time.  An area that initially only has
1062	   BFR-id in one SI might end up with many SI over a longer period of
1063	   growth.  Allocating SIs to areas with initially sufficiently many
1064	   spare bits for growths can help to alleviate this issue.  Or renumber
1065	   BFR-id after network expansion.  In this example one may consider to
1066	   use 6 SI and assign one to each area.

1068	   This example shows that intelligent BFR-id allocation within at least
1069	   subdomain 0 can even be helpful or even necessary in BIER.

1071	7.5.2.  With BIER-TE

1073	   In BIER-TE one needs to determine a subset of the physical topology
1074	   and attached BFER so that the "desired" representation of this
1075	   topology and the BFER fit into a single bitstring.  This process
1076	   needs to be repeated until the whole topology is covered.

1078	   Once bits/SIs are assigned to topology and BFER, BFR-id is just a
1079	   derived set of identifiers from the operator/BIER-TE controller as
1080	   explained above.

1082	   Every time that different sub-topologies have overlap, bits need to
1083	   be repeated across the bitstrings, increasing the overall amount of
1084	   bits required across all bitstring/SIs.  In the worst case, random
1085	   subsets of BFER are assigned to different SI.  This is much worse
1086	   than in BIER because it not only reduces replication efficiency with
1087	   the same number of overall bits, but even further - because more bits
1088	   are required due to duplication of bits for topology across multiple
1089	   SI.  Intelligent BFER to SI assignment and selecting specific
1090	   "desired" subtopologies can minimize this problem.

1092	   To set up BIER-TE efficiently for above topology, the following bit
1093	   allocation methods can be used.  This method can easily be expanded
1094	   to other, similarly structured larger topologies.

1096	   Each area is allocated one or more SI depending on the number of
1097	   future expected BFER and number of bits required for the topology in
1098	   the area.  In this example, 6 SI, one per area.

1100	   In addition, we use 4 bits in each SI: bia, bib, bea, beb: bit
1101	   ingress a, bit ingress b, bit egress a, bit egress b.  These bits
1102	   will be used to pass BIER packets from any BFIR via any combination
1103	   of ingress area a/b BFR and egress area a/b BFR into a specific
1104	   target area.  These bits are then set up with the right
1105	   forward_routed adjacencies on the BFIR and area edge BFR:

1107	   On all BFIR in an area j, bia in each BIFT:SI is populated with the
1108	   same forward_routed(BFRja), and bib with forward_routed(BFRjb).  On
1109	   all area edge BFR, bea in BIFT:SI=k is populated with
1110	   forward_routed(BFRka) and beb in BIFT:SI=k with
1111	   forward_routed(BFRkb).

1113	   For BIER-TE forwarding of a packet to some subset of BFER across all
1114	   areas, a BFIR would create at most 6 copies, with SI=1...SI=6, In
1115	   each packet, the bits indicate bits for topology and BFER in that
1116	   topology plus the four bits to indicate whether to pass this packet
1117	   via the ingress area a or b border BFR and the egress area a or b
1118	   border BFR, therefore allowing path engineering for those two
1119	   "unicast" legs: 1) BFIR to ingress are edge and 2) core to egress
1120	   area edge.  Replication only happens inside the egress areas.  For
1121	   BFER in the same area as in the BFIR, these four bits are not used.

1123	7.6.  Summary

1125	   BIER-TE can like BIER support multiple SI within a sub-domain to
1126	   allow re-using the concept of BFR-id and therefore minimize BIER-TE
1127	   specific functions in underlay routing, flow overlay methods and BIER
1128	   headers.

1130	   The number of BFIR/BFER possible in a subdomain is smaller than in
1131	   BIER because BIER-TE uses additional bits for topology.

1133	   Subdomains can in BIER-TE be used like in BIER to create more
1134	   efficient replication to known subsets of BFER.

1136	   Assigning bits for BFER intelligently into the right SI is more
1137	   important in BIER-TE than in BIER because of replication efficiency
1138	   and overall amount of bits required.

1140	8.  BIER-TE and Segment Routing

1142	   Segment Routing aims to achieve lightweight path engineering via
1143	   loose source routing.  Compared for example to RSVP-TE, it does not
1144	   require per-path signaling to each of these hops.

1146	   BIER-TE is supports the same design philosophy for multicast.  Like
1147	   in SR, it relies on source-routing - via the definition of a
1148	   BitString.  Like SR, it only requires to consider the "hops" on which
1149	   either replication has to happen, or across which the traffic should
1150	   be steered (even without replication).  Any other hops can be skipped
1151	   via the use of routed adjacencies.

1153	   Instead of defining BitPositions for non-replicating hops, it is
1154	   equally possible to use segment routing encapsulations (eg: MPLS
1155	   label stacks) for "forward_routed" adjacencies.

1157	9.  Security Considerations

1159	   The security considerations are the same as for BIER with the
1160	   following differences:

1162	   BFR-ids and BFR-prefixes are not used in BIER-TE, nor are procedures
1163	   for their distribution, so these are not attack vectors against BIER-
1164	   TE.

1166	10.  IANA Considerations

1168	   This document requests no action by IANA.

1170	11.  Acknowledgements

1172	   The authors would like to thank Greg Shepherd, Ijsbrand Wijnands and
1173	   Neale Ranns for their extensive review and suggestions.

1175	12.  Change log [RFC Editor: Please remove]

1177	      04: Added comparison to Live-Live and BFIR to FRR section
1178	      (Eckert).

1180	      04: Removed FRR content into the new FRR draft [I-D.eckert-bier-
1181	      te-frr] (Braun).

1183	      - Linked FRR information to new draft in Overview/Introduction

1185	      - Removed BTAFT/FRR from "Changes in the network topology"

1187	      - Linked new draft in "Link/Node Failures and Recovery"

1189	      - Removed FRR from "The BIER-TE Forwarding Layer"

1191	      - Moved FRR section to new draft

1193	      - Moved FRR parts of Pseudocode into new draft

1195	      - Left only non FRR parts

1197	      - removed FrrUpDown(..) and //FRR operations in
1198	      ForwardBierTePacket(..)

1200	      - New draft contains FrrUpDown(..) and ForwardBierTePacket(Packet)
1201	      from bier-arch-03

1203	      - Moved "BIER-TE and existing FRR to new draft
1204	      - Moved "BIER-TE and Segment Routing" section one level up

1206	      - Thus, removed "Further considerations" that only contained this
1207	      section

1209	      - Added Changes for version 04

1211	      03: Updated the FRR section.  Added examples for FRR key concepts.
1212	      Added BIER-in-BIER tunneling as option for tunnels in backup
1213	      paths.  BIFT structure is expanded and contains an additional
1214	      match field to support full node protection with BIER-TE FRR.

1216	      03: Updated FRR section.  Explanation how BIER-in-BIER
1217	      encapsulation provides P2MP protection for node failures even
1218	      though the routing underlay does not provide P2MP.

1220	      02: Changed the definition of BIFT to be more inline with BIER.
1221	      In revs. up to -01, the idea was that a BIFT has only entries for
1222	      a single bitstring, and every SI and subdomain would be a separate
1223	      BIFT.  In BIER, each BIFT covers all SI.  This is now also how we
1224	      define it in BIER-TE.

1226	      02: Added Section 7 to explain the use of SI, subdomains and BFR-
1227	      id in BIER-TE and to give an example how to efficiently assign
1228	      bits for a large topology requiring multiple SI.

1230	      02: Added further detailed for rings - how to support input from
1231	      all ring nodes.

1233	      01: Fixed BFIR -> BFER for section 4.3.

1235	      01: Added explanation of SI, difference to BIER ECMP,
1236	      consideration for Segment Routing, unicast FRR, considerations for
1237	      encapsulation, explanations of BIER-TE controller host and CLI.

1239	      00: Initial version.

1241	13.  References

1243	   [I-D.ietf-bier-architecture]
1244	              Wijnands, I., Rosen, E., Dolganow, A., Przygienda, T., and
1245	              S. Aldrin, "Multicast using Bit Index Explicit
1246	              Replication", draft-ietf-bier-architecture-03 (work in
1247	              progress), January 2016.

1249	   [I-D.ietf-bier-mpls-encapsulation]
1250	              Wijnands, I., Rosen, E., Dolganow, A., Tantsura, J., and
1251	              S. Aldrin, "Encapsulation for Bit Index Explicit
1252	              Replication in MPLS Networks", draft-ietf-bier-mpls-
1253	              encapsulation-04 (work in progress), April 2016.

1255	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1256	              Requirement Levels", BCP 14, RFC 2119,
1257	              DOI 10.17487/RFC2119, March 1997,
1258	              <http://www.rfc-editor.org/info/rfc2119>.

1260	Authors' Addresses

1262	   Toerless Eckert
1263	   Cisco Systems, Inc.

1265	   Email: eckert@cisco.com

1267	   Gregory Cauchie
1268	   Bouygues Telecom

1270	   Email: GCAUCHIE@bouyguestelecom.fr

1272	   Wolfgang Braun
1273	   University of Tuebingen

1275	   Email: wolfgang.braun@uni-tuebingen.de

1277	   Michael Menth
1278	   University of Tuebingen

1280	   Email: menth@uni-tuebingen.de