idnits 2.17.1 

draft-eckert-bier-te-arch-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([I-D.ietf-bier-architecture]),
     which it shouldn't.  Please replace those with straight textual mentions
     of the documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document doesn't use any RFC 2119 keywords, yet seems to have RFC
     2119 boilerplate text.

  -- The document date (October 18, 2015) is 3103 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'VRF' is mentioned on line 996, but not defined

  == Missing Reference: 'Index' is mentioned on line 978, but not defined

  == Missing Reference: 'BitStringLength' is mentioned on line 928, but not
     defined

  == Missing Reference: 'BP' is mentioned on line 972, but not defined

  == Missing Reference: 'BT' is mentioned on line 973, but not defined

  == Missing Reference: 'I' is mentioned on line 983, but not defined

  == Unused Reference: 'I-D.ietf-bier-mpls-encapsulation' is defined on line
     1361, but no explicit reference was found in the text

  == Outdated reference: A later version (-08) exists of
     draft-ietf-bier-architecture-02

  == Outdated reference: A later version (-12) exists of
     draft-ietf-bier-mpls-encapsulation-02


     Summary: 2 errors (**), 0 flaws (~~), 11 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          T. Eckert
3	Internet-Draft                                       Cisco Systems, Inc.
4	Intended status: Standards Track                              G. Cauchie
5	Expires: April 20, 2016                                 Bouygues Telecom
6	                                                        October 18, 2015

8	     Traffic Enginering for Bit Index Explicit Replication BIER-TE
9	                      draft-eckert-bier-te-arch-02

11	Abstract

13	   This document proposes an architecture for BIER-TE: Traffic
14	   Engineering for Bit Index Explicit Replication (BIER).

16	   BIER-TE shares part of its architecture with BIER as described in
17	   [I-D.ietf-bier-architecture].  It also proposes to share the packet
18	   format with BIER.

20	   BIER-TE forwards and replicates packets like BIER based on a
21	   BitString in the packet header but it does not require an IGP.  It
22	   does support traffic engineering by explicit hop-by-hop forwarding
23	   and loose hop forwarding of packets.  It does support Fast ReRoute
24	   (FRR) for link and node protection and incremental deployment.
25	   Because BIER-TE like BIER operates without explicit in-network tree-
26	   building but also supports traffic engineering, it is more similar to
27	   SR than RSVP-TE.

29	Status of This Memo

31	   This Internet-Draft is submitted in full conformance with the
32	   provisions of BCP 78 and BCP 79.

34	   Internet-Drafts are working documents of the Internet Engineering
35	   Task Force (IETF).  Note that other groups may also distribute
36	   working documents as Internet-Drafts.  The list of current Internet-
37	   Drafts is at http://datatracker.ietf.org/drafts/current/.

39	   Internet-Drafts are draft documents valid for a maximum of six months
40	   and may be updated, replaced, or obsoleted by other documents at any
41	   time.  It is inappropriate to use Internet-Drafts as reference
42	   material or to cite them other than as "work in progress."

44	   This Internet-Draft will expire on April 20, 2016.

46	Copyright Notice

48	   Copyright (c) 2015 IETF Trust and the persons identified as the
49	   document authors.  All rights reserved.

51	   This document is subject to BCP 78 and the IETF Trust's Legal
52	   Provisions Relating to IETF Documents
53	   (http://trustee.ietf.org/license-info) in effect on the date of
54	   publication of this document.  Please review these documents
55	   carefully, as they describe your rights and restrictions with respect
56	   to this document.  Code Components extracted from this document must
57	   include Simplified BSD License text as described in Section 4.e of
58	   the Trust Legal Provisions and are provided without warranty as
59	   described in the Simplified BSD License.

61	Table of Contents

63	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
64	     1.1.  Overview  . . . . . . . . . . . . . . . . . . . . . . . .   3
65	     1.2.  Requirements Language . . . . . . . . . . . . . . . . . .   4
66	   2.  Layering  . . . . . . . . . . . . . . . . . . . . . . . . . .   4
67	     2.1.  The Multicast Flow Overlay  . . . . . . . . . . . . . . .   5
68	     2.2.  The BIER-TE Controller Host . . . . . . . . . . . . . . .   5
69	       2.2.1.  Assignment of BitPositions to adjacencies of the
70	               network topology  . . . . . . . . . . . . . . . . . .   6
71	       2.2.2.  Changes in the network topology . . . . . . . . . . .   6
72	       2.2.3.  Set up per-multicast flow BIER-TE state . . . . . . .   6
73	       2.2.4.  Link/Node Failures and Recovery . . . . . . . . . . .   6
74	     2.3.  The BIER-TE Forwarding Layer  . . . . . . . . . . . . . .   7
75	     2.4.  The Routing Underlay  . . . . . . . . . . . . . . . . . .   7
76	   3.  BIER-TE Forwarding  . . . . . . . . . . . . . . . . . . . . .   7
77	     3.1.  The Bit Index Forwarding Table (BIFT) . . . . . . . . . .   7
78	     3.2.  Adjacency Types . . . . . . . . . . . . . . . . . . . . .   8
79	       3.2.1.  Forward Connected . . . . . . . . . . . . . . . . . .   8
80	       3.2.2.  Forward Routed  . . . . . . . . . . . . . . . . . . .   9
81	       3.2.3.  ECMP  . . . . . . . . . . . . . . . . . . . . . . . .   9
82	       3.2.4.  Local Decap . . . . . . . . . . . . . . . . . . . . .   9
83	     3.3.  Encapsulation considerations  . . . . . . . . . . . . . .  10
84	     3.4.  Basic BIER-TE Forwarding Example  . . . . . . . . . . . .  10
85	   4.  BIER-TE Controller Host BitPosition Assignments . . . . . . .  12
86	     4.1.  P2P Links . . . . . . . . . . . . . . . . . . . . . . . .  12
87	     4.2.  BFER  . . . . . . . . . . . . . . . . . . . . . . . . . .  13
88	     4.3.  Leaf BFERs  . . . . . . . . . . . . . . . . . . . . . . .  13
89	     4.4.  LANs  . . . . . . . . . . . . . . . . . . . . . . . . . .  13
90	     4.5.  Hub and Spoke . . . . . . . . . . . . . . . . . . . . . .  14
91	     4.6.  Rings . . . . . . . . . . . . . . . . . . . . . . . . . .  14
92	     4.7.  Equal Cost MultiPath (ECMP) . . . . . . . . . . . . . . .  15
93	     4.8.  Routed adjacencies  . . . . . . . . . . . . . . . . . . .  17
94	       4.8.1.  Reducing BitPositions . . . . . . . . . . . . . . . .  17
95	       4.8.2.  Supporting nodes without BIER-TE  . . . . . . . . . .  17
96	   5.  Avoiding loops and duplicates . . . . . . . . . . . . . . . .  17
97	     5.1.  Loops . . . . . . . . . . . . . . . . . . . . . . . . . .  17
98	     5.2.  Duplicates  . . . . . . . . . . . . . . . . . . . . . . .  18
99	   6.  BIER-TE FRR . . . . . . . . . . . . . . . . . . . . . . . . .  18
100	     6.1.  The BIER-TE Adjacency FRR Table (BTAFT) . . . . . . . . .  18
101	     6.2.  FRR in BIER-TE forwarding . . . . . . . . . . . . . . . .  19
102	     6.3.  FRR in the BIER-TE Controller Host  . . . . . . . . . . .  19
103	     6.4.  BIER-TE FRR Benefits  . . . . . . . . . . . . . . . . . .  20
104	   7.  BIER-TE Forwarding Pseudocode . . . . . . . . . . . . . . . .  20
105	   8.  Managing SI, subdomains and BFR-ids . . . . . . . . . . . . .  23
106	     8.1.  Why SI and sub-domains  . . . . . . . . . . . . . . . . .  23
107	     8.2.  Bit assignment comparison BIER and BIER-TE  . . . . . . .  24
108	     8.3.  Using BFR-id with BIER-TE . . . . . . . . . . . . . . . .  24
109	     8.4.  Assigning BFR-ids for BIER-TE . . . . . . . . . . . . . .  25
110	     8.5.  Example bit allocations . . . . . . . . . . . . . . . . .  26
111	       8.5.1.  With BIER . . . . . . . . . . . . . . . . . . . . . .  26
112	       8.5.2.  With BIER-TE  . . . . . . . . . . . . . . . . . . . .  27
113	     8.6.  Summary . . . . . . . . . . . . . . . . . . . . . . . . .  28
114	   9.  Further considerations  . . . . . . . . . . . . . . . . . . .  28
115	     9.1.  BIER-TE and existing FRR  . . . . . . . . . . . . . . . .  28
116	     9.2.  BIER-TE and Segment Routing . . . . . . . . . . . . . . .  29
117	   10. Security Considerations . . . . . . . . . . . . . . . . . . .  29
118	   11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  29
119	   12. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  29
120	   13. Change log [RFC Editor: Please remove]  . . . . . . . . . . .  29
121	   14. References  . . . . . . . . . . . . . . . . . . . . . . . . .  30
122	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  30

124	1.  Introduction

126	1.1.  Overview

128	   This document specifies the architecture for BIER-TE: traffic
129	   engineering for Bit Index Explicit Replication BIER.

131	   BIER-TE shares architecture and packet formats with BIER as described
132	   in [I-D.ietf-bier-architecture].

134	   BIER-TE forwards and replicates packets like BIER based on a
135	   BitString in the packet header but it does not require an IGP.  It
136	   does support traffic engineering by explicit hop-by-hop forwarding
137	   and loose hop forwarding of packets.  It does support Fast ReRoute
138	   (FRR) for link and node protection and incremental deployment.
139	   Because BIER-TE like BIER operates without explicit in-network tree-
140	   building but also supports traffic engineering, it is more similar to
141	   SR than RSVP-TE.

143	   The key differences over BIER are:

145	   o  BIER-TE replaces in-network autonomous path calculation by
146	      explicit paths calculated offpath by the BIER-TE controller host.

148	   o  In BIER-TE every BitPosition of the BitString of a BIER-TE packet
149	      indicates one or more adjacencies - instead of a BFER as in BIER.

151	   o  BIER-TE in each BFR has no routing table but only a BIER-TE
152	      Forwarding Table (BIFT) indexed by SI:BitPosition and populated
153	      with only those adjacencies to which the BFR should replicate
154	      packets to.

156	   BIER-TE headers use the same format as BIER headers.

158	   BIER-TE forwarding does not require/use the BFIR-ID.  The BFIR-ID can
159	   still be useful though for coordinated BFIR/BFER functions, such as
160	   the context for upstream assigned labels for MPLS payloads in MVPN
161	   over BIER-TE.

163	   If the BIER-TE domain is also running BIER, then the BFIR-ID in BIER-
164	   TE packets can be set to the same BFIR-ID as used with BIER packets.

166	   If the BIER-TE domain is not running full BIER or does not want to
167	   reduce the need to allocate bits in BIER bierstrings for BFIR-ID
168	   values, then the allocation of BFIR-ID values in BIER-TE packets can
169	   be done through other mechanisms outside the scope of this document,
170	   as long as this is appropraitely agreed upon between all BFIR/BFER.

172	   Currently, this specification has no considerations for BIER sub-
173	   domains.

175	1.2.  Requirements Language

177	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
178	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
179	   document are to be interpreted as described in RFC 2119 [RFC2119].

181	2.  Layering

183	   End to end BIER-TE operations consists of four components: The
184	   "Multicast Flow Overlay", the "BIER-TE Controller Host", the "Routing
185	   Underlay" and the "BIER-TE forwarding layer".

187	      Picture 2: Layers of BIER-TE

189	                   <------BGP/PIM----->
190	      |<-IGMP/PIM->  multicast flow   <-PIM/IGMP->|
191	                        overlay

193	                   [Bier-TE Controller Host]
194	                      ^      ^     ^
195	                     /       |      \   BIER-TE control protocol
196	                    |        |       |  eg.: Netconf/Restconf/Yang
197	                    v        v       v
198	    Src -> Rtr1 -> BFIR-----BFR-----BFER -> Rtr2 -> Rcvr

200	                   |--------------------->|
201	                   BIER-TE forwarding layer

203	                   |<- BIER-TE domain-->|

205	                  |<--------------------->|
206	                      Routing underlay

208	2.1.  The Multicast Flow Overlay

210	   The Multicast Flow Overlay operates as in BIER.  See
211	   [I-D.ietf-bier-architecture].  Instead of interacting with the BIER
212	   layer, it interacts with the BIER-TE Controller Host

214	2.2.  The BIER-TE Controller Host

216	   The BIER-TE controller host is representing the control plane of
217	   BIER-TE.  It communicates two sets of informations with BFRs:

219	   During bring-up or modifications of the network topology, the
220	   controller discovers the network topology, assigns BitPositions to
221	   adjacencies and signals the resulting mapping of BitPositions to
222	   adjacencies to each BFR connecting to the adjacency.

224	   During day-to-day operations of the network, the controller signals
225	   to BFIRs what multicast flows are mapped to what BitStrings.

227	   Communications between the BIER-TE controller host to BFRs is ideally
228	   via standardized protocols and data-models such as Netconf/Retconf/
229	   Yang.  This is currently outside the scope of this document.  Vendor-
230	   specific CLI on the BFRs is also a posible stopgap option (as in many
231	   other SDN solutions lacking definition of standardized data model).

233	   For simplicity, the procedures of the BIER-TE controller host are
234	   described in this document as if it is a single, centralized
235	   automated entity, such as an SDN controller.  It could equally be an
236	   operator setting up CLI on the BFRs.  Distribution of the functions
237	   of the BIER-TE controller host is currently outside the scope of this
238	   document.

240	2.2.1.  Assignment of BitPositions to adjacencies of the network
241	        topology

243	   The BIER-TE controller host tracks the BFR topology of the BIER-TE
244	   domain.  It determines what adjacencies require BitPositions so that
245	   BIER-TE explicit paths can be built through them as desired by
246	   operator policy.

248	   The controller then pushes the BitPositions/adjacencies to the BIFT
249	   of the BFRs, populating only those SI:BitPositions to the BIFT of
250	   each BFR to which that BFR should be able to send packets to -
251	   adjacencies connecting to this BFR.

253	2.2.2.  Changes in the network topology

255	   If the network topology changes (not failure based) so that
256	   adjacencies that are assigned to BitPositions are no longer needed,
257	   the controller can re-use those BitPositions for new adjacencies.
258	   First, these BitPositions need to be removed from any BFIR flow state
259	   and BFR BIFT state (and BTAFT if FRR is supported, see below), then
260	   they can be repopulated, first into BIFT (and if FRR is supported
261	   BTAFT), then into BFIR.

263	2.2.3.  Set up per-multicast flow BIER-TE state

265	   The BIER-TE controller host tracks the multicast flow overlay to
266	   determine what multicast flow needs to be sent by a BFIR to which set
267	   of BFER.  It calculates the desired distribution tree across the
268	   BIER-TE domain based on algorithms outside the scope of this document
269	   (eg.: CSFP, Steiner Tree,...).  It then pushes the calculated
270	   BitString into the BFIR.

272	2.2.4.  Link/Node Failures and Recovery

274	   When link or nodes fail or recover in the topology, BIER-TE can
275	   quickly respond with the optional FRR procedures described below.  It
276	   can also more slowly react by recalculating the BitStrings of
277	   affected multicast flows.  This reaction is slower than the FR
278	   procedure because the controller needs to receive link/node up/down
279	   indications, recalculate the desired BitStrings and push them down
280	   into the BFIRs. with FRR, this is all performed locally on a BFR
281	   receiving the adjacency up/down notification.

283	2.3.  The BIER-TE Forwarding Layer

285	   When the BIER-TE Forwarding Layer receives a packet, it simply looks
286	   up the BitPositions that are set in the BitString of the packet in
287	   the Bit Index Forwarding Table (BIFT) that was populated by the BIER-
288	   TE controller host.  For every BP that is set in the BitString, and
289	   that has one or more adjacencies in the BIFT, a copy is made
290	   according to the type of adjacencies for that BP in the BIFT.  Before
291	   sending any copy, the BFR resets all BitPositions in the BitString of
292	   the packet to which it can create a copy.  This is done to inhibit
293	   that packets can loop.

295	   If the BFR support BIER-TE FRR operations, then the BIER-TE
296	   forwarding layer will receive fast adjacency up/down notification
297	   uses the BIER-TE FRR Adjacency Table to modify the BitString of the
298	   packet before it performs BIER-TE forwarding.  This is detailed in
299	   the FRR section.

301	2.4.  The Routing Underlay

303	   BIER-TE is sending BIER packets to directly connected BIER-TE
304	   neighbors as L2 (unicasted) BIER packets without requiring a routing
305	   underlay.  BIER-TE forwarding uses the Routing underlay for
306	   forward_routed adjacencies which copy BIER-TE packets to not-
307	   directly-connected BFRs (see below for adjacency definitions).

309	   If the BFR intends to support FRR for BIER-TE, then the BIER-TE
310	   forwarding plane needs to receive fast adjacency up/down
311	   notifications: Link up/down or neighbor up/down, eg.: from BFD.
312	   Providing these notifications is considered to be part of the routing
313	   underlay in this document.

315	3.  BIER-TE Forwarding

317	3.1.  The Bit Index Forwarding Table (BIFT)

319	   The Bit Index Forwarding Table (BIFT) exists in every BFR.  For every
320	   subdomain in use, it is a table indexed by SI:BitPosition and is
321	   populated by the BIER-TE control plane.  Each index can be empty or
322	   contain a list of one or more adjacencies.

324	   BIER-TE can support multiple subdomains like BIER.  Each one with a
325	   separate BIFT

327	   In the BIER architecture, indices into the BIFT are explained to be
328	   both BFR-id and SI:Bitstring (BitPosition).  This is because there is
329	   a 1:1 relationship between BFR-id and SI:Bitstring - every bit in
330	   every SI is/can be assigned to a BFIR/BFER.  In BIER-TE there are
331	   more bits used in each BitString than there are BFIR/BFER assigned to
332	   the bitstring.  This is because of the bits required to express the
333	   (traffic engineered) path through the topology.  The BIER-TE
334	   forwarding definitions do therefore not use the term BFR-id at all.
335	   Instead, BFR-ids are only used as required by routing underlay, flow
336	   overlay of BIER headers.  Please refer to Section 8 for explanations
337	   how to deal with SI, subdomains and BFR-id in BIER-TE.

339	     ------------------------------------------------------------------
340	     | Index:          |  Adjacencies:                                |
341	     | SI:BitPosition  |  <empty> or one or more per entry            |
342	     ==================================================================
343	     | 0:1             |  forward_connected(interface,neighbor,DNR)   |
344	     ------------------------------------------------------------------
345	     | 0:2             |  forward_connected(interface,neighbor,DNR)   |
346	     |                 |  forward_connected(interface,neighbor,DNR)   |
347	     ------------------------------------------------------------------
348	     | 0:3             |  local_decap([VRF])                          |
349	     ------------------------------------------------------------------
350	     | 0:4             |  forward_routed([VRF,]l3-neighbor)           |
351	     ------------------------------------------------------------------
352	     | 0:5             |  <empty>                                     |
353	     ------------------------------------------------------------------
354	     | 0:6             |  ECMP({adjacency1,...adjacencyN}, seed)      |
355	     ------------------------------------------------------------------
356	     ...
357	     | BitStringLength |  ...                                         |
358	     ------------------------------------------------------------------
359	                      Bit Index Forwarding Table

361	   The BIFT is programmed into the data plane of BFRs by the BIER-TE
362	   controller host and used to forward packets, according to the rules
363	   specified in the BIER-TE Forwarding Procedures.

365	   Adjacencies for the same BP when populated in more than one BFR by
366	   the controller do not have to have the same adjacencies.  This is up
367	   to the controller.  BPs for p2p links are one case (see below).

369	3.2.  Adjacency Types

371	3.2.1.  Forward Connected

373	   A "forward_connected" adjacency is towards a directly connected BFR
374	   neighbor using an interface address of that BFR on the connecting
375	   interface.  A forward_connected adjacency does not route packets but
376	   only L2 forwards them to the neighbor.

378	   Packets sent to an adjacency with "DoNotReset" (DNR) set in the BIFT
379	   will not have the BitPosition for that adjacency reset when the BFR
380	   creates a copy for it.  The BitPosition will still be reset for
381	   copies of the packet made towards other adjacencies.  The can be used
382	   for example in ring topologies as explained below.

384	3.2.2.  Forward Routed

386	   A "forward_routed" adjacency is an adjacency towards a BFR that is
387	   not a forward_connected adjacency: towards a loopback address of a
388	   BFR or towards an interface address that is non-directly connected.
389	   Forward_routed packets are forwarded via the Routing Underlay.

391	   If the Routing Underlay has multiple paths for a forward_routed
392	   adjacency, it will perform ECMP independent of BIER-TE for packets
393	   forwarded across a forward_routed adjacency.

395	   If the Routing Underlay has FRR, it will perform FRR independent of
396	   BIER-TE for packets forwarded across a forward_routed adjacency.

398	3.2.3.  ECMP

400	   The ECMP mechanisms in BIER are tied to the BIER BIFT and are are
401	   therefore not directly useable with BIER-TE.  The following
402	   procedures describe ECMP for BIER-TE that we consider to be
403	   lightweight but also well manageable.  It leverages the existing
404	   entropy parameter in the BIER header to keep packets of the flows on
405	   the same path anbd it introduces a "seed" parameter to allow
406	   engineering traffic to be polarized or randomized across multiple
407	   hops.

409	   An "Equal Cost Multipath" (ECMP) adjacency has a list of two or more
410	   adjacencies included in it.  It copies the BIER-TE to one of those
411	   adjacencies based on the ECMP hash calculation.  The BIER-TE ECMP
412	   hash algorithm must select the same adjacency from that list for all
413	   packets with the same "entropy" value in the BIER-TE header if the
414	   same number of adjacencies and same seed are given as parameters.
415	   Further use of the seed parameter is explained below.

417	3.2.4.  Local Decap

419	   A "local_decap" adjacency passes a copy of the payload of the BIER-TE
420	   packet to the packets NextProto within the BFR (IPv4/IPv6,
421	   Ethernet,...).  A local_decap adjacency turns the BFR into a BFER for
422	   matching packets.  Local_decap adjacencies require the BFER to
423	   support routing or switching for NextProto to determine how to
424	   further process the packet.

426	3.3.  Encapsulation considerations

428	   Specifications for BIER-TE encapsulation are outside the scope of
429	   this document.  This section gives explanations and guidelines.

431	   Because a BFR needs to interpret the BitString of a BIER-TE packet
432	   differently from a BIER packet, it is necessary to distinguish BIER
433	   from BIER-TE packets.  This is subject to definitions in BIER
434	   encapsulation specifications.

436	   MPLS encapsulation for example assigns one label by which BFRs
437	   recognizes BIER packets for every (SI,subdomain) combination.  If it
438	   is desirable that every subdomain can forward only BIER or BIER-TE
439	   packets, then the label allocation could stay the same, and only the
440	   forwarding model (BIER/BIER-TE) would have to be defined per
441	   subdomain.  If it id desirable to support both BIER and BIER-TE
442	   forwarding in the same subdomain, then additional label would need to
443	   be assigned for BIER-TE forwarding.

445	   "forward_routed" requires an encapsulation permitting to unicast
446	   BIER-TE packets to a specific interface address on a target BFR.
447	   With MPLS encapsulation, this can simply be done via a label stack
448	   with that addresses label as the top label - followed by the label
449	   assigned to (SI,subdomain) - and if necessary (see above) BIER-TE.
450	   With non-MPLS encapsulation, some form of IP tunneling (IP in IP,
451	   LISP, GRE) would be required.

453	   The encapsulation used for "forward_routed" adjacencies can equally
454	   support existing advanced adjacency information such as "loose source
455	   routes" via eg: MPLS label stacks or appropriate header extensions
456	   (eg: for IPv6).

458	3.4.  Basic BIER-TE Forwarding Example

460	   Step by step example of basic BIER-TE forwarding.  This does not use
461	   ECMP or forward_routed adjacencies nor does it try to minimize the
462	   number of required BitPositions for the topology.

464	     Picture 1: Forwarding Example

466	               [Bier-Te Controller Host]
467	                       /   | \
468	                      v    v  v

470	           | p13   p1 |
471	           +- BFIR2 --+          |
472	           |          | p2   p6  |           LAN2
473	           |          +-- BFR3 --+           |
474	           |          |          |  p7  p11  |
475	      Src -+                     +-- BFER1 --+
476	           |          | p3   p8  |           |
477	           |          +-- BFR4 --+           +-- Rcv1
478	           |          |          |           |
479	           |          |
480	           | p14  p4  |
481	           +- BFIR1 --+          |
482	           |          +-- BFR5 --+ p10  p12  |
483	         LAN1         | p5   p9  +-- BFER2 --+
484	                                 |           +-- Rcv2
485	                                             |
486	                                             LAN3

488	          IP  |..... BIER-TE network......| IP

490	   pXX indicate the BitPositions number assigned by the BIER-TE
491	   controller host to adjacencies in the BIER-TE topology.  For example,
492	   p9 is the adjacency towards BFR9 on the LAN connecting to BFER2.

494	      BIFT BFIR2:
495	        p13: local_decap()
496	         p2: forward_connected(BFR3)

498	      BIFT BFR3:
499	         p1: forward_connected(BFIR2)
500	         p7: forward_connected(BFER1)
501	         p8: forward_connected(BFR4)

503	      BIFT BFER1:
504	        p11: local_decap()
505	         p6: forward_connected(BFR3)
506	         p8: forward_connected(BFR4)

508	   ...and so on.

510	   Traffic needs to flow from BFIR2 towards Rcv1, Rcv2.  The controller
511	   determines it wants it to pass across the following paths:

513	                 -> BFER1 ---------------> Rcv1
514	    BFIR2 -> BFR3
515	                 -> BFR4 -> BFR5 -> BFER2 -> Rcv2

517	   These paths equal to the following BitString: p2, p5, p7, p8, p10,
518	   p11, p12

520	   This BitString is set up in BFIR2.  Multicast packets arriving at
521	   BFIR2 from Src are assigned this BitString.

523	   BFIR2 forwards based on that BitString.  It has p2 and p13 populated.
524	   Only p13 is in BitString which has an adjacency towards BFR3.  BFIR2
525	   resets p2 in BitString and sends a copy towards BFR2.

527	   BFR3 sees a BitString of p5,p7,p8,p10,p11,p12.  It is only interested
528	   in p1,p7,p8.  It creates a copy of the packet to BFER1 (due to p7)
529	   and one to BFR4 (due to p8).  It resets p7, p8 before sending.

531	   BFER1 sees a BitString of p5,p10,p11,p12.  It is only interested in
532	   p6,p7,p8,p11 and therefore considers only p11. p11 is a "local_decap"
533	   adjacency installed by the BIER-TE controller host because BFER1
534	   should pass packets to IP multicast.  The local_decap adjacency
535	   instructs BFER1 to create a copy, decapsulate it from the BIER header
536	   and pass it on to the NextProtocol, in this example IP multicast.  IP
537	   multicast will then forward the packet out to LAN2 because it did
538	   receive PIM or IGMP joins on LAN2 for the traffic.

540	   Further processing of the packet in BFR4, BFR5 and BFER2 accordingly.

542	4.  BIER-TE Controller Host BitPosition Assignments

544	   This section describes how the BIER-TE controller host can use the
545	   different BIER-TE adjacency types to define the BitPositions of a
546	   BIER-TE domain.

548	   Because the size of the BitString is limiting the size of the BIER-TE
549	   domain, many of the options described exist to support larger
550	   topologies with fewer BitPositions (4.1, 4.3, 4.4, 4.5, 4.6, 4.7,
551	   4.8).

553	4.1.  P2P Links

555	   Each P2p link in the BIER-TE domain is assigned one unique
556	   BitPosition with a forward_connected adjacency pointing to the
557	   neighbor on the p2p link.

559	4.2.  BFER

561	   Every BFER is given a unique BitPosition with a local_decap
562	   adjacency.

564	4.3.  Leaf BFERs

566	   Leaf BFERs are BFERs where incoming BIER-TE packets never need to be
567	   forwarded to another BFR but are only sent to the BFER to exit the
568	   BIER-TE domain.  For example, in networks where PEs are spokes
569	   connected to P routers, those PEs are Leaf BFIRs unless there is a
570	   U-turn between two PEs.

572	   All leaf-BFER in a BIER-TE domain can share a single BitPosition.
573	   This is possible because the BitPosition for the adjacency to reach
574	   the BFER can be used to distinguish whether or not packets should
575	   reach the BFER.

577	   This optimization will not work if an upstream interface of the BFER
578	   is using a BitPosition optimized as described in the following two
579	   sections (LAN, Hub and Spoke).

581	4.4.  LANs

583	   In a LAN, the adjacency to each neighboring BFR on the LAN is given a
584	   unique BitPosition.  The adjacency of this BitPosition is a
585	   forward_connected adjacency towards the BFR and this BitPosition is
586	   populated into the BIFT of all the other BFRs on that LAN.

588	            BFR1
589	             |p1
590	      LAN1-+-+---+-----+
591	          p3|  p4|   p2|
592	          BFR3 BFR4  BFR7

594	   If Bandwidth on the LAN is not an issue and most BIER-TE traffic
595	   should be copied to all neighbors on a LAN, then BitPositions can be
596	   saved by assigning just a single BitPosition to the LAN and
597	   populating the BitPosition of the BIFTs of each BFRs on the LAN with
598	   a list of forward_connected adjacencies to all other neighbors on the
599	   LAN.

601	   This optimization does not work in the face of BFRs redundantly
602	   connected to more than one LANs with this optimization because these
603	   BFRs would receive duplicates and forward those duplicates into the
604	   opposite LANs.  Adjacencies of such BFRs into their LANs still need a
605	   separate BitPosition.

607	4.5.  Hub and Spoke

609	   In a setup with a hub and multiple spokes connected via separate p2p
610	   links to the hub, all p2p links can share the same BitPosition.  The
611	   BitPosition on the hubs BIFT is set up with a list of
612	   forward_connected adjacencies, one for each Spoke.

614	   This option is similar to the BitPosition optimization in LANs:
615	   Redundantly connected spokes need their own BitPositions.

617	4.6.  Rings

619	   In L3 rings, instead of assigning a single BitPosition for every p2p
620	   link in the ring, it is possible to save BitPositions by setting the
621	   "Do Not Reset" (DNR) flag on forward_connected adjacencies.

623	   For the rings shown in the following picture, a single BitPosition
624	   will suffice to forward traffic entering the ring at BFRa or BFRb all
625	   the way up to BFR1:

627	   On BFRa, BFRb, BFR30,... BFR3, the BitPosition is populated with a
628	   forward_connected adjacency pointing to the clockwise neighbor on the
629	   ring and with DNR set.  On BFR2, the adjacency also points to the
630	   clockwise neighbor BFR1, but without DNR set.

632	   Handling DNR this way ensures that copies forwarded from any BFR in
633	   the ring to a BFR outside the ring will not have the ring BitPosition
634	   set, therefore minimizing the chance to create loops.

636	                  v        v
637	                  |        |
638	           L1     |   L2   |   L3
639	       /-------- BFRa ---- BFRb --------------------\
640	       |                                            |
641	       \- BFR1 - BFR2 - BFR3 - ... - BFR29 - BFR30 -/
642	           |      |    L4               |      |
643	        p33|                         p15|
644	           BFRd                       BFRc

646	   Note that this example only permits for packets to enter the ring at
647	   BFRa and BFRb, and that packets will always travel clockwise.  If
648	   packets should be allowed to enter the ring at any ring BFR, then one
649	   would have to use two ring BitPositions.  One for clockwise, one for
650	   counterlockwise.

652	   Both would be set up to stop rotating on the same link, eg: L1.  When
653	   the ingres ring BFR creates the clockwise copy, it will reset the
654	   counterlockwise BitPosition because the DNR bit only applies to the
655	   bit for which the replication is done.  Likewise for the clockwise
656	   BitPosition for the counterlockwise copy.  In result, the ring ingres
657	   BFR will send a copy in both directions, serving BFRs on either side
658	   of the ring up to L1.

660	4.7.  Equal Cost MultiPath (ECMP)

662	   The ECMP adjacency allows to use just one BP per link bundle between
663	   two BFRs instead of one BP for each p2p member link of that link
664	   bundle.  In the following picture, one BP is used across L1,L2,L3 and
665	   BFR1/BFR2 have for the BP

667	                --L1-----
668	           BFR1 --L2----- BFR2
669	                --L3-----

671	     BIFT entry in BFR1:
672	     ------------------------------------------------------------------
673	     | Index |  Adjacencies                                           |
674	     ==================================================================
675	     | 0:6   |  ECMP({L1-to-BFR2,L2-to-BFR2,L3-to-BFR2}, seed)        |
676	     ------------------------------------------------------------------

678	     BIFT entry in BFR2:
679	     ------------------------------------------------------------------
680	     | Index |  Adjacencies                                           |
681	     ==================================================================
682	     | 0:6   |  ECMP({L1-to-BFR1,L2-to-BFR1,L3-to-BFR1}, seed)        |
683	     ------------------------------------------------------------------

685	   In the following example, all traffic from BFR1 towards BFR10 is
686	   intended to be ECMP load split equally across the topology.  This
687	   example is not mean as a likely setup, but to illustrate that ECMP
688	   can be used to share BPs not only across link bundles, and it
689	   explains the use of the seed parameter.

691	                    BFR1
692	                  /     \
693	                 /L11    \L12
694	             BFR2         BFR3
695	            /    \       /    \
696	           /L21   \L22  /L31   \L32
697	          BFR4  BFR5   BFR6  BFR7
698	           \      /     \      /
699	            \    /       \    /
700	             BFR8         BFR9
701	                 \       /
702	                  \     /
703	                   BFR10

705	     BIFT entry in BFR1:
706	     ------------------------------------------------------------------
707	     | 0:6   |  ECMP({L11-to-BFR2,L12-to-BFR3}, seed)                 |
708	     ------------------------------------------------------------------

710	     BIFT entry in BFR2:
711	     ------------------------------------------------------------------
712	     | 0:6   |  ECMP({L21-to-BFR4,L22-to-BFR5}, seed)                 |
713	     ------------------------------------------------------------------

715	     BIFT entry in BFR3:
716	     ------------------------------------------------------------------
717	     | 0:6   |  ECMP({L31-to-BFR6,L32-to-BFR7}, seed)                 |
718	     ------------------------------------------------------------------

720	   With the setup of ECMP in above topology, traffic would not be
721	   equally load-split.  Instead, links L22 and L31 would see no traffic
722	   at all: BFR2 will only see traffic from BFR1 for which the ECMP hash
723	   in BFR1 selected the first adjacency in a list of 2 adjacencies: link
724	   L11-to-BFR2.  When forwarding in BFR2 performs again an ECMP with two
725	   adjacencies on that subset of traffic, then it will again select the
726	   first of its two adjacencies to it: L21-to-BFR4.  And therefore L22
727	   and BFR5 sees no traffic.

729	   To resolve this issue, the ECMP adjaceny on BFR1 simply needs to be
730	   set up with a different seed than the ECMP adjacncies on BFR2/BFR3

732	   This issue is called polarization.  It depends on the ECMP hash.  It
733	   is possible to build ECMP that does not have polarization, for
734	   example by taking entropy from the actual adjacency members into
735	   account, but that can make it harder to achieve evenly balanced load-
736	   splitting on all BFR without making the ECMP hash algorithm
737	   potentially too complex for fast forwarding in the BFRs.

739	4.8.  Routed adjacencies

741	4.8.1.  Reducing BitPositions

743	   Routed adjacencies can reduce the number of BitPositions required
744	   when the traffic engineering requirement is not hop-by-hop explicit
745	   path selection, but loose-hop selection.

747	              ...............             ...............
748	       BFR1--... Redundant ...--L1-- BFR2... Redundant ...---
749	          \--... Network   ...--L2--/    ... Network   ...---
750	       BFR4--... Segment 1 ...--L3-- BFR3... Segment 2 ...---
751	              ...............             ...............

753	   Assume the requirement in above network is to explicitly engineer
754	   paths such that specific traffic flows are passed from segment 1 to
755	   segment 2 via link L1 (or via L2 or via L3).

757	   To achieve this, BFR1 and BFR4 are set up with a forward_routed
758	   adjacency BitPosition towards an address of BFR2 on link L1 (or link
759	   L2 BFR3 via L3).

761	   For paths to be engineered through a specific node BFR2 (or BFR3),
762	   BFR1 and BFR4 are set up up with a forward_routed adjacency
763	   BitPosition towards a loopback address of BFR2 (or BFR3).

765	4.8.2.  Supporting nodes without BIER-TE

767	   Routed adjacencies also enable incremental deployment of BIER-TE.
768	   Only the nodes through which BIER-TE traffic needs to be steered -
769	   with or without replication - need to support BIER-TE.  Where they
770	   are not directly connected to each other, forward_routed adjacencies
771	   are used to pass over non BIER-TE enabled nodes.

773	5.  Avoiding loops and duplicates

775	5.1.  Loops

777	   Whenever BIER-TE creates a copy of a packet, the BitString of that
778	   copy will have all BitPositions cleared that are associated with
779	   adjacencies in the BFR.  This inhibits looping of packets.  The only
780	   exception are adjacencies with DNR set.

782	   With DNR set, looping can happen.  Consider in the ring picture that
783	   link L4 from BFR3 is plugged into the L1 interface of BFRa.  This
784	   creates a loop where the rings clockwise BitPosition is never reset
785	   for copies of the packets traveling clockwise around the ring.

787	   To inhibit looping in the face of such physical misconfiguration,
788	   only forward_connected adjacencies are permitted to have DNR set, and
789	   the link layer destination address of the adjacency (eg.: MAC
790	   address) protects against closing the loop.  Link layers without port
791	   unique link layer addresses should not used with the DNR flag set.

793	5.2.  Duplicates

795	   Duplicates happen when the topology of the BitString is not a tree
796	   but redundantly connecting BFRs with each other.  The controller must
797	   therefore ensure to only create BitStrings that are trees in the
798	   topology.

800	   When links are incorrectly physically re-connected before the
801	   controller updates BitStrings in BFIRs, duplicates can happen.  Like
802	   loops, these can be inhibited by link layer addressing in
803	   forward_connected adjacencies.

805	   If interface or loopback addresses used in forward_routed adjacencies
806	   are moved from one BFR to another, duplicates can equally happen.
807	   Such re-addressing operations must be coordinated with the
808	   controller.

810	6.  BIER-TE FRR

812	   FRR is an optional procedure.  To leverage it, the BIER-TE controller
813	   host and BFRs need to support it.  It does not have to be supported
814	   on all BFRs, but only those that are attached to a link/adjacency for
815	   which FRR support is required.

817	   If BIER-TE FRR is supported by the BIER-TE controller host, then it
818	   needs to calculate the desired backup paths for link and/or node
819	   failures in the BIER-TE domain and download this information into the
820	   BIER-TE Adjacency FRR Table (BTAFT) of the BFRs.  The BTAFT then
821	   drives FRR operations in the BIER-TE forwarding plane of that BFR.

823	6.1.  The BIER-TE Adjacency FRR Table (BTAFT)

825	   The BIER-TE IF FRR Table exists in every BFR that is supporting BIER-
826	   TE FRR procedures.  It is indexed by FRR Adjacency Index.  Associated
827	   with each FRR Adjacency Index is a ResetBitmask, AddBitmask and
828	   BitPosition.

830	     -----------------------------------------------------------
831	     | FRR Adjacency | BitPosition | ResetBitmask | AddBitmask |
832	     | Index         |             |              |            |
833	     ===========================================================
834	     | 0:1           |   5         |  ..0010000   | ..11000000 |
835	     -----------------------------------------------------------
836	     ...

838	   An FRR Adjacency is an adjacency that is used in the BIFT of the BFR.
839	   The BFR has to be able to determine whether the adjacency is up or
840	   down in less than 50msec.  An FRR adjacency can be a
841	   forward_connected adjacency with fast L2 link state Up/Down state
842	   notifications or a forward_connected or forward_routed adjacency with
843	   a fast aliveness mechanism such as BFD.  Details of those mechanism
844	   are outside the scope of this architecture.

846	   The FRR Adjacency Index is the index that would be indicated on the
847	   fast Up/Down notifications to the BIER-TE forwarding plane

849	   The BitPosition is the BP in the BIFT in which the FRR Adjacency is
850	   used

852	6.2.  FRR in BIER-TE forwarding

854	   The BIER-TE forwarding plane receives fast Up/Down notifications with
855	   the FRR Adjacency Index.  From the BitPosition in the BTAFT entry, it
856	   remembers which BPs are currently affected (have a down adjacency).

858	   When a packet is received, BIER-TE forwarding checks if it has
859	   affected BPs to which it would forward.  If it does, it will remove
860	   the ResetBitmask bits from the packets BitString and add the
861	   AddBitmask bits to the packets BitString.

863	   Afterwards, normal BIER-TE forwarding occurs, taking the modified
864	   BitString into account.

866	6.3.  FRR in the BIER-TE Controller Host

868	   The basic rules how the BIER-TE controller host would calculate
869	   ResetBitMask and AddBitmask are as follows:

871	   1.  The BIER-TE controller host has to determine whether a failure of
872	       the adjacency should be taken to indicate link or node failure.
873	       This is a policy decision.

875	   2.  The ResetBitmask has the BitPosition of the failed adjacency.

877	   3.  In the case of link protection, the AddBitmask are the segments
878	       forming a path from the BFR over to the BFR on the other end of
879	       the failed link.

881	   4.  In the case of node protection, the AddBitmask are the segments
882	       forming a tree from the BFR over to all necessary BFR downstream
883	       of the (assumed to be failed) BFR across the failed adjacency.

885	   5.  The ResetBitmask is extended with those segments that could lead
886	       to duplicate packets if the AddBitmask is added to possible
887	       BitStrings of packets using the failing BitPosition.

889	6.4.  BIER-TE FRR Benefits

891	   Compared to other FRR solutions, such as RSVP-TE/P2MP FRR, BIER-TE
892	   FRR has two key distinctions

894	   o  It maintains the goal of BIER-TE not to establish in-network per
895	      multicast traffic flow state.  For that reason, the backup path/
896	      trees are only tied to the topology but not to individual
897	      distribution trees.

899	   o  For the case of node failure, it allows to build a path engineered
900	      backup tree (4.) as opposed to only a set of p2p backup tunnels.

902	7.  BIER-TE Forwarding Pseudocode

904	   The following sections of Pseudocode are meant to illustrate the
905	   BIER-TE forwarding plane.  This code is not meant to be normative but
906	   to serve both as a potentially easier to read and more precise
907	   representation of the forwarding functionality and to illustrate how
908	   simple BIER-TE forwarding is and that it can be efficiently be
909	   implemented.

911	   The following procedure is executed on a BFR whenever the BIFT is
912	   changed by the BIER-TE controller host:

914	      global MyBitsOfInterest

916	      void BIFTChanged()
917	      {

919	          for (Index = 0; Index++ ; Index <= BitStringLength)
920	              if(BIFT[Index] != <empty>)
921	                  MyBitsOfInterest != 2<<(Index-1)
922	      }

924	   The following procedure is executed whenever an adjacency used for
925	   BIER-TE FRR changes state:

927	      global ResetBitMaskByBT[BitStringLength]
928	      global AddtBitMaskByBT[BitStringLength]
929	      global FRRaffectedBP

931	      void FrrUpDown(FrrAdjacencyIndex, UpDown)
932	      {
933	          global FRRAdjacenciesDown
934	          local Idx = FrrAdjacencyIndex

936	          if (UpDown == Up)
937	              FRRAdjacenciesDown &= ~ 2<<(FrrAdjacencyIndex-1)
938	          else
939	              FRRAdjacenciesDown |=   2<<(FrrAdjacencyIndex-1)

941	          for (Index = GetFirstBitPosition(FRRAdjacenciesDown); Index ;
942	              Index = GetNextBitPosition(FRRAdjacenciesDown, Index))

944	              local BP = BTAFT[Index].BitPosition
945	              FRRaffectedBP |= 2<<(Index)
946	              ResetBitMaskByBT[BP] |= BTAFT[Index].ResetBitMask
947	              AddBitMaskByBT[BP]   |= BTAFT[Index].AddBitMask
948	      }

950	   The following procedure is executed whenever a BIER-TE packet is to
951	   be forwarded:

953	      void ForwardBierTePacket (Packet)
954	      {
955	          // We calculate in BitMask the subset of BPs of the BitString
956	          // for which we have adjacencies. This is purely an
957	          // optimization to avoid to replicate for every BP
958	          // set in BitString only to discover that for most of them,
959	          // the BIFT has no adjacency.

961	          local BitMask = Packet->BitString
962	          Packet->BitString &= ~MyBitsOfInterest
963	          BitMask &= MyBitsOfInterest

965	          // FRR Operations
966	          // Note: this algorithm is not optimal yet for ECMP cases
967	          // it performs FRR replacement for all candidate ECMP paths

969	          local MyFRRBP = BitMask & FRRaffectedBP
970	          for (BP = GetFirstBitPosition(MyFRRNP); BP ;
971	               BP = GetNextBitPosition(MyFRRNP, BP))
972	              BitMask &= ~ResetBitMaskByBT[BP]
973	              BitMask |=  ResetBitMaskByBT[BT]

975	          // Replication
976	          for (Index = GetFirstBitPosition(BitMask); Index ;
977	               Index = GetNextBitPosition(BitMask, Index))
978	              foreach adjacency BIFT[Index]

980	                  if(adjacency == ECMP(ListOfAdjacencies, seed) )
981	                      I = ECMP_hash(sizeof(ListOfAdjacencies),
982	                                    Packet->Entropy, seed)
983	                      adjacency = ListOfAdjacencies[I]

985	                  PacketCopy = Copy(Packet)

987	                  switch(adjacency)
988	                      case forward_connected(interface,neighbor,DNR):
989	                          if(DNR)
990	                              PacketCopy->BitString |= 2<<(Index-1)
991	                          SendToL2Unicast(PacketCopy,interface,neighbor)

993	                      case forward_routed([VRF],neighbor):
994	                          SendToL3(PacketCopy,[VRF,]l3-neighbor)

996	                      case local_decap([VRF],neighbor):
997	                          DecapBierHeader(PacketCopy)
998	                          PassTo(PacketCopy,[VRF,]Packet->NextProto)
999	      }

1001	8.  Managing SI, subdomains and BFR-ids

1003	   When the number of bits required to represent the necessary hops in
1004	   the topology and BFER exceeds the supported bitstring length,
1005	   multiple SI and/or subdomains must be used.  This section discusses
1006	   how.

1008	   BIER-TE forwarding does not require the concept of BFR-id, but
1009	   routing underlay, flow overlay and BIER headers may.  This section
1010	   also discusses how BFR-id can be assigned to BFIR/BFER for BIER-TE.

1012	8.1.  Why SI and sub-domains

1014	   For BIER and BIER-TE forwarding, the most important result of using
1015	   multiple SI and/or subdomains is the same: Packets that need to be
1016	   sent to BFER in different SI or subdomains require different BIER
1017	   packets: each one with a bitstring for a different (SI,subdomain)
1018	   bitstring.  Each such bitstring uses one bitstring legth sized SI
1019	   block in the BIFT of the subdomain.  We call this a BIFT:SI (block).

1021	   For BIER and BIER-TE forwarding itself there is also no difference
1022	   whether different SI and/or sub-domains are choosen, but SI and
1023	   subdomain have different purposes in the BIER architecture shared by
1024	   BIER-TE.  This impacts how operators are managing them and how
1025	   especially flow overlays will likely use them.

1027	   By default, every possible BFIR/BFER in a BIER network would likey be
1028	   given a BFR-id in subdomain 0 (unless there are > 64k BFIR/BFER).

1030	   If there are different flow services (or service instances) requiring
1031	   replication to different subsets of BFER, then it will likely not be
1032	   possible to achieve the best replication efficieny for all of these
1033	   service instances via subdomain 0.  Ideal replication efficiency for
1034	   N BFER exists in a subdomain if they are split over not more than
1035	   ceiling(N/bitstring-length) SI.

1037	   If service instances justify additional BIER:SI state in the network,
1038	   additional subdomains will be used: BFIR/BFER are assigned BFIR-id in
1039	   those subdomains and each service instance is configured to use the
1040	   most appropriate subdomain.  This results in improved replication
1041	   efficiency for different services.

1043	   Even if creation of subdomains and assignment of BFR-id to BFIR/BFER
1044	   in those subdomains is automated, it is not expected that individual
1045	   service instances can deal with BFER in different subdomains.  A
1046	   service instance may only support configuration of a single subdomain
1047	   it should rely on.

1049	   To be able to easily reuse (and modify as little as possible)
1050	   existing BIER procedures including flow-overlay and routing underlay,
1051	   when BIER-TE forwarding is added, we therefore reuse SI and subdomain
1052	   logically in the same way as they are used in BIER: All necessary
1053	   BFIR/BFER for a service use a single BIER-TE BIFT and are split
1054	   across as many SI as necessary (see below).  Different services may
1055	   use different subdomains that primarily exist to provide more
1056	   efficient replication (and for BIER-TE desirable traffic engineering)
1057	   for different subsets of BFIR/BFER.

1059	8.2.  Bit assignment comparison BIER and BIER-TE

1061	   In BIER, bitstrings only need to carry bits for BFER, which lead to
1062	   the model that BFR-ids map 1:1 to each bit in a bitstring.

1064	   In BIER-TE, bitstrings need to carry bits to indicate not only the
1065	   receiving BFER but also the intermediate hops/links across which the
1066	   packet must be sent.  The maximum number of BFER that can be
1067	   supported in a single bitstring or BIFT:SI depends on the number of
1068	   bits necessary to represent the desired topology between them.

1070	   "Desired" topology because it depends on the physical topology, and
1071	   on the desire of the operator to allow for explicit traffic
1072	   engineering across every single hop (which requires more bits), or
1073	   reducing the number of required bits by exploiting optimizations such
1074	   as unicast (forward_route), ECMP or flood (DNR) over "uninteresting"
1075	   sub-parts of the topology - eg: parts where different trees do not
1076	   need to take different paths due to traffic-engineering reasons.

1078	   The total number of bits to describe the topology in a BIFT:SI can
1079	   therefore easily be as low as 20% or as high as 80%. The higher the
1080	   percentage, the higher the likelyhood, that those topology bits are
1081	   not just BIER-TE overhead without additional benefit, but instead
1082	   they will allow to express the desired traffic-engineering
1083	   alternatives.

1085	8.3.  Using BFR-id with BIER-TE

1087	   Because there is no 1:1 mapping between bits in the bitstring and
1088	   BFER, BIER-TE can not simply rely on the BIER 1:1 mapping between
1089	   bits in a bitstring and BFR-id.

1091	   In BIER, automatic schemes could assign all possible BFR-ids
1092	   sequentially to BFERs.  This will not work in BIER-TE.  In BIER-TE,
1093	   the operator or BIER-TE controller host has to determine a BFR-id for
1094	   each BFER in each required subdomain.  The BFR-id may or may not have
1095	   a relationship with a bit in the bitstring.  Suggestions are
1096	   detailled below.  Once determined, the BFR-id can then be configured
1097	   on the BFER and used by flow overlay, routing underlay and the BIER
1098	   header almost the same as the BFR-id in BIER.

1100	   The one exception are application/flow-overlays that automatically
1101	   calculate the bitstring(s) of BIER packets by converting BFR-id to
1102	   bits.  In BIER-TE, this operation can be done in two ways:

1104	   "Independent branches": For a given application or (set of) trees,
1105	   the branches from a BFIR to every BFER are independent of the
1106	   branches to any other BFER.  For example, shortest part trees have
1107	   independent branches.

1109	   "Interdependent braches": When a BFER is added or deleted from a
1110	   particular distribution tree, branches to other BFER still in the
1111	   tree may need to change.  Steiner tree are examples of dependent
1112	   branch trees.

1114	   If "independent branches" are sufficient, the BIER-TE controller host
1115	   can provide to such applications for every BFR-id a SI:bitstring with
1116	   the BIER-TE bits for the branch towards that BFER.  The application
1117	   can then independently calculate the SI:bitstring for all desired
1118	   BFER by OR'ing their bitstrings.

1120	   If "interdependent branches" are required, the application could call
1121	   a BIER-TE controller host API with the list of required BFER-id and
1122	   get the required bitstring back.  Whenever the set of BFER-id
1123	   changes, this is repeated.

1125	   Note that in either case (unlike in BIER), the bits in BIER-TE may
1126	   need to change upon link/node failure/recovery, network expansion and
1127	   network load by other traffic (as part of traffic engineering goals).
1128	   Interactions between such BFIR applications and the BIER-TE
1129	   controller host do therefore need to support dynamic updates to the
1130	   bitstrings.

1132	8.4.  Assigning BFR-ids for BIER-TE

1134	   For non-leaf BFER, there is usually a single bit k for that BFER with
1135	   a local_decap() adjacency on the BFER.  The BFR-id for such a BFER is
1136	   therefore most easily the one it would have in BIER: SI * bitstring-
1137	   length + k.

1139	   As explained earlier in the document, leaf BFER do not need such a
1140	   separate bit because the fact alone that the BIER-TE packet is
1141	   forwarded to the leaf BFER indicates that the BFER should decapsulate
1142	   it.  Such a BFER will have one or more bits for the links leading
1143	   only to it.  The BFR-id could therefore most easily be the BFR-id
1144	   derived from the lowest bit for those links.

1146	   These two rules are only recommendations for the operator or BIER-TE
1147	   controller assigning the BFR-ids.  Any allocation scheme can be used,
1148	   the BFR-ids just need to be unique across BFRs in each subdomain.

1150	   It is not currently determined if a single subdomain could or should
1151	   be allowed to forward both BIER and BIER-TE packets.  If this should
1152	   be supported, there are two options:

1154	   A.  BIER and BIER-TE have different BFR-id in the same subdomain.
1155	   This allows higher replication efficiency for BIER because their BFR-
1156	   id can be assigned sequentially, while the bitstrings for BIER-TE
1157	   will have also the additional bits for the topology.  There is no
1158	   relationship between a BFR BIER BFR-id and BIER-TE BFR-id.

1160	   B.  BIER and BIER-TE share the same BFR-id.  The BFR-id are assigned
1161	   as explained above for BIER-TE and simply reused for BIER.  The
1162	   replication efficiency for BIER will be as low as that for BIER-TE in
1163	   this approach.  Depending on topology, only the same 20%..80% of bits
1164	   as possible for BIER-TE can be used for BIER.

1166	8.5.  Example bit allocations

1168	8.5.1.  With BIER

1170	   Consider a network setup with a bitstring length of 256 for a network
1171	   topology as shown in the picture below.  The network has 6 areas,
1172	   each with ca. 180 BFR, connecting via a core with some larger (core)
1173	   BFR.  To address all BFER with BIER, 4 SI are required.  To send a
1174	   BIER packet to all BFER in the network, 4 copies need to be sent by
1175	   the BFIR.  On the BFIR it does not make a difference how the BFR-id
1176	   are allocated to BFER in the network, but for efficiency further down
1177	   in the network it does make a difference.

1179	                area1           area2        area3
1180	               BFR1a BFR1b  BFR2a BFR2b   BFR3a BFR3b
1181	                 |  \         /    \        /  |
1182	                 ................................
1183	                 .                Core          .
1184	                 ................................
1185	                 |    /       \    /        \  |
1186	               BFR4a BFR4b  BFR5a BFR5b   BFR6a BFR6b
1187	                area4          area5        area6

1189	   With randomn allocation of BFR-id to BFER, each receiving area would
1190	   (most likely) have to receive all 4 copies of the BIER packet because
1191	   there would be BFR-id for each of the 4 SI in each of the areas.
1192	   Only further towards each BFER would this duplication subside - when
1193	   each of the 4 trees runs out of branches.

1195	   If BFR-id are allocated intelligently, then all the BFER in an area
1196	   would be given BFR-id with as few as possible different SI.  Each
1197	   area would only have to forward one or two packets instead of 4.

1199	   Given how networks can grow over time, replication efficiency in an
1200	   area will also easily go down over time when BFR-id are network wide
1201	   allocated sequentially over time.  An area that initially only has
1202	   BFR-id in one SI might end up with many SI over a longer period of
1203	   growth.  Allocating SIs to areas with initially sufficienctly many
1204	   spare bits for growths can help to aleviate this issue.  Or renumber
1205	   BFR-id after network expansion.  In this example one may consider to
1206	   use 6 SI and assign one to each area.

1208	   This example shows that intelligent BFR-id allocation within at least
1209	   subdomain 0 can even be helpfull or even necessary in BIER.

1211	8.5.2.  With BIER-TE

1213	   In BIER-TE one needs to determine a subset of the physical topology
1214	   and attached BFER so that the "desired" representation of this
1215	   topology and the BFER fit into a single bitstring.  This process
1216	   needs to be repeated intil the whole topology is covered.

1218	   Once bits/SIs are assigned to topology and BFER, BFR-id is just a
1219	   derived set of identifiers from the operator/BIER-TE controller as
1220	   explained above.

1222	   Every time that different sub-topologies have overlap, bits need to
1223	   be repeated across the bitstrings, increasing the overall amount of
1224	   bits required across all bitstring/SIs.  In the worst case, randomn
1225	   subsets of BFER are assigned to different SI.  This is much worse
1226	   than in BIER because it not only reduces replication efficiency with
1227	   the same number of overall bits, but even further - because more bits
1228	   are required due to duplication of bits for topology across multiple
1229	   SI.  Intelligent BFER to SI assignment and selecting specific
1230	   "desired" subtopologies can minimize this problem.

1232	   To set up BIER-TE efficiently for above topology, the following bit
1233	   allocation methods can be used.  This method can easily be expanded
1234	   to other, similarily structured larger topologies.

1236	   Each area is allocated one or more SI depending on the number of
1237	   future expected BFER and number of bits required for the topology in
1238	   the area.  In this example, 6 SI, one per area.

1240	   In addition, we use 4 bits in each SI: bia, bib, bea, beb: bit ingres
1241	   a, bit ingres b, bit egres a, bit egres b.  These bits will be used
1242	   to pass BIER packets from any BFIR via any combination of ingres area
1243	   a/b BFR and egres area a/b BFR into a specific target area.  These
1244	   bits are then set up with the right forward_routed adjacencies on the
1245	   BFIR and area edge BFR:

1247	   On all BFIR in an area j, bia in each BIFT:SI is populated with the
1248	   same forward_routed(BFRja), and bib with forward_routed(BFRjb).  On
1249	   all area edge BFR, bea in BIFT:SI=k is populated with
1250	   forward_routed(BFRka) and beb in BIFT:SI=k with
1251	   forward_routed(BFRkb).

1253	   For BIER-TE forwarding of a packet to some subset of BFER across all
1254	   areas, a BFIR would create at most 6 copies, with SI=1...SI=6, In
1255	   each packet, the bits indicate bits for topology and BFER in that
1256	   topology plus the four bits to indicate whether to pass this packet
1257	   via the ingres area a or b border BFR and the egres area a or b
1258	   border BFR, therefore allowing path engineering for those two
1259	   "unicast" legs: 1) BFIR to ingres are edge and 2) core to egres area
1260	   edge.  Replication only happens inside the egres areas.  For BFER in
1261	   the same area as in the BFIR, these four bits are not used.

1263	8.6.  Summary

1265	   BIER-TE can like BIER support multiple SI within a sub-domain to
1266	   allow re-using the concept of BFR-id and therefore minimize BIER-TE
1267	   specific functions in underlay routing, flow overlay methods and BIER
1268	   headers.

1270	   The number of BFIR/BFER possible in a subdomain is smaller than in
1271	   BIER because BIER-TE uses additional bits for topology.

1273	   Subdomains can in BIER-TE be used like in BIER to create more
1274	   efficient replication to known subsets of BFER.

1276	   Assigning bits for BFER intelligently into the right SI is more
1277	   important in BIER-TE than in BIER because of replication efficiency
1278	   and overall amount of bits required.

1280	9.  Further considerations

1282	9.1.  BIER-TE and existing FRR

1284	   BIER-TE as described above is an advanced method for mode-protection
1285	   where the replication in a failed node is on the fly replaced by
1286	   another replication tree through bit operations on the BitString.

1288	   If BIER-TE is not feasible or necessary, it is also possible for
1289	   BIER-TE to leverage any existing form of "link" protection.  For
1290	   example: instead of dorectly setting up a forward_connected adjacency
1291	   to a next-hop neighbor, this can be a "protected" adjacency that is
1292	   maintained by RSVP-TE (or another FRR mechanism) and passes via a
1293	   backup path if the link fails.

1295	9.2.  BIER-TE and Segment Routing

1297	   Segment Routing aims to achieve lightweight path engineering via
1298	   loose source routing.  Compared for example to RSVP-TE, it does not
1299	   require per-path signaling to each of these hops.

1301	   BIER-TE is supports the same design philosophy for multicast.  Like
1302	   in SR, it relies on source-routing - via the definition of a
1303	   BitString.  Like SR, it only requires to consider the "hops" on which
1304	   either replication has to happen, or across which the traffic should
1305	   be steered (even without replication).  Any other hops can be skipped
1306	   via the use of routed adjacencies.

1308	   Instead of defining BitPositions for non-replicating hops, it is
1309	   equally possible to use segment routing encapsulations (eg: MPLS
1310	   label stacks) for "forward_routed" adjacencies.

1312	10.  Security Considerations

1314	   The security considerations are the same as for BIER with the
1315	   following differences:

1317	   BFR-ids and BFR-prefixes are not used in BIER-TE, nor are procedures
1318	   for their distribution, so these are not attack vectors against BIER-
1319	   TE.

1321	11.  IANA Considerations

1323	   This document requests no action by IANA.

1325	12.  Acknowledgements

1327	   The authors would like to thank Greg Shepherd, Ijsbrand Wijnands and
1328	   Neale Ranns for their extensive review and suggestions.

1330	13.  Change log [RFC Editor: Please remove]

1332	      02: Changed the definition of BIFT to be more inline with BIER.
1333	      In revs. up to -01, the idea was that a BIFT has only entries for
1334	      a single bitstring, and every SI and subdomain would be a separate
1335	      BIFT.  In BIER, each BIFT covers all SI.  This is now also how we
1336	      define it in BIER-TE.

1338	      02: Added Section 8 to explain the use of SI, subdomains and BFR-
1339	      id in BIER-TE and to give an example how to efficiently assign
1340	      bits for a large topology requiring multiple SI.

1342	      02: Added further detailed for rings - how to support input from
1343	      all ring nodes.

1345	      01: Fixed BFIR -> BFER for section 4.3.

1347	      01: Added explanation of SI, difference to BIER ECMP,
1348	      consideration for Segment Routing, unicast FRR, considerations for
1349	      encapsulation, explanations of BIER-TE controller host and CLI.

1351	      00: Initial version.

1353	14.  References

1355	   [I-D.ietf-bier-architecture]
1356	              Wijnands, I., Rosen, E., Dolganow, A., Przygienda, T., and
1357	              S. Aldrin, "Multicast using Bit Index Explicit
1358	              Replication", draft-ietf-bier-architecture-02 (work in
1359	              progress), July 2015.

1361	   [I-D.ietf-bier-mpls-encapsulation]
1362	              Wijnands, I., Rosen, E., Dolganow, A., Tantsura, J., and
1363	              S. Aldrin, "Encapsulation for Bit Index Explicit
1364	              Replication in MPLS Networks", draft-ietf-bier-mpls-
1365	              encapsulation-02 (work in progress), August 2015.

1367	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1368	              Requirement Levels", BCP 14, RFC 2119,
1369	              DOI 10.17487/RFC2119, March 1997,
1370	              <http://www.rfc-editor.org/info/rfc2119>.

1372	Authors' Addresses

1374	   Toerless Eckert
1375	   Cisco Systems, Inc.

1377	   Email: eckert@cisco.com

1379	   Gregory Cauchie
1380	   Bouygues Telecom

1382	   Email: GCAUCHIE@bouyguestelecom.fr