< draft-ietf-bier-te-arch-00.txt   draft-ietf-bier-te-arch-01.txt >
Network Working Group T. Eckert, Ed. Network Working Group T. Eckert, Ed.
Internet-Draft Huawei Internet-Draft Huawei
Intended status: Experimental G. Cauchie Intended status: Experimental G. Cauchie
Expires: July 27, 2018 Bouygues Telecom Expires: April 26, 2019 Bouygues Telecom
W. Braun W. Braun
M. Menth M. Menth
University of Tuebingen University of Tuebingen
January 23, 2018 October 23, 2018
Traffic Engineering for Bit Index Explicit Replication (BIER-TE) Traffic Engineering for Bit Index Explicit Replication (BIER-TE)
draft-ietf-bier-te-arch-00 draft-ietf-bier-te-arch-01
Abstract Abstract
This document proposes an architecture for BIER-TE: Traffic This document proposes an architecture for BIER-TE: Traffic
Engineering for Bit Index Explicit Replication (BIER). Engineering for Bit Index Explicit Replication (BIER).
BIER-TE shares part of its architecture with BIER as described in BIER-TE shares part of its architecture with BIER as described in
[RFC8279]. It also proposes to share the packet format with BIER. [RFC8279]. It also proposes to share the packet format with BIER.
BIER-TE forwards and replicates packets like BIER based on a BIER-TE forwards and replicates packets like BIER based on a
skipping to change at page 1, line 47 skipping to change at page 1, line 47
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 27, 2018. This Internet-Draft will expire on April 26, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 45 skipping to change at page 2, line 45
3. BIER-TE Forwarding . . . . . . . . . . . . . . . . . . . . . 7 3. BIER-TE Forwarding . . . . . . . . . . . . . . . . . . . . . 7
3.1. The Bit Index Forwarding Table (BIFT) . . . . . . . . . . 7 3.1. The Bit Index Forwarding Table (BIFT) . . . . . . . . . . 7
3.2. Adjacency Types . . . . . . . . . . . . . . . . . . . . . 8 3.2. Adjacency Types . . . . . . . . . . . . . . . . . . . . . 8
3.2.1. Forward Connected . . . . . . . . . . . . . . . . . . 8 3.2.1. Forward Connected . . . . . . . . . . . . . . . . . . 8
3.2.2. Forward Routed . . . . . . . . . . . . . . . . . . . 9 3.2.2. Forward Routed . . . . . . . . . . . . . . . . . . . 9
3.2.3. ECMP . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2.3. ECMP . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.4. Local Decap . . . . . . . . . . . . . . . . . . . . . 9 3.2.4. Local Decap . . . . . . . . . . . . . . . . . . . . . 9
3.3. Encapsulation considerations . . . . . . . . . . . . . . 10 3.3. Encapsulation considerations . . . . . . . . . . . . . . 10
3.4. Basic BIER-TE Forwarding Example . . . . . . . . . . . . 10 3.4. Basic BIER-TE Forwarding Example . . . . . . . . . . . . 10
3.5. Forwarding comparison with BIER . . . . . . . . . . . . . 12 3.5. Forwarding comparison with BIER . . . . . . . . . . . . . 12
3.6. Requirements . . . . . . . . . . . . . . . . . . . . . . 13
4. BIER-TE Controller Host BitPosition Assignments . . . . . . . 13 4. BIER-TE Controller Host BitPosition Assignments . . . . . . . 13
4.1. P2P Links . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1. P2P Links . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2. BFER . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2. BFER . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3. Leaf BFERs . . . . . . . . . . . . . . . . . . . . . . . 14 4.3. Leaf BFERs . . . . . . . . . . . . . . . . . . . . . . . 14
4.4. LANs . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.4. LANs . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.5. Hub and Spoke . . . . . . . . . . . . . . . . . . . . . . 15 4.5. Hub and Spoke . . . . . . . . . . . . . . . . . . . . . . 15
4.6. Rings . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.6. Rings . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.7. Equal Cost MultiPath (ECMP) . . . . . . . . . . . . . . . 16 4.7. Equal Cost MultiPath (ECMP) . . . . . . . . . . . . . . . 16
4.8. Routed adjacencies . . . . . . . . . . . . . . . . . . . 18 4.8. Routed adjacencies . . . . . . . . . . . . . . . . . . . 19
4.8.1. Reducing BitPositions . . . . . . . . . . . . . . . . 18 4.8.1. Reducing BitPositions . . . . . . . . . . . . . . . . 19
4.8.2. Supporting nodes without BIER-TE . . . . . . . . . . 18 4.8.2. Supporting nodes without BIER-TE . . . . . . . . . . 19
5. Avoiding loops and duplicates . . . . . . . . . . . . . . . . 18 5. Avoiding loops and duplicates . . . . . . . . . . . . . . . . 19
5.1. Loops . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.1. Loops . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.2. Duplicates . . . . . . . . . . . . . . . . . . . . . . . 19 5.2. Duplicates . . . . . . . . . . . . . . . . . . . . . . . 20
6. BIER-TE Forwarding Pseudocode . . . . . . . . . . . . . . . . 19 6. BIER-TE Forwarding Pseudocode . . . . . . . . . . . . . . . . 20
7. Managing SI, subdomains and BFR-ids . . . . . . . . . . . . . 20 7. Managing SI, subdomains and BFR-ids . . . . . . . . . . . . . 23
7.1. Why SI and sub-domains . . . . . . . . . . . . . . . . . 21 7.1. Why SI and sub-domains . . . . . . . . . . . . . . . . . 24
7.2. Bit assignment comparison BIER and BIER-TE . . . . . . . 22 7.2. Bit assignment comparison BIER and BIER-TE . . . . . . . 25
7.3. Using BFR-id with BIER-TE . . . . . . . . . . . . . . . . 22 7.3. Using BFR-id with BIER-TE . . . . . . . . . . . . . . . . 25
7.4. Assigning BFR-ids for BIER-TE . . . . . . . . . . . . . . 23 7.4. Assigning BFR-ids for BIER-TE . . . . . . . . . . . . . . 26
7.5. Example bit allocations . . . . . . . . . . . . . . . . . 24 7.5. Example bit allocations . . . . . . . . . . . . . . . . . 27
7.5.1. With BIER . . . . . . . . . . . . . . . . . . . . . . 24 7.5.1. With BIER . . . . . . . . . . . . . . . . . . . . . . 27
7.5.2. With BIER-TE . . . . . . . . . . . . . . . . . . . . 25 7.5.2. With BIER-TE . . . . . . . . . . . . . . . . . . . . 28
7.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 26 7.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 29
8. BIER-TE and Segment Routing . . . . . . . . . . . . . . . . . 26 8. BIER-TE and Segment Routing . . . . . . . . . . . . . . . . . 29
9. Security Considerations . . . . . . . . . . . . . . . . . . . 27 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 27 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 30
12. Change log [RFC Editor: Please remove] . . . . . . . . . . . 27 12. Change log [RFC Editor: Please remove] . . . . . . . . . . . 30
13. References . . . . . . . . . . . . . . . . . . . . . . . . . 29 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 32
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 33
1. Introduction 1. Introduction
1.1. Overview 1.1. Overview
This document specifies the architecture for BIER-TE: traffic This document specifies the architecture for BIER-TE: traffic
engineering for Bit Index Explicit Replication BIER. engineering for Bit Index Explicit Replication BIER.
BIER-TE shares architecture and packet formats with BIER as described BIER-TE shares architecture and packet formats with BIER as described
in [RFC8279]. in [RFC8279].
skipping to change at page 5, line 26 skipping to change at page 5, line 26
Src -> Rtr1 -> BFIR-----BFR-----BFER -> Rtr2 -> Rcvr Src -> Rtr1 -> BFIR-----BFR-----BFER -> Rtr2 -> Rcvr
|--------------------->| |--------------------->|
BIER-TE forwarding layer BIER-TE forwarding layer
|<- BIER-TE domain-->| |<- BIER-TE domain-->|
|<--------------------->| |<--------------------->|
Routing underlay Routing underlay
Figure 1: BIER-TE architecture
2.1. The Multicast Flow Overlay 2.1. The Multicast Flow Overlay
The Multicast Flow Overlay operates as in BIER. See [RFC8279]. The Multicast Flow Overlay operates as in BIER. See [RFC8279].
Instead of interacting with the BIER layer, it interacts with the Instead of interacting with the BIER layer, it interacts with the
BIER-TE Controller Host BIER-TE Controller Host
2.2. The BIER-TE Controller Host 2.2. The BIER-TE Controller Host
The BIER-TE controller host is representing the control plane of The BIER-TE controller host is representing the control plane of
BIER-TE. It communicates two sets of information with BFRs: BIER-TE. It communicates two sets of information with BFRs:
skipping to change at page 8, line 31 skipping to change at page 8, line 31
------------------------------------------------------------------ ------------------------------------------------------------------
| 0:5 | <empty> | | 0:5 | <empty> |
------------------------------------------------------------------ ------------------------------------------------------------------
| 0:6 | ECMP({adjacency1,...adjacencyN}, seed) | | 0:6 | ECMP({adjacency1,...adjacencyN}, seed) |
------------------------------------------------------------------ ------------------------------------------------------------------
... ...
| BitStringLength | ... | | BitStringLength | ... |
------------------------------------------------------------------ ------------------------------------------------------------------
Bit Index Forwarding Table Bit Index Forwarding Table
Figure 2: BIFT adjacencies
The BIFT is programmed into the data plane of BFRs by the BIER-TE The BIFT is programmed into the data plane of BFRs by the BIER-TE
controller host and used to forward packets, according to the rules controller host and used to forward packets, according to the rules
specified in the BIER-TE Forwarding Procedures. specified in the BIER-TE Forwarding Procedures.
Adjacencies for the same BP when populated in more than one BFR by Adjacencies for the same BP when populated in more than one BFR by
the controller do not have to have the same adjacencies. This is up the controller do not have to have the same adjacencies. This is up
to the controller. BPs for p2p links are one case (see below). to the controller. BPs for p2p links are one case (see below).
3.2. Adjacency Types 3.2. Adjacency Types
skipping to change at page 11, line 5 skipping to change at page 11, line 5
support existing advanced adjacency information such as "loose source support existing advanced adjacency information such as "loose source
routes" via eg: MPLS label stacks or appropriate header extensions routes" via eg: MPLS label stacks or appropriate header extensions
(eg: for IPv6). (eg: for IPv6).
3.4. Basic BIER-TE Forwarding Example 3.4. Basic BIER-TE Forwarding Example
Step by step example of basic BIER-TE forwarding. This does not use Step by step example of basic BIER-TE forwarding. This does not use
ECMP or forward_routed adjacencies nor does it try to minimize the ECMP or forward_routed adjacencies nor does it try to minimize the
number of required BitPositions for the topology. number of required BitPositions for the topology.
Picture 1: Forwarding Example
[Bier-Te Controller Host] [Bier-Te Controller Host]
/ | \ / | \
v v v v v v
| p13 p1 | | p13 p1 |
+- BFIR2 --+ | +- BFIR2 --+ |
| | p2 p6 | LAN2 | | p2 p6 | LAN2
| +-- BFR3 --+ | | +-- BFR3 --+ |
| | | p7 p11 | | | | p7 p11 |
Src -+ +-- BFER1 --+ Src -+ +-- BFER1 --+
skipping to change at page 11, line 31 skipping to change at page 11, line 29
| p14 p4 | | p14 p4 |
+- BFIR1 --+ | +- BFIR1 --+ |
| +-- BFR5 --+ p10 p12 | | +-- BFR5 --+ p10 p12 |
LAN1 | p5 p9 +-- BFER2 --+ LAN1 | p5 p9 +-- BFER2 --+
| +-- Rcv2 | +-- Rcv2
| |
LAN3 LAN3
IP |..... BIER-TE network......| IP IP |..... BIER-TE network......| IP
Figure 3: BIER-TE Forwarding Example
pXX indicate the BitPositions number assigned by the BIER-TE pXX indicate the BitPositions number assigned by the BIER-TE
controller host to adjacencies in the BIER-TE topology. For example, controller host to adjacencies in the BIER-TE topology. For example,
p9 is the adjacency towards BFR9 on the LAN connecting to BFER2. p9 is the adjacency towards BFR9 on the LAN connecting to BFER2.
BIFT BFIR2: BIFT BFIR2:
p13: local_decap() p13: local_decap()
p2: forward_connected(BFR3) p2: forward_connected(BFR3)
BIFT BFR3: BIFT BFR3:
p1: forward_connected(BFIR2) p1: forward_connected(BFIR2)
p7: forward_connected(BFER1) p7: forward_connected(BFER1)
p8: forward_connected(BFR4) p8: forward_connected(BFR4)
BIFT BFER1: BIFT BFER1:
p11: local_decap() p11: local_decap()
p6: forward_connected(BFR3) p6: forward_connected(BFR3)
p8: forward_connected(BFR4) p8: forward_connected(BFR4)
Figure 4: BIER-TE Forwarding Example Adjacencies
...and so on. ...and so on.
Traffic needs to flow from BFIR2 towards Rcv1, Rcv2. The controller Traffic needs to flow from BFIR2 towards Rcv1, Rcv2. The controller
determines it wants it to pass across the following paths: determines it wants it to pass across the following paths:
-> BFER1 ---------------> Rcv1 -> BFER1 ---------------> Rcv1
BFIR2 -> BFR3 BFIR2 -> BFR3
-> BFR4 -> BFR5 -> BFER2 -> Rcv2 -> BFR4 -> BFR5 -> BFER2 -> Rcv2
Figure 5: BIER-TE Forwarding Example Paths
These paths equal to the following BitString: p2, p5, p7, p8, p10, These paths equal to the following BitString: p2, p5, p7, p8, p10,
p11, p12. p11, p12.
This BitString is set up in BFIR2. Multicast packets arriving at This BitString is set up in BFIR2. Multicast packets arriving at
BFIR2 from Src are assigned this BitString. BFIR2 from Src are assigned this BitString.
BFIR2 forwards based on that BitString. It has p2 and p13 populated. BFIR2 forwards based on that BitString. It has p2 and p13 populated.
Only p13 is in BitString which has an adjacency towards BFR3. BFIR2 Only p13 is in BitString which has an adjacency towards BFR3. BFIR2
resets p2 in BitString and sends a copy towards BFR2. resets p2 in BitString and sends a copy towards BFR2.
skipping to change at page 12, line 37 skipping to change at page 12, line 42
instructs BFER1 to create a copy, decapsulate it from the BIER header instructs BFER1 to create a copy, decapsulate it from the BIER header
and pass it on to the NextProtocol, in this example IP multicast. IP and pass it on to the NextProtocol, in this example IP multicast. IP
multicast will then forward the packet out to LAN2 because it did multicast will then forward the packet out to LAN2 because it did
receive PIM or IGMP joins on LAN2 for the traffic. receive PIM or IGMP joins on LAN2 for the traffic.
Further processing of the packet in BFR4, BFR5 and BFER2 accordingly. Further processing of the packet in BFR4, BFR5 and BFER2 accordingly.
3.5. Forwarding comparison with BIER 3.5. Forwarding comparison with BIER
Forwarding of BIER-TE is designed to allow common forwarding hardware Forwarding of BIER-TE is designed to allow common forwarding hardware
with BIER. Like BIER, the core of BIER-TE forwarding are BIFTs with with BIER. In fact, one of the main goals of this document is to
bitstring size number of entries: One for each bit of the bitstring encourage the building of forwarding hardware that can not only
in the processed packet (consider that 256 is the most common size). support BIER, but also BIER-TE - to allow experimentation with BIER-
TE and support building of BIER-TE control plane code.
When a packet is received, the BIFT to process needs to be selected. The pseudocode in Section 6 shows how existing BIER/BIFT forwarding
This is based on SI and subdomain like in BIER. How SI and subdomain can be amended to support basic BIER-TE forwarding, by using BIER
are indicated is subject to the BIER-TE encapsulation, but not BIER-T BIFT's F-BM. Only the masking of bits due to avoid duplicates must
itself. It is expected that the mechanisms for encapsulation will be be skipped when forwarding is for BIER-TE.
very similar if not the same to BIER, but this is subject to followup
work.
There are some key difference between the BIFT in BIER and BIER-TE: Whether to use BIER or BIER-TE forwarding can simply be a configured
choice per subdomain and accordingly be set up by a BIER-TE
controller host. The BIER packet encapsulation [RFC8296] too can be
reused without changes except that the currently defined BIER-TE ECMP
adjacency does not leverage the entropy field so that field would be
unused when BIER-TE forwarding is used.
In BIER-TE, each entry in the BIFT can have a list of 0 or more 3.6. Requirements
adjacencies. A separate copy of the packet is made for each
adjacency. In BIER, each BIFT entry has at most one adjacency (BFR-
NBR). In BIER, different bits can not be processed independently
directly: Only one packet copy is to be sent for all bits in the
packet with the same adjacency, which is why the forwarding procedure
specifies how to sequentially identify those bits and avoid
duplication. In BIER-TE there are no mutual dependencies between bit
adjacencies, so all bits of a BIER-TE bitstring could be procssed
independently in parallel.
In BIER the BIFT has adjacencies for all BFR-ids assigned to BFER and Basic BIER-TE forwarding MUST support to configure Subdomains to use
reachable in the IGP. In BIER-TE the BIFT only has adjacencies for basic BIER-TE forwarding rules (instead of BIER). With basic BIER-TE
bits that are adjacent hops - intermediate or BFER. In forwarding, forwarding, every bit MUST support to have zero or one adjacency. It
this can be treated via the same lookup logic except that in BIER-TE MUST support the adjacency types forward_connected without DNR flag,
there is no step modifyin the original packet and the packet copy forward_routed and local_decap. All other BIER-TE forwarding
bitstring with the FBM. Instead, all the bits locally processed are features are optional. This Basic BIER-TE requirements make BIER-TE
reset in the original packet before looking up bits in the BIFT forwarding exactly the same as BIER forwarding with the exception of
(~MyBitsOfInterest). Only for an adjacency with the "DNR" (Do Not skipping the aforementioned F-BM masking on egres.
Reset) bit set would the bit in the bitstring not be set again as
part of processing of the adjacency.
In summary, implementations of BIER forwarding that are to be BIER-TE forwarding SHOULD support the DNR flag, as this is highly
extended to also support BIER-TE forwarding primarily need to useful to save bits in rings (see Section 4.6).
consider how they can ensure that individual bit lookups can result
in a sequence of more than one copy to be made (as opposed to one in BIER-TE forwarding MAY support more than one djacency on a bit and
BIER), and they need to see that they can accordingly reset bits in ECMP adjacencies. The importance of ECMP adjacencies is unclear when
the bitstring differently for BIER (per-packet) vs. BIER-TE (per- traffic engineering is used because it may be more desirable to
paket-copy). explicitly steer traffic across non-ECMP paths to make per-path
traffic calculation easier for controllers. Having more than one
adjacency for a bit allows further savings of bits in hub&spoke
scenarios, but unlike rings it is less "natural" to flood traffic
across multuple links unconditional. Both ECMP and multiple
adjacencies are forwarding plane features that should be possible to
support later when needed as they do not impact the basic BIER-TE
replication loop. This is true because there is no inter-copy
depency through resetting of F-BM as in BIER.
4. BIER-TE Controller Host BitPosition Assignments 4. BIER-TE Controller Host BitPosition Assignments
This section describes how the BIER-TE controller host can use the This section describes how the BIER-TE controller host can use the
different BIER-TE adjacency types to define the BitPositions of a different BIER-TE adjacency types to define the BitPositions of a
BIER-TE domain. BIER-TE domain.
Because the size of the BitString is limiting the size of the BIER-TE Because the size of the BitString is limiting the size of the BIER-TE
domain, many of the options described exist to support larger domain, many of the options described exist to support larger
topologies with fewer BitPositions (4.1, 4.3, 4.4, 4.5, 4.6, 4.7, topologies with fewer BitPositions (4.1, 4.3, 4.4, 4.5, 4.6, 4.7,
skipping to change at page 14, line 40 skipping to change at page 14, line 46
unique BitPosition. The adjacency of this BitPosition is a unique BitPosition. The adjacency of this BitPosition is a
forward_connected adjacency towards the BFR and this BitPosition is forward_connected adjacency towards the BFR and this BitPosition is
populated into the BIFT of all the other BFRs on that LAN. populated into the BIFT of all the other BFRs on that LAN.
BFR1 BFR1
|p1 |p1
LAN1-+-+---+-----+ LAN1-+-+---+-----+
p3| p4| p2| p3| p4| p2|
BFR3 BFR4 BFR7 BFR3 BFR4 BFR7
Figure 6: LAN Example
If Bandwidth on the LAN is not an issue and most BIER-TE traffic If Bandwidth on the LAN is not an issue and most BIER-TE traffic
should be copied to all neighbors on a LAN, then BitPositions can be should be copied to all neighbors on a LAN, then BitPositions can be
saved by assigning just a single BitPosition to the LAN and saved by assigning just a single BitPosition to the LAN and
populating the BitPosition of the BIFTs of each BFRs on the LAN with populating the BitPosition of the BIFTs of each BFRs on the LAN with
a list of forward_connected adjacencies to all other neighbors on the a list of forward_connected adjacencies to all other neighbors on the
LAN. LAN.
This optimization does not work in the face of BFRs redundantly This optimization does not work in the face of BFRs redundantly
connected to more than one LANs with this optimization because these connected to more than one LANs with this optimization because these
BFRs would receive duplicates and forward those duplicates into the BFRs would receive duplicates and forward those duplicates into the
skipping to change at page 15, line 44 skipping to change at page 16, line 15
v v v v
| | | |
L1 | L2 | L3 L1 | L2 | L3
/-------- BFRa ---- BFRb --------------------\ /-------- BFRa ---- BFRb --------------------\
| | | |
\- BFR1 - BFR2 - BFR3 - ... - BFR29 - BFR30 -/ \- BFR1 - BFR2 - BFR3 - ... - BFR29 - BFR30 -/
| | L4 | | | | L4 | |
p33| p15| p33| p15|
BFRd BFRc BFRd BFRc
Figure 7: Ring Example
Note that this example only permits for packets to enter the ring at Note that this example only permits for packets to enter the ring at
BFRa and BFRb, and that packets will always travel clockwise. If BFRa and BFRb, and that packets will always travel clockwise. If
packets should be allowed to enter the ring at any ring BFR, then one packets should be allowed to enter the ring at any ring BFR, then one
would have to use two ring BitPositions. One for clockwise, one for would have to use two ring BitPositions. One for clockwise, one for
counterclockwise. counterclockwise.
Both would be set up to stop rotating on the same link, eg: L1. When Both would be set up to stop rotating on the same link, eg: L1. When
the ingress ring BFR creates the clockwise copy, it will reset the the ingress ring BFR creates the clockwise copy, it will reset the
counterclockwise BitPosition because the DNR bit only applies to the counterclockwise BitPosition because the DNR bit only applies to the
bit for which the replication is done. Likewise for the clockwise bit for which the replication is done. Likewise for the clockwise
skipping to change at page 16, line 15 skipping to change at page 17, line 4
BitPosition for the counterclockwise copy. In result, the ring BitPosition for the counterclockwise copy. In result, the ring
ingress BFR will send a copy in both directions, serving BFRs on ingress BFR will send a copy in both directions, serving BFRs on
either side of the ring up to L1. either side of the ring up to L1.
4.7. Equal Cost MultiPath (ECMP) 4.7. Equal Cost MultiPath (ECMP)
The ECMP adjacency allows to use just one BP per link bundle between The ECMP adjacency allows to use just one BP per link bundle between
two BFRs instead of one BP for each p2p member link of that link two BFRs instead of one BP for each p2p member link of that link
bundle. In the following picture, one BP is used across L1,L2,L3 and bundle. In the following picture, one BP is used across L1,L2,L3 and
BFR1/BFR2 have for the BP BFR1/BFR2 have for the BP
--L1----- --L1-----
BFR1 --L2----- BFR2 BFR1 --L2----- BFR2
--L3----- --L3-----
BIFT entry in BFR1: BIFT entry in BFR1:
------------------------------------------------------------------ ------------------------------------------------------------------
| Index | Adjacencies | | Index | Adjacencies |
================================================================== ==================================================================
| 0:6 | ECMP({L1-to-BFR2,L2-to-BFR2,L3-to-BFR2}, seed) | | 0:6 | ECMP({L1-to-BFR2,L2-to-BFR2,L3-to-BFR2}, seed) |
------------------------------------------------------------------ ------------------------------------------------------------------
BIFT entry in BFR2: BIFT entry in BFR2:
------------------------------------------------------------------ ------------------------------------------------------------------
| Index | Adjacencies | | Index | Adjacencies |
================================================================== ==================================================================
| 0:6 | ECMP({L1-to-BFR1,L2-to-BFR1,L3-to-BFR1}, seed) | | 0:6 | ECMP({L1-to-BFR1,L2-to-BFR1,L3-to-BFR1}, seed) |
------------------------------------------------------------------ ------------------------------------------------------------------
Figure 8: ECMP Example
In the following example, all traffic from BFR1 towards BFR10 is In the following example, all traffic from BFR1 towards BFR10 is
intended to be ECMP load split equally across the topology. This intended to be ECMP load split equally across the topology. This
example is not mean as a likely setup, but to illustrate that ECMP example is not mean as a likely setup, but to illustrate that ECMP
can be used to share BPs not only across link bundles, and it can be used to share BPs not only across link bundles, and it
explains the use of the seed parameter. explains the use of the seed parameter.
BFR1 BFR1
/ \ / \
/L11 \L12 /L11 \L12
BFR2 BFR3 BFR2 BFR3
skipping to change at page 17, line 34 skipping to change at page 18, line 34
BIFT entry in BFR2: BIFT entry in BFR2:
------------------------------------------------------------------ ------------------------------------------------------------------
| 0:6 | ECMP({L21-to-BFR4,L22-to-BFR5}, seed) | | 0:6 | ECMP({L21-to-BFR4,L22-to-BFR5}, seed) |
------------------------------------------------------------------ ------------------------------------------------------------------
BIFT entry in BFR3: BIFT entry in BFR3:
------------------------------------------------------------------ ------------------------------------------------------------------
| 0:6 | ECMP({L31-to-BFR6,L32-to-BFR7}, seed) | | 0:6 | ECMP({L31-to-BFR6,L32-to-BFR7}, seed) |
------------------------------------------------------------------ ------------------------------------------------------------------
Figure 9: Polarization Example
With the setup of ECMP in above topology, traffic would not be With the setup of ECMP in above topology, traffic would not be
equally load-split. Instead, links L22 and L31 would see no traffic equally load-split. Instead, links L22 and L31 would see no traffic
at all: BFR2 will only see traffic from BFR1 for which the ECMP hash at all: BFR2 will only see traffic from BFR1 for which the ECMP hash
in BFR1 selected the first adjacency in a list of 2 adjacencies: link in BFR1 selected the first adjacency in a list of 2 adjacencies: link
L11-to-BFR2. When forwarding in BFR2 performs again an ECMP with two L11-to-BFR2. When forwarding in BFR2 performs again an ECMP with two
adjacencies on that subset of traffic, then it will again select the adjacencies on that subset of traffic, then it will again select the
first of its two adjacencies to it: L21-to-BFR4. And therefore L22 first of its two adjacencies to it: L21-to-BFR4. And therefore L22
and BFR5 sees no traffic. and BFR5 sees no traffic.
To resolve this issue, the ECMP adjacency on BFR1 simply needs to be To resolve this issue, the ECMP adjacency on BFR1 simply needs to be
skipping to change at page 18, line 19 skipping to change at page 19, line 21
Routed adjacencies can reduce the number of BitPositions required Routed adjacencies can reduce the number of BitPositions required
when the traffic engineering requirement is not hop-by-hop explicit when the traffic engineering requirement is not hop-by-hop explicit
path selection, but loose-hop selection. path selection, but loose-hop selection.
............... ............... ............... ...............
BFR1--... Redundant ...--L1-- BFR2... Redundant ...--- BFR1--... Redundant ...--L1-- BFR2... Redundant ...---
\--... Network ...--L2--/ ... Network ...--- \--... Network ...--L2--/ ... Network ...---
BFR4--... Segment 1 ...--L3-- BFR3... Segment 2 ...--- BFR4--... Segment 1 ...--L3-- BFR3... Segment 2 ...---
............... ............... ............... ...............
Figure 10: Routed Adjacencies Example
Assume the requirement in above network is to explicitly engineer Assume the requirement in above network is to explicitly engineer
paths such that specific traffic flows are passed from segment 1 to paths such that specific traffic flows are passed from segment 1 to
segment 2 via link L1 (or via L2 or via L3). segment 2 via link L1 (or via L2 or via L3).
To achieve this, BFR1 and BFR4 are set up with a forward_routed To achieve this, BFR1 and BFR4 are set up with a forward_routed
adjacency BitPosition towards an address of BFR2 on link L1 (or link adjacency BitPosition towards an address of BFR2 on link L1 (or link
L2 BFR3 via L3). L2 BFR3 via L3).
For paths to be engineered through a specific node BFR2 (or BFR3), For paths to be engineered through a specific node BFR2 (or BFR3),
BFR1 and BFR4 are set up up with a forward_routed adjacency BFR1 and BFR4 are set up up with a forward_routed adjacency
skipping to change at page 19, line 30 skipping to change at page 20, line 35
loops, these can be inhibited by link layer addressing in loops, these can be inhibited by link layer addressing in
forward_connected adjacencies. forward_connected adjacencies.
If interface or loopback addresses used in forward_routed adjacencies If interface or loopback addresses used in forward_routed adjacencies
are moved from one BFR to another, duplicates can equally happen. are moved from one BFR to another, duplicates can equally happen.
Such re-addressing operations must be coordinated with the Such re-addressing operations must be coordinated with the
controller. controller.
6. BIER-TE Forwarding Pseudocode 6. BIER-TE Forwarding Pseudocode
The following sections of Pseudocode are meant to illustrate the The following simplified pseudocode for BIER-TE forwarding is using
BIER-TE forwarding plane. This code is not meant to be normative but BIER forwarding pseudocode of [RFC8279], section 6.5 with the one
to serve both as a potentially easier to read and more precise modification necessary to support basic BIER-TE forwarding. Like the
representation of the forwarding functionality and to illustrate how BIER pseudo forwarding code, for simplicity it does hide the details
simple BIER-TE forwarding is and that it can be efficiently be of the adjacency processing inside PacketSend() which can be
implemented. forward_connected, forward_routed or local_decap.
The following procedure is executed on a BFR whenever the BIFT is
changed by the BIER-TE controller host:
global MyBitsOfInterest
void BIFTChanged() void ForwardBitMaskPacket_withTE (Packet)
{ {
for (Index = 0; Index++ ; Index <= BitStringLength) SI=GetPacketSI(Packet);
if(BIFT[Index] != <empty>) Offset=SI*BitStringLength;
MyBitsOfInterest != 2<<(Index-1) for (Index = GetFirstBitPosition(Packet->BitString); Index ;
Index = GetNextBitPosition(Packet->BitString, Index)) {
F-BM = BIFT[Index+Offset]->F-BM;
if (!F-BM) continue;
BFR-NBR = BIFT[Index+Offset]->BFR-NBR;
PacketCopy = Copy(Packet);
PacketCopy->BitString &= F-BM; [2]
PacketSend(PacketCopy, BFR-NBR);
// The following must not be done for BIER-TE:
// Packet->BitString &= ~F-BM; [1]
}
} }
The following procedure is executed whenever a BIER-TE packet is to Figure 11: Simplified BIER-TE Forwarding Pseudocode
be forwarded:
void ForwardBierTePacket (Packet) The difference is that in BIER-TE, step [1] must not be performed.
{
// We calculate in BitMask the subset of BPs of the BitString
// for which we have adjacencies. This is purely an
// optimization to avoid to replicate for every BP
// set in BitString only to discover that for most of them,
// the BIFT has no adjacency.
local BitMask = Packet->BitString In BIER, this step is necessary to avoid duplicates when two or more
Packet->BitString &= ~MyBitsOfInterest BFER are reachable via the same neighbor. The F-BM of all those BFER
BitMask &= MyBitsOfInterest bits will indicate each others bits, and step [1] will reset all
these bits on the first copy made for the first of those BFER bits
set in the BitString, hence skipping any further copies to that
neighbor.
// Replication Whereas in BIER, the F-BM of bits toward a specific neighbor contain
for (Index = GetFirstBitPosition(BitMask); Index ; only the bits of those BFER destined to be forwarded across this
Index = GetNextBitPosition(BitMask, Index)) neighbor, in BIER-TE the F-BM for a neighbor needs to have all bits
foreach adjacency BIFT[Index] set except all those bits that are actual (non-empty) adjacencies of
this BFR. Step [2] will reset those adjacency bits to avoid loops,
but all the other bits that are not adjacencies of this BFR need to
stay untouched by [2] so that they can be processed by further BFR
along the path. If [1] was performed as in BIER, then those non-
adjacency bits would erroneously get reset during replication.
if(adjacency == ECMP(ListOfAdjacencies, seed) ) To support the DNR (Do Not Reset) flag of forward_connected()
I = ECMP_hash(sizeof(ListOfAdjacencies), adjacencies, the F-BM must also have its own bit set in the F-BM of
Packet->Entropy, seed) such an adjacency , so that for the packet copy made for this
adjacency = ListOfAdjacencies[I] adjacency the bit stays on, whereas it will not be set in the F-BM of
other bits so that it will be reset for any other packet copy made.
PacketCopy = Copy(Packet) Eliminating the need to perform [1] also makes processing of bits in
the BIER-TE bitstring independent of processing other bits, which may
also simplify forwarding plane implementations.
switch(adjacency) The following pseudocode is comprehensive:
case forward_connected(interface,neighbor,DNR):
if(DNR)
PacketCopy->BitString |= 2<<(Index-1)
SendToL2Unicast(PacketCopy,interface,neighbor)
case forward_routed([VRF],neighbor): o This pseudocode eliminates per-bit F-BM, therefore reducing state
SendToL3(PacketCopy,[VRF,]l3-neighbor) by BitStringLength^2*SI and eliminating the need for per-packet-
copy masking operation except for adjacencies with DNR flag set:
case local_decap([VRF],neighbor): * AdjacentBits[SI] are bits with a non-empty list of adjcencies.
DecapBierHeader(PacketCopy) This can be computed whenever the BIER-TE controller host
PassTo(PacketCopy,[VRF,]Packet->NextProto) updates the adjacencies.
}
* Only the AdjacentBits need to be examined in the loop for
packet copies.
* The packets BitString is masked with those AdjacentBits on
ingres to avoid packet loopings.
o The code loops over the adjacencies because there may be more than
one adjacency for a bit.
o When an adjacency has the DNR bit, the bit is set in the packet
copy (to save bits in rings for example).
o The ECMP adjacency is shown. Its parameters are a
ListOfAdjacencies from which one is picked.
o The forward_local, forward_routed, local_decap adjacencies are
shown with their parameters.
void ForwardBitMaskPacket_withTE (Packet)
{
SI=GetPacketSI(Packet);
Offset=SI*BitStringLength;
AdjacentBitstring = Packet->BitString &= ~AdjacentBits[SI];
Packet->BitString &= AdjacentBits[SI];
for (Index = GetFirstBitPosition(AdjacentBits); Index ;
Index = GetNextBitPosition(AdjacentBits, Index)) {
foreach adjacency BIFT[Index+Offset] {
if(adjacency == ECMP(ListOfAdjacencies, seed) ) {
I = ECMP_hash(sizeof(ListOfAdjacencies),
Packet->Entropy, seed);
adjacency = ListOfAdjacencies[I];
}
PacketCopy = Copy(Packet);
switch(adjacency) {
case forward_connected(interface,neighbor,DNR):
if(DNR)
PacketCopy->BitString |= 2<<(Index-1);
SendToL2Unicast(PacketCopy,interface,neighbor);
case forward_routed([VRF],neighbor):
SendToL3(PacketCopy,[VRF,]l3-neighbor);
case local_decap([VRF],neighbor):
DecapBierHeader(PacketCopy);
PassTo(PacketCopy,[VRF,]Packet->NextProto);
}
}
}
}
Figure 12: BIER-TE Forwarding Pseudocode
7. Managing SI, subdomains and BFR-ids 7. Managing SI, subdomains and BFR-ids
When the number of bits required to represent the necessary hops in When the number of bits required to represent the necessary hops in
the topology and BFER exceeds the supported bitstring length, the topology and BFER exceeds the supported bitstring length,
multiple SI and/or subdomains must be used. This section discusses multiple SI and/or subdomains must be used. This section discusses
how. how.
BIER-TE forwarding does not require the concept of BFR-id, but BIER-TE forwarding does not require the concept of BFR-id, but
routing underlay, flow overlay and BIER headers may. This section routing underlay, flow overlay and BIER headers may. This section
skipping to change at page 24, line 40 skipping to change at page 27, line 40
area1 area2 area3 area1 area2 area3
BFR1a BFR1b BFR2a BFR2b BFR3a BFR3b BFR1a BFR1b BFR2a BFR2b BFR3a BFR3b
| \ / \ / | | \ / \ / |
................................ ................................
. Core . . Core .
................................ ................................
| / \ / \ | | / \ / \ |
BFR4a BFR4b BFR5a BFR5b BFR6a BFR6b BFR4a BFR4b BFR5a BFR5b BFR6a BFR6b
area4 area5 area6 area4 area5 area6
Figure 13: Scaling BIER-TE bits by reuse
With random allocation of BFR-id to BFER, each receiving area would With random allocation of BFR-id to BFER, each receiving area would
(most likely) have to receive all 4 copies of the BIER packet because (most likely) have to receive all 4 copies of the BIER packet because
there would be BFR-id for each of the 4 SI in each of the areas. there would be BFR-id for each of the 4 SI in each of the areas.
Only further towards each BFER would this duplication subside - when Only further towards each BFER would this duplication subside - when
each of the 4 trees runs out of branches. each of the 4 trees runs out of branches.
If BFR-id are allocated intelligently, then all the BFER in an area If BFR-id are allocated intelligently, then all the BFER in an area
would be given BFR-id with as few as possible different SI. Each would be given BFR-id with as few as possible different SI. Each
area would only have to forward one or two packets instead of 4. area would only have to forward one or two packets instead of 4.
skipping to change at page 27, line 5 skipping to change at page 30, line 9
in SR, it relies on source-routing - via the definition of a in SR, it relies on source-routing - via the definition of a
BitString. Like SR, it only requires to consider the "hops" on which BitString. Like SR, it only requires to consider the "hops" on which
either replication has to happen, or across which the traffic should either replication has to happen, or across which the traffic should
be steered (even without replication). Any other hops can be skipped be steered (even without replication). Any other hops can be skipped
via the use of routed adjacencies. via the use of routed adjacencies.
Instead of defining BitPositions for non-replicating hops, it is Instead of defining BitPositions for non-replicating hops, it is
equally possible to use segment routing encapsulations (eg: MPLS equally possible to use segment routing encapsulations (eg: MPLS
label stacks) for "forward_routed" adjacencies. label stacks) for "forward_routed" adjacencies.
Note that BIER itself is also similar to SR - it achieves the same as
"Shortest Path SID" where the label stack uses only one SID to
indicate the egres node of the SR domain. Instead of routing such a
SR packet hop-by-hop based on that SID, BIER routes the packet hop-
by-hop based on the BFER-id bits of the egres nodes of the BIER
domain. What BIER does not allow is to indicate intermediate hops,
or terms of SR lavbel stacks with more than one SID in the stack (for
the same SR domain). This is what BIER-TE provides.
9. Security Considerations 9. Security Considerations
The security considerations are the same as for BIER with the The security considerations are the same as for BIER with the
following differences: following differences:
BFR-ids and BFR-prefixes are not used in BIER-TE, nor are procedures BFR-ids and BFR-prefixes are not used in BIER-TE, nor are procedures
for their distribution, so these are not attack vectors against BIER- for their distribution, so these are not attack vectors against BIER-
TE. TE.
10. IANA Considerations 10. IANA Considerations
skipping to change at page 27, line 27 skipping to change at page 30, line 40
11. Acknowledgements 11. Acknowledgements
The authors would like to thank Greg Shepherd, Ijsbrand Wijnands and The authors would like to thank Greg Shepherd, Ijsbrand Wijnands and
Neale Ranns for their extensive review and suggestions. Neale Ranns for their extensive review and suggestions.
12. Change log [RFC Editor: Please remove] 12. Change log [RFC Editor: Please remove]
draft-ietf-bier-te-arch: draft-ietf-bier-te-arch:
01: Added note comparing BIER and SR to also hopefully clarify
BIER-TE vs. BIER comparison re. SR.
- added requirements section mandating only most basic BIER-TE
forwarding features as MUST.
- reworked comparison with BIER forwarding section to only
summarize and point to pseudocode section.
- reworked pseudocode section to have one pseodcode that mirrors
the BIER forwarding pseudocode to make comparison easier and a
second pseudocode that shows the complete set of BIER-TE
forwarding options and simplification/optimization possible vs.
BIER forwarding.
- Added captions to pictures.
00: Changed target state to experimental (WG conclusion), updated 00: Changed target state to experimental (WG conclusion), updated
references, mod auth association. references, mod auth association.
- Source now on http://www.github.com/toerless/bier-te-arch - Source now on http://www.github.com/toerless/bier-te-arch
- Please open issues on the github for change/improvement requests - Please open issues on the github for change/improvement requests
to the document - in addition to posting them on the list to the document - in addition to posting them on the list
(bier@ietf.). Thanks!. (bier@ietf.). Thanks!.
draft-eckert-bier-te-arch: draft-eckert-bier-te-arch:
 End of changes. 40 change blocks. 
115 lines changed or deleted 219 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/