| < draft-wijnands-bier-architecture-03.txt | draft-wijnands-bier-architecture-04.txt > | |||
|---|---|---|---|---|
| Internet Engineering Task Force IJ. Wijnands, Ed. | Internet Engineering Task Force IJ. Wijnands, Ed. | |||
| Internet-Draft Cisco Systems, Inc. | Internet-Draft Cisco Systems, Inc. | |||
| Intended status: Standards Track E. Rosen, Ed. | Intended status: Standards Track E. Rosen, Ed. | |||
| Expires: July 31, 2015 Juniper Networks, Inc. | Expires: August 6, 2015 Juniper Networks, Inc. | |||
| A. Dolganow | A. Dolganow | |||
| Alcatel-Lucent | Alcatel-Lucent | |||
| T. Przygienda | T. Przygienda | |||
| Ericsson | Ericsson | |||
| S. Aldrin | S. Aldrin | |||
| Huawei Technologies | Huawei Technologies | |||
| January 27, 2015 | February 2, 2015 | |||
| Multicast using Bit Index Explicit Replication | Multicast using Bit Index Explicit Replication | |||
| draft-wijnands-bier-architecture-03 | draft-wijnands-bier-architecture-04 | |||
| Abstract | Abstract | |||
| This document specifies a new architecture for the forwarding of | This document specifies a new architecture for the forwarding of | |||
| multicast data packets. It provides optimal forwarding of multicast | multicast data packets. It provides optimal forwarding of multicast | |||
| packets through a "multicast domain". However, it does not require | packets through a "multicast domain". However, it does not require | |||
| any explicit tree-building protocol, nor does it require intermediate | any explicit tree-building protocol, nor does it require intermediate | |||
| nodes to maintain any per-flow state. This architecture is known as | nodes to maintain any per-flow state. This architecture is known as | |||
| "Bit Index Explicit Replication" (BIER). When a multicast data | "Bit Index Explicit Replication" (BIER). When a multicast data | |||
| packet enters the domain, the ingress router determines the set of | packet enters the domain, the ingress router determines the set of | |||
| skipping to change at page 2, line 4 ¶ | skipping to change at page 2, line 4 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on July 31, 2015. | This Internet-Draft will expire on August 6, 2015. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2015 IETF Trust and the persons identified as the | Copyright (c) 2015 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 43 ¶ | skipping to change at page 2, line 43 ¶ | |||
| 6.2. BFR Neighbors . . . . . . . . . . . . . . . . . . . . . . 13 | 6.2. BFR Neighbors . . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 6.3. The Bit Index Routing Table . . . . . . . . . . . . . . . 14 | 6.3. The Bit Index Routing Table . . . . . . . . . . . . . . . 14 | |||
| 6.4. The Bit Index Forwarding Table . . . . . . . . . . . . . 14 | 6.4. The Bit Index Forwarding Table . . . . . . . . . . . . . 14 | |||
| 6.5. The BIER Forwarding Procedure . . . . . . . . . . . . . . 15 | 6.5. The BIER Forwarding Procedure . . . . . . . . . . . . . . 15 | |||
| 6.6. Examples of BIER Forwarding . . . . . . . . . . . . . . . 17 | 6.6. Examples of BIER Forwarding . . . . . . . . . . . . . . . 17 | |||
| 6.6.1. Example 1 . . . . . . . . . . . . . . . . . . . . . . 18 | 6.6.1. Example 1 . . . . . . . . . . . . . . . . . . . . . . 18 | |||
| 6.6.2. Example 2 . . . . . . . . . . . . . . . . . . . . . . 18 | 6.6.2. Example 2 . . . . . . . . . . . . . . . . . . . . . . 18 | |||
| 6.7. Equal Cost Multi-path Forwarding . . . . . . . . . . . . 20 | 6.7. Equal Cost Multi-path Forwarding . . . . . . . . . . . . 20 | |||
| 6.7.1. Non-deterministic ECMP . . . . . . . . . . . . . . . 21 | 6.7.1. Non-deterministic ECMP . . . . . . . . . . . . . . . 21 | |||
| 6.7.2. Deterministic ECMP . . . . . . . . . . . . . . . . . 22 | 6.7.2. Deterministic ECMP . . . . . . . . . . . . . . . . . 22 | |||
| 6.8. Prevention of Loops and Duplicates . . . . . . . . . . . 23 | 6.8. Prevention of Loops and Duplicates . . . . . . . . . . . 24 | |||
| 6.9. When Some Nodes do not Support BIER . . . . . . . . . . . 24 | 6.9. When Some Nodes do not Support BIER . . . . . . . . . . . 24 | |||
| 6.10. Use of Different BitStringLengths within a Domain . . . . 25 | 6.10. Use of Different BitStringLengths within a Domain . . . . 26 | |||
| 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26 | |||
| 8. Security Considerations . . . . . . . . . . . . . . . . . . . 26 | 8. Security Considerations . . . . . . . . . . . . . . . . . . . 26 | |||
| 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 26 | 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 27 | |||
| 10. Contributor Addresses . . . . . . . . . . . . . . . . . . . . 27 | 10. Contributor Addresses . . . . . . . . . . . . . . . . . . . . 27 | |||
| 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 28 | 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 28 | |||
| 11.1. Normative References . . . . . . . . . . . . . . . . . . 28 | 11.1. Normative References . . . . . . . . . . . . . . . . . . 28 | |||
| 11.2. Informative References . . . . . . . . . . . . . . . . . 28 | 11.2. Informative References . . . . . . . . . . . . . . . . . 29 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29 | |||
| 1. Introduction | 1. Introduction | |||
| This document specifies a new architecture for the forwarding of | This document specifies a new architecture for the forwarding of | |||
| multicast data packets. It provides optimal forwarding of multicast | multicast data packets. It provides optimal forwarding of multicast | |||
| data packets through a "multicast domain". However, it does not | data packets through a "multicast domain". However, it does not | |||
| require any explicit tree-building protocol, and does not require | require any explicit tree-building protocol, and does not require | |||
| intermediate nodes to maintain any per-flow state. This architecture | intermediate nodes to maintain any per-flow state. This architecture | |||
| is known as "Bit Index Explicit Replication" (BIER). | is known as "Bit Index Explicit Replication" (BIER). | |||
| skipping to change at page 6, line 17 ¶ | skipping to change at page 6, line 17 ¶ | |||
| domains of that BIER domain. | domains of that BIER domain. | |||
| A "BFR Identifier" (BFR-id) is a number in the range [1,65535]. In | A "BFR Identifier" (BFR-id) is a number in the range [1,65535]. In | |||
| general, each BFR in a given BIER sub-domain must be assigned a | general, each BFR in a given BIER sub-domain must be assigned a | |||
| unique number from this range (i.e., two BFRs in the same BIER sub- | unique number from this range (i.e., two BFRs in the same BIER sub- | |||
| domain MUST NOT have the same BFR-id in that sub-domain). However, | domain MUST NOT have the same BFR-id in that sub-domain). However, | |||
| if it is known that a given BFR will never need to function as a BFER | if it is known that a given BFR will never need to function as a BFER | |||
| in a given sub-domain, then it is not necessary to assign a BFR-id | in a given sub-domain, then it is not necessary to assign a BFR-id | |||
| for that sub-domain to that BFR. | for that sub-domain to that BFR. | |||
| Note that the value 0 is not a legal BFR-id. | ||||
| The procedure for assigning a particular BFR-id to a particular BFR | The procedure for assigning a particular BFR-id to a particular BFR | |||
| is outside the scope of this document. However, it is RECOMMENDED | is outside the scope of this document. However, it is RECOMMENDED | |||
| that the BFR-ids for each sub-domain be assigned "densely" from the | that the BFR-ids for each sub-domain be assigned "densely" from the | |||
| numbering space, as this will result in a more efficient encoding | numbering space, as this will result in a more efficient encoding | |||
| (see Section 3). That is, if there are 256 or fewer BFERs, it is | (see Section 3). That is, if there are 256 or fewer BFERs, it is | |||
| RECOMMENDED to assign all the BFR-ids from the range [1,256]. If | RECOMMENDED to assign all the BFR-ids from the range [1,256]. If | |||
| there are more than 256 BFERs, but less than 512, it is RECOMMENDED | there are more than 256 BFERs, but less than 512, it is RECOMMENDED | |||
| to assign all the BFR-ids from the range [1,512], with as few "holes" | to assign all the BFR-ids from the range [1,512], with as few "holes" | |||
| as possible in the earlier range. However, in some deployments, it | as possible in the earlier range. However, in some deployments, it | |||
| may be advantageous to depart from this recommendation; this is | may be advantageous to depart from this recommendation; this is | |||
| skipping to change at page 12, line 5 ¶ | skipping to change at page 12, line 5 ¶ | |||
| domain, has not yet advertised its ownership of BFR-id N for that | domain, has not yet advertised its ownership of BFR-id N for that | |||
| sub-domain, but has received an advertisement from a different BFR | sub-domain, but has received an advertisement from a different BFR | |||
| (say BFR-B) that is advertising ownership of BFR-id N for the same | (say BFR-B) that is advertising ownership of BFR-id N for the same | |||
| sub-domain, then BFR-A SHOULD log an error, and MUST NOT advertise | sub-domain, then BFR-A SHOULD log an error, and MUST NOT advertise | |||
| its own ownership of BFR-id N for that sub-domain as long as the | its own ownership of BFR-id N for that sub-domain as long as the | |||
| advertisement from BFR-B is extant. | advertisement from BFR-B is extant. | |||
| This procedure may prevent the accidental misconfiguration of a | This procedure may prevent the accidental misconfiguration of a | |||
| new BFR from impacting an existing BFR. | new BFR from impacting an existing BFR. | |||
| If a BFR advertises that it has a BFR-id of 0 in a particular sub- | ||||
| domain, other BFRs receiving the advertisement MUST interpret that | ||||
| advertisement as meaning that the advertising BFR does not have a | ||||
| BFR-id in that sub-domain. | ||||
| 6. BIER Intra-Domain Forwarding Procedures | 6. BIER Intra-Domain Forwarding Procedures | |||
| This section specifies the rules for forwarding a BIER-encapsulated | This section specifies the rules for forwarding a BIER-encapsulated | |||
| data packet within a BIER domain. | data packet within a BIER domain. | |||
| 6.1. Overview | 6.1. Overview | |||
| This section provides a brief overview of the BIER forwarding | This section provides a brief overview of the BIER forwarding | |||
| procedures. Subsequent sub-sections specify the procedures in more | procedures. Subsequent sub-sections specify the procedures in more | |||
| detail. | detail. | |||
| skipping to change at page 22, line 20 ¶ | skipping to change at page 22, line 20 ¶ | |||
| Note however that by the rules of Section 6.5, any packet destined | Note however that by the rules of Section 6.5, any packet destined | |||
| for both BFER-D and BFER-F will be sent via BFR-C. | for both BFER-D and BFER-F will be sent via BFR-C. | |||
| 6.7.2. Deterministic ECMP | 6.7.2. Deterministic ECMP | |||
| With the procedures of Section 6.7.1, where ECMP paths exist, the | With the procedures of Section 6.7.1, where ECMP paths exist, the | |||
| path a packet takes to reach any particular BFER depends not only on | path a packet takes to reach any particular BFER depends not only on | |||
| routing and on the packet's entropy, but also on the set of other | routing and on the packet's entropy, but also on the set of other | |||
| BFERs to which the packet is destined. | BFERs to which the packet is destined. | |||
| For example consider the network in Figure 6. Suppose that there is | For example consider the following scenario in the network of | |||
| a sequence of packets being transmitted by BFR A, some of which are | Figure 6. | |||
| destined for D and F, and some of which are destined for E and F. | ||||
| And suppose that all the packets in this sequence have the same | o There is a sequence of packets being transmitted by BFR-A, some of | |||
| entropy value. Using the forwarding procedures of Section 6.7.1, the | which are destined for both D and F, and some of which are | |||
| packets destined for both D and F would follow the path A-B-C-F, | destined only for F. | |||
| while the packets destined for both E and F would follow the path | ||||
| A-B-E-F. | o All the packets in this sequence have the same entropy value, call | |||
| it "Q". | ||||
| o At BFR-B, when a packet with entropy value Q is forwarded via | ||||
| entry 2 in the BIFT, the packet is sent to E. | ||||
| Using the forwarding procedure of Section 6.7.1, packets of this | ||||
| sequence that are destined for both D and F are forwarded according | ||||
| to entry 1 in the BIFT, and thus will reach F via the path A-B-C-F. | ||||
| However, packets of this sequence that are destined only for F are | ||||
| forwarded according to entry 2 in the BIFT, and thus will reach F via | ||||
| the path A-B-E-F. | ||||
| That procedure minimizes the number of packets transmitted by BFR B. | That procedure minimizes the number of packets transmitted by BFR B. | |||
| However, consider a particular multicast flow that initially needs to | However, consider the following scenario: | |||
| be received ONLY by BFER-F. Let's suppose that the packets of that | ||||
| flow have an entropy value that causes B to forward them along the | ||||
| path B-C-F. Now suppose that E needs to start receiving the flow. | ||||
| By the procedures of Section 6.7.1, B will now switch the packets to | ||||
| the path B-E-F. When E no longer needs to receive the flow, B will | ||||
| switch the packets back to the path B-C-F. | ||||
| The problem is that if E repeatedly joins and leaves the flow, the | o Beginning at time t0, the multicast flow in question needs to be | |||
| received ONLY by BFER-F; | ||||
| o Beginning at a later time, t1, the flow needs to be received by | ||||
| both BFER-D and BFER-F. | ||||
| o Beginning at a later time, t2, the no longer needs to be received | ||||
| by D, but still needs to be received by F. | ||||
| Then from t0 until t1, the flow will travel to F via the path | ||||
| A-B-E-F. From t1 until t2, the flow will travel to F via the path | ||||
| A-B-C-F. And from t2, the flow will again travel to F via the path | ||||
| A-B-E-F. | ||||
| The problem is that if D repeatedly joins and leaves the flow, the | ||||
| flow's path from B to F will keep switching. This could cause F to | flow's path from B to F will keep switching. This could cause F to | |||
| receive packets out of order. It also makes troubleshooting | receive packets out of order. It also makes troubleshooting | |||
| difficult. For example, if there is some problem on the C-F link, | difficult. For example, if there is some problem on the E-F link, | |||
| receivers at F will get good service when the flow is also going to E | receivers at F will get good service when the flow is also going to D | |||
| (avoiding the C-F link), but bad service when the flow is not going | (avoiding the E-F link), but bad service when the flow is not going | |||
| to E. Since it is hard to know which path is being used at any given | to D. Since it is hard to know which path is being used at any given | |||
| time, this may be hard to troubleshoot. Also, it is very difficult | time, this may be hard to troubleshoot. Also, it is very difficult | |||
| to perform a traceroute that is known to follow the path taken by the | to perform a traceroute that is known to follow the path taken by the | |||
| flow at any given time. | flow at any given time. | |||
| The source of this difficulty is that, in the procedures of | The source of this difficulty is that, in the procedures of | |||
| Section 6.7.1, the path taken by a particular flow to a particular | Section 6.7.1, the path taken by a particular flow to a particular | |||
| BFER depends upon whether there are "lower numbered" BFERs that are | BFER depends upon whether there are lower numbered BFERs that are | |||
| also receiving the flow. Thus the choice among the ECMP paths is | also receiving the flow. Thus the choice among the ECMP paths is | |||
| fundamentally non-deterministic. | fundamentally non-deterministic. | |||
| Deterministic forwarding can be achieved by using multiple BIFTs, | Deterministic forwarding can be achieved by using multiple BIFTs, | |||
| such that each row in a BIFT has only one path to each destination, | such that each row in a BIFT has only one path to each destination, | |||
| but the multiple ECMP paths to any particular destination are spread | but the multiple ECMP paths to any particular destination are spread | |||
| across the multiple tables. When a BIER-encapsulated packet arrives | across the multiple tables. When a BIER-encapsulated packet arrives | |||
| to be forwarded, the BFR uses a hash of the BIER Entropy field to | to be forwarded, the BFR uses a hash of the BIER Entropy field to | |||
| determine which BIFT to use, and then the normal BIER forwarding | determine which BIFT to use, and then the normal BIER forwarding | |||
| algorithm (as described in Sections 6.5 and 6.6) is used with the | algorithm (as described in Sections 6.5 and 6.6) is used with the | |||
| selected BIFT. | selected BIFT. | |||
| ECMP is achieved by having a particular path may appear in multiple | As an example, suppose there are two paths to destination X (call | |||
| tables. For example, suppose there are two paths to destination X | them X1 and X2), and four paths to destination Y (call them Y1, Y2, | |||
| (call them X1 and X2), and four paths to destination Y (call them Y1, | Y3, and Y4). If there are, say, four BIFTs, one BIFT would have | |||
| Y2, Y3, and Y4). If there are, say, four BIFTs, one BIFT would have | ||||
| paths X1 and Y1, one would have X1 and Y2, one would have X2 and Y3, | paths X1 and Y1, one would have X1 and Y2, one would have X2 and Y3, | |||
| and one would have X2 and Y4. Note that if there are three paths to | and one would have X2 and Y4. If traffic to X is split evenly among | |||
| one destination and four paths to another, 12 BIFTs would be required | these four BIFTs, the traffic will be split evenly between the two | |||
| in order to get even splitting of the load. | paths to X; if traffic to Y is split evenly among these four BIFTs, | |||
| the traffic will be split evenly between the four paths to Y. | ||||
| Note that if there are three paths to one destination and four paths | ||||
| to another, 12 BIFTs would be required in order to get even splitting | ||||
| of the load to each of those two destinations. Of course, each BIFT | ||||
| uses some memory, and one might be willing to have less optimal | ||||
| splitting in order to have fewer BIFTs. How that tradeoff is made is | ||||
| an implementation or deployment decision. | ||||
| 6.8. Prevention of Loops and Duplicates | 6.8. Prevention of Loops and Duplicates | |||
| The BitString in a BIER-encapsulated packet specifies the set of | The BitString in a BIER-encapsulated packet specifies the set of | |||
| BFERs to which that packet is to be forwarded. When a BIER- | BFERs to which that packet is to be forwarded. When a BIER- | |||
| encapsulated packet is replicated, no two copies of the packet will | encapsulated packet is replicated, no two copies of the packet will | |||
| ever have a BFER in common. If one of the packet's BFERs forwards | ever have a BFER in common. If one of the packet's BFERs forwards | |||
| the packet further, that will first clear the bit that identifies | the packet further, that will first clear the bit that identifies | |||
| itself. As a result, duplicate delivery of packets is not possible | itself. As a result, duplicate delivery of packets is not possible | |||
| with BIER. | with BIER. | |||
| skipping to change at page 28, line 4 ¶ | skipping to change at page 28, line 26 ¶ | |||
| Email: Martin.Horneffer@telekom.de | Email: Martin.Horneffer@telekom.de | |||
| Uwe Joorde | Uwe Joorde | |||
| Deutsche Telekom | Deutsche Telekom | |||
| Hammer Str. 216-226 | Hammer Str. 216-226 | |||
| Muenster D-48153 | Muenster D-48153 | |||
| DE | DE | |||
| Email: Uwe.Joorde@telekom.de | Email: Uwe.Joorde@telekom.de | |||
| Luay Jalil | ||||
| Verizon | ||||
| 1201 E Arapaho Rd. | ||||
| Richardson, TX 75081 | ||||
| US | ||||
| Email: luay.jalil@verizon.com | ||||
| Jeff Tantsura | Jeff Tantsura | |||
| Ericsson | Ericsson | |||
| 300 Holger Way | 300 Holger Way | |||
| San Jose, CA 95134 | San Jose, CA 95134 | |||
| US | US | |||
| Email: jeff.tantsura@ericsson.com | Email: jeff.tantsura@ericsson.com | |||
| 11. References | 11. References | |||
| End of changes. 18 change blocks. | ||||
| 36 lines changed or deleted | 78 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||