idnits 2.17.1 draft-ietf-bier-source-protection-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 29, 2021) is 1093 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-05) exists of draft-ietf-bier-bfd-01 == Outdated reference: A later version (-08) exists of draft-ietf-bier-mld-05 == Outdated reference: A later version (-12) exists of draft-ietf-bier-pim-signaling-11 == Outdated reference: A later version (-13) exists of draft-ietf-bier-ping-07 == Outdated reference: A later version (-02) exists of draft-szcl-mboned-redundant-ingress-failover-00 Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BIER WG Z. Zhang 3 Internet-Draft G. Mirsky 4 Intended status: Informational Q. Xiong 5 Expires: October 31, 2021 ZTE Corporation 6 Y. Liu 7 China Mobile 8 H. Li 9 China Telecom 10 April 29, 2021 12 BIER (Bit Index Explicit Replication) Redundant Ingress Router Failover 13 draft-ietf-bier-source-protection-00 15 Abstract 17 This document describes a failover in the Bit Index Explicit 18 Replication domain with a redundant ingress router. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at https://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on October 31, 2021. 37 Copyright Notice 39 Copyright (c) 2021 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (https://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 55 2. Keywords . . . . . . . . . . . . . . . . . . . . . . . . . . 3 56 3. The Redundant BFIR Failover Analysis . . . . . . . . . . . . 3 57 3.1. Node Failure Monitoring . . . . . . . . . . . . . . . . . 4 58 3.2. Monitoring of the Working Path for a Failure . . . . . . 5 59 4. BFD and Ping . . . . . . . . . . . . . . . . . . . . . . . . 7 60 4.1. BIER Ping . . . . . . . . . . . . . . . . . . . . . . . . 7 61 4.2. BIER BFD . . . . . . . . . . . . . . . . . . . . . . . . 8 62 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 63 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 64 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 65 7.1. Normative References . . . . . . . . . . . . . . . . . . 9 66 7.2. Informative References . . . . . . . . . . . . . . . . . 9 67 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 69 1. Introduction 71 Bit Index Explicit Replication (BIER) [RFC8279] is an architecture 72 that provides multicast forwarding through a BIER domain without 73 requiring intermediate routers to maintain any multicast related per- 74 flow state. BIER also does not require any explicit tree-building 75 protocol for its operation. A multicast data packet enters a BIER 76 domain at a Bit-Forwarding Ingress Router (BFIR) and leaves the BIER 77 domain at one or more Bit-Forwarding Egress Routers (BFERs). 79 Redundant Ingress Router Failover is not specific to the BIER 80 environment. Redundant Ingress Router Failover means that to avoid 81 single node failure, two or more ingress routers, BFIRs in a BIER 82 environment, can be connected to the same multicast flow's source 83 node. One of BFIRs is selected to forward the flow from a multicast 84 source node to egress routers, i.e., BFER in a BIER environment. The 85 BFERs may choose the primary BFIR for the given multicast flow based 86 on local policies. BFERs in the same multicast group may select the 87 same or different BFIR. The BFIR and the path in use are referred to 88 as working, while all alternative available BFIRs and paths that can 89 be used to receive the same multicast flow are referred to as 90 protection. 92 When either the working BFIR or the working path fails, a BFER can 93 select one of the protecting BFIRs to recover the multicast flow. 94 The shorter the detection time, the faster the flow recovers. 96 This document discusses the functions that can be used to detect the 97 failure to trigger redundant ingress router failover in the BIER 98 environment. 100 2. Keywords 102 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 103 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 104 "OPTIONAL" in this document are to be interpreted as described in BCP 105 14 [RFC2119] [RFC8174] when, and only when, they appear in all 106 capitals, as shown here. 108 3. The Redundant BFIR Failover Analysis 110 According to the BIER architecture [RFC8279], BIER overlay protocols, 111 which among others include MVPN [RFC8556], MLD [I-D.ietf-bier-mld], 112 PIM [I-D.ietf-bier-pim-signaling], are used to exchange the multicast 113 flow information. Based on that, a BFER selects the UMH (Upstream 114 Multicast Hop) BFIR as the ingress router. The BFIR selected as the 115 UMH through a BIER overlay protocol learns of BFERs which have chosen 116 it to receive the particular multicast flow. BIER transport is used 117 to deliver the multicast packet to the destination BFERs. The 118 detection of a defect in the BIER transport layer ensures that the 119 source flow protection is uninterrupted. The switchover is performed 120 at the BIER overlay layer. Upon detecting the failure, an update in 121 the BIER overlay can trigger BFIR re-selection by BFERs. 123 As described in [I-D.szcl-mboned-redundant-ingress-failover], the 124 root standby modes, i.e., Cold Standby, Warm Standby, and Hot 125 Standby, can be used in the BIER environment. In Warm and Hot 126 Standby modes, the protection BFIR needs to learn through BIER 127 overlay protocols the identities of BFERs in the particular multicast 128 group. In the Hot Standby mode, BFER receives duplicate flows from 129 the selected active BFIR and protection BFIR, BFER accepts the flow 130 packet from the selected active BFIR, identified, for example, by 131 BFIR-id in the BIER header, discards the multicast packet from the 132 protection BFIR. 134 The most important elements in the redundant BFIR failover mechanism 135 are failure detection and switchover. Note that the scope of the 136 failure detection includes node and working path. Similarly, BFIR 137 switching and BFER switching are considered in the switchover 138 scenario. 140 The selected BFIR is referred to as Selected BFIR (S-BFIR), and the 141 backup BFIR is referred to as Backup BFIR (B-BFIR). For simplicity, 142 only one B-BFIR is considered in the following use case. 144 +--------+S1+-------+ 145 | | 146 +--v----+ +---v---+ 147 +------+ BFIR1 |..........| BFIR2 +-------+ 148 . +-------+ +-------+ . 149 . . 150 . ...... . 151 . . 152 . +-------+ +-------+ +-------+ . 153 +--|BFER1|-......-|BFER2|-......-|BFER3|--+ 154 +--+----+ +---+---+ +--+----+ 155 | | | 156 v v v 157 R1 R2 R3 159 Figure 1: An Example of the Redundant BFIR Failover 161 In Figure 1, a multicast source S1 is connected to BFIR1 and BFIR2. 162 In some deployments, only BFIR1 advertises S1 flow information to 163 BFERs using a BIER overlay protocol, such as, among others, 164 BGP(MVPN), MLD, or PIM. For this example, all BFERs that are 165 directed to receive the S1 flow will select BFIR1 as the S-BFIR, and 166 BFIR2 is considered as the B-BFIR. In some other deployments, BFIR1 167 and BFIR2 both advertise S1 flows to BFERs using a BIER overlay 168 protocol. As a result, some BFERs may select BFIR1 as their S-BFIR, 169 other BFERs may select BFIR2 as S-BFIR, BFIR1 and BFIR2 are 170 responsible for different sub-groups of BFERs, and they, 171 respectively, are the B-BFIR for the second sub-set of BFERs. We do 172 not distinguish these two cases strictly. 174 There are two types of failure monitoring: 176 o Node failure monitoring: It is used for BFIR failure detection. 177 The BFER failure detection is out of the scope of this document. 179 o Working path failure monitoring: It is used for BIER transport 180 path failure detection. It is used for the monitoring among BIER 181 domain edge routers, which include BFIR and BFER, through BIER 182 forwarding. 184 3.1. Node Failure Monitoring 186 For example, consider when S1 is connected to BFIR1 and BFIR2 on a 187 shared media segment. BFIR1 is acting as S-BFIR for the multicast 188 flow transmitted by S1. BFIR2 can monitor BFIR1 node failure using a 189 BFD session [RFC5880] built over the shared media segment. Also, can 190 use ping methods, including, for example, IPv4 ping [RFC0792], IPv6 191 ping [RFC4443], and LSP-Ping [RFC8029] in a network with either IPv4, 192 IPv6, or MPLS data plane, respectively. 194 In case there is no shared media segment interconnecting S1, BFIR1, 195 and BFIR2, BFIR2 may monitor the state of BFIR1 using a BIER BFD 196 session [I-D.ietf-bier-bfd] or a ping protocol across the BIER 197 domain. A ping protocol listed above or BIER ping 198 [I-D.ietf-bier-ping] can be used. In case there is no direct 199 connection between BFIR1 and BFIR2, multiple hops will be traversed. 200 Similarly, any of the listed above path continuity checking methods 201 can be used by a BFER to monitor the path to and state of S-BFIR. 202 The case when the S-BFIR monitors the working path to a BFER is 203 considered further in the document in more details. 205 The monitoring case between S-BFIR and B-BFIR, referred to as the 206 Warm Standby mode, is described in section 4.2 207 [I-D.szcl-mboned-redundant-ingress-failover]. For code and Hot 208 Standby modes described in Sections 4.1 and 4.3 209 [I-D.szcl-mboned-redundant-ingress-failover], the monitoring between 210 S-BFIR and B-BFIR may not be necessary. 212 For the monitoring between BFIR and BFERs, the BFIR node failure 213 detection is also be combined with working path failure detection. 215 3.2. Monitoring of the Working Path for a Failure 216 +--------+S1+-------+ 217 | | 218 +--v----+ +---v---+ 219 +------+ BFIR1 |..........| BFIR2 +-------+ 220 | +-----+-+ <------> +-------+ | 221 | | bfd | 222 | +--v---+ +------+ | 223 | | BFR1 | | BFR2 | | 224 | +-+---+--+- +------+ | 225 | | | ...... | 226 | | +-----+ | 227 | | | | 228 | +--v---+ +-v----+ +------+ | 229 | | BFRx | | BFRy | | BFRz | | 230 | ++-----+ ++--+--+ +------+ | 231 | | | | | 232 | | | +------------+ | 233 | | | | | 234 | +---v---+ +-v-----+ +--v----+ | 235 +--|BFER1||......||BFER2||......||BFER3|--+ 236 +--+----+ +---+---+ +--+----+ 237 | | | 238 v v v 239 R1 R2 R3 241 Figure 2: An Example of the Monitoring of the Working Path 243 In the case of a node failure detection, the path between B-BFIR and 244 S-BFIR may not be the same as the path traversed by the data flow. 245 For example, in Figure 2, the path from BFIR1 (S-BFIR) to all the 246 BFERs is different from the path between BFIR1 and BFIR2 (B-BFIR). 247 In Warm Standby mode, if the path between BFIR2 and BFIR1 is broken, 248 BFIR2 will detect the failure and interpret that as if BFIR1 is down. 249 As a result, BFIR2 will take on the role of S-BFIR. But the path 250 from BFIR1 to all or some of the BFERs may be working well and is not 251 affected by the defect between BFIR1 and BFIR2. In this situation, 252 the B-BFIR switches to S-BFIR unnecessarily, and that causes packet 253 duplication in the network and at BFERs. 255 For the failure detection between BIER edge routers, which include 256 BFIR and BFER, the path of a test packet is steered from BFIR to BFER 257 is the same as the path traversed by the monitored flow. In this 258 way, the BFER simultaneously monitors S-BFIR for node and working 259 path failure. 261 There are two options to monitor the working multicast distribution 262 tree in BIER: 264 o S-BFIR monitors all the BFERs; 266 o BFER monitors the S-BFIR. 268 In the BIER transport environment, the defect detection is based on a 269 BIER-specific mechanism, e.g., BIER Ping [I-D.ietf-bier-ping], BIER 270 BFD [I-D.ietf-bier-bfd]. BIER BFD [I-D.ietf-bier-bfd] reduces the 271 number of BFD sessions between S-BFIR and each of BFERs. Only one 272 multipoint BFD session will be built among S-BFIR and all the BFERs 273 and B-BFIR. When MVPN is used as the BIER overlay protocol, BFD 274 Discriminator attribute, defined in Section 3.1.6 in 275 [I-D.ietf-bess-mvpn-fast-failover], can be used to bootstrap the 276 multipoint BFD session between a BFIR and BFERs. In this situation, 277 only S-BFIR sends the BFD Discriminator attribute and transmits 278 periodic BFD Control messages, BFER and B-BFIR can monitor S-BFIR, 279 S-BFIR doesn't monitor BFER and B-BFIR. 281 Consider when S-BFIR monitors paths to and state of all BFERs in the 282 particular multicast group. Once S-BFIR detects that a BFER is 283 unreachable, S-BFIR notifies B-BFIR and the latter may start 284 frowarding that multicast packets to that BFER. The monitoring can 285 be achieved by a P2P BFD session between S-BFIR and each of BFERs. 286 Alternatively, a P2MP BFD session with active tails between S-BFIR 287 and BFERs can be used. This behavior can be used for the Warm 288 Standby mode. 290 When BFER monitors S-BFIR, a B-BFIR can also monitor S-BFIR. 291 Consider that a BFER or B-BFIR detects the failure of the S-BFIR. In 292 the Cold Standby mode, the BFER MUST select B-BFIR as the new S-BFIR 293 and signal to B-BFIR using a BIER overlay protocol as soon as 294 possible. In the Hot Standby mode, the BFER MUST switch to accept 295 and forward the multicast flow received from B-BFIR. In the Warm 296 Standby mode, B-BFIR becomes the S-BFIR and begins to forward the 297 flow to BFERs. 299 4. BFD and Ping 301 BFD and Ping can be used in failure detection, but there are 302 differences between them. A network administrator can select the 303 appropriate mechanism according to the actual network. 305 4.1. BIER Ping 307 [I-D.ietf-bier-ping] describes the mechanism and basic BIER 308 Operation, Administration, and Maintenance packet format that can be 309 used to perform failure detection and isolation on the BIER data 310 plane without any dependency on other layers like the IP layer. 312 In the example of Figure 1, BFER can monitor the status of BFIR and 313 the path status between BFER and BFIR. BFER1 sends the BIER Ping 314 packet to BFIR1. Suppose BFER1 does not receive several consecutive 315 responses from BFIR1 in an expected period (may be multiple of the 316 average round-trip time). In that case, the BFER1 concludes the 317 BFIR1 as a failed UMH, and BFER1 selects BFIR2 as the UMH. In the 318 Cold Standby mode, BFER1 signals to BFIR2 to start receiving the 319 multicast flow. In the Hot Standby mode, BFER begins to accept the 320 flow from BFIR2. If B-BFIR monitors S-BFIR in the Warm Standby mode 321 and detects the failure, B-BFIR takes the role of S-BFIR and begins 322 to forward the flow. 324 In this example, BFER1, BFER2, BFER3, and B-BFIR send the BIER ping 325 packets to BFIR1 separately. The timeout period MAY be set to 326 different values depending on the local performance requirement on 327 each BFER. In the Warm Standby mode, if the timeout period is 328 different on BFER and B-BFIR, and the period on B-BFIR is longer than 329 BFER, and multicast packets could be lost. 331 In the general case of a more complex BIER topology, it cannot be 332 guaranteed that the path used from BFIR1 to BFER1 is the same as in 333 the reverse direction, i.e., from BFER1 to BFIR1. If that is not 334 guaranteed and the paths are not co-routed, then this method may 335 produce false results, both false negative and false positive. The 336 former is when ping fails while the multicast path and flow are OK. 337 The latter is when the multicast path has a defect, but ping works. 338 Thus, to improve the consistency of this method of detecting a 339 failure in multicast flow transport, the path that the echo request 340 from BFER1 traverses to BFIR1 must be co-routed with the path that 341 the monitored multicast flow traverses through the BIER domain from 342 BFIR1 to BFER1. 344 4.2. BIER BFD 346 [I-D.ietf-bier-bfd] describes the application of P2MP BFD in a BIER 347 network. And it describes the procedures for using such a mode of 348 BFD protocol to verify multipoint or multicast connectivity between a 349 sender (BFIR) and one or more receivers (BFER and a redundant BFIR). 351 In the same example, BFIR1 sends the BIER Echo request packet to 352 BFERs to bootstrap a p2mp BFD session. After BFER1, BFER2 and BFER3 353 receive the Echo request packet with BFD Discriminator and the Target 354 SI-Bitstring TLVs, BFERs creates the BFD session of type 355 MultipointTail [RFC8562] to monitor the status of BFIR1 and the 356 working path. If BFERs have not received a BFD packet from BFER1 for 357 the Detection Time [RFC8562], BFER1 will treat BFIR1 as a failed UMH. 358 In the Cold Standby mode, BFER1 re-selects UMH and then signals to 359 BFIR2. As a result, BFIR2 begins to forward the multicast flow. In 360 the Hot Standby mode, BFER1 switches to accept the flow from BFIR2. 361 B-BFIR (BFIR2) monitors S-BFIR (BFIR1) in the Warm Standby mode, 362 using the same p2mp BFD session. After B-BFIR detects the failure, 363 it takes on the role of S-BFIR and begins to forward the multicast 364 flow to BFERs. 366 5. IANA Considerations 368 This document does not have any requests for IANA allocation. This 369 section can be deleted before the publication of the draft. 371 6. Security Considerations 373 Security considerations discussed in [RFC8279], [RFC8562], 374 [I-D.ietf-bier-ping], [I-D.ietf-bess-mvpn-fast-failover] and 375 [I-D.ietf-bier-bfd] apply to this document. 377 7. References 379 7.1. Normative References 381 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 382 Requirement Levels", BCP 14, RFC 2119, 383 DOI 10.17487/RFC2119, March 1997, 384 . 386 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 387 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 388 May 2017, . 390 7.2. Informative References 392 [I-D.ietf-bess-mvpn-fast-failover] 393 Morin, T., Kebler, R., and G. Mirsky, "Multicast VPN Fast 394 Upstream Failover", draft-ietf-bess-mvpn-fast-failover-15 395 (work in progress), January 2021. 397 [I-D.ietf-bier-bfd] 398 Xiong, Q., Mirsky, G., hu, f., and C. Liu, "BIER BFD", 399 draft-ietf-bier-bfd-01 (work in progress), April 2021. 401 [I-D.ietf-bier-mld] 402 Pfister, P., Wijnands, I., Venaas, S., Wang, C., Zhang, 403 Z., and M. Stenberg, "BIER Ingress Multicast Flow Overlay 404 using Multicast Listener Discovery Protocols", draft-ietf- 405 bier-mld-05 (work in progress), February 2021. 407 [I-D.ietf-bier-pim-signaling] 408 Bidgoli, H., Xu, F., Kotalwar, J., Wijnands, I., Mishra, 409 M., and Z. Zhang, "PIM Signaling Through BIER Core", 410 draft-ietf-bier-pim-signaling-11 (work in progress), 411 November 2020. 413 [I-D.ietf-bier-ping] 414 Nainar, N., Pignataro, C., Akiya, N., Zheng, L., Chen, M., 415 and G. Mirsky, "BIER Ping and Trace", draft-ietf-bier- 416 ping-07 (work in progress), May 2020. 418 [I-D.szcl-mboned-redundant-ingress-failover] 419 Shepherd, G., Zhang, Z., Liu, Y., and Y. Cheng, "Multicast 420 Redundant Ingress Router Failover", draft-szcl-mboned- 421 redundant-ingress-failover-00 (work in progress), October 422 2020. 424 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 425 RFC 792, DOI 10.17487/RFC0792, September 1981, 426 . 428 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 429 Control Message Protocol (ICMPv6) for the Internet 430 Protocol Version 6 (IPv6) Specification", STD 89, 431 RFC 4443, DOI 10.17487/RFC4443, March 2006, 432 . 434 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 435 (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, 436 . 438 [RFC8029] Kompella, K., Swallow, G., Pignataro, C., Ed., Kumar, N., 439 Aldrin, S., and M. Chen, "Detecting Multiprotocol Label 440 Switched (MPLS) Data-Plane Failures", RFC 8029, 441 DOI 10.17487/RFC8029, March 2017, 442 . 444 [RFC8279] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., 445 Przygienda, T., and S. Aldrin, "Multicast Using Bit Index 446 Explicit Replication (BIER)", RFC 8279, 447 DOI 10.17487/RFC8279, November 2017, 448 . 450 [RFC8556] Rosen, E., Ed., Sivakumar, M., Przygienda, T., Aldrin, S., 451 and A. Dolganow, "Multicast VPN Using Bit Index Explicit 452 Replication (BIER)", RFC 8556, DOI 10.17487/RFC8556, April 453 2019, . 455 [RFC8562] Katz, D., Ward, D., Pallagatti, S., Ed., and G. Mirsky, 456 Ed., "Bidirectional Forwarding Detection (BFD) for 457 Multipoint Networks", RFC 8562, DOI 10.17487/RFC8562, 458 April 2019, . 460 Authors' Addresses 462 Zheng Zhang 463 ZTE Corporation 465 Email: zhang.zheng@zte.com.cn 467 Greg Mirsky 468 ZTE Corporation 470 Email: gregory.mirsky@ztetx.com 472 Quan Xiong 473 ZTE Corporation 475 Email: xiong.quan@zte.com.cn 477 Yisong Liu 478 China Mobile 480 Email: liuyisong@chinamobile.com 482 Huanan Li 483 China Telecom 485 Email: lihn6@chinatelecom.cn