idnits 2.17.1 draft-ietf-mboned-redundant-ingress-failover-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (7 April 2022) is 750 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-05) exists of draft-ietf-bier-bfd-01 == Outdated reference: A later version (-13) exists of draft-ietf-bier-ping-07 == Outdated reference: A later version (-08) exists of draft-ietf-pim-sr-p2mp-policy-04 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MBONED WG G. Shepherd 3 Internet-Draft Cisco Systems, Inc. 4 Intended status: Informational Z. Zhang, Ed. 5 Expires: 9 October 2022 ZTE Corporation 6 Y. Liu 7 China Mobile 8 Y. Cheng 9 China Unicom 10 7 April 2022 12 Multicast Redundant Ingress Router Failover 13 draft-ietf-mboned-redundant-ingress-failover-00 15 Abstract 17 This document discusses the redundant ingress router failover in 18 multicast domain. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at https://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on 9 October 2022. 37 Copyright Notice 39 Copyright (c) 2022 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 44 license-info) in effect on the date of publication of this document. 45 Please review these documents carefully, as they describe your rights 46 and restrictions with respect to this document. Code Components 47 extracted from this document must include Revised BSD License text as 48 described in Section 4.e of the Trust Legal Provisions and are 49 provided without warranty as described in the Revised BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 3. Multicast Redundant Ingress Router Failover . . . . . . . . . 3 56 3.1. Swichover . . . . . . . . . . . . . . . . . . . . . . . . 4 57 3.2. Failure detection . . . . . . . . . . . . . . . . . . . . 7 58 4. Stand-by Modes . . . . . . . . . . . . . . . . . . . . . . . 7 59 4.1. Cold . . . . . . . . . . . . . . . . . . . . . . . . . . 8 60 4.2. Warm . . . . . . . . . . . . . . . . . . . . . . . . . . 8 61 4.3. Hot . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 62 4.4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 9 63 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 64 6. Security Considerations . . . . . . . . . . . . . . . . . . . 11 65 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 66 7.1. Normative References . . . . . . . . . . . . . . . . . . 11 67 7.2. Informative References . . . . . . . . . . . . . . . . . 11 68 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 70 1. Introduction 72 The multicast redundant ingress router failover is an important issue 73 in multicast deployment. This document tries to do a research on it 74 in the multicast domain. The Multicast Domain is a domain which is 75 used to forward multicast flow according to specific multicast 76 technologies, such as PIM ([RFC7761]), BIER ([RFC8279]), P2MP TE 77 tunnel ([RFC4875]), MLDP ([RFC6388]), etc. The domain may or may not 78 connect the multicast source and receiver directly. 80 The ingress router is close to the multicast source. The ingress 81 router may connect the multicast source directly, or there may be 82 multiple hops between the ingress router and the multicast source. 83 In the multicast domain, the ingress router is the most adjacent 84 router to the multicast source. It's also called the first-hop 85 router in PIM, or BFIR in BIER, or Ingress LSR in P2MP TE tunnel or 86 MLDP. 88 The failover function between the multicast source and the ingress 89 router can be achieved by many ways, and it is not included in this 90 document. 92 The egress router is close to the multicast receiver. The egress 93 router may connect the multicast receiver directly, or there may be 94 multiple hops between the egress router and the multicast receiver. 95 In the multicast domain, the egress router is the most adjacent 96 router to the multicast receiver. It's also called the last-hop 97 router in PIM, or BFER in BIER, or Egress LSR in P2MP TE tunnel or 98 MLDP. 100 There may be some other function deployed in the multicast domain, 101 such as static configuration, or AMT ([RFC7450]), or SR P2MP Policy 102 ([I-D.ietf-pim-sr-p2mp-policy]). 104 This document doesn't discuss the details of these technologies. 105 This document discusses the general redundant ingress router failover 106 ways in the multicast domain. 108 2. Terminology 110 The following abbreviations are used in this document: 112 IR: the ingress router which is the most close to the multicast 113 source in the multicast domain. 115 ER: the egress router which is the most close to the multicast 116 receiver in the multicast domain. 118 SIR: The IR that is in charge of sending the multicast flow, or the 119 flow from the IR is accepted by the ERs, the IR is called as the 120 Selected-IR, that is SIR in abbreviation. 122 BIR: The IR that is not in charge of sending the multicast flow, or 123 the flow from the IR is not accepted by the ERs, but the IR replaces 124 the role of SIR once SIR fails. The IR is called as the Backup-IR, 125 that is BIR in abbreviation. 127 3. Multicast Redundant Ingress Router Failover 128 source 129 ... 130 +-----+ +-----+ 131 +----------+ IR1 +------+ IR2 +---------+ 132 |multicast +-----+ +-----+ | 133 |domain ... | 134 | | 135 | +-----+ +-----+ | 136 | | Rm | | Rn | | 137 | ++---++ +--+--+ | 138 | | | | | 139 | +-----+ +---+ +-----+ | 140 | | | | | 141 | +-v---+ +--v--+ +--v--+ | 142 +---+ ER1 +------+ ER2 +------+ ER3 +---+ 143 +-----+ +-----+ +-----+ 144 ... ... ... 145 receiver receiver receiver 146 Figure 1 148 Usually, a multicast source connects directly, or across multiple 149 hops to two IRs to avoid single node failure. As shown in figure 1, 150 there are two IRs close to a multicast source. The two IRs are UMH 151 (Upstream Multicast Hop) candidates for the ERs. 153 The two IRs gets multicast flow from the mutlcast source, how to 154 forward the multicast flow to ERs is different according to the 155 technologies deployed in the multicast domain. For example, for PIM 156 which is used in this domain, two PIM Trees that rooted on the two 157 IRs may be built separately. 159 The IRs works with the other router, such as the ER, in the multicast 160 domain to minimize the multicast flow packet loss during the IR 161 swichover. 163 3.1. Swichover 165 There may be some failures occurs in the domain, such as link 166 failure, node failure, if the failed link or node is on the multicast 167 flow forwarding path, there may be multicast flow packet loss. 169 If there are multiple paths from the IR to the ERs, there is no need 170 to switch IR when some nodes or links fail. 172 * When PIM is used in the domain as multicast forwarding protocol, 173 the forwarding tree for (S, G) or (*, G) is built in advance. 174 When a node or link in the forwarding tree fails, the tree is 175 rebuilt partially. 177 * When BIER is used in the domain as multicast forwarding protocol, 178 there is no need to rebuilt forwarding tree in case of node or 179 link failure, the BIER forwarding recovers along with the IGP 180 routing convergence. 182 * When P2MP TE tunnel or MLDP is used in the domain as multicast 183 forwarding protocol, the forwarding LSP is built in advance. When 184 a node or link in the LSP fails, the LSP may be rebuilt partially. 186 * When static multicast tree or SR P2MP policy is used in the 187 domain, the controller needs to re-compute a new forwarding path 188 to bypass the failed node or link. 190 In some situations, there are some key nodes or links in the network. 191 The multicast path can not be recovered due to the key node or link 192 failure. The IR needs swichover. 194 source 195 ... 196 +-----+ +-----+ 197 +----------+ IR1 +------+ IR2 +---------+ 198 | +--+--+ +--+--+ | 199 | | | | 200 | +--+--+ +--+--+ | 201 | | Rx | | Ry | | 202 | +-+-+-+ ++---++ | 203 | | | | | | 204 | | +-----------+ | | 205 | | | | | | 206 | | +---------+ | | | 207 | | | | | | 208 | +-v-v-+ +--v-v+ | 209 | | Rm | | Rn | | 210 | ++---++ +--+--+ | 211 | | | | | 212 | +-----+ +---+ +-----+ | 213 | | | | | 214 | +-v---+ +--v--+ +--v--+ | 215 +---+ ER1 +------+ ER2 +------+ ER3 +---+ 216 +-----+ +-----+ +-----+ 217 ... ... ... 218 receiver receiver receiver 219 Figure 2 221 For example in figure 2, there is only one path in the network 222 partially. The IR1, Rx are key nodes in the domain, when IR1 or Rx 223 fails, there is no any other path between the IR1 and the ERs. 225 * When PIM is used in the domain, Rm and Rn may choose Ry as the 226 upstream node to send Join message to build a new tree which 227 rooted with IR2. 229 * When BIER is used in the domain, IR2 should in charge of the 230 forwarding role to forward the flow to the ERs. 232 * When P2MP TE tunnel or MLDP is used in the domain, the LSP started 233 from IR2 can be built and replace the used LSP started from IR1 234 when the used LSP does not work. 236 * When static multicast tree or SR P2MP policy is used in the 237 domain, the controller should let the IR2 to forward multicast 238 flow to the ERs. 240 3.2. Failure detection 242 In order to achieve the successful IR switchover, some methods should 243 be used for monitoring the IR node failure or the path failure 244 between IR and ERs, and the IR can do the switching once the failure 245 occurs. BFD or PING methods can be used for it. 247 BFD [RFC5880] can be used in all the deployments. Multipoint BFD 248 [RFC8562] can also be used for the failure detection between IR and 249 ERs. BFD for MPLS LSPs [RFC5884] can be used in P2MP TE tunnel or 250 MLDP deployments. BIER BFD [I-D.ietf-bier-bfd] can be used in BIER 251 deployment. 253 IPv4 PING [RFC0792] and IPv6 PING [RFC4443] can also be used in all 254 the deployments. LSP-Ping [RFC8029] can be used for P2MP TE tunnel 255 or MLDP deployments. BIER PING [I-D.ietf-bier-ping] can be used in 256 BIER deployment. 258 BIR and ER can detect the SIR node and path failure easily by the BFD 259 and PING methods. If the monitoring is between SIR and ER, how to 260 trigger the switchover quickly is challenging when BIR needs to start 261 forwarding the multicast flow. If the monitoring is between BIR and 262 SIR, the path between BIR and SIR may fail, but the path is not the 263 way from SIR to ERs, BIR may trigger the switchover by mistake, in 264 this case unnecessary duplicate flow occurs. In this case, the ER 265 must support the selective receiving and can be compatible with the 266 IR switchover mistake. In order to minimize the mistaken switchover, 267 the reliability of SIR/BIR detection needs to be enhanced, such as 268 using redundant reliable paths for detection, etc. 270 4. Stand-by Modes 272 In case there are more than one IRs can be the UMH, and there is no 273 other path from an IR to ERs in case of the IR fails, the IR needs to 274 be switched. 276 Usually there are three types of stand-by modes in multicast IR 277 protection. [RFC9026] has some description on it. This document 278 discusses the detail of the three modes here. 280 The ER may send request to upstream router or IR when it finds the 281 node or path failure. The request from the ER may be the PIM tree 282 building, or BIER overlay protocol signaling, or LSP building, or 283 some other ways to let IR knows whether forwards the multicast flow. 285 4.1. Cold 287 In cold standby mode, the ER selects an SIR, for example IR1 in 288 figure 1, as the SIR and signals to it to get the multicast flow. 290 When the ER finds that the SIR is down, or the ER finds that it 291 cannot receive flow from IR1, the ER signals to IR2 to get the 292 multicast flow. 294 * For IR, the IRs, include SIR and BIR, just do the regular 295 operation of forwarding flow according to the request from the ER. 297 * For ER, the ER must select an IR as the SIR and signal to it. 298 When the SIR fails or the path between the SIR and ER fails, the 299 ER must signal to the BIR to get the flow. 301 * For the intermediate routers, they know nothing about the role of 302 IR, they just do the packet forwarding. There is no duplicate 303 packets in the domain. 305 In case of the IR switchover, the ER detects the failure of SIR, and 306 signals to the BIR. There is packet loss during the signaling until 307 the ER receives the flow from the BIR. 309 4.2. Warm 311 In Warm standby mode, the ER signals to both IR1 and IR2. 313 In case IR1 is the SIR, IR1 forwards the flow to the ER. The BIR, 314 for example the IR2, must not forward the flow to the ER until the 315 SIR is down. 317 * For IR, the IR should take the role of SIR or BIR. The BIR must 318 not forward flow to the ER. When the SIR fails or the path 319 between SIR and ER fails, the BIR must start forwarding the flow 320 to ER. But it's hard to know the failure for BIR itself, some 321 methods should be taken to let the BIR to get the failure 322 notification. 324 * For ER, the ER does not select the SIR or BIR. The ER just signal 325 to both of them. 327 * For the intermediate routers, they know nothing about the role of 328 IR, they just do the packet forwarding. There is no duplicate 329 packets in the domain. 331 In case of the IR switchover, the BIR detects the failure of the SIR 332 and switch to SIR. There is packet loss during the IR switchover. 334 In some deployments, the SIR and BIR may in charge of different 335 multicast flow. For a specific multicast flow, the SIR may be IR1, 336 for another multicast flow, the SIR may be IR2. So the two IRs can 337 share the multicast forwarding load. And another possible deployment 338 is, the two IRs can in charge of different ERs for one multicast 339 flow. For example, IR1 sends the multicast flow to some of the ERs, 340 and IR2 send the multicast flow to the other ERs. In case IR1 341 detects there is something wrong between IR1 and the ERs, IR1 may 342 notify IR2 to take over the responsibility of forwarding the 343 multicast flow to these ERs that receive flow from IR1 before. 345 4.3. Hot 347 In Hot standby mode, the ER signals to both IRs. 349 Both IRs are sending the flow to the ER. The ER must discard the 350 duplicate flow from one of the IRs. 352 In this situation, there are no SIR or BIR. Only ER knows which IR 353 is the SIR. 355 * For IR, the IR need not to know the roles of SIR or BIR, IR just 356 forwarding the flow according to the request received from ER. 358 * For ER, the ER signal to both of the IRs to get the flow. And the 359 ER must discard the duplicated flow from the backup BIR. When the 360 SIR fails or the path between SIR and ER fails, the ER must switch 361 the forwarding plane to forward the flow packet comes from the 362 BIR. To be noted, the ERs may choose different SIR or BIR. 364 * For the intermediate routers, they know nothing about the role of 365 IR, they just do the packet forwarding. There are duplicate 366 packets forwarded in the domain. 368 In case of the IR switchover, the ER detects the failure of the SIR. 369 Because there are duplicate flow packets arrive on the ER, the ER 370 just switch to forward the flow comes from the BIR. There may be 371 packet loss during the switching. 373 4.4. Summary 375 The table is a brief comparison among the three modes. The 'SIR 376 failover' means the SIR fails or the path between SIR and ER fails. 378 +==============+================+================+=================+ 379 | role | Cold Mode | Warm Mode | Hot Mode | 380 +==============+================+================+=================+ 381 | IR | Forwarding | Takes the role | Need not to | 382 | | flow according | of SIR or BIR, | know the roles | 383 | | to the request | BIR must not | of SIR or BIR, | 384 | | from ER. | forward flow | just forwarding | 385 | | | to ER until | flow according | 386 | | | SIR failovers. | to the request | 387 | | | | from ER. | 388 +--------------+----------------+----------------+-----------------+ 389 | ER | Must select an | Does not | Signal to both | 390 | | IR as SIR to | select the SIR | of SIR and BIR. | 391 | | signal the | or BIR, just | Discards the | 392 | | request, | signal to both | duplicate flow | 393 | | signal to the | of them. | from BIR until | 394 | | BIR to request | | SIR failover. | 395 | | the flow when | | | 396 | | SIR failovers. | | | 397 +--------------+----------------+----------------+-----------------+ 398 | Intermediate | Knows nothing | Knows nothing | Knows nothing | 399 | Router | about SIR or | about SIR or | about SIR or | 400 | | BIR. No | BIR. No | BIR. | 401 | | duplicated | duplicated | Duplicated flow | 402 | | flow is | flow is | is forwarded. | 403 | | forwarded. | forwarded. | | 404 +--------------+----------------+----------------+-----------------+ 406 Table 1 408 The Cold stand-by mode is the easiest way to implementated, but it 409 takes the longest converge time. 411 The Hot stand-by mode takes the most less packet loss, but there is 412 duplicated packet forwarding in the domain, more bandwidth is 413 occupied. 415 The Warm stand-by mode takes the middle packet loss and converge 416 time, but it's hard for BIR to know the failure between SIR and ERs. 418 So it's hard to say which mode is the best way for multicast 419 redundant ingress router failover, the network administrator should 420 select the most suitable mode according to the network deployment. 422 5. IANA Considerations 424 This document does not have any requests for IANA allocation. 426 6. Security Considerations 428 This document adds no new security considerations. 430 7. References 432 7.1. Normative References 434 [RFC4875] Aggarwal, R., Ed., Papadimitriou, D., Ed., and S. 435 Yasukawa, Ed., "Extensions to Resource Reservation 436 Protocol - Traffic Engineering (RSVP-TE) for Point-to- 437 Multipoint TE Label Switched Paths (LSPs)", RFC 4875, 438 DOI 10.17487/RFC4875, May 2007, 439 . 441 [RFC6388] Wijnands, IJ., Ed., Minei, I., Ed., Kompella, K., and B. 442 Thomas, "Label Distribution Protocol Extensions for Point- 443 to-Multipoint and Multipoint-to-Multipoint Label Switched 444 Paths", RFC 6388, DOI 10.17487/RFC6388, November 2011, 445 . 447 [RFC7450] Bumgardner, G., "Automatic Multicast Tunneling", RFC 7450, 448 DOI 10.17487/RFC7450, February 2015, 449 . 451 [RFC7761] Fenner, B., Handley, M., Holbrook, H., Kouvelas, I., 452 Parekh, R., Zhang, Z., and L. Zheng, "Protocol Independent 453 Multicast - Sparse Mode (PIM-SM): Protocol Specification 454 (Revised)", STD 83, RFC 7761, DOI 10.17487/RFC7761, March 455 2016, . 457 [RFC8279] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., 458 Przygienda, T., and S. Aldrin, "Multicast Using Bit Index 459 Explicit Replication (BIER)", RFC 8279, 460 DOI 10.17487/RFC8279, November 2017, 461 . 463 7.2. Informative References 465 [I-D.ietf-bier-bfd] 466 Xiong, Q., Mirsky, G., Hu, F., and C. Liu, "BIER BFD", 467 Work in Progress, Internet-Draft, draft-ietf-bier-bfd-01, 468 8 April 2021, . 471 [I-D.ietf-bier-ping] 472 Kumar, N., Pignataro, C., Akiya, N., Zheng, L., Chen, M., 473 and G. Mirsky, "BIER Ping and Trace", Work in Progress, 474 Internet-Draft, draft-ietf-bier-ping-07, 11 May 2020, 475 . 478 [I-D.ietf-pim-sr-p2mp-policy] 479 (editor), D. V., Filsfils, C., Parekh, R., Bidgoli, H., 480 and Z. Zhang, "Segment Routing Point-to-Multipoint 481 Policy", Work in Progress, Internet-Draft, draft-ietf-pim- 482 sr-p2mp-policy-04, 7 March 2022, 483 . 486 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 487 RFC 792, DOI 10.17487/RFC0792, September 1981, 488 . 490 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 491 Control Message Protocol (ICMPv6) for the Internet 492 Protocol Version 6 (IPv6) Specification", STD 89, 493 RFC 4443, DOI 10.17487/RFC4443, March 2006, 494 . 496 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 497 (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, 498 . 500 [RFC5884] Aggarwal, R., Kompella, K., Nadeau, T., and G. Swallow, 501 "Bidirectional Forwarding Detection (BFD) for MPLS Label 502 Switched Paths (LSPs)", RFC 5884, DOI 10.17487/RFC5884, 503 June 2010, . 505 [RFC8029] Kompella, K., Swallow, G., Pignataro, C., Ed., Kumar, N., 506 Aldrin, S., and M. Chen, "Detecting Multiprotocol Label 507 Switched (MPLS) Data-Plane Failures", RFC 8029, 508 DOI 10.17487/RFC8029, March 2017, 509 . 511 [RFC8562] Katz, D., Ward, D., Pallagatti, S., Ed., and G. Mirsky, 512 Ed., "Bidirectional Forwarding Detection (BFD) for 513 Multipoint Networks", RFC 8562, DOI 10.17487/RFC8562, 514 April 2019, . 516 [RFC9026] Morin, T., Ed., Kebler, R., Ed., and G. Mirsky, Ed., 517 "Multicast VPN Fast Upstream Failover", RFC 9026, 518 DOI 10.17487/RFC9026, April 2021, 519 . 521 Authors' Addresses 523 Greg Shepherd 524 Cisco Systems, Inc. 525 170 W. Tasman Dr. 526 San Jose, 527 United States of America 528 Email: gjshep@gmail.com 530 Zheng Zhang (editor) 531 ZTE Corporation 532 Nanjing 533 China 534 Email: zhang.zheng@zte.com.cn 536 Yisong Liu 537 China Mobile 538 Beijing 539 Email: liuyisong@chinamobile.com 541 Ying Cheng 542 China Unicom 543 Beijing 544 China 545 Email: chengying10@chinaunicom.cn