idnits 2.17.1 draft-wang-lsr-prefix-unreachable-annoucement-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (October 15, 2021) is 924 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC5340' is defined on line 418, but no explicit reference was found in the text == Unused Reference: 'RFC5709' is defined on line 422, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 LSR Working Group A. Wang 3 Internet-Draft China Telecom 4 Intended status: Standards Track G. Mishra 5 Expires: April 18, 2022 Verizon Inc. 6 Z. Hu 7 Y. Xiao 8 Huawei Technologies 9 October 15, 2021 11 Prefix Unreachable Announcement 12 draft-wang-lsr-prefix-unreachable-annoucement-08 14 Abstract 16 This document describes a mechanism to solve an existing issue with 17 Longest Prefix Match (LPM), that exists where an operator domain is 18 divided into multiple areas or levels where summarization is 19 utilized. This draft addresses a fail-over issue related to a multi 20 areas or levels domain, where a link or node down event occurs 21 resulting in an LPM component prefix being omitted from the FIB 22 resulting in black hole sink of routing and connectivity loss. This 23 draft introduces a new control plane convergence signaling mechanism 24 using a negative prefix called Prefix Unreachable Announcement 25 Mechanism(PUAM), utilized to detect a link or node down event and 26 signal the RIB that the event has occurred to force immediate control 27 plane convergence. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at https://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on April 18, 2022. 46 Copyright Notice 48 Copyright (c) 2021 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (https://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 64 2. Conventions used in this document . . . . . . . . . . . . . . 3 65 3. Scenario Description . . . . . . . . . . . . . . . . . . . . 3 66 3.1. Inter-Area Node Failure Scenario . . . . . . . . . . . . 4 67 3.2. Inter-Area Links Failure Scenario . . . . . . . . . . . . 4 68 4. PUA (Prefix Unreachable Advertisement) Procedures . . . . . . 5 69 5. MPLS and SRv6 LPM based BGP Next-hop Failure Application . . 5 70 6. PUAM Capabilities Announcement . . . . . . . . . . . . . . . 6 71 7. Implementation Consideration . . . . . . . . . . . . . . . . 7 72 8. Deployment Considerations . . . . . . . . . . . . . . . . . . 7 73 9. Security Considerations . . . . . . . . . . . . . . . . . . . 8 74 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 75 11. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 9 76 12. Normative References . . . . . . . . . . . . . . . . . . . . 9 77 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 79 1. Introduction 81 As part of an operator optimized design criteria, a critical 82 requirement is to limit Shortest Path First (SPF) churn which occurs 83 within a single OSPF area or ISIS level. This is accomplished by 84 sub-dividing the IGP domain into multiple areas for flood reduction 85 of intra area prefixes so they are contained within each discrete 86 area to avoid domain wide flooding. 88 OSPF and ISIS have a default and summary route mechanism which is 89 performed on the OSPF area border router or ISIS L1-L2 node. The 90 OSPF summary route is triggered to be advertised conditionally when 91 at least one component prefix exists within the non-zero area. ISIS 92 Level-L1-L2 node as well generate a summary prefix into the level-2 93 backbone area for Level 1 area prefixes that is triggered to be 94 advertised conditionally when at least a single component prefix 95 exists within the Level-1 area. ISIS L1-L2 node with attach bit set 96 also generates a default route into each Level-1 area along with 97 summary prefixes generated for other Level-1 areas. 99 Operators have historically relied on MPLS architecture which is 100 based on exact match host route FEC binding for single area. 101 [RFC5283] LDP inter-area extension provides the ability to LPM, so 102 now the RIB match can now be a summary match and not an exact match 103 of a host route of the egress PE for an inter-area LSP to be 104 instantiated. SRV6 routing framework utilities the IPv6 data plane 105 standard IGP LPM. When operators start to migrate from MPLS LSP 106 based host route bootstrapped FEC binding, to SRv6 routing framework, 107 the IGP LPM now comes into play with summarization which will 108 influence the forwarding of traffic when a link or node event occurs 109 for a component prefix within the summary range resulting in black 110 hole routing of traffic. 112 The motivation behind this draft is based on either MPLS LPM FEC 113 binding, or SRv6 BGP service overlay using traditional unicast 114 routing (uRIB) LPM forwarding plane where the IGP domain has been 115 carved up into OSPF or ISIS areas and summarization is utilized. In 116 this scenario where a failure conditions result in a black hole of 117 traffic where multiple ABRs exist and either the area is partitioned 118 or other link or node failures occur resulting in the component 119 prefix host route missing within the summary range. Summarization of 120 inter-area types routes propagated into the backbone area for flood 121 reduction are made up of component prefixes. It is these component 122 prefixes that the PUAM tracks to ensure traffic is not black hole 123 sink routed due to a PE or ABR failure. The PUA mechanism ensures 124 immediate control plane convergence with ABR or PE node switchover 125 when area is partitioned or ABR has services down to avoid black hole 126 of traffic. 128 2. Conventions used in this document 130 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 131 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 132 document are to be interpreted as described in [RFC2119] . 134 3. Scenario Description 136 Figure 1 illustrates the topology scenario when OSPF or ISIS is 137 running in multi areas or multi levels domain. R0-R4 are routers in 138 backbone area, S1-S4,T1-T4 are internal routers in area 1 and area 2 139 respectively. R1 and R3 are area border routers or ISIS Level 1-2 140 border nodes between area 0 and area 1. R2 and R4 are area border 141 routers between area 0 and area 2. 143 S1/S4 and T2/T4 PEs peer to customer CEs for overlay VPNs. Ps1/Ps4 144 is the loopback0 address of S1/S4 and Pt2/Pt4 is the loopback0 145 address of T2/T4. 147 +---------------------+------+--------+-----+--------------+ 148 | +--+ +--+ ++-+ ++-+ +-++ + -+ +--+| 149 | |S1+--------+S2+---+R1+---|R0+----+R2+---+T1+--------+T2|| 150 | +-++Ps1 +-++ ++-+ +--+ +-++ ++++ Pt2 +-++| 151 | | | | | || | | 152 | | | | | || | | 153 | +-++Ps4 +-++ ++-+ +-++ ++++ Pt4+-++| 154 | |S4+--------+S3+---+R3+-----------+R4+---+T3+--------+T4|| 155 | +--+ +--+ ++-+ +-++ ++-+ +--+| 156 | | | | 157 | | | | 158 | Area 1 | Area 0 | Area 2 | 159 +---------------------+---------------+--------------------+ 161 Figure 1: OSPF Inter-Area Prefix Unreachable Announcement Scenario 163 3.1. Inter-Area Node Failure Scenario 165 If the area border router R2/R4 does the summary action, then one 166 summary address that cover the prefixes of area 2 will be announced 167 to area 0 and area 1, instead of the detail address. When the node 168 T2 is down, Pt2 bgp next hop becomes unreachable while the LPM 169 summary prefix continues to be advertised into the backbone area. 170 Except the border router R2/R4, the other routers within area 0 and 171 area 1 do not know the unreachable status of the Pt2 bgp next hop 172 prefix. Traffic will continue to forward LPM match to prefix Pt2 and 173 will be dropped on the ABR or Level 1-2 border node resulting in 174 black hole routing and connectivity loss. Customer overlay VPN dual 175 homed to both S1/S4 and T2/R4, traffic will not be able to fail-over 176 to alternate egress PE T4 bgp next hop Pt4 due to the summarization. 178 3.2. Inter-Area Links Failure Scenario 180 In a link failure scenario, if the link between T1/T2 and T1/T3 are 181 down, R2 will not be able to reach node T2. But as R2 and R4 do the 182 summary announcement, and the summary address covers the bgp next hop 183 prefix of Pt2, other nodes in area 0 area 1 will still send traffic 184 to T2 bgp next hop prefix Pt2 via the border router R2, thus black 185 hole sink routing the traffic. 187 In such a situation, the border router R2 should notify other routers 188 that it can't reach the prefix Pt2, and lets the other ABRs(R4) that 189 can reach prefix Pt2 advertise one specific route to Pt2, then the 190 internal routers will select R4 as the bypass router to reach prefix 191 Pt2. 193 4. PUA (Prefix Unreachable Advertisement) Procedures 195 [RFC7794] and [I-D.ietf-lsr-ospf-prefix-originator] draft both define 196 one sub-tlv to announce the originator information of the one prefix 197 from a specified node. This draft utilizes such TLV for both OSPF 198 and ISIS to signal the negative prefix in the perspective PUAM when a 199 link or node goes down. 201 ABR detects link or node down and floods PUAM negative prefix 202 advertisement along with the summary advertisement according to the 203 prefix-originator specification. The ABR or ISIS L1-L2 border node 204 has the responsibility to add the prefix originator information when 205 it receives the Router LSA from other routers in the same area or 206 level. 208 When the ABR or ISIS L1-L2 border node generates the summary 209 advertisement based on component prefixes, the ABR will announce one 210 new summary LSA or LSP which includes the information about this down 211 prefix, with the prefix originator set to NULL. The number of PUAMs 212 is equivalent to the number of links down or nodes down. The LSA or 213 LSP will be propagated with standard flooding procedures. 215 If the nodes in the area receive the PUAM flood from all of its ABR 216 routers, they will start BGP convergence process if there exist BGP 217 session on this PUAM prefix. The PUAM creates a forced fail over 218 action to initiate immediate control plane convergence switchover to 219 alternate egress PE. Without the PUAM forced convergence the down 220 prefix will yield black hole routing resulting in loss of 221 connectivity. 223 When only some of the ABRs can't reach the failure node/link, as that 224 described in Section 3.2, the ABR that can reach the PUAM prefix 225 should advertise one specific route to this PUAM prefix. The 226 internal routers within another area can then bypass the ABRs that 227 can't reach the PUAM prefix, to reach the PUAM prefix. 229 5. MPLS and SRv6 LPM based BGP Next-hop Failure Application 231 In an MPLS or SR-MPLS service provider core, scalability has been a 232 concern for operators which have split up the IGP domain into 233 multiple areas to avoid SPF churn. Normally, MPLS FEC binding for 234 LSP instantiation is based on egress PE exact match of a host route 235 Looback0. [RFC5283] LDP inter-area extension provides the ability to 236 LPM, so now the RIB match can now be a summary match and not an exact 237 match of host route of the egress PE for an inter-area LSP to be 238 instantiated. The caveat related to this feature that has prevented 239 operators from using the [RFC5283] LDP inter-area extension concept 240 is that when the component prefixes are now hidden in the summary 241 prefix, and thus the visibility of the BGP next-hop attribute is 242 lost. 244 In a case where a PE is down, and the [RFC5283] LDP inter-area 245 extension LPM summary is used to build the LSP inter-area, the LSP 246 remains partially established black hole on the ABR performing the 247 summarization. This major gap with [RFC5283] inter-area extension 248 forces operators into a workaround of having to flood the BGP next- 249 hop domain wide. In a small network this is fine, however if you 250 have 1000s PEs and many areas, the domain wide flooding can be 251 painful for operators as far as resource usage memory consumption and 252 computational requirements for RIB / FIB / LFIB label binding control 253 plane state. The ramifications of domain wide flooding of host 254 routes is described in detail in [RFC5302] domain wide prefix 255 distribution with 2 level ISIS Section 1.2 - Scalability. As SRv6 256 utilizes LPM, this problem exists as well with SRv6 when IGP domain 257 is broken up into areas and summarization is utilized. 259 PUAM is now able to provide the negative prefix component flooded 260 across the backbone to the other areas along with the summary prefix, 261 which is now immediately programmed into the RIB control plane. MPLS 262 LSP exact match or SRv6 LPM match over fail over path can now be 263 established to the alternate egress PE. No disruption in traffic or 264 loss of connectivity results from PUAM. Further optimizations such 265 as LFA and BFD can be done to make the data plane convergence 266 hitless. The PUAM solution applies to MPLS or SR-MPLS where LDP 267 inter-area extension is utilized for LPM aggregate FEC, as well a 268 SRv6 IPv6 control plane LPM match summarization of BGP next hop. 270 6. PUAM Capabilities Announcement 272 When not all of the nodes in one area support the PUAM information, 273 there are possibilities to form traffic loop. To avoid this happen, 274 the ABR should not send PUAM information to one area until it ensures 275 that all of nodes in this area can parse the PUAM information. To 276 accomplish this, this draft defines the capabilities sub-TLV as the 277 followings: 279 For OSPFv2, this bit (Bit number TBD, suggest bit 6, 0x20) should be 280 carried in "OSPF Router-LSA Option", as that described in [RFC2328]. 281 For OSPFv3, one bit (Bit number TBD, suggest bit 8) should be defined 282 to indicate the router's capabilities to support PUAM that described 283 in this draft, the defined bit should be carried in "OSPF Router 284 Informational Capabilities" TLV, which is described in [RFC7770]. 285 For ISIS, one new sub-TLV(Type TBD, suggest 29), PUAM Capabilities 286 sub-TLV, which is included in the "IS-IS Router CAPABILITY TLV" 287 [RFC7981] is defined in the followings: 289 0 1 2 3 290 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 291 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 292 | Type | Length | Flags | 293 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 294 Type: TBD, Suggested value 29, to be assigned by IANA 295 Length: 2 296 Flags: 2 octets 297 The following flags are defined: 298 0 1 299 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 300 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 301 |P| | 302 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 303 where: 304 P-flag: If set, the router supports PUA information. 306 Figure 2: PUA Capabilities sub-TLV format 308 7. Implementation Consideration 310 Considering the balances of reachable information and unreachable 311 information announcement capabilities, the implementation of this 312 mechanism should set one MAX_Address_Announcement (MAA) threshold 313 value that can be configurable. Then, the ABR should make the 314 following decisions to announce the prefixes: 316 1. If the number of unreachable prefixes is less than MAA, the ABR 317 should advertise the summary address and the PUAM. 319 2. If the number of reachable address is less than MAA, the ABR 320 should advertise the detail reachable address only. 322 3. If the number of reachable prefixes and unreachable prefixes 323 exceed MAA, then advertise the summary address with MAX metric. 325 8. Deployment Considerations 327 To support the PUAM advertisement, the ABRs should be upgraded 328 according to the procedures described in Section 4. The PEs that 329 want to accomplish the BGP switchover that described in Section 3.1 330 and Section 5 should also be upgraded to act upon the receive of the 331 PUAM message. Other nodes within the network can ignore such PUAM 332 message if they don't care or don't support. 334 As described in Section 4, the ABR will advertise the PUAM message 335 once it detects there is link or node down within the summary 336 address. In order to reduce the unnecessary advertisements of PUAM 337 messages on ABRs, the ABRs should support the configuration of the 338 protected prefixes. Based on such information, the ABR will only 339 advertise the PUAM message when the protected prefixes(for example, 340 the loopback addresses of PEs that run BGP) that within the summary 341 address is missing. 343 The advertisement of PUAM message should only last one configurable 344 period to allow the services that run on the failure prefixes are 345 converged or switchover. If one prefix is missed before the PUAM 346 takes effect, the ABR will not declare its absence via the PUAM. 348 9. Security Considerations 350 Advertisement of PUAM information follow the same procedure of 351 traditional LSA. The action based on the PUAM is clearly defined in 352 this document for ABR or Level1/2 router and the receiver that run 353 BGP. 355 There is no changes to the forward behavior of other internal 356 routers. 358 10. IANA Considerations 360 IANA is requested to register the following in the "OSPF Router 361 Properties Registry" and "OSPF Router Informational Capability Bits 362 Registry" respectively. 364 +------------+------------------+-------------+ 365 | Bit Number | Capability Name | Reference | 366 +============+==================+=============+ 367 | TBD(0x20) | OSPF PUA Support |this document| 368 +------------+------------------+-------------+ 369 Table 1: P-Bit in OSPF Router-LSA Option 371 +------------+------------------+-------------+ 372 | Bit Number | Capability Name | Reference | 373 +============+==================+=============+ 374 | TBD(bit 8) | OSPF PUA Support |this document| 375 +------------+------------------+-------------+ 376 Table 2: OSPF Router PUA Capability Support Bit 378 IANA is requested to register the following in "Sub-TLVs for 379 TLV242(IS-IS Router CAPABILITY TLV) 381 Type: 29 (Suggested - to be assigned by IANA) 383 Description: PUA Support Capabilities 385 11. Acknowledgement 387 Thanks Peter Psenak, Les Ginsberg, Acee Lindem, Shraddha Hegde, 388 Robert Raszuk, Tonly Li, Jeff Tantsura, Tony Przygienda and Bruno 389 Decraene for their suggestions and comments on this draft. 391 12. Normative References 393 [I-D.ietf-lsr-ospf-prefix-originator] 394 Wang, A., Lindem, A., Dong, J., Psenak, P., and K. 395 Talaulikar, "OSPF Prefix Originator Extensions", draft- 396 ietf-lsr-ospf-prefix-originator-12 (work in progress), 397 April 2021. 399 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 400 Requirement Levels", BCP 14, RFC 2119, 401 DOI 10.17487/RFC2119, March 1997, 402 . 404 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, 405 DOI 10.17487/RFC2328, April 1998, 406 . 408 [RFC5283] Decraene, B., Le Roux, JL., and I. Minei, "LDP Extension 409 for Inter-Area Label Switched Paths (LSPs)", RFC 5283, 410 DOI 10.17487/RFC5283, July 2008, 411 . 413 [RFC5302] Li, T., Smit, H., and T. Przygienda, "Domain-Wide Prefix 414 Distribution with Two-Level IS-IS", RFC 5302, 415 DOI 10.17487/RFC5302, October 2008, 416 . 418 [RFC5340] Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF 419 for IPv6", RFC 5340, DOI 10.17487/RFC5340, July 2008, 420 . 422 [RFC5709] Bhatia, M., Manral, V., Fanto, M., White, R., Barnes, M., 423 Li, T., and R. Atkinson, "OSPFv2 HMAC-SHA Cryptographic 424 Authentication", RFC 5709, DOI 10.17487/RFC5709, October 425 2009, . 427 [RFC7770] Lindem, A., Ed., Shen, N., Vasseur, JP., Aggarwal, R., and 428 S. Shaffer, "Extensions to OSPF for Advertising Optional 429 Router Capabilities", RFC 7770, DOI 10.17487/RFC7770, 430 February 2016, . 432 [RFC7794] Ginsberg, L., Ed., Decraene, B., Previdi, S., Xu, X., and 433 U. Chunduri, "IS-IS Prefix Attributes for Extended IPv4 434 and IPv6 Reachability", RFC 7794, DOI 10.17487/RFC7794, 435 March 2016, . 437 [RFC7981] Ginsberg, L., Previdi, S., and M. Chen, "IS-IS Extensions 438 for Advertising Router Information", RFC 7981, 439 DOI 10.17487/RFC7981, October 2016, 440 . 442 Authors' Addresses 444 Aijun Wang 445 China Telecom 446 Beiqijia Town, Changping District 447 Beijing 102209 448 China 450 Email: wangaj3@chinatelecom.cn 452 Gyan Mishra 453 Verizon Inc. 455 Email: gyan.s.mishra@verizon.com 456 Zhibo Hu 457 Huawei Technologies 458 Huawei Bld., No.156 Beiqing Rd. 459 Beijing 100095 460 China 462 Email: huzhibo@huawei.com 464 Yaqun Xiao 465 Huawei Technologies 466 Huawei Bld., No.156 Beiqing Rd. 467 Beijing 100095 468 China 470 Email: xiaoyaqun@huawei.com