idnits 2.17.1 draft-dong-ospf-maxage-flush-problem-statement-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (October 31, 2016) is 2733 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-23) exists of draft-ietf-ospf-ospfv3-lsa-extend-13 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Dong 3 Internet-Draft X. Zhang 4 Intended status: Informational Huawei Technologies 5 Expires: May 4, 2017 Z. Li 6 China Mobile 7 October 31, 2016 9 OSPF LSA Flushing Problem Statement 10 draft-dong-ospf-maxage-flush-problem-statement-01 12 Abstract 14 In OSPF protocol, Link State Advertisements (LSAs) are exchanged in 15 Link State Update (LSU) packets to achieve link state database (LSDB) 16 synchronization and consistent route calculation. OSPF protocol 17 specifies several scenarios in which an LSA is flushed with the LS 18 age field set to MaxAge. In some cases, the flushing of MaxAge LSAs 19 may cause flooding storm of OSPF packets and severely impact the 20 services provided by the network. 22 This document describes the problem of OSPF LSA flushing, and ask for 23 solutions to solve this problem. 25 Requirements Language 27 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 28 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 29 document are to be interpreted as described in RFC 2119 [RFC2119]. 31 Status of This Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at http://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on May 4, 2017. 48 Copyright Notice 50 Copyright (c) 2016 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents 55 (http://trustee.ietf.org/license-info) in effect on the date of 56 publication of this document. Please review these documents 57 carefully, as they describe your rights and restrictions with respect 58 to this document. Code Components extracted from this document must 59 include Simplified BSD License text as described in Section 4.e of 60 the Trust Legal Provisions and are provided without warranty as 61 described in the Simplified BSD License. 63 Table of Contents 65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 66 2. Typical Scenarios of LSA Flushing . . . . . . . . . . . . . . 3 67 3. Consequence of LSA Flushing . . . . . . . . . . . . . . . . . 3 68 4. Requirements on Potential Solutions . . . . . . . . . . . . . 4 69 4.1. Solution for Problem Localization . . . . . . . . . . . . 4 70 4.2. Solution for Impact Mitigation . . . . . . . . . . . . . 4 71 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5 72 6. Security Considerations . . . . . . . . . . . . . . . . . . . 5 73 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 5 74 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 5 75 8.1. Normative References . . . . . . . . . . . . . . . . . . 5 76 8.2. Informative References . . . . . . . . . . . . . . . . . 6 77 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 6 79 1. Introduction 81 In OSPF protocol [RFC2328], Link State Updates (LSAs) are exchanged 82 in Link State Update (LSU) packets to achieve link state database 83 (LSDB) synchronization and consistent route calculation. OSPF 84 specifies several scenarios in which an LSA is flushed with the LS 85 age field set to MaxAge. In some cases, the flushing of MaxAge LSAs 86 may cause flooding storm of OSPF packets and severely impact the 87 services in the network. Since the MaxAge LSA may be flushed by any 88 OSPF router, usually it would take a long time for troubleshooting 89 and could cause huge damage to both the network provider and its 90 customers. 92 2. Typical Scenarios of LSA Flushing 94 [RFC2328] specifies several scenarios in which an LSA should be 95 flushed with the LS age field set to MaxAge. Under normal 96 circumstances, the LSA flushing happens when the LS age of an LSA 97 naturally reaches MaxAge, this can be done by any OSPF router. Since 98 OSPF router would generate a new instance of the self-originated LSA 99 when its LS age reaches LSRefreshTime, which is usually the half of 100 the value of MaxAge, the naturally aging to MaxAge case would only 101 happen when the originator of the LSA is not reachable in the network 102 and cannot refresh the LSA. 104 Another case of LSA flushing is "Premature aging", which is to set 105 the LS age of a self-originated LSA to MaxAge and then flood the LSA. 106 Premature aging is used when the self-originated LSA's sequence 107 number field is about to wrap, or all the external routes previously 108 advertised by the LSA are no longer reachable. Premature aging and 109 flushing of LSA can also happen when a router is changed from the 110 Designated Router (DR) to a non-DR, or in some rare cases the 111 router's Router ID is changed. 113 Field experience has shown several circumstances where MaxAge LSA 114 flushing may be generated by the misbehaved router in the network. 115 For example, the LS age may be corrupted to reach the MaxAge much 116 earlier than normally expected. This is difficult to detect with the 117 existing OSPF checksum mechanism, as the LS age field is excluded 118 from the checksum calculation of LSA. Besides, OSPF cryptographic 119 authentication can not detect the corruption of the LS age field if 120 it happens before the LSA is assembled to LSU packet. 122 3. Consequence of LSA Flushing 124 While MaxAge LSA flushing is important for fast convergence and the 125 consistency of the Link-State DataBase (LSDB) of all OSPF routers, as 126 shown in several accidents happened in the production network, 127 improper LSA flushing can have severe impact to the network and the 128 services provided by the network. This section evaluates the impacts 129 of MaxAge LSA flushing. 131 According to section 14 of [RFC2328], the MaxAge LSA can be flushed 132 by any router, no matter whether this LSA is self-originated or not. 133 According to the flooding scope of the LSA, this MaxAge LSA would be 134 flooded either in the whole routing domain or in the specific area. 135 On all the routers receiving this MaxAge LSA, this would cause the 136 old LSA instance being replaced, and consequently triggers route 137 calculation and installation. When the MaxAge LSA is received by the 138 originating router of this LSA, the originating router would increase 139 the LSA's LS sequence number one past the received LS sequence 140 number, and originate a new instance of the LSA. If the LSA flushing 141 is due to systematic problem and cannot recover automatically, this 142 flooding and processing would last forever, which severely impacts 143 network reachability and stability. Since OSPF is the fundamental 144 protocol to build the infrastructure for other protocols such as BGP, 145 LDP, etc., and various services provided by the network, it will 146 cause huge damage to both the network provider and its customers. 148 As the MaxAge LSA may be flushed by any OSPF router, usually it would 149 take a long time for troubleshooting to locate the misbehaved router 150 in the network, and during this time the LSA flushing could have 151 caused huge damage to both the network provider and its customers. 153 4. Requirements on Potential Solutions 155 Considering the importance of OSPF protocol to the networks and the 156 services carried in the networks, and the potential severe impact of 157 MaxAge LSA flooding, this document calls for solutions to protect 158 against or mitigate the impact of improper MaxAge LSA flushing. 160 The potential solutions can be classified into two categories, and 161 the requirements are provided in following sections respectively. 163 4.1. Solution for Problem Localization 165 Since OSPF allows the flushing of non-self originated LSAs, for 166 troubleshooting and problem localization, some mechanism to identify 167 the misbehaved router quickly is needed. If the improper MaxAge LSA 168 flushing is caused by systematic problem, operators would need to 169 locate the misbehaved router and shut it down to stop the flooding 170 storm. 172 [RFC6232] proposes to add the Purge Originator Identification (POI) 173 TLV into IS-IS Purge LSPs to identify the originator of IS-IS Purges. 174 Although a similar TLV may be added into the OSPF extended LSAs as 175 defined in [RFC7684] and [I-D.ietf-ospf-ospfv3-lsa-extend], the 176 structure of the legacy OSPF LSAs as defined in [RFC2328] is not TLV- 177 based and such mechanism does not apply. Some problem localization 178 solution which is backward compatible and applicable to all the OSPF 179 LSAs would be preferred. 181 4.2. Solution for Impact Mitigation 183 Since the flooding storm caused by improper LSA flushing can have 184 severe impact to network stability and the services provided by the 185 network, it is important to alleviate such impact even before the 186 root cause or the misbehaved router can be identified. In addition, 187 some problem localization mechanisms may rely on the availability of 188 the network, which means the impact mitigation mechanism is necessary 189 to ensure that the problem localization mechanisms do work when 190 severe flooding storm caused by LSA flushing happens in the network. 192 It is important that the impact mitigation solution is backward 193 compatible and can support incremental deployment. Preferably, the 194 mitigation solution should not delay the route convergence triggered 195 by normal LSA flushing. 197 5. IANA Considerations 199 This document makes no request of IANA. 201 Note to RFC Editor: this section may be removed on publication as an 202 RFC. 204 6. Security Considerations 206 This document describes the problem of MaxAge LSA flushing, which in 207 some cases is due to the lack of integrity protection of the LS age 208 field. The LS age field may be altered as a result of software or 209 hardware problem, such modification cannot be detected by LSA 210 checksum nor OSPF packet cryptographic authentication. LSA flushing 211 could have severe impact on network stability and the services 212 provided by the network. This may be considered as a security 213 vulnerability. 215 7. Acknowledgements 217 The authors would like to thank Bruno Decraene, Acee Lindom and Les 218 Ginsberg for the discussion on this topic. 220 8. References 222 8.1. Normative References 224 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 225 Requirement Levels", BCP 14, RFC 2119, 226 DOI 10.17487/RFC2119, March 1997, 227 . 229 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, 230 DOI 10.17487/RFC2328, April 1998, 231 . 233 8.2. Informative References 235 [I-D.ietf-ospf-ospfv3-lsa-extend] 236 Lindem, A., Mirtorabi, S., Roy, A., and F. Baker, "OSPFv3 237 LSA Extendibility", draft-ietf-ospf-ospfv3-lsa-extend-13 238 (work in progress), October 2016. 240 [RFC6232] Wei, F., Qin, Y., Li, Z., Li, T., and J. Dong, "Purge 241 Originator Identification TLV for IS-IS", RFC 6232, 242 DOI 10.17487/RFC6232, May 2011, 243 . 245 [RFC7684] Psenak, P., Gredler, H., Shakir, R., Henderickx, W., 246 Tantsura, J., and A. Lindem, "OSPFv2 Prefix/Link Attribute 247 Advertisement", RFC 7684, DOI 10.17487/RFC7684, November 248 2015, . 250 Authors' Addresses 252 Jie Dong 253 Huawei Technologies 254 Huawei Campus, No.156 Beiqing Rd. 255 Beijing 100095 256 China 258 Email: jie.dong@huawei.com 260 Xudong Zhang 261 Huawei Technologies 262 Huawei Campus, No.156 Beiqing Rd. 263 Beijing 100095 264 China 266 Email: zhangxudong@huawei.com 268 Zhenqiang Li 269 China Mobile 270 No.32 Xuanwumenxi Ave., Xicheng District 271 Beijing 100032 272 China 274 Email: li_zhenqiang@hotmail.com